Machine learning (or artificial intelligence) is a must-have for scaling malware detection. But what type of machine learning should you look for, and how should it be applied?
To scan and assess a file for malware in real time requires an anti-malware scan engine. These engines lie at the core of virtually all of today’s malware detection and protection systems and have evolved over time to include a wide range of different detection technologies. Embedding Artificial Intelligence, code emulation and content extraction technologies, they are very much ‘next gen’ security technology. In this blog, we take a look at how anti-malware scan engines are deployed in security systems. We’ll also look at the benefits that can be gained from leveraging Software Development Kits (SDKs) to assist in building an anti-malware system.
Anti-malware scan engines have come a long way from the tools that relied upon a list of known signatures – common practice in the 1990s. Today, they still include signature-based detection, but the best performing engines have added additional technologies. These may include, for example, heuristics and generics, machine learning, emulation and content extraction. These capabilities result in powerful tools that can detect and stop malware in the vast majority of cases.
Anti-malware systems can be designed from scratch, or leverage Software Development Kits (SDKs) such as Avira’s Anti-malware SDK. This latter approach enables proven, existing engine designs to be used and allows a scan engine to be implemented quickly, without the cost and time associated with in-house development. Put another way, it avoids reinventing the wheel. But more importantly, leveraging the SDK of a vendor that consistently meets the highest performance for detection (and lowest False Positive rates) means that your own security offering will benefit from embedding a technology built by experts with decades of experience.
A scan engine is one of the key components within a security solution that will protect the user (or system) from malware, in real-time. However, on occasion, it may assess a file as suspicious but be unable to definitively deduce whether it contains malicious code or not. In this case, the scan engine must work with an online cloud security service (such as the Avira Protection Cloud). This brings the benefits of real-time knowledge of threats evolving world-wide. It also offers the the option of using more powerful analysis engines to identify whether the code is malicious.
Avira’s business and consumer user base submits tens of millions of suspicious files a day to the Avira Protection Cloud. The vast majority of these submissions are simply enquiries to our cloud backend to check whether a file has already been identified as malware. However, about 400,000 files a day are identified as malware. This results in the local scan engine being informed to block the file.
Regular updates to the detection engine and the scanning engine is one way to ensure the highest levels of local detection. These updates, which occur multiple times a day, ensure the local scan engine will detect the latest malware variants, instantly. Consequently, the rate at which updates are applied to the scan engine is one of the factors that dictates how effective it will be at detecting emerging threats without recourse to a cloud security service. This is one reason why smaller updates, applied more frequently, improve detection efficacy. However, the simple fact is that the greatest detection and protection comes from using a local scan engine linked to a cloud security service
To answer this, we need to look at the different deployment modes of scan engines: online, offline and off-net.
Operating in online mode, engines developed using SDKs such as the Anti-malware SDK can leverage the power of a cloud-based security service such as the Avira Protection Cloud. They don’t completely rely on the latest detection updates, but can query a live cloud database of known files. They can even upload unknown files for immediate assessment. Such an approach delivers, for all intents and purposes, complete detection, in real-time. A scan engine combined with a cloud security service is a very powerful anti-malware solution.
Scan engines connected to the internet but not paired with a cloud security service operate in offline mode. They continue to receive frequent detection updates but can no longer access the live cloud database or detection engines of the cloud security service. The risk then exists that new types of malware may emerge that are dissimilar enough to existing families of malware that they are not detected by the detection capabilities available locally. In this case, the malware may remain undetected until the next database update.
Completely disconnected from the Internet, and unable to receive any updates, the scan engine is off-net. This (unusual) scenario can be found where critical infrastructure or data is protected through a process of air-gapping. In this case, the user is responsible for generating their own detection updates and may use threat intelligence feeds as a way of enhancing their own threat database from which they can build their own update.
Anti-malware scan engines underpin malware detection in today’s protection systems. While their essential purpose and modus operandi has remained fairly consistent over time, increasingly sophisticated developments in technology, including AI, now assist them in performing their role. Scan engines that work with a cloud-based security service optimize malware detection through live access to a threat intelligence database. For more information on Avira’s scan engine SDK, take a look at Avira’s and for an overview of how Avira enables technology partnerships, you can learn more here.