Preparing AV Engines for Fair and Independent Tests

Alexander Vukcevic, 3 weeks ago 4 min read

How does a cyber-security company prepare its AV engine for fair and independent tests? In our recent blog, we wrote of the importance of independent testing for anti-virus (AV) products, and how organizations such as AV-Comparatives, and AV-Test test as rigorously as possible, in an environment as close to real-world as can reasonably be achieved. Their objective: to differentiate between products that protect users well, and those that don’t.

They invest significant time and money in ensuring these tests are truly representative. Consequently, anti-malware and cyber-security vendors  take the results seriously. A vendors’ own quality assurance process often uses these test results.  After-all, if the test house can find a flaw in a finished product, a malware author certainly would.

So, how do manufacturers make sure their AV solutions score well in these tests?

What does a ‘good’ score mean for a vendor? How can a test result help decide which manufacturer to OEM partner with?

Critical third-party assessments are essential

In the world of independent cyber-security testing, it is in all our interests to ensure test houses have access to the best facilities and use the most effective methodologies to test products. And it all must be done in a fair and independent manner.

Many of the top cyber-security vendors and test houses have signed up to the Anti-Malware Testing Standards Organization (AMTSO), which provides a standard by which tests are conducted. To use their words, AMTSO members “collaborate to improve the objectivity, quality, and transparency in security testing methodologies”.

The way in which each test-house designs and undertakes an evaluation is up to the test house. The commitment to ethical and fair practices is a cornerstone of AMTSO, for both vendors and test houses. This includes the publishing of test methodologies by the test house and a right of appeal for vendors. The code submitted for testing by vendors should be their publicly available code. Everyone should be held to, and tested to, the same standard.

Therefore, the first step to delivering valid and valuable test results, is to become a member of AMTSO. This is because AMTSO promotes fair and ethical testing. The second is then to assess the test methodologies published by the test houses. Having done this, it is possible to decide whether the test is appropriate for the product. A product designed for consumer environment use may not automatically be the best product for an OEM integration into a vendor’s gateway or firewall.

The right product for the right test

We previously touched on the three key measurements in AV tests: detection rates, performance and false positives. We measure the effectiveness of an anti-malware solution by these three key indicators. Each is closely related to the other.

Performance

In this context, performance measures the impact on the system of hosting the anti-malware solution. It’s relatively easy to build an AV system that has little impact on the local system – simply pass everything off to a cloud security service for analysis. But, taken to an extreme, without an internet connection, the underlying system is unprotected.

An ideal architecture couples powerful on-device anti-malware scanning capabilities with a cloud security service that offers zero-day detection. It is the approach most vendors choose to couple performance with efficacy, and something Daniel Steiner wrote about in detail previously. Good performance also comes from tuning the detection system to the most commonly used platform. Unfortunately, the endpoint platform most vendors choose to optimize performance on (Windows 10, 4GB RAM) is not the same as the platform most equipment vendors choose for their firewall, UTM system, or web gateway! Vendors looking to integrate scan engines on network devices should not just take ‘performance’ at face value but should consider how the underlying architecture of the engine (consumption of RAM, disk and initialization speed) are impacted by a product optimized for testing in an endpoint.

Detection and false positives

When it comes to detection, experience shows that it’s the last two percent that involves the most skill. Detecting 98 percent of malware is relatively straightforward. Tuning a system to perform well for the last two percent of detection is the difficult bit. Getting it wrong can risk dramatically increasing the false positive rate, and reducing the usefulness of the system.

False positives soar if the system is too aggressive in classifying Potentially Unwanted Applications. Or if the detection engine is tuned too aggressively. It’s here where a strong foundation in malware detection pays off.  Years of development work will have gone into refining the engine and producing quality detection scores.

Capabilities such as false positive control mechanisms play an important role in managing the risk of a false positive outbreak resulting from a mis-classified file.  However, their benefit in a test environment is limited because of the small number of devices used in the test. This creates a environment analogous to an engine deployed in an offline mode. Good engine design and well architected rules is the most effective way to reduce false positives.

Strong foundations: the anti-malware engine

For a scan engine to perform well, whether it is in the real world, or in independent tests, it needs to be robust and be maintained through rigorous processes.

Excellent cyber threat intelligence is at the heart of a great scan engine. Identification of the latest malware threats, as early as possible requires, diverse and quality malware sources. These must be sourced from all over the world. It also depends on having a stable and modern threat analysis infrastructure. This allows new incoming samples to be classified by a range of threat detection capabilities, most likely only available from within a cloud security infrastructure.

Of course, not all threats can be identified by machines and so the right human expertise is also vital. The hunt for new families  – or significant variants of existing malware – that the system may not yet be able to detect alone, is the realm of highly trained and experienced malware researchers

Finally, the scan engine requires regular updates. Unfortunately, between a threat appearing and a scan engine update, there will always be a gap no matter how many times a day a scan engine’s local detection pattern is updated. Great detection scores are not only achieved through regular pattern updates but also through continuous access to a cloud-based security service such as the Avira Protection Cloud which bridges the time gap between first sighting and local pattern detection.

Want to comment on this post?

We encourage you to share your thoughts on your favorite social platform.
Preparing AV Engines for Fair and Independent Tests

Alexander Vukcevic

Alexander joined Avira in 2000 and leads the Protection Labs & QA teams. He is passionate and enthusiastic about always delivering the best protection and highest quality to customers and partners. With more than 18 years of experience in the anti-malware industry Alex leads, guide and motivates his team to deliver market-leading detection for millions of customers.

You might like

Machine Learning

Applying AI: getting underneath machine learning

Applying AI: getting underneath machine learning

Machine learning (or artificial intelligence) is a must-have for scaling malware detection. But what type of machine learning should you look for, and how should it be applied?

10 months ago 3 min read