Machine Learning: The Difference between Tool and Security Layer

Reading time: 4 min
Share this Share on email Share on twitter Share on linkedin Share on facebook

Following years of active research and progress, artificial intelligence and machine learning have gained traction, becoming integrated in all cybersecurity layers to boost the efficiency of unknown malware detection, spam detection, URL filtering and network anomalies. Still, the industry has only reached the tip of the iceberg, with many opportunities still to be explored in the future.

The security industry has actively invested in machine learning technology, attracted by its high detection rate and ability to quickly analyze tons of data.

In 2016, global spending on AI was estimated at $20 billion-$30 billion, while a report from ABI Research forecasts investments in big data, intelligence and analytics in cybersecurity will approach $96 million by 2021, following the increase in popularity of machine learning technology in cybersecurity.

45 percent of CIOs will focus their efforts on machine learning, while 70 percent trust machine learning implementation will lead to more accurate decisions, a survey found.

Machine learning is a key detection tool based on rules and patterns defined by the human factor to make predictions and manage the large amounts of data harnessed following the growing number of malware families.

Providing data-driven insights based on pattern-recognition, machine learning automatically turns into a competitive advantage that shouldn’t be labelled as a universal solution in malware prevention, but a complementary approach that augments security layers.

A drawback is the need for training by engineers to improve existing security layers, plus some limitations in detection and efficiency as a stand-alone tool that could make it vulnerable to attacks. For example, machine learning is trained to scan files, because of the limitations of the endpoint security solution which prevents it from reading raw memory, consequently additional technologies are necessary. Unless supported by additional detection methods, a generic algorithm may deliver false positives, while too many restrictions could lead to false negatives.

Machine learning can be outsmarted because it can’t dive deeply enough to detect gaps in the data it analyzes.  It all comes down to the way the algorithm was originally trained. If the training data set contains a clean file, the algorithm will read it as malware, compromising the output. Also, if the new data doesn’t resemble the training pattern, the algorithm will not know what to do with it.

In short, if the training data is corrupted, the output will be flawed. Machine learning algorithms use models to extract data. Although they read and understand patterns in real time without human assistance, the human component is still necessary to interpret the output or revalidate the algorithm to remove false positives and improve detection.

Machine learning, a static analysis tool, is only one piece of a larger puzzle in malware detection. The industry shouldn’t rely solely on machine learning algorithms as the universal solution for all security problems.

Besides its applicability in next-gen endpoint protection agents, machine learning has been noted in its deployment in earthquake prediction, fraud detection, face and voice recognition software and search engines by top technology companies like Google, Amazon and Apple.