In the years before the Sept. 11, 2001 attacks, terrorism analysts trying to make sense of the streams of incoming data were often overwhelmed, taking years to translate suspicious conversations or even look at important photographs.
Today, the government and private sector face similar problems with the daunting amount of cyber data, according to materials presented at GTC Spring 2022. “The volume and velocity of cyber data is extreme,” Rachel Allen, an NVIDIA senior data scientist, said at a GTC session, Transform Cybersecurity with Accelerated Data Science. “It’s estimated that over 90 percent of cyber data today is left on the floor, so to speak, and is never collected or analyzed at all.”
Enter artificial intelligence (AI).
As cyberattacks become more severe amid a shortage of cybersecurity professionals, AI data science platforms are potentially transformative, NVIDIA experts said at the session. AI, they said, is ideal for cyber use cases because it employs machine learning (ML) at exponentially rapid speed – to identify and act on threats that more traditional platforms tend to miss.
The best use of AI to fight the cyber threat is NVIDIA Morpheus, Allen and her colleague, NVIDIA senior cybersecurity engineer Michael Demoret, said. Morpheus is the company’s open-source, AI cybersecurity framework that became available to all developers in April.
NVIDIA has been pushing to bring AI – and Morpheus – to the forefront of the cybersecurity debate as agencies accelerate the move to zero trust security. “Given the volume and complexity of security data and the evolving sophistication of cyberattacks…it is an appealing field to apply data science and machine learning,” Allen said.
With several use cases, Allen and Demoret demonstrated how enterprises can use Morpheus to adopt the latest research in cybersecurity models to secure their networks. One use case was on phishing email detection. Phishing has been on the rise for the past two years and poses a continued threat to Federal agencies.
Allen showed a screen with three emails: One asked the recipient to click a link and log onto a “Hilton Honors” account; the second gave instructions on purchasing $500 worth of iTunes gift cards; the third sought bank information so the recipient could allegedly receive $18 million.
“Can you identify the phishing email from these three examples?” she asked.
Answering her own question, Allen said the only legitimate email was from the Hilton Honors program. The other two were phishing attempts.
She then explained how ML – and Morpheus – identify such suspicious emails better and faster than other platforms. “Traditional approaches to detecting phishing emails may flag emails with features that match previously seen, known malicious phishing emails,” she said. “Some of these features are based solely on URLs. And if we were to only analyze URLs, we would miss both of these phishing emails.”
In contrast, she said, ML uses a “context-aware approach” that spots “these social engineering emails without URLs. Additionally, by feeding the model the whole email, including formatting and font selection, we can capture that … look that many phishing emails have.”
AI faces some challenges in the cybersecurity context, Allen acknowledged, “in comparison to other fields where machine learning has made a large and early impact, like computer vision.” She said the key to advancing ML for cybersecurity “is to borrow successful approaches from other fields with more mature ML applications. For example, we’d like to borrow the advanced graph-based techniques that are used in the world of financial fraud.”
But the NVIDIA experts said they have found ways to help overcome a number of hurdles related to using AI for cybersecurity [that Federal leaders face] – including the requirement to label cybersecurity data sets. The cybersecurity experts are already often overwhelmed by the flood of incoming data.
Access the session on-demand here.