Visible to the public CyberBERT: A Deep Dynamic-State Session-Based Recommender System for Cyber Threat Recognition

TitleCyberBERT: A Deep Dynamic-State Session-Based Recommender System for Cyber Threat Recognition
Publication TypeConference Paper
Year of Publication2021
AuthorsMcDonnell, Serena, Nada, Omar, Abid, Muhammad Rizwan, Amjadian, Ehsan
Conference Name2021 IEEE Aerospace Conference (50100)
Date Publishedmar
KeywordsAerodynamics, computer security, History, Human Behavior, Malware, Predictive models, pubcrawl, Real-time Systems, recommender systems, resilience, Resiliency, Scalability, Training
AbstractSession-based recommendation is the task of predicting user actions during short online sessions. The user is considered to be anonymous in this setting, with no past behavior history available. Predicting anonymous users' next actions and their preferences in the absence of historical user behavior information is valuable from a cybersecurity and aerospace perspective, as cybersecurity measures rely on the prompt classification of novel threats. Our offered solution builds upon the previous representation learning work originating from natural language processing, namely BERT, which stands for Bidirectional Encoder Representations from Transformers (Devlin et al., 2018). In this paper we propose CyberBERT, the first deep session-based recommender system to employ bidirectional transformers to model the intent of anonymous users within a session. The session-based setting lends itself to applications in threat recognition, through monitoring of real-time user behavior using the CyberBERT architecture. We evaluate the efficiency of this dynamic state method using the Windows PE Malware API sequence dataset (Catak and Yazi, 2019), which contains behavior for 7107 API call sequences executed by 8 classes of malware. We compare the proposed CyberBERT solution to two high-performing benchmark algorithms on the malware dataset: LSTM (Long Short-term Memory) and transformer encoder (Vaswani et al., 2017). We also evaluate the method using the YOOCHOOSE 1/64 dataset, which is a session-based recommendation dataset that contains 37,483 items, 719,470 sessions, and 31,637,239 clicks. Our experiments demonstrate the advantage of a bidirectional architecture over the unidirectional approach, as well as the flexibility of the CyberBERT solution in modelling the intent of anonymous users in a session. Our system achieves state-of-the-art measured by F1 score on the Windows PE Malware API sequence dataset, and state-of-the-art for P@20 and MRR@20 on YOOCHOOSE 1/64. As CyberBERT allows for user behavior monitoring in the absence of behavior history, it acts as a robust malware classification system that can recognize threats in aerospace systems, where malicious actors may be interacting with a system for the first time. This work provides the backbone for systems that aim to protect aviation and aerospace applications from prospective third-party applications and malware.
DOI10.1109/AERO50100.2021.9438286
Citation Keymcdonnell_cyberbert_2021