Biblio
Filters: Author is Yang, Shanchieh Jay [Clear All Filters]
Translating Intrusion Alerts to Cyberattack Stages Using Pseudo-Active Transfer Learning (PATRL). 2021 IEEE Conference on Communications and Network Security (CNS). :110–118.
.
2021. Intrusion alerts continue to grow in volume, variety, and complexity. Its cryptic nature requires substantial time and expertise to interpret the intended consequence of observed malicious actions. To assist security analysts in effectively diagnosing what alerts mean, this work develops a novel machine learning approach that translates alert descriptions to intuitively interpretable Action-Intent-Stages (AIS) with only 1% labeled data. We combine transfer learning, active learning, and pseudo labels and develop the Pseudo-Active Transfer Learning (PATRL) process. The PATRL process begins with an unsupervised-trained language model using MITRE ATT&CK, CVE, and IDS alert descriptions. The language model feeds to an LSTM classifier to train with 1% labeled data and is further enhanced with active learning using pseudo labels predicted by the iteratively improved models. Our results suggest PATRL can predict correctly for 85% (top-1 label) and 99% (top-3 labels) of the remaining 99% unknown data. Recognizing the need to build confidence for the analysts to use the model, the system provides Monte-Carlo Dropout Uncertainty and Pseudo-Label Convergence Score for each of the predicted alerts. These metrics give the analyst insights to determine whether to directly trust the top-1 or top-3 predictions and whether additional pseudo labels are needed. Our approach overcomes a rarely tackled research problem where minimal amounts of labeled data do not reflect the truly unlabeled data's characteristics. Combining the advantages of transfer learning, active learning, and pseudo labels, the PATRL process translates the complex intrusion alert description for the analysts with confidence.
Dynamic Generation of Empirical Cyberattack Models with Engineered Alert Features. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–6.
.
2019. Due to the increased diversity and complexity of cyberattacks, innovative and effective analytics are needed in order to identify critical cyber incidents on a corporate network even if no ground truth data is available. This paper develops an automated system which processes a set of intrusion alerts to create behavior aggregates and then classifies these aggregates into empirical attack models through a dynamic Bayesian approach with innovative feature engineering methods. Each attack model represents a unique collective attack behavior that helps to identify critical activities on the network. Using 2017 National Collegiate Penetration Testing Competition data, it is demonstrated that the developed system is capable of generating and refining unique attack models that make sense to human, without a priori knowledge.