AI & ML Based Anamoly Detection and Response Using Ember Dataset

Submitted by aekwall on Thu, 07/14/2022 - 4:21pm

Title	AI & ML Based Anamoly Detection and Response Using Ember Dataset
Publication Type	Conference Paper
Year of Publication	2021
Authors	Rathod, Viraj, Parekh, Chandresh, Dholariya, Dharati
Conference Name	2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)
Date Published	sep
Keywords	anomaly detection, composability, cyber security, Cyber Security Analytics, EMBER, feature extraction, machine learning, Market research, Metrics, pubcrawl, ransomware, Resiliency, security, security tools, Threat Detection &response, Tools, Training
Abstract	In the era of rapid technological growth, malicious traffic has drawn increased attention. Most well-known offensive security assessment todays are heavily focused on pre-compromise. The amount of anomalous data in today's context is massive. Analyzing the data using primitive methods would be highly challenging. Solution to it is: If we can detect adversary behaviors in the early stage of compromise, one can prevent and safeguard themselves from various attacks including ransomwares and Zero-day attacks. Integration of new technologies Artificial Intelligence & Machine Learning with manual Anomaly Detection can provide automated machine-based detection which in return can provide the fast, error free, simplify & scalable Threat Detection & Response System. Endpoint Detection & Response (EDR) tools provide a unified view of complex intrusions using known adversarial behaviors to identify intrusion events. We have used the EMBER dataset, which is a labelled benchmark dataset. It is used to train machine learning models to detect malicious portable executable files. This dataset consists of features derived from 1.1 million binary files: 900,000 training samples among which 300,000 were malicious, 300,000 were benevolent, 300,000 un-labelled, and 200,000 evaluation samples among which 100K were malicious, 100K were benign. We have also included open-source code for extracting features from additional binaries, enabling the addition of additional sample features to the dataset.
DOI	10.1109/ICRITO51393.2021.9596451
Citation Key	rathod_ai_2021

Groups:

Science of Security VO