Visible to the public AI & ML Based Anamoly Detection and Response Using Ember Dataset

TitleAI & ML Based Anamoly Detection and Response Using Ember Dataset
Publication TypeConference Paper
Year of Publication2021
AuthorsRathod, Viraj, Parekh, Chandresh, Dholariya, Dharati
Conference Name2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)
Date Publishedsep
Keywordsanomaly detection, composability, cyber security, Cyber Security Analytics, EMBER, feature extraction, machine learning, Market research, Metrics, pubcrawl, ransomware, Resiliency, security, security tools, Threat Detection &response, Tools, Training
AbstractIn the era of rapid technological growth, malicious traffic has drawn increased attention. Most well-known offensive security assessment todays are heavily focused on pre-compromise. The amount of anomalous data in today's context is massive. Analyzing the data using primitive methods would be highly challenging. Solution to it is: If we can detect adversary behaviors in the early stage of compromise, one can prevent and safeguard themselves from various attacks including ransomwares and Zero-day attacks. Integration of new technologies Artificial Intelligence & Machine Learning with manual Anomaly Detection can provide automated machine-based detection which in return can provide the fast, error free, simplify & scalable Threat Detection & Response System. Endpoint Detection & Response (EDR) tools provide a unified view of complex intrusions using known adversarial behaviors to identify intrusion events. We have used the EMBER dataset, which is a labelled benchmark dataset. It is used to train machine learning models to detect malicious portable executable files. This dataset consists of features derived from 1.1 million binary files: 900,000 training samples among which 300,000 were malicious, 300,000 were benevolent, 300,000 un-labelled, and 200,000 evaluation samples among which 100K were malicious, 100K were benign. We have also included open-source code for extracting features from additional binaries, enabling the addition of additional sample features to the dataset.
DOI10.1109/ICRITO51393.2021.9596451
Citation Keyrathod_ai_2021