Analyst Intuition Based Hidden Markov Model on High Speed, Temporal Cyber Security Big Data

Submitted by grigby1 on Wed, 11/14/2018 - 1:21pm

Title	Analyst Intuition Based Hidden Markov Model on High Speed, Temporal Cyber Security Big Data
Publication Type	Conference Paper
Year of Publication	2017
Authors	Teoh, T. T., Nguwi, Y. Y., Elovici, Y., Cheung, N. M., Ng, W. L.
Conference Name	2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)
Date Published	jul
Keywords	Analyst Intuition, attacker, Big Data, Clustering algorithms, computer security, cyber security, cyber security attack, cyber security data, cyber security expert, cyber security log, Data analysis, data mining, Data models, Expectation Regulated, expert systems, forecasting time series data, fuzzy k mean cluster, Fuzzy k-means (FKM), fuzzy set theory, hidden Markov model, Hidden Markov Model (HMM), Hidden Markov models, High Velocity, HMM state, Human Behavior, IP addresses, Malware, Multi-layer Perceptron (MLP), network protocols, Principal Component Analysis (PCA), probabilistic models, pubcrawl, resilience, Resiliency, Scalability, scoring system, security, security attacks, security of data, statistical data, temporal cyber security big data, time series, unsure attack, Virus
Abstract	Hidden Markov Models (HMM) are probabilistic models that can be used for forecasting time series data. It has seen success in various domains like finance [1-5], bioinformatics [6-8], healthcare [9-11], agriculture [12-14], artificial intelligence[15-17]. However, the use of HMM in cyber security found to date is numbered. We believe the properties of HMM being predictive, probabilistic, and its ability to model different naturally occurring states form a good basis to model cyber security data. It is hence the motivation of this work to provide the initial results of our attempts to predict security attacks using HMM. A large network datasets representing cyber security attacks have been used in this work to establish an expert system. The characteristics of attacker's IP addresses can be extracted from our integrated datasets to generate statistical data. The cyber security expert provides the weight of each attribute and forms a scoring system by annotating the log history. We applied HMM to distinguish between a cyber security attack, unsure and no attack by first breaking the data into 3 cluster using Fuzzy K mean (FKM), then manually label a small data (Analyst Intuition) and finally use HMM state-based approach. By doing so, our results are very encouraging as compare to finding anomaly in a cyber security log, which generally results in creating huge amount of false detection.
DOI	10.1109/FSKD.2017.8393092
Citation Key	teoh_analyst_2017

Groups:

Science of Security VO