Visible to the public Biblio

Found 245 results

Filters: Keyword is pattern classification  [Clear All Filters]
2020-10-29
Wei, Qu, Xiao, Shi, Dongbao, Li.  2019.  Malware Classification System Based on Machine Learning. 2019 Chinese Control And Decision Conference (CCDC). :647—652.

The main challenge for malware researchers is the large amount of data and files that need to be evaluated for potential threats. Researchers analyze a large number of new malware daily and classify them in order to extract common features. Therefore, a system that can ensure and improve the efficiency and accuracy of the classification is of great significance for the study of malware characteristics. A high-performance, high-efficiency automatic classification system based on multi-feature selection fusion of machine learning is proposed in this paper. Its performance and efficiency, according to our experiments, have been greatly improved compared to single-featured systems.

Mahajan, Ginika, Saini, Bhavna, Anand, Shivam.  2019.  Malware Classification Using Machine Learning Algorithms and Tools. 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). :1—8.

Malware classification is the process of categorizing the families of malware on the basis of their signatures. This work focuses on classifying the emerging malwares on the basis of comparable features of similar malwares. This paper proposes a novel framework that categorizes malware samples into their families and can identify new malware samples for analysis. For this six diverse classification techniques of machine learning are used. To get more comparative and thus accurate classification results, analysis is done using two different tools, named as Knime and Orange. The work proposed can help in identifying and thus cleaning new malwares and classifying malware into their families. The correctness of family classification of malwares is investigated in terms of confusion matrix, accuracy and Cohen's Kappa. After evaluation it is analyzed that Random Forest gives the highest accuracy.

Tran, Trung Kien, Sato, Hiroshi, Kubo, Masao.  2019.  Image-Based Unknown Malware Classification with Few-Shot Learning Models. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :401—407.

Knowing malware types in every malware attacks is very helpful to the administrators to have proper defense policies for their system. It must be a massive benefit for the organization as well as the social if the automatic protection systems could themselves detect, classify an existence of new malware types in the whole network system with a few malware samples. This feature helps to prevent the spreading of malware as soon as any damage is caused to the networks. An approach introduced in this paper takes advantage of One-shot/few-shot learning algorithms in solving the malware classification problems by using some well-known models such as Matching Networks, Prototypical Networks. To demonstrate an efficiency of the approach, we run the experiments on the two malware datasets (namely, MalImg and Microsoft Malware Classification Challenge), and both experiments all give us very high accuracies. We confirm that if applying models correctly from the machine learning area could bring excellent performance compared to the other traditional methods, open a new area of malware research.

Lo, Wai Weng, Yang, Xu, Wang, Yapeng.  2019.  An Xception Convolutional Neural Network for Malware Classification with Transfer Learning. 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). :1—5.

In this work, we applied a deep Convolutional Neural Network (CNN) with Xception model to perform malware image classification. The Xception model is a recently developed special CNN architecture that is more powerful with less over- fitting problems than the current popular CNN models such as VGG16. However only a few use cases of the Xception model can be found in literature, and it has never been used to solve the malware classification problem. The performance of our approach was compared with other methods including KNN, SVM, VGG16 etc. The experiments on two datasets (Malimg and Microsoft Malware Dataset) demonstrated that the Xception model can achieve the highest training accuracy than all other approaches including the champion approach, and highest validation accuracy than all other approaches including VGG16 model which are using image-based malware classification (except the champion solution as this information was not provided). Additionally, we proposed a novel ensemble model to combine the predictions from .bytes files and .asm files, showing that a lower logloss can be achieved. Although the champion on the Microsoft Malware Dataset achieved a bit lower logloss, our approach does not require any features engineering, making it more effective to adapt to any future evolution in malware, and very much less time consuming than the champion's solution.

Jiang, Jianguo, Li, Song, Yu, Min, Li, Gang, Liu, Chao, Chen, Kai, Liu, Hui, Huang, Weiqing.  2019.  Android Malware Family Classification Based on Sensitive Opcode Sequence. 2019 IEEE Symposium on Computers and Communications (ISCC). :1—7.

Android malware family classification is an advanced task in Android malware analysis, detection and forensics. Existing methods and models have achieved a certain success for Android malware detection, but the accuracy and the efficiency are still not up to the expectation, especially in the context of multiple class classification with imbalanced training data. To address those challenges, we propose an Android malware family classification model by analyzing the code's specific semantic information based on sensitive opcode sequence. In this work, we construct a sensitive semantic feature-sensitive opcode sequence using opcodes, sensitive APIs, STRs and actions, and propose to analyze the code's specific semantic information, generate a semantic related vector for Android malware family classification based on this feature. Besides, aiming at the families with minority, we adopt an oversampling technique based on the sensitive opcode sequence. Finally, we evaluate our method on Drebin dataset, and select the top 40 malware families for experiments. The experimental results show that the Total Accuracy and Average AUC (Area Under Curve, AUC) reach 99.50% and 98.86% with 45. 17s per Android malware, and even if the number of malware families increases, these results remain good.

2020-10-26
Sethi, Kamalakanta, Kumar, Rahul, Sethi, Lingaraj, Bera, Padmalochan, Patra, Prashanta Kumar.  2019.  A Novel Machine Learning Based Malware Detection and Classification Framework. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1–4.
As time progresses, new and complex malware types are being generated which causes a serious threat to computer systems. Due to this drastic increase in the number of malware samples, the signature-based malware detection techniques cannot provide accurate results. Different studies have demonstrated the proficiency of machine learning for the detection and classification of malware files. Further, the accuracy of these machine learning models can be improved by using feature selection algorithms to select the most essential features and reducing the size of the dataset which leads to lesser computations. In this paper, we have developed a machine learning based malware analysis framework for efficient and accurate malware detection and classification. We used Cuckoo sandbox for dynamic analysis which executes malware in an isolated environment and generates an analysis report based on the system activities during execution. Further, we propose a feature extraction and selection module which extracts features from the report and selects the most important features for ensuring high accuracy at minimum computation cost. Then, we employ different machine learning algorithms for accurate detection and fine-grained classification. Experimental results show that we got high detection and classification accuracy in comparison to the state-of-the-art approaches.
Walker, Aaron, Sengupta, Shamik.  2019.  Insights into Malware Detection via Behavioral Frequency Analysis Using Machine Learning. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–6.
The most common defenses against malware threats involves the use of signatures derived from instances of known malware. However, the constant evolution of the malware threat landscape necessitates defense against unknown malware, making a signature catalog of known threats insufficient to prevent zero-day vulnerabilities from being exploited. Recent research has applied machine learning approaches to identify malware through artifacts of malicious activity as observed through dynamic behavioral analysis. We have seen that these approaches mimic common malware defenses by simply offering a method of detecting known malware. We contribute a new method of identifying software as malicious or benign through analysis of the frequency of Windows API system function calls. We show that this is a powerful technique for malware detection because it generates learning models which understand the difference between malicious and benign software, rather than producing a malware signature classifier. We contribute a method of systematically comparing machine learning models against different datasets to determine their efficacy in accurately distinguishing the difference between malicious and benign software.
2020-10-16
Tong, Weiming, Liu, Bingbing, Li, Zhongwei, Jin, Xianji.  2019.  Intrusion Detection Method of Industrial Control System Based on RIPCA-OCSVM. 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE). :1148—1154.

In view of the problem that the intrusion detection method based on One-Class Support Vector Machine (OCSVM) could not detect the outliers within the industrial data, which results in the decision function deviating from the training sample, an anomaly intrusion detection algorithm based on Robust Incremental Principal Component Analysis (RIPCA) -OCSVM is proposed in this paper. The method uses RIPCA algorithm to remove outliers in industrial data sets and realize dimensionality reduction. In combination with the advantages of OCSVM on the single classification problem, an anomaly detection model is established, and the Improved Particle Swarm Optimization (IPSO) is used for model parameter optimization. The simulation results show that the method can efficiently and accurately identify attacks or abnormal behaviors while meeting the real-time requirements of the industrial control system (ICS).

Zhang, Rui, Chen, Hongwei.  2019.  Intrusion Detection of Industrial Control System Based on Stacked Auto-Encoder. 2019 Chinese Automation Congress (CAC). :5638—5643.

With the deep integration of industrial control systems and Internet technologies, how to effectively detect whether industrial control systems are threatened by intrusion is a difficult problem in industrial security research. Aiming at the difficulty of high dimensionality and non-linearity of industrial control system network data, the stacked auto-encoder is used to extract the network data features, and the multi-classification support vector machine is used for classification. The research results show that the accuracy of the intrusion detection model reaches 95.8%.

2020-10-12
Sieu, Brandon, Gavrilova, Marina.  2019.  Person Identification from Visual Aesthetics Using Gene Expression Programming. 2019 International Conference on Cyberworlds (CW). :279–286.
The last decade has witnessed an increase in online human interactions, covering all aspects of personal and professional activities. Identification of people based on their behavior rather than physical traits is a growing industry, spanning diverse spheres such as online education, e-commerce and cyber security. One prominent behavior is the expression of opinions, commonly as a reaction to images posted online. Visual aesthetic is a soft, behavioral biometric that refers to a person's sense of fondness to a certain image. Identifying individuals using their visual aesthetics as discriminatory features is an emerging domain of research. This paper introduces a new method for aesthetic feature dimensionality reduction using gene expression programming. The advantage of this method is that the resulting system is capable of using a tree-based genetic approach for feature recombination. Reducing feature dimensionality improves classifier accuracy, reduces computation runtime, and minimizes required storage. The results obtained on a dataset of 200 Flickr users evaluating 40000 images demonstrates a 94% accuracy of identity recognition based solely on users' aesthetic preferences. This outperforms the best-known method by 13.5%.
2020-09-28
Akaishi, Sota, Uda, Ryuya.  2019.  Classification of XSS Attacks by Machine Learning with Frequency of Appearance and Co-occurrence. 2019 53rd Annual Conference on Information Sciences and Systems (CISS). :1–6.
Cross site scripting (XSS) attack is one of the attacks on the web. It brings session hijack with HTTP cookies, information collection with fake HTML input form and phishing with dummy sites. As a countermeasure of XSS attack, machine learning has attracted a lot of attention. There are existing researches in which SVM, Random Forest and SCW are used for the detection of the attack. However, in the researches, there are problems that the size of data set is too small or unbalanced, and that preprocessing method for vectorization of strings causes misclassification. The highest accuracy of the classification was 98% in existing researches. Therefore, in this paper, we improved the preprocessing method for vectorization by using word2vec to find the frequency of appearance and co-occurrence of the words in XSS attack scripts. Moreover, we also used a large data set to decrease the deviation of the data. Furthermore, we evaluated the classification results with two procedures. One is an inappropriate procedure which some researchers tend to select by mistake. The other is an appropriate procedure which can be applied to an attack detection filter in the real environment.
Sliwa, Benjamin, Haferkamp, Marcus, Al-Askary, Manar, Dorn, Dennis, Wietfeld, Christian.  2018.  A radio-fingerprinting-based vehicle classification system for intelligent traffic control in smart cities. 2018 Annual IEEE International Systems Conference (SysCon). :1–5.
The measurement and provision of precise and up-to-date traffic-related key performance indicators is a key element and crucial factor for intelligent traffic control systems in upcoming smart cities. The street network is considered as a highly-dynamic Cyber Physical System (CPS) where measured information forms the foundation for dynamic control methods aiming to optimize the overall system state. Apart from global system parameters like traffic flow and density, specific data, such as velocity of individual vehicles as well as vehicle type information, can be leveraged for highly sophisticated traffic control methods like dynamic type-specific lane assignments. Consequently, solutions for acquiring these kinds of information are required and have to comply with strict requirements ranging from accuracy over cost-efficiency to privacy preservation. In this paper, we present a system for classifying vehicles based on their radio-fingerprint. In contrast to other approaches, the proposed system is able to provide real-time capable and precise vehicle classification as well as cost-efficient installation and maintenance, privacy preservation and weather independence. The system performance in terms of accuracy and resource-efficiency is evaluated in the field using comprehensive measurements. Using a machine learning based approach, the resulting success ratio for classifying cars and trucks is above 99%.
Piskachev, Goran, Nguyen Quang Do, Lisa, Johnson, Oshando, Bodden, Eric.  2019.  SWAN\_ASSIST: Semi-Automated Detection of Code-Specific, Security-Relevant Methods. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1094–1097.
To detect specific types of bugs and vulnerabilities, static analysis tools must be correctly configured with security-relevant methods (SRM), e.g., sources, sinks, sanitizers and authentication methods-usually a very labour-intensive and error-prone process. This work presents the semi-automated tool SWAN\_ASSIST, which aids the configuration with an IntelliJ plugin based on active machine learning. It integrates our novel automated machine-learning approach SWAN, which identifies and classifies Java SRM. SWAN\_ASSIST further integrates user feedback through iterative learning. SWAN\_ASSIST aids developers by asking them to classify at each point in time exactly those methods whose classification best impact the classification result. Our experiments show that SWAN\_ASSIST classifies SRM with a high precision, and requires a relatively low effort from the user. A video demo of SWAN\_ASSIST can be found at https://youtu.be/fSyD3V6EQOY. The source code is available at https://github.com/secure-software-engineering/swan.
2020-09-04
Khan, Aasher, Rehman, Suriya, Khan, Muhammad U.S, Ali, Mazhar.  2019.  Synonym-based Attack to Confuse Machine Learning Classifiers Using Black-box Setting. 2019 4th International Conference on Emerging Trends in Engineering, Sciences and Technology (ICEEST). :1—7.
Twitter being the most popular content sharing platform is giving rise to automated accounts called “bots”. Majority of the users on Twitter are bots. Various machine learning (ML) algorithms are designed to detect bots avoiding the vulnerability constraints of ML-based models. This paper contributes to exploit vulnerabilities of machine learning (ML) algorithms through black-box attack. An adversarial text sequence misclassifies the results of deep learning (DL) classifiers for bot detection. Literature shows that ML models are vulnerable to attacks. The aim of this paper is to compromise the accuracy of ML-based bot detection algorithms by replacing original words in tweets with their synonyms. Our results show 7.2% decrease in the accuracy for bot tweets, therefore classifying bot tweets as legitimate tweets.
Tsingenopoulos, Ilias, Preuveneers, Davy, Joosen, Wouter.  2019.  AutoAttacker: A reinforcement learning approach for black-box adversarial attacks. 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :229—237.
Recent research has shown that machine learning models are susceptible to adversarial examples, allowing attackers to trick a machine learning model into making a mistake and producing an incorrect output. Adversarial examples are commonly constructed or discovered by using gradient-based methods that require white-box access to the model. In most real-world AI system deployments, having complete access to the machine learning model is an unrealistic threat model. However, it is possible for an attacker to construct adversarial examples even in the black-box case - where we assume solely a query capability to the model - with a variety of approaches each with its advantages and shortcomings. We introduce AutoAttacker, a novel reinforcement learning framework where agents learn how to operate around the black-box model by querying it, to effectively extract the underlying decision behaviour, and to undermine it successfully. AutoAttacker is a first of kind framework that uses reinforcement learning and assumes nothing about the differentiability or structure of the underlying function and is thus robust towards common defenses like gradient obfuscation or adversarial training. Finally, without differentiable output, as in binary classification, most methods cease to operate and require either an approximation of the gradient, or another approach altogether. Our approach, however, maintains the capability to function when the output descriptiveness diminishes.
Usama, Muhammad, Qayyum, Adnan, Qadir, Junaid, Al-Fuqaha, Ala.  2019.  Black-box Adversarial Machine Learning Attack on Network Traffic Classification. 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC). :84—89.

Deep machine learning techniques have shown promising results in network traffic classification, however, the robustness of these techniques under adversarial threats is still in question. Deep machine learning models are found vulnerable to small carefully crafted adversarial perturbations posing a major question on the performance of deep machine learning techniques. In this paper, we propose a black-box adversarial attack on network traffic classification. The proposed attack successfully evades deep machine learning-based classifiers which highlights the potential security threat of using deep machine learning techniques to realize autonomous networks.

Liang, Jiaqi, Li, Linjing, Chen, Weiyun, Zeng, Daniel.  2019.  Targeted Addresses Identification for Bitcoin with Network Representation Learning. 2019 IEEE International Conference on Intelligence and Security Informatics (ISI). :158—160.

The anonymity and decentralization of Bitcoin make it widely accepted in illegal transactions, such as money laundering, drug and weapon trafficking, gambling, to name a few, which has already caused significant security risk all around the world. The obvious de-anonymity approach that matches transaction addresses and users is not possible in practice due to limited annotated data set. In this paper, we divide addresses into four types, exchange, gambling, service, and general, and propose targeted addresses identification algorithms with high fault tolerance which may be employed in a wide range of applications. We use network representation learning to extract features and train imbalanced multi-classifiers. Experimental results validated the effectiveness of the proposed method.

2020-08-28
Hasanin, Tawfiq, Khoshgoftaar, Taghi M., Leevy, Joffrey L..  2019.  A Comparison of Performance Metrics with Severely Imbalanced Network Security Big Data. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :83—88.

Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.

Huang, Angus F.M., Chi-Wei, Yang, Tai, Hsiao-Chi, Chuan, Yang, Huang, Jay J.C., Liao, Yu-Han.  2019.  Suspicious Network Event Recognition Using Modified Stacking Ensemble Machine Learning. 2019 IEEE International Conference on Big Data (Big Data). :5873—5880.
This study aims to detect genuine suspicious events and false alarms within a dataset of network traffic alerts. The rapid development of cloud computing and artificial intelligence-oriented automatic services have enabled a large amount of data and information to be transmitted among network nodes. However, the amount of cyber-threats, cyberattacks, and network intrusions have increased in various domains of network environments. Based on the fields of data science and machine learning, this paper proposes a series of solutions involving data preprocessing, exploratory data analysis, new features creation, features selection, ensemble learning, models construction, and verification to identify suspicious network events. This paper proposes a modified form of stacking ensemble machine learning which includes AdaBoost, Neural Networks, Random Forest, LightGBM, and Extremely Randomised Trees (Extra Trees) to realise a high-performance classification. A suspicious network event recognition dataset for a security operations centre, which uses real network log observations from the 2019 IEEE BigData Cup Challenge, is used as an experimental dataset. This paper investigates the possibility of integrating big-data analytics, machine learning, and data science to improve intelligent cybersecurity.
Traylor, Terry, Straub, Jeremy, Gurmeet, Snell, Nicholas.  2019.  Classifying Fake News Articles Using Natural Language Processing to Identify In-Article Attribution as a Supervised Learning Estimator. 2019 IEEE 13th International Conference on Semantic Computing (ICSC). :445—449.

Intentionally deceptive content presented under the guise of legitimate journalism is a worldwide information accuracy and integrity problem that affects opinion forming, decision making, and voting patterns. Most so-called `fake news' is initially distributed over social media conduits like Facebook and Twitter and later finds its way onto mainstream media platforms such as traditional television and radio news. The fake news stories that are initially seeded over social media platforms share key linguistic characteristics such as making excessive use of unsubstantiated hyperbole and non-attributed quoted content. In this paper, the results of a fake news identification study that documents the performance of a fake news classifier are presented. The Textblob, Natural Language, and SciPy Toolkits were used to develop a novel fake news detector that uses quoted attribution in a Bayesian machine learning system as a key feature to estimate the likelihood that a news article is fake. The resultant process precision is 63.333% effective at assessing the likelihood that an article with quotes is fake. This process is called influence mining and this novel technique is presented as a method that can be used to enable fake news and even propaganda detection. In this paper, the research process, technical analysis, technical linguistics work, and classifier performance and results are presented. The paper concludes with a discussion of how the current system will evolve into an influence mining system.

2020-08-24
Liang, Dai, Pan, Peisheng.  2019.  Research on Intrusion Detection Based on Improved DBN-ELM. 2019 International Conference on Communications, Information System and Computer Engineering (CISCE). :495–499.
To leverage the feature extraction of DBN and the fast classification and good generalization of ELM, an improved method of DBN-ELM is proposed for intrusion detection. The improved model uses deep belief network (DBN) to train NSL-KDD dataset and feed them back to the extreme learning machine (ELM) for classification. A classifier is connected at each intermediate level of the DBN-ELM. By majority voting on the output of classifier and ELM, the final output is calculated by integration. Experiments show that the improved model increases the classification confidence and accuracy of the classifier. The model has been benchmarked on the NSL-KDD dataset, and the accuracy of the model has been improved to 97.82%, while the false alarm rate has been reduced to 1.81%. Proposed improved model has been also compared with DBN, ELM, DBN-ELM and achieves competitive accuracy.
Raghavan, Pradheepan, Gayar, Neamat El.  2019.  Fraud Detection using Machine Learning and Deep Learning. 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE). :334–339.
Frauds are known to be dynamic and have no patterns, hence they are not easy to identify. Fraudsters use recent technological advancements to their advantage. They somehow bypass security checks, leading to the loss of millions of dollars. Analyzing and detecting unusual activities using data mining techniques is one way of tracing fraudulent transactions. transactions. This paper aims to benchmark multiple machine learning methods such as k-nearest neighbor (KNN), random forest and support vector machines (SVM), while the deep learning methods such as autoencoders, convolutional neural networks (CNN), restricted boltzmann machine (RBM) and deep belief networks (DBN). The datasets which will be used are the European (EU) Australian and German dataset. The Area Under the ROC Curve (AUC), Matthews Correlation Coefficient (MCC) and Cost of failure are the 3-evaluation metrics that would be used.
Sarma, Subramonian Krishna.  2019.  Optimized Activation Function on Deep Belief Network for Attack Detection in IoT. 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). :702–708.
This paper mainly focuses on presenting a novel attack detection system to thread out the risk issues in IoT. The presented attack detection system links the interconnection of DevOps as it creates the correlation between development and IT operations. Further, the presented attack detection model ensures the operational security of different applications. In view of this, the implemented system incorporates two main stages named Proposed Feature Extraction process and Classification. The data from every application is processed with the initial stage of feature extraction, which concatenates the statistical and higher-order statistical features. After that, these extracted features are supplied to classification process, where determines the presence of attacks. For this classification purpose, this paper aims to deploy the optimized Deep Belief Network (DBN), where the activation function is tuned optimally. Furthermore, the optimal tuning is done by a renowned meta-heuristic algorithm called Lion Algorithm (LA). Finally, the performance of proposed work is compared and proved over other conventional methods.
Lavrenovs, Arturs, Visky, Gabor.  2019.  Exploring features of HTTP responses for the classification of devices on the Internet. 2019 27th Telecommunications Forum (℡FOR). :1–4.
Devices that are connected to the Internet are very interesting to security researchers as are at high risk of being attacked, compromised or otherwise abused. To investigate the root causes of the risks it is necessary to understand what classes of devices are affected in different ways. These devices are heterogeneous, thus making it impractical to classify large sets by applying static rules. We propose improvements for manually labelling training sets using HTTP response features for future classification using a neural network.
Gupta, Nitika, Traore, Issa, de Quinan, Paulo Magella Faria.  2019.  Automated Event Prioritization for Security Operation Center using Deep Learning. 2019 IEEE International Conference on Big Data (Big Data). :5864–5872.
Despite their popularity, Security Operation Centers (SOCs) are facing increasing challenges and pressure due to the growing volume, velocity and variety of the IT infrastructure and security data observed on a daily basis. Due to the mixed performance of current technological solutions, e.g. IDS and SIEM, there is an over-reliance on manual analysis of the events by human security analysts. This creates huge backlogs and slow down considerably the resolution of critical security events. Obvious solutions include increasing accuracy and efficiency in the automation of crucial aspects of the SOC workflow, such as the event classification and prioritization. In the current paper, we present a new approach for SOC event classification by identifying a set of new features using graphical analysis and classifying using a deep neural network model. Experimental evaluation using real SOC event log data yields very encouraging results in terms of classification accuracy.