Visible to the public Biblio

Filters: Author is Oprea, A.  [Clear All Filters]
2019-03-22
Duan, J., Zeng, Z., Oprea, A., Vasudevan, S..  2018.  Automated Generation and Selection of Interpretable Features for Enterprise Security. 2018 IEEE International Conference on Big Data (Big Data). :1258-1265.

We present an effective machine learning method for malicious activity detection in enterprise security logs. Our method involves feature engineering, or generating new features by applying operators on features of the raw data. We generate DNF formulas from raw features, extract Boolean functions from them, and leverage Fourier analysis to generate new parity features and rank them based on their highest Fourier coefficients. We demonstrate on real enterprise data sets that the engineered features enhance the performance of a wide range of classifiers and clustering algorithms. As compared to classification of raw data features, the engineered features achieve up to 50.6% improvement in malicious recall, while sacrificing no more than 0.47% in accuracy. We also observe better isolation of malicious clusters, when performing clustering on engineered features. In general, a small number of engineered features achieve higher performance than raw data features according to our metrics of interest. Our feature engineering method also retains interpretability, an important consideration in cyber security applications.

2018-02-06
Tiwari, T., Turk, A., Oprea, A., Olcoz, K., Coskun, A. K..  2017.  User-Profile-Based Analytics for Detecting Cloud Security Breaches. 2017 IEEE International Conference on Big Data (Big Data). :4529–4535.

While the growth of cloud-based technologies has benefited the society tremendously, it has also increased the surface area for cyber attacks. Given that cloud services are prevalent today, it is critical to devise systems that detect intrusions. One form of security breach in the cloud is when cyber-criminals compromise Virtual Machines (VMs) of unwitting users and, then, utilize user resources to run time-consuming, malicious, or illegal applications for their own benefit. This work proposes a method to detect unusual resource usage trends and alert the user and the administrator in real time. We experiment with three categories of methods: simple statistical techniques, unsupervised classification, and regression. So far, our approach successfully detects anomalous resource usage when experimenting with typical trends synthesized from published real-world web server logs and cluster traces. We observe the best results with unsupervised classification, which gives an average F1-score of 0.83 for web server logs and 0.95 for the cluster traces.