Biblio
As increasingly more enterprises are deploying cloud file-sharing services, this adds a new channel for potential insider threats to company data and IPs. In this paper, we introduce a two-stage machine learning system to detect anomalies. In the first stage, we project the access logs of cloud file-sharing services onto relationship graphs and use three complementary graph-based unsupervised learning methods: OddBall, PageRank and Local Outlier Factor (LOF) to generate outlier indicators. In the second stage, we ensemble the outlier indicators and introduce the discrete wavelet transform (DWT) method, and propose a procedure to use wavelet coefficients with the Haar wavelet function to identify outliers for insider threat. The proposed system has been deployed in a real business environment, and demonstrated effectiveness by selected case studies.