Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs
Title | Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Manzoor, Emaad, Milajerdi, Sadegh M., Akoglu, Leman |
Conference Name | Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
Date Published | August 2016 |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4232-2 |
Keywords | advanced persistent threat, anomaly detection, Attack Graphs, attack vectors, composability, dynamic networks, edge detection, evolving graphs, graph sketches, heterogenous graphs, Human Behavior, Metrics, pubcrawl, Resiliency, Scalability, security, streamspot, temporal networks, threat vectors, Time Frequency Analysis, typed graphs |
Abstract | Given a stream of heterogeneous graphs containing different types of nodes and edges, how can we spot anomalous ones in real-time while consuming bounded memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. We propose StreamSpot, a clustering based anomaly detection approach that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. We introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of a graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves similarity. StreamSpot exhibits desirable properties that a streaming application requires: it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100,000 edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that StreamSpot is high-performance; achieving above 95% detection accuracy with small delay, as well as competitive time and memory usage. |
URL | https://dl.acm.org/doi/10.1145/2939672.2939783 |
DOI | 10.1145/2939672.2939783 |
Citation Key | manzoor_fast_2016 |
- Human behavior
- typed graphs
- Time Frequency Analysis
- threat vectors
- temporal networks
- streamspot
- security
- Scalability
- Resiliency
- pubcrawl
- Metrics
- advanced persistent threat
- heterogenous graphs
- graph sketches
- evolving graphs
- edge detection
- dynamic networks
- composability
- Attack vectors
- attack graphs
- Anomaly Detection