Ranking Causal Anomalies by Modeling Local Propagations on Networked Systems
Title | Ranking Causal Anomalies by Modeling Local Propagations on Networked Systems |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Ni, J., Cheng, W., Zhang, K., Song, D., Yan, T., Chen, H., Zhang, X. |
Conference Name | 2017 IEEE International Conference on Data Mining (ICDM) |
Date Published | nov |
ISBN Number | 978-1-5386-3835-4 |
Keywords | causal anomaly detection, Clustering algorithms, complex system behaviors, complex systems, Debugging, fault diagnosis, global fault propagation, Human Behavior, human factor, human factors, Indexes, invariant network, large-scale complex systems, low-rank network diffusion model, matrix factorization, Metrics, Monitoring, multiple fault diagnosis, networked systems, pattern clustering, probabilistic clustering, Probabilistic logic, pubcrawl, resilience, Resiliency, security of data, single fault propagation, system failure status, Time series analysis |
Abstract | Complex systems are prevalent in many fields such as finance, security and industry. A fundamental problem in system management is to perform diagnosis in case of system failure such that the causal anomalies, i.e., root causes, can be identified for system debugging and repair. Recently, invariant network has proven a powerful tool in characterizing complex system behaviors. In an invariant network, a node represents a system component, and an edge indicates a stable interaction between two components. Recent approaches have shown that by modeling fault propagation in the invariant network, causal anomalies can be effectively discovered. Despite their success, the existing methods have a major limitation: they typically assume there is only a single and global fault propagation in the entire network. However, in real-world large-scale complex systems, it's more common for multiple fault propagations to grow simultaneously and locally within different node clusters and jointly define the system failure status. Inspired by this key observation, we propose a two-phase framework to identify and rank causal anomalies. In the first phase, a probabilistic clustering is performed to uncover impaired node clusters in the invariant network. Then, in the second phase, a low-rank network diffusion model is designed to backtrack causal anomalies in different impaired clusters. Extensive experimental results on real-life datasets demonstrate the effectiveness of our method. |
URL | https://ieeexplore.ieee.org/document/8215592 |
DOI | 10.1109/ICDM.2017.129 |
Citation Key | ni_ranking_2017 |
- matrix factorization
- Time series analysis
- system failure status
- single fault propagation
- security of data
- Resiliency
- resilience
- pubcrawl
- Probabilistic logic
- probabilistic clustering
- pattern clustering
- networked systems
- multiple fault diagnosis
- Monitoring
- Metrics
- causal anomaly detection
- low-rank network diffusion model
- large-scale complex systems
- invariant network
- Indexes
- Human Factors
- human factor
- Human behavior
- global fault propagation
- fault diagnosis
- debugging
- complex systems
- complex system behaviors
- Clustering algorithms