Visible to the public Ranking Causal Anomalies by Modeling Local Propagations on Networked Systems

TitleRanking Causal Anomalies by Modeling Local Propagations on Networked Systems
Publication TypeConference Paper
Year of Publication2017
AuthorsNi, J., Cheng, W., Zhang, K., Song, D., Yan, T., Chen, H., Zhang, X.
Conference Name2017 IEEE International Conference on Data Mining (ICDM)
Date Publishednov
ISBN Number978-1-5386-3835-4
Keywordscausal anomaly detection, Clustering algorithms, complex system behaviors, complex systems, Debugging, fault diagnosis, global fault propagation, Human Behavior, human factor, human factors, Indexes, invariant network, large-scale complex systems, low-rank network diffusion model, matrix factorization, Metrics, Monitoring, multiple fault diagnosis, networked systems, pattern clustering, probabilistic clustering, Probabilistic logic, pubcrawl, resilience, Resiliency, security of data, single fault propagation, system failure status, Time series analysis
Abstract

Complex systems are prevalent in many fields such as finance, security and industry. A fundamental problem in system management is to perform diagnosis in case of system failure such that the causal anomalies, i.e., root causes, can be identified for system debugging and repair. Recently, invariant network has proven a powerful tool in characterizing complex system behaviors. In an invariant network, a node represents a system component, and an edge indicates a stable interaction between two components. Recent approaches have shown that by modeling fault propagation in the invariant network, causal anomalies can be effectively discovered. Despite their success, the existing methods have a major limitation: they typically assume there is only a single and global fault propagation in the entire network. However, in real-world large-scale complex systems, it's more common for multiple fault propagations to grow simultaneously and locally within different node clusters and jointly define the system failure status. Inspired by this key observation, we propose a two-phase framework to identify and rank causal anomalies. In the first phase, a probabilistic clustering is performed to uncover impaired node clusters in the invariant network. Then, in the second phase, a low-rank network diffusion model is designed to backtrack causal anomalies in different impaired clusters. Extensive experimental results on real-life datasets demonstrate the effectiveness of our method.

URLhttps://ieeexplore.ieee.org/document/8215592
DOI10.1109/ICDM.2017.129
Citation Keyni_ranking_2017