Practical and White-Box Anomaly Detection through Unsupervised and Active Learning
Title | Practical and White-Box Anomaly Detection through Unsupervised and Active Learning |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Wang, Y., Wang, Z., Xie, Z., Zhao, N., Chen, J., Zhang, W., Sui, K., Pei, D. |
Conference Name | 2020 29th International Conference on Computer Communications and Networks (ICCCN) |
Keywords | active learning, anomaly detection, composability, Forestry, iRRCF, key performance indicators, KPI anomaly detection framework, Labeling, Metrics, Monitoring, Neural networks, pubcrawl, random forests, resilience, Resiliency, robust random cut forest, RRCF, RRCF algorithm, security, security of data, supervised learning, time series, Time series analysis, Unsupervised Anomaly Detection, unsupervised learning, user experience, white box, White Box Security, white-box anomaly detection |
Abstract | To ensure quality of service and user experience, large Internet companies often monitor various Key Performance Indicators (KPIs) of their systems so that they can detect anomalies and identify failure in real time. However, due to a large number of various KPIs and the lack of high-quality labels, existing KPI anomaly detection approaches either perform well only on certain types of KPIs or consume excessive resources. Therefore, to realize generic and practical KPI anomaly detection in the real world, we propose a KPI anomaly detection framework named iRRCF-Active, which contains an unsupervised and white-box anomaly detector based on Robust Random Cut Forest (RRCF), and an active learning component. Specifically, we novelly propose an improved RRCF (iRRCF) algorithm to overcome the drawbacks of applying original RRCF in KPI anomaly detection. Besides, we also incorporate the idea of active learning to make our model benefit from high-quality labels given by experienced operators. We conduct extensive experiments on a large-scale public dataset and a private dataset collected from a large commercial bank. The experimental resulta demonstrate that iRRCF-Active performs better than existing traditional statistical methods, unsupervised learning methods and supervised learning methods. Besides, each component in iRRCF-Active has also been demonstrated to be effective and indispensable. |
DOI | 10.1109/ICCCN49398.2020.9209704 |
Citation Key | wang_practical_2020 |
- pubcrawl
- white-box anomaly detection
- user experience
- Unsupervised Learning
- Unsupervised Anomaly Detection
- Time series analysis
- time series
- supervised learning
- security of data
- RRCF algorithm
- RRCF
- robust random cut forest
- Resiliency
- resilience
- random forests
- White Box Security
- Neural networks
- Monitoring
- Metrics
- Labeling
- KPI anomaly detection framework
- key performance indicators
- iRRCF
- Forestry
- composability
- Anomaly Detection
- active learning
- security
- white box