AESOP: Automatic Policy Learning for Predicting and Mitigating Network Service Impairments

Submitted by grigby1 on Tue, 01/23/2018 - 4:02pm

Title	AESOP: Automatic Policy Learning for Predicting and Mitigating Network Service Impairments
Publication Type	Conference Paper
Year of Publication	2017
Authors	Deb, Supratim, Ge, Zihui, Isukapalli, Sastry, Puthenpura, Sarat, Venkataraman, Shobha, Yan, He, Yates, Jennifer
Conference Name	Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Date Published	August 2017
Publisher	ACM
Conference Location	New York, NY, USA
ISBN Number	978-1-4503-4887-4
Keywords	Automated Response Actions, composability, network management, policy learning, pubcrawl, Resiliency, supervised learning
Abstract	Efficient management and control of modern and next-gen networks is of paramount importance as networks have to maintain highly reliable service quality whilst supporting rapid growth in traffic demand and new application services. Rapid mitigation of network service degradations is a key factor in delivering high service quality. Automation is vital to achieving rapid mitigation of issues, particularly at the network edge where the scale and diversity is the greatest. This automation involves the rapid detection, localization and (where possible) repair of service-impacting faults and performance impairments. However, the most significant challenge here is knowing what events to detect, how to correlate events to localize an issue and what mitigation actions should be performed in response to the identified issues. These are defined as policies to systems such as ECOMP. In this paper, we present AESOP, a data-driven intelligent system to facilitate automatic learning of policies and rules for triggering remedial actions in networks. AESOP combines best operational practices (domain knowledge) with a variety of measurement data to learn and validate operational policies to mitigate service issues in networks. AESOP's design addresses the following key challenges: (i) learning from high-dimensional noisy data, (ii) capturing multiple fault models, (iii) modeling the high service-cost of false positives, and (iv) accounting for the evolving network infrastructure. We present the design of our system and show results from our ongoing experiments to show the effectiveness of our policy leaning framework.
URL	http://doi.acm.org/10.1145/3097983.3098157
DOI	10.1145/3097983.3098157
Citation Key	deb_aesop:_2017

Groups:

Science of Security VO