Title | API Security in Large Enterprises: Leveraging Machine Learning for Anomaly Detection |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Baye, Gaspard, Hussain, Fatima, Oracevic, Alma, Hussain, Rasheed, Ahsan Kazmi, S.M. |
Conference Name | 2021 International Symposium on Networks, Computers and Communications (ISNCC) |
Keywords | Anomalies, API, APIs, Application program interface, application program interfaces, Application Programming Interface (API), composability, compositionality, feature extraction, gaussian distribution, machine learning, Micro-service, pubcrawl, reliability, resilience, Resiliency, security, support vector machine (SVM), support vector machine classification, Training |
Abstract | Large enterprises offer thousands of micro-services applications to support their daily business activities by using Application Programming Interfaces (APIs). These applications generate huge amounts of traffic via millions of API calls every day, which is difficult to analyze for detecting any potential abnormal behaviour and application outage. This phenomenon makes Machine Learning (ML) a natural choice to leverage and analyze the API traffic and obtain intelligent predictions. This paper proposes an ML-based technique to detect and classify API traffic based on specific features like bandwidth and number of requests per token. We employ a Support Vector Machine (SVM) as a binary classifier to classify the abnormal API traffic using its linear kernel. Due to the scarcity of the API dataset, we created a synthetic dataset inspired by the real-world API dataset. Then we used the Gaussian distribution outlier detection technique to create a training labeled dataset simulating real-world API logs data which we used to train the SVM classifier. Furthermore, to find a trade-off between accuracy and false positives, we aim at finding the optimal value of the error term (C) of the classifier. The proposed anomaly detection method can be used in a plug and play manner, and fits into the existing micro-service architecture with little adjustments in order to provide accurate results in a fast and reliable way. Our results demonstrate that the proposed method achieves an F1-score of 0.964 in detecting anomalies in API traffic with a 7.3% of false positives rate. |
DOI | 10.1109/ISNCC52172.2021.9615638 |
Citation Key | baye_api_2021 |