Biblio
Continuous and adaptive learning is an effective learning approach when dealing with highly dynamic and changing scenarios, where concept drift often happens. In a continuous, stream or adaptive learning setup, new measurements arrive continuously and there are no boundaries for learning, meaning that the learning model has to decide how and when to (re)learn from these new data constantly. We address the problem of adaptive and continual learning for network security, building dynamic models to detect network attacks in real network traffic. The combination of fast and big network measurements data with the re-training paradigm of adaptive learning imposes complex challenges in terms of data processing speed, which we tackle by relying on big data platforms for parallel stream processing. We build and benchmark different adaptive learning models on top of a novel big data analytics platform for network traffic monitoring and analysis tasks, and show that high speed-up computations (as high as × 6) can be achieved by parallelizing off-the-shelf stream learning approaches.
Accurate network traffic identification is an important basis for network traffic monitoring and data analysis, and is the key to improve the quality of user service. In this paper, through the analysis of two network traffic identification methods based on machine learning and deep packet inspection, a network traffic identification method based on machine learning and deep packet inspection is proposed. This method uses deep packet inspection technology to identify most network traffic, reduces the workload that needs to be identified by machine learning method, and deep packet inspection can identify specific application traffic, and improves the accuracy of identification. Machine learning method is used to assist in identifying network traffic with encryption and unknown features, which makes up for the disadvantage of deep packet inspection that can not identify new applications and encrypted traffic. Experiments show that this method can improve the identification rate of network traffic.