Biblio
Information security is a process of securing data from security breaches, hackers. The program of intrusion detection is a software framework that keeps tracking and analyzing the data in the network to identify the attacks by using traditional techniques. These traditional intrusion techniques work very efficient when it uses on small data. but when the same techniques used for big data, process of analyzing the data properties take long time and become not efficient and need to use the big data technologies like Apache Spark, Hadoop, Flink etc. to design modern Intrusion Detection System (IDS). In this paper, the design of Apache Spark and classification algorithm-based IDS is presented and employed Chi-square as a feature selection method for selecting the features from network security events data. The performance of Logistic Regression, Decision Tree and SVM is evaluated with SGD in the design of Apache Spark based IDS with AUROC and AUPR used as metrics. Also tabulated the training and testing time of each algorithm and employed NSL-KDD dataset for designing all our experiments.