Biblio
Machine learning techniques help to understand underlying patterns in datasets to develop defense mechanisms against cyber attacks. Multilayer Perceptron (MLP) technique is a machine learning technique used in detecting attack vs. benign data. However, it is difficult to construct any effective model when there are imbalances in the dataset that prevent proper classification of attack samples in data. In this research, we use UGR'16 dataset to conduct data wrangling initially. This technique helps to prepare a test set from the original dataset to train the neural network model effectively. We experimented with a series of inputs of varying sizes (i.e. 10000, 50000, 1 million) to observe the performance of the MLP neural network model with distribution of features over accuracy. Later, we use Generative Adversarial Network (GAN) model that produces samples of different attack labels (e.g. blacklist, anomaly spam, ssh scan) for balancing the dataset. These samples are generated based on data from the UGR'16 dataset. Further experiments with MLP neural network model shows that a balanced attack sample dataset, made possible with GAN, produces more accurate results than an imbalanced one.
Genetic Algorithms are group of mathematical models in computational science by exciting evolution in AI techniques nowadays. These algorithms preserve critical information by applying data structure with simple chromosome recombination operators by encoding solution to a specific problem. Genetic algorithms they are optimizer, in which range of problems applied to it are quite broad. Genetic Algorithms with its global search includes basic principles like selection, crossover and mutation. Data structures, algorithms and human brain inspiration are found for classification of data and for learning which works using Neural Networks. Artificial Intelligence (AI) it is a field, where so many tasks performed naturally by a human. When AI conventional methods are used in a computer it was proved as a complicated task. Applying Neural Networks techniques will create an internal structure of rules by which a program can learn by examples, to classify different inputs than mining techniques. This paper proposes a phishing websites classifier using improved polynomial neural networks in genetic algorithm.