Visible to the public Biblio

Filters: Keyword is classifiers  [Clear All Filters]
2023-03-31
Premalatha, N., Sujatha, S..  2022.  An Optimization driven – Deep Belief Neural Network Model for Prediction of Employment Status after Graduation. 2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT). :1–5.
Higher education management has problems producing 100% of graduates capable of responding to the needs of industry while industry also is struggling to find qualified graduates that responded to their needs in part because of the inefficient way of evaluating problems, as well as because of weaknesses in the evaluation of problem-solving capabilities. The objective of this paper is to propose an appropriate classification model to be used for predicting and evaluating the attributes of the data set of the student in order to meet the selection criteria required by the industries in the academic field. The dataset required for this analysis was obtained from a private firm and the execution was carried out using Chimp Optimization Algorithm (COA) based Deep Belief Neural Network (COA-DBNN) and the obtained results are compared with various classifiers such as Logistic Regression (LR), Decision Tree (DT) and Random Forest (RF). The proposed model outperforms other classifiers in terms of various performance metrics. This critical analysis will help the college management to make a better long-term plan for producing graduates who are skilled, knowledgeable and fulfill the industry needs as well.
2021-11-08
Bosaeed, Sahar, Katib, Iyad, Mehmood, Rashid.  2020.  A Fog-Augmented Machine Learning based SMS Spam Detection and Classification System. 2020 Fifth International Conference on Fog and Mobile Edge Computing (FMEC). :325–330.
Smart cities and societies are driving unprecedented technological and socioeconomic growth in everyday life albeit making us increasingly vulnerable to infinitely and incomprehensibly diverse threats. Short Message Service (SMS) spam is one such threat that can affect mobile security by propagating malware on mobile devices. A security breach could also cause a mobile device to send spam messages. Many works have focused on classifying incoming SMS messages. This paper proposes a tool to detect spam from outgoing SMS messages, although the work can be applied to both incoming and outgoing SMS messages. Specifically, we develop a system that comprises multiple machine learning (ML) based classifiers built by us using three classification methods – Naïve Bayes (NB), Support Vector Machine (SVM), and Naïve Bayes Multinomial (NBM)- and five preprocessing and feature extraction methods. The system is built to allow its execution in cloud, fog or edge layers, and is evaluated using 15 datasets built by 4 widely-used public SMS datasets. The system detects spam SMSs and gives recommendations on the spam filters and classifiers to be used based on user preferences including classification accuracy, True Negatives (TN), and computational resource requirements.
2021-02-10
Banerjee, R., Baksi, A., Singh, N., Bishnu, S. K..  2020.  Detection of XSS in web applications using Machine Learning Classifiers. 2020 4th International Conference on Electronics, Materials Engineering Nano-Technology (IEMENTech). :1—5.
Considering the amount of time we spend on the internet, web pages have evolved over a period of time with rapid progression and momentum. With such advancement, we find ourselves fronting a few hostile ideologies, breaching the security levels of webpages as such. The most hazardous of them all is XSS, known as Cross-Site Scripting, is one of the attacks which frequently occur in website-based applications. Cross-Site Scripting (XSS) attacks happen when malicious data enters a web application through an untrusted source. The spam attacks happen in the form of Wall posts, News feed, Message spam and mostly when a user is open to download content of webpages. This paper investigates the use of machine learning to build classifiers to allow the detection of XSS. Establishing our approach, we target the detection modus operandi of XSS attack via two features: URLs and JavaScript. To predict the level of XSS threat, we will be using four machine learning algorithms (SVM, KNN, Random forest and Logistic Regression). Proposing these classified algorithms, webpages will be branded as malicious or benign. After assessing and calculating the dataset features, we concluded that the Random Forest Classifier performed most accurately with the lowest False Positive Rate of 0.34. This precision will ensure a method much efficient to evaluate threatening XSS for the smooth functioning of the system.
2020-11-09
Kemp, C., Calvert, C., Khoshgoftaar, T..  2018.  Utilizing Netflow Data to Detect Slow Read Attacks. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :108–116.
Attackers can leverage several techniques to compromise computer networks, ranging from sophisticated malware to DDoS (Distributed Denial of Service) attacks that target the application layer. Application layer DDoS attacks, such as Slow Read, are implemented with just enough traffic to tie up CPU or memory resources causing web and application servers to go offline. Such attacks can mimic legitimate network requests making them difficult to detect. They also utilize less volume than traditional DDoS attacks. These low volume attack methods can often go undetected by network security solutions until it is too late. In this paper, we explore the use of machine learners for detecting Slow Read DDoS attacks on web servers at the application layer. Our approach uses a generated dataset based upon Netflow data collected at the application layer on a live network environment. Our Netflow data uses the IP Flow Information Export (IPFIX) standard providing significant flexibility and features. These Netflow features can process and handle a growing amount of traffic and have worked well in our previous DDoS work detecting evasion techniques. Our generated dataset consists of real-world network data collected from a production network. We use eight different classifiers to build Slow Read attack detection models. Our wide selection of learners provides us with a more comprehensive analysis of Slow Read detection models. Experimental results show that the machine learners were quite successful in identifying the Slow Read attacks with a high detection and low false alarm rate. The experiment demonstrates that our chosen Netflow features are discriminative enough to detect such attacks accurately.
2020-02-10
Melo, Princess Marie B., Sison, Ariel M., Medina, Ruji P..  2019.  Enhanced TCP Sequence Number Steganography Using Dynamic Identifier. 2019 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE). :482–485.

Network steganography is a branch of steganography that hides information through packet header manipulation and uses protocols as carriers to hide secret information. Many techniques were already developed using the Transmission Control Protocol (TCP) headers. Among the schemes in hiding information in the TCP header, the Initial Sequence Number (ISN) field is the most difficult to be detected since this field can have arbitrary values within the requirements of the standard. In this paper, a more undetectable scheme is proposed by increasing the complexity of hiding data in the TCP ISN using dynamic identifiers. The experimental results have shown that using Bayes Net, the proposed scheme outperforms the existing scheme with a low detection accuracy of 0.52%.

Hasan, Musaab, Balbahaith, Zayed, Tarique, Mohammed.  2019.  Detection of SQL Injection Attacks: A Machine Learning Approach. 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA). :1–6.
With the rapid growth in online services, hacking (alternatively attacking) on online database applications has become a grave concern now. Attacks on online database application are being frequently reported. Among these attacks, the SQL injection attack is at the top of the list. The hackers alter the SQL query sent by the user and inject malicious code therein. Hence, they access the database and manipulate the data. It is reported in the literature that the traditional SQL injection detection algorithms fail to prevent this type of attack. In this paper, we propose a machine learning based heuristic algorithm to prevent the SQL injection attack. We use a dataset of 616 SQL statements to train and test 23 different machine learning classifiers. Among these classifiers, we select the best five classifiers based on their detection accuracy and develop a Graphical User Interface (GUI) application based on these five classifiers. We test our proposed algorithm and the results show that our algorithm is able to detect the SQL injection attack with a high accuracy (93.8%).
2019-11-26
Zabihimayvan, Mahdieh, Doran, Derek.  2019.  Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection. 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1-6.

Phishing as one of the most well-known cybercrime activities is a deception of online users to steal their personal or confidential information by impersonating a legitimate website. Several machine learning-based strategies have been proposed to detect phishing websites. These techniques are dependent on the features extracted from the website samples. However, few studies have actually considered efficient feature selection for detecting phishing attacks. In this work, we investigate an agreement on the definitive features which should be used in phishing detection. We apply Fuzzy Rough Set (FRS) theory as a tool to select most effective features from three benchmarked data sets. The selected features are fed into three often used classifiers for phishing detection. To evaluate the FRS feature selection in developing a generalizable phishing detection, the classifiers are trained by a separate out-of-sample data set of 14,000 website samples. The maximum F-measure gained by FRS feature selection is 95% using Random Forest classification. Also, there are 9 universal features selected by FRS over all the three data sets. The F-measure value using this universal feature set is approximately 93% which is a comparable result in contrast to the FRS performance. Since the universal feature set contains no features from third-part services, this finding implies that with no inquiry from external sources, we can gain a faster phishing detection which is also robust toward zero-day attacks.

2019-03-22
Duan, J., Zeng, Z., Oprea, A., Vasudevan, S..  2018.  Automated Generation and Selection of Interpretable Features for Enterprise Security. 2018 IEEE International Conference on Big Data (Big Data). :1258-1265.

We present an effective machine learning method for malicious activity detection in enterprise security logs. Our method involves feature engineering, or generating new features by applying operators on features of the raw data. We generate DNF formulas from raw features, extract Boolean functions from them, and leverage Fourier analysis to generate new parity features and rank them based on their highest Fourier coefficients. We demonstrate on real enterprise data sets that the engineered features enhance the performance of a wide range of classifiers and clustering algorithms. As compared to classification of raw data features, the engineered features achieve up to 50.6% improvement in malicious recall, while sacrificing no more than 0.47% in accuracy. We also observe better isolation of malicious clusters, when performing clustering on engineered features. In general, a small number of engineered features achieve higher performance than raw data features according to our metrics of interest. Our feature engineering method also retains interpretability, an important consideration in cyber security applications.

2018-09-12
Montieri, A., Ciuonzo, D., Aceto, G., Pescape, A..  2017.  Anonymity Services Tor, I2P, JonDonym: Classifying in the Dark. 2017 29th International Teletraffic Congress (ITC 29). 1:81–89.

Traffic classification, i.e. associating network traffic to the application that generated it, is an important tool for several tasks, spanning on different fields (security, management, traffic engineering, R&D). This process is challenged by applications that preserve Internet users' privacy by encrypting the communication content, and even more by anonymity tools, additionally hiding the source, the destination, and the nature of the communication. In this paper, leveraging a public dataset released in 2017, we provide (repeatable) classification results with the aim of investigating to what degree the specific anonymity tool (and the traffic it hides) can be identified, when compared to the traffic of the other considered anonymity tools, using machine learning approaches based on the sole statistical features. To this end, four classifiers are trained and tested on the dataset: (i) Naïve Bayes, (ii) Bayesian Network, (iii) C4.5, and (iv) Random Forest. Results show that the three considered anonymity networks (Tor, I2P, JonDonym) can be easily distinguished (with an accuracy of 99.99%), telling even the specific application generating the traffic (with an accuracy of 98.00%).

2017-03-08
Mishra, A., Kumar, K., Rai, S. N., Mittal, V. K..  2015.  Multi-stage face recognition for biometric access. 2015 Annual IEEE India Conference (INDICON). :1–6.

Protecting the privacy of user-identification data is fundamental to protect the information systems from attacks and vulnerabilities. Providing access to such data only to the limited and legitimate users is the key motivation for `Biometrics'. In `Biometric Systems' confirming a user's claim of his/her identity reliably, is more important than focusing on `what he/she really possesses' or `what he/she remembers'. In this paper the use of face image for biometric access is proposed using two multistage face recognition algorithms that employ biometric facial features to validate the user's claim. The proposed algorithms use standard algorithms and classifiers such as EigenFaces, PCA and LDA in stages. Performance evaluation of both proposed algorithms is carried out using two standard datasets, the Extended Yale database and AT&T database. Results using the proposed multi-stage algorithms are better than those using other standard algorithms. Current limitations and possible applications of the proposed algorithms are also discussed along, with further scope of making these robust to pose, illumination and noise variations.