Visible to the public Biblio

Filters: Keyword is classification accuracy  [Clear All Filters]
2020-10-26
Sethi, Kamalakanta, Kumar, Rahul, Sethi, Lingaraj, Bera, Padmalochan, Patra, Prashanta Kumar.  2019.  A Novel Machine Learning Based Malware Detection and Classification Framework. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1–4.
As time progresses, new and complex malware types are being generated which causes a serious threat to computer systems. Due to this drastic increase in the number of malware samples, the signature-based malware detection techniques cannot provide accurate results. Different studies have demonstrated the proficiency of machine learning for the detection and classification of malware files. Further, the accuracy of these machine learning models can be improved by using feature selection algorithms to select the most essential features and reducing the size of the dataset which leads to lesser computations. In this paper, we have developed a machine learning based malware analysis framework for efficient and accurate malware detection and classification. We used Cuckoo sandbox for dynamic analysis which executes malware in an isolated environment and generates an analysis report based on the system activities during execution. Further, we propose a feature extraction and selection module which extracts features from the report and selects the most important features for ensuring high accuracy at minimum computation cost. Then, we employ different machine learning algorithms for accurate detection and fine-grained classification. Experimental results show that we got high detection and classification accuracy in comparison to the state-of-the-art approaches.
2020-08-13
Sadeghi, Koosha, Banerjee, Ayan, Gupta, Sandeep K. S..  2019.  An Analytical Framework for Security-Tuning of Artificial Intelligence Applications Under Attack. 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). :111—118.
Machine Learning (ML) algorithms, as the core technology in Artificial Intelligence (AI) applications, such as self-driving vehicles, make important decisions by performing a variety of data classification or prediction tasks. Attacks on data or algorithms in AI applications can lead to misclassification or misprediction, which can fail the applications. For each dataset separately, the parameters of ML algorithms should be tuned to reach a desirable classification or prediction accuracy. Typically, ML experts tune the parameters empirically, which can be time consuming and does not guarantee the optimal result. To this end, some research suggests an analytical approach to tune the ML parameters for maximum accuracy. However, none of the works consider the ML performance under attack in their tuning process. This paper proposes an analytical framework for tuning the ML parameters to be secure against attacks, while keeping its accuracy high. The framework finds the optimal set of parameters by defining a novel objective function, which takes into account the test results of both ML accuracy and its security against attacks. For validating the framework, an AI application is implemented to recognize whether a subject's eyes are open or closed, by applying k-Nearest Neighbors (kNN) algorithm on her Electroencephalogram (EEG) signals. In this application, the number of neighbors (k) and the distance metric type, as the two main parameters of kNN, are chosen for tuning. The input data perturbation attack, as one of the most common attacks on ML algorithms, is used for testing the security of the application. Exhaustive search approach is used to solve the optimization problem. The experiment results show k = 43 and cosine distance metric is the optimal configuration of kNN for the EEG dataset, which leads to 83.75% classification accuracy and reduces the attack success rate to 5.21%.
2020-08-03
Dai, Haipeng, Liu, Alex X., Li, Zeshui, Wang, Wei, Zhang, Fengmin, Dong, Chao.  2019.  Recognizing Driver Talking Direction in Running Vehicles with a Smartphone. 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). :10–18.
This paper addresses the fundamental problem of identifying driver talking directions using a single smartphone, which can help drivers by warning distraction of having conversations with passengers in a vehicle and enable safety enhancement. The basic idea of our system is to perform talking status and direction identification using two microphones on a smartphone. We first use the sound recorded by the two microphones to identify whether the driver is talking or not. If yes, we then extract the so-called channel fingerprint from the speech signal and classify it into one of three typical driver talking directions, namely, front, right and back, using a trained model obtained in advance. The key novelty of our scheme is the proposition of channel fingerprint which leverages the heavy multipath effects in the harsh in-vehicle environment and cancels the variability of human voice, both of which combine to invalidate traditional TDoA, DoA and fingerprint based sound source localization approaches. We conducted extensive experiments using two kinds of phones and two vehicles for four phone placements in three representative scenarios, and collected 23 hours voice data from 20 participants. The results show that our system can achieve 95.0% classification accuracy on average.
2020-06-22
Lv, Chaoxian, Li, Qianmu, Long, Huaqiu, Ren, Yumei, Ling, Fei.  2019.  A Differential Privacy Random Forest Method of Privacy Protection in Cloud. 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC). :470–475.
This paper proposes a new random forest classification algorithm based on differential privacy protection. In order to reduce the impact of differential privacy protection on the accuracy of random forest classification, a hybrid decision tree algorithm is proposed in this paper. The hybrid decision tree algorithm is applied to the construction of random forest, which balances the privacy and classification accuracy of the random forest algorithm based on differential privacy. Experiment results show that the random forest algorithm based on differential privacy can provide high privacy protection while ensuring high classification performance, achieving a balance between privacy and classification accuracy, and has practical application value.
2020-06-12
Li, Wenyue, Yin, Jihao, Han, Bingnan, Zhu, Hongmei.  2019.  Generative Adversarial Network with Folded Spectrum for Hyperspectral Image Classification. IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium. :883—886.

Hyperspectral image (HSIs) with abundant spectral information but limited labeled dataset endows the rationality and necessity of semi-supervised spectral-based classification methods. Where, the utilizing approach of spectral information is significant to classification accuracy. In this paper, we propose a novel semi-supervised method based on generative adversarial network (GAN) with folded spectrum (FS-GAN). Specifically, the original spectral vector is folded to 2D square spectrum as input of GAN, which can generate spectral texture and provide larger receptive field over both adjacent and non-adjacent spectral bands for deep feature extraction. The generated fake folded spectrum, the labeled and unlabeled real folded spectrum are then fed to the discriminator for semi-supervised learning. A feature matching strategy is applied to prevent model collapse. Extensive experimental comparisons demonstrate the effectiveness of the proposed method.

2020-05-11
Cui, Zhicheng, Zhang, Muhan, Chen, Yixin.  2018.  Deep Embedding Logistic Regression. 2018 IEEE International Conference on Big Knowledge (ICBK). :176–183.
Logistic regression (LR) is used in many areas due to its simplicity and interpretability. While at the same time, those two properties limit its classification accuracy. Deep neural networks (DNNs), instead, achieve state-of-the-art performance in many domains. However, the nonlinearity and complexity of DNNs make it less interpretable. To balance interpretability and classification performance, we propose a novel nonlinear model, Deep Embedding Logistic Regression (DELR), which augments LR with a nonlinear dimension-wise feature embedding. In DELR, each feature embedding is learned through a deep and narrow neural network and LR is attached to decide feature importance. A compact and yet powerful model, DELR offers great interpretability: it can tell the importance of each input feature, yield meaningful embedding of categorical features, and extract actionable changes, making it attractive for tasks such as market analysis and clinical prediction.
2019-06-10
Roseline, S. A., Geetha, S..  2018.  Intelligent Malware Detection Using Oblique Random Forest Paradigm. 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI). :330-336.

With the increase in the popularity of computerized online applications, the analysis, and detection of a growing number of newly discovered stealthy malware poses a significant challenge to the security community. Signature-based and behavior-based detection techniques are becoming inefficient in detecting new unknown malware. Machine learning solutions are employed to counter such intelligent malware and allow performing more comprehensive malware detection. This capability leads to an automatic analysis of malware behavior. The proposed oblique random forest ensemble learning technique is efficient for malware classification. The effectiveness of the proposed method is demonstrated with three malware classification datasets from various sources. The results are compared with other variants of decision tree learning models. The proposed system performs better than the existing system in terms of classification accuracy and false positive rate.