Visible to the public Biblio

Filters: Keyword is support vector machine  [Clear All Filters]
2020-01-20
Yihunie, Fekadu, Abdelfattah, Eman, Regmi, Amish.  2019.  Applying Machine Learning to Anomaly-Based Intrusion Detection Systems. 2019 IEEE Long Island Systems, Applications and Technology Conference (LISAT). :1–5.

The enormous growth of Internet-based traffic exposes corporate networks with a wide variety of vulnerabilities. Intrusive traffics are affecting the normal functionality of network's operation by consuming corporate resources and time. Efficient ways of identifying, protecting, and mitigating from intrusive incidents enhance productivity. As Intrusion Detection System (IDS) is hosted in the network and at the user machine level to oversee the malicious traffic in the network and at the individual computer, it is one of the critical components of a network and host security. Unsupervised anomaly traffic detection techniques are improving over time. This research aims to find an efficient classifier that detects anomaly traffic from NSL-KDD dataset with high accuracy level and minimal error rate by experimenting with five machine learning techniques. Five binary classifiers: Stochastic Gradient Decent, Random Forests, Logistic Regression, Support Vector Machine, and Sequential Model are tested and validated to produce the result. The outcome demonstrates that Random Forest Classifier outperforms the other four classifiers with and without applying the normalization process to the dataset.

Halimaa A., Anish, Sundarakantham, K..  2019.  Machine Learning Based Intrusion Detection System. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :916–920.

In order to examine malicious activity that occurs in a network or a system, intrusion detection system is used. Intrusion Detection is software or a device that scans a system or a network for a distrustful activity. Due to the growing connectivity between computers, intrusion detection becomes vital to perform network security. Various machine learning techniques and statistical methodologies have been used to build different types of Intrusion Detection Systems to protect the networks. Performance of an Intrusion Detection is mainly depends on accuracy. Accuracy for Intrusion detection must be enhanced to reduce false alarms and to increase the detection rate. In order to improve the performance, different techniques have been used in recent works. Analyzing huge network traffic data is the main work of intrusion detection system. A well-organized classification methodology is required to overcome this issue. This issue is taken in proposed approach. Machine learning techniques like Support Vector Machine (SVM) and Naïve Bayes are applied. These techniques are well-known to solve the classification problems. For evaluation of intrusion detection system, NSL- KDD knowledge discovery Dataset is taken. The outcomes show that SVM works better than Naïve Bayes. To perform comparative analysis, effective classification methods like Support Vector Machine and Naive Bayes are taken, their accuracy and misclassification rate get calculated.

2019-12-30
Heydari, Mohammad, Mylonas, Alexios, Katos, Vasilios, Balaguer-Ballester, Emili, Tafreshi, Vahid Heydari Fami, Benkhelifa, Elhadj.  2019.  Uncertainty-Aware Authentication Model for Fog Computing in IoT. 2019 Fourth International Conference on Fog and Mobile Edge Computing (FMEC). :52–59.

Since the term “Fog Computing” has been coined by Cisco Systems in 2012, security and privacy issues of this promising paradigm are still open challenges. Among various security challenges, Access Control is a crucial concern for all cloud computing-like systems (e.g. Fog computing, Mobile edge computing) in the IoT era. Therefore, assigning the precise level of access in such an inherently scalable, heterogeneous and dynamic environment is not easy to perform. This work defines the uncertainty challenge for authentication phase of the access control in fog computing because on one hand fog has a number of characteristics that amplify uncertainty in authentication and on the other hand applying traditional access control models does not result in a flexible and resilient solution. Therefore, we have proposed a novel prediction model based on the extension of Attribute Based Access Control (ABAC) model. Our data-driven model is able to handle uncertainty in authentication. It is also able to consider the mobility of mobile edge devices in order to handle authentication. In doing so, we have built our model using and comparing four supervised classification algorithms namely as Decision Tree, Naïve Bayes, Logistic Regression and Support Vector Machine. Our model can achieve authentication performance with 88.14% accuracy using Logistic Regression.

2019-11-04
Ramachandran, Raji, Nidhin, R, Shogil, P P.  2018.  Anomaly Detection in Role Administered Relational Databases — A Novel Method. 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI). :1017–1021.
A significant amount of attempt has been lately committed for the progress of Database Management Systems (DBMS) that ensures high assertion and high security. Common security measures for database like access control measures, validation, encryption technologies, etc are not sufficient enough to secure the data from all the threats. By using an anomaly detection system, we are able to enhance the security feature of the Database management system. We are taking an assumption that the database access control is role based. In this paper, a mechanism is proposed for finding the anomaly in database by using machine learning technique such as classification. The importance of providing anomaly detection technique to a Role-Based Access Control database is that it will help for the protection against the insider attacks. The experimentation results shows that the system is able to detect intrusion effectively with high accuracy and high F1-score.
2019-07-01
Perez, R. Lopez, Adamsky, F., Soua, R., Engel, T..  2018.  Machine Learning for Reliable Network Attack Detection in SCADA Systems. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :633–638.

Critical Infrastructures (CIs) use Supervisory Control And Data Acquisition (SCADA) systems for remote control and monitoring. Sophisticated security measures are needed to address malicious intrusions, which are steadily increasing in number and variety due to the massive spread of connectivity and standardisation of open SCADA protocols. Traditional Intrusion Detection Systems (IDSs) cannot detect attacks that are not already present in their databases. Therefore, in this paper, we assess Machine Learning (ML) for intrusion detection in SCADA systems using a real data set collected from a gas pipeline system and provided by the Mississippi State University (MSU). The contribution of this paper is two-fold: 1) The evaluation of four techniques for missing data estimation and two techniques for data normalization, 2) The performances of Support Vector Machine (SVM), and Random Forest (RF) are assessed in terms of accuracy, precision, recall and F1score for intrusion detection. Two cases are differentiated: binary and categorical classifications. Our experiments reveal that RF detect intrusions effectively, with an F1score of respectively \textbackslashtextgreater 99%.

2019-06-10
Kornish, D., Geary, J., Sansing, V., Ezekiel, S., Pearlstein, L., Njilla, L..  2018.  Malware Classification Using Deep Convolutional Neural Networks. 2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). :1-6.

In recent years, deep convolution neural networks (DCNNs) have won many contests in machine learning, object detection, and pattern recognition. Furthermore, deep learning techniques achieved exceptional performance in image classification, reaching accuracy levels beyond human capability. Malware variants from similar categories often contain similarities due to code reuse. Converting malware samples into images can cause these patterns to manifest as image features, which can be exploited for DCNN classification. Techniques for converting malware binaries into images for visualization and classification have been reported in the literature, and while these methods do reach a high level of classification accuracy on training datasets, they tend to be vulnerable to overfitting and perform poorly on previously unseen samples. In this paper, we explore and document a variety of techniques for representing malware binaries as images with the goal of discovering a format best suited for deep learning. We implement a database for malware binaries from several families, stored in hexadecimal format. These malware samples are converted into images using various approaches and are used to train a neural network to recognize visual patterns in the input and classify malware based on the feature vectors. Each image type is assessed using a variety of learning models, such as transfer learning with existing DCNN architectures and feature extraction for support vector machine classifier training. Each technique is evaluated in terms of classification accuracy, result consistency, and time per trial. Our preliminary results indicate that improved image representation has the potential to enable more effective classification of new malware.

2019-03-25
Shehu, Yahaya Isah, James, Anne, Palade, Vasile.  2018.  Detecting an Alteration in Biometric Fingerprint Databases. Proceedings of the 2Nd International Conference on Digital Signal Processing. :6–11.
Assuring the integrity of biometric fingerprint templates in fingerprint databases is of paramount importance. Fingerprint templates contain a set of fingerprint minutiae which are various points of interest in a fingerprint. Most times, it is assumed that the stored biometric fingerprint templates are well protected and, as such, researchers are more concerned with improving/developing biometric systems that will not suffer from an unacceptable rate of false alarms and/or missed detections. The introduction of forensic techniques into biometrics for biometric template manipulation detection is of great importance and little research has been carried in this area. This paper investigates possible forensic techniques that could be used for stored biometric fingerprint templates tampering detection. A Support Vector Machine (SVM) classification approach is used for this task. The original and tampered templates are used to train the SVM classifier. The fingerprint datasets from the Biometrics Ideal Test (BIT) [13] are used for training and testing the classifier. Our proposed approach detects alterations with an accuracy of 90.5%.
2019-03-22
Ali, M. A. M., Tahir, N. M..  2018.  Cancelable Biometrics Technique for Iris Recognition. 2018 IEEE Symposium on Computer Applications Industrial Electronics (ISCAIE). :434-437.

Iris recognition is one of the most reliable biometrics for identification purpose in terms of reliability and accuracy. Hence, in this research the integration of cancelable biometrics features for iris recognition using encryption and decryption non-invertible transformation is proposed. Here, the biometric data is protected via the proposed cancelable biometrics method. The experimental results showed that the recognition rate achieved is 99.9% using Bath-A dataset with a maximum decision criterion of 0.97 along with acceptable processing time.

2019-03-15
Kim, D., Shin, D., Shin, D..  2018.  Unauthorized Access Point Detection Using Machine Learning Algorithms for Information Protection. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1876-1878.

With the frequent use of Wi-Fi and hotspots that provide a wireless Internet environment, awareness and threats to wireless AP (Access Point) security are steadily increasing. Especially when using unauthorized APs in company, government and military facilities, there is a high possibility of being subjected to various viruses and hacking attacks. It is necessary to detect unauthorized Aps for protection of information. In this paper, we use RTT (Round Trip Time) value data set to detect authorized and unauthorized APs in wired / wireless integrated environment, analyze them using machine learning algorithms including SVM (Support Vector Machine), C4.5, KNN (K Nearest Neighbors) and MLP (Multilayer Perceptron). Overall, KNN shows the highest accuracy.

2019-03-06
Wang, Jiawen, Wang, Wai Ming, Tian, Zonggui, Li, Zhi.  2018.  Classification of Multiple Affective Attributes of Customer Reviews: Using Classical Machine Learning and Deep Learning. Proceedings of the 2Nd International Conference on Computer Science and Application Engineering. :94:1-94:5.

Affective1 engineering is a methodology of designing products by collecting customer affective needs and translating them into product designs. It usually begins with questionnaire surveys to collect customer affective demands and responses. However, this process is expensive, which can only be conducted periodically in a small scale. With the rapid development of e-commerce, a larger number of customer product reviews are available on the Internet. Many studies have been done using opinion mining and sentiment analysis. However, the existing studies focus on the polarity classification from a single perspective (such as positive and negative). The classification of multiple affective attributes receives less attention. In this paper, 3-class classifications of four different affective attributes (i.e. Soft-Hard, Appealing-Unappealing, Handy-Bulky, and Reliable-Shoddy) are performed by using two classical machine learning algorithms (i.e. Softmax regression and Support Vector Machine) and two deep learning methods (i.e. Restricted Boltzmann machines and Deep Belief Network) on an Amazon dataset. The results show that the accuracy of deep learning methods is above 90%, while the accuracy of classical machine learning methods is about 64%. This indicates that deep learning methods are significantly better than classical machine learning methods.

Xing, Z., Liu, L., Li, S., Liu, Y..  2018.  Analysis of Radiation Effects for Monitoring Circuit Based on Deep Belief Network and Support Vector Method. 2018 Prognostics and System Health Management Conference (PHM-Chongqing). :511-516.

The monitoring circuit is widely applied in radiation environment and it is of significance to study the circuit reliability with the radiation effects. In this paper, an intelligent analysis method based on Deep Belief Network (DBN) and Support Vector Method is proposed according to the radiation experiments analysis of the monitoring circuit. The Total Ionizing Dose (TID) of the monitoring circuit is used to identify the circuit degradation trend. Firstly, the output waveforms of the monitoring circuit are obtained by radiating with the different TID. Subsequently, the Deep Belief Network Model is trained to extract the features of the circuit signal. Finally, the Support Vector Machine (SVM) and Support Vector Regression (SVR) are applied to classify and predict the remaining useful life (RUL) of the monitoring circuit. According to the experimental results, the performance of DBN-SVM exceeds DBN method for feature extraction and classification, and SVR is effective for predicting the degradation.

2019-02-18
Wu, KuanTing, Chou, ShingHua, Chen, ShyhWei, Tsai, ChingTsorng, Yuan, ShyanMing.  2018.  Application of Machine Learning to Identify Counterfeit Website. Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services. :321–324.
Recent years the prevalence of fraudulent websites has become more severe than before. Fraudulent ecommerce websites that sell counterfeit goods not only cost financial damage to consumers but also have a great impact on Internet industry. Nowadays, there is not an effective way to confront these websites. In this paper, we look forward to achieving three goals: find the characteristics of counterfeit websites, train models for classifying ecommerce websites and provide a service to help consumers distinguish counterfeit websites from legitimate ones.
Oka, Daisuke, Balage, Don Hiroshan Lakmal, Motegi, Kazuhiro, Kobayashi, Yasuhiro, Shiraishi, Yoichi.  2018.  A Combination of Support Vector Machine and Heuristics in On-line Non-Destructive Inspection System. Proceedings of the 2018 International Conference on Machine Learning and Machine Intelligence. :45–49.
This paper deals with an on-line non-destructive inspection system by using hammering sounds based on the combination of support vector machine and a heuristic algorithm. In machine learning algorithms, the perfect performance is hard to attain and it is newly suggested that a heuristic algorithm redeeming this insufficiency is connected to the support vector machine as a post-process. The experimental results show that the combination of support vector machine and the heuristic algorithm attains 100% detection of defective pieces with 18.4% of erroneous determination of non-defective pieces within the upper limit of given processing time.
Zhu, Mengeheng, Shi, Hong.  2018.  A Novel Support Vector Machine Algorithm for Missing Data. Proceedings of the 2Nd International Conference on Innovation in Artificial Intelligence. :48–53.
Missing data problem often occurs in data analysis. The most common way to solve this problem is imputation. But imputation methods are only suitable for dealing with a low proportion of missing data, when assuming that missing data satisfies MCAR (Missing Completely at Random) or MAR (Missing at Random). In this paper, considering the reasons for missing data, we propose a novel support vector machine method using a new kernel function to solve the problem with a relatively large proportion of missing data. This method makes full use of observed data to reduce the error caused by filling a large number of missing values. We validate our method on 4 data sets from UCI Repository of Machine Learning. The accuracy, F-score, Kappa statistics and recall are used to evaluate the performance. Experimental results show that our method achieve significant improvement in terms of classification results compared with common imputation methods, even when the proportion of missing data is high.
Wu, Siyan, Tong, Xiaojun, Wang, Wei, Xin, Guodong, Wang, Bailing, Zhou, Qi.  2018.  Website Defacements Detection Based on Support Vector Machine Classification Method. Proceedings of the 2018 International Conference on Computing and Data Engineering. :62–66.
Website defacements can inflict significant harm on the website owner through the loss of reputation, the loss of money, or the leakage of information. Due to the complexity and diversity of all kinds of web application systems, especially a lack of necessary security maintenance, website defacements increased year by year. In this paper, we focus on detecting whether the website has been defaced by extracting website features and website embedded trojan features. We use three kinds of classification learning algorithms which include Gradient Boosting Decision Tree (GBDT), Random Forest (RF) and Support Vector Machine (SVM) to do the classification experiments, and experimental results show that Support Vector Machine classifier performed better than two other classifiers. It can achieve an overall accuracy of 95%-96% in detecting website defacements.
Xu, Bowen, Shirani, Amirreza, Lo, David, Alipour, Mohammad Amin.  2018.  Prediction of Relatedness in Stack Overflow: Deep Learning vs. SVM: A Reproducibility Study. Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. :21:1–21:10.
Background Xu et al. used a deep neural network (DNN) technique to classify the degree of relatedness between two knowledge units (question-answer threads) on Stack Overflow. More recently, extending Xu et al.'s work, Fu and Menzies proposed a simpler classification technique based on a fine-tuned support vector machine (SVM) that achieves similar performance but in a much shorter time. Thus, they suggested that researchers need to compare their sophisticated methods against simpler alternatives. Aim The aim of this work is to replicate the previous studies and further investigate the validity of Fu and Menzies' claim by evaluating the DNN- and SVM-based approaches on a larger dataset. We also compare the effectiveness of these two approaches against SimBow, a lightweight SVM-based method that was previously used for general community question-answering. Method We (1) collect a large dataset containing knowledge units from Stack Overflow, (2) show the value of the new dataset addressing shortcomings of the original one, (3) re-evaluate both the DNN-and SVM-based approaches on the new dataset, and (4) compare the performance of the two approaches against that of SimBow. Results We find that: (1) there are several limitations in the original dataset used in the previous studies, (2) effectiveness of both Xu et al.'s and Fu and Menzies' approaches (as measured using F1-score) drop sharply on the new dataset, (3) similar to the previous finding, performance of SVM-based approaches (Fu and Menzies' approach and SimBow) are slightly better than the DNN-based approach, (4) contrary to the previous findings, Fu and Menzies' approach runs much slower than DNN-based approach on the larger dataset - its runtime grows sharply with increase in dataset size, and (5) SimBow outperforms both Xu et al. and Fu and Menzies' approaches in terms of runtime. Conclusion We conclude that, for this task, simpler approaches based on SVM performs adequately well. We also illustrate the challenges brought by the increased size of the dataset and show the benefit of a lightweight SVM-based approach for this task.
Lu, Yunmei, Yan, Mingyuan, Han, Meng, Zhang, Qingliang, Zhang, Yanqing.  2018.  Privacy Preserving Multiclass Classification for Horizontally Distributed Data. Proceedings of the 19th Annual SIG Conference on Information Technology Education. :165–165.
With the advent of the era of big data, applying data mining techniques on assembling data from multiple parties (or sources) has become a leading trend. In this work, a Privacy Preserving Multiclass Classification (PPM2C) method is proposed. Experimental results show that PPM2C is workable and stable.
Dam, Khanh Huu The, Touili, Tayssir.  2018.  Learning Malware Using Generalized Graph Kernels. Proceedings of the 13th International Conference on Availability, Reliability and Security. :28:1–28:6.
Machine learning techniques were extensively applied to learn and detect malware. However, these techniques use often rough abstractions of programs. We propose in this work to use a more precise model for programs, namely extended API call graphs, where nodes correspond to API function calls, edges specify the execution order between the API functions, and edge labels indicate the dependence relation between API functions parameters. To learn such graphs, we propose to use Generalized Random Walk Graph Kernels (combined with Support Vector Machines). We implemented our techniques and obtained encouraging results for malware detection: 96.73% of detection rate with 0.73% of false alarms.
2019-02-13
Feng, Y., Akiyama, H., Lu, L., Sakurai, K..  2018.  Feature Selection for Machine Learning-Based Early Detection of Distributed Cyber Attacks. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech). :173–180.

It is well known that distributed cyber attacks simultaneously launched from many hosts have caused the most serious problems in recent years including problems of privacy leakage and denial of services. Thus, how to detect those attacks at early stage has become an important and urgent topic in the cyber security community. For this purpose, recognizing C&C (Command & Control) communication between compromised bots and the C&C server becomes a crucially important issue, because C&C communication is in the preparation phase of distributed attacks. Although attack detection based on signature has been practically applied since long ago, it is well-known that it cannot efficiently deal with new kinds of attacks. In recent years, ML(Machine learning)-based detection methods have been studied widely. In those methods, feature selection is obviously very important to the detection performance. We once utilized up to 55 features to pick out C&C traffic in order to accomplish early detection of DDoS attacks. In this work, we try to answer the question that "Are all of those features really necessary?" We mainly investigate how the detection performance moves as the features are removed from those having lowest importance and we try to make it clear that what features should be payed attention for early detection of distributed attacks. We use honeypot data collected during the period from 2008 to 2013. SVM(Support Vector Machine) and PCA(Principal Component Analysis) are utilized for feature selection and SVM and RF(Random Forest) are for building the classifier. We find that the detection performance is generally getting better if more features are utilized. However, after the number of features has reached around 40, the detection performance will not change much even more features are used. It is also verified that, in some specific cases, more features do not always means a better detection performance. We also discuss 10 important features which have the biggest influence on classification.

Liu, Shigang, Zhang, Jun, Wang, Yu, Zhou, Wanlei, Xiang, Yang, Vel., Olivier De.  2018.  A Data-driven Attack Against Support Vectors of SVM. Proceedings of the 2018 on Asia Conference on Computer and Communications Security. :723–734.
Machine learning (ML) is commonly used in multiple disciplines and real-world applications, such as information retrieval, financial systems, health, biometrics and online social networks. However, their security profiles against deliberate attacks have not often been considered. Sophisticated adversaries can exploit specific vulnerabilities exposed by classical ML algorithms to deceive intelligent systems. It is emerging to perform a thorough security evaluation as well as potential attacks against the machine learning techniques before developing novel methods to guarantee that machine learning can be securely applied in adversarial setting. In this paper, an effective attack strategy for crafting foreign support vectors in order to attack a classic ML algorithm, the Support Vector Machine (SVM) has been proposed with mathematical proof. The new attack can minimize the margin around the decision boundary and maximize the hinge loss simultaneously. We evaluate the new attack in different real-world applications including social spam detection, Internet traffic classification and image recognition. Experimental results highlight that the security of classifiers can be worsened by poisoning a small group of support vectors.
Prakash, A., Priyadarshini, R..  2018.  An Intelligent Software defined Network Controller for preventing Distributed Denial of Service Attack. 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT). :585–589.

Software Defined Network (SDN) architecture is a new and novel way of network management mechanism. In SDN, switches do not process the incoming packets like conventional network computing environment. They match for the incoming packets in the forwarding tables and if there is none it will be sent to the controller for processing which is the operating system of the SDN. A Distributed Denial of Service (DDoS) attack is a biggest threat to cyber security in SDN network. The attack will occur at the network layer or the application layer of the compromised systems that are connected to the network. In this paper a machine learning based intelligent method is proposed which can detect the incoming packets as infected or not. The different machine learning algorithms adopted for accomplishing the task are Naive Bayes, K-Nearest neighbor (KNN) and Support vector machine (SVM) to detect the anomalous behavior of the data traffic. These three algorithms are compared according to their performances and KNN is found to be the suitable one over other two. The performance measure is taken here is the detection rate of infected packets.

2019-02-08
Wang, M., Zhu, W., Yan, S., Wang, Q..  2018.  SoundAuth: Secure Zero-Effort Two-Factor Authentication Based on Audio Signals. 2018 IEEE Conference on Communications and Network Security (CNS). :1-9.

Two-factor authentication (2FA) popularly works by verifying something the user knows (a password) and something she possesses (a token, popularly instantiated with a smart phone). Conventional 2FA systems require extra interaction like typing a verification code, which is not very user-friendly. For improved user experience, recent work aims at zero-effort 2FA, in which a smart phone placed close to a computer (where the user enters her username/password into a browser to log into a server) automatically assists with the authentication. To prove her possession of the smart phone, the user needs to prove the phone is on the login spot, which reduces zero-effort 2FA to co-presence detection. In this paper, we propose SoundAuth, a secure zero-effort 2FA mechanism based on (two kinds of) ambient audio signals. SoundAuth looks for signs of proximity by having the browser and the smart phone compare both their surrounding sounds and certain unpredictable near-ultrasounds; if significant distinguishability is found, SoundAuth rejects the login request. For the ambient signals comparison, we regard it as a classification problem and employ a machine learning technique to analyze the audio signals. Experiments with real login attempts show that SoundAuth not only is comparable to existent schemes concerning utility, but also outperforms them in terms of resilience to attacks. SoundAuth can be easily deployed as it is readily supported by most smart phones and major browsers.

2019-01-21
Lu, L., Yu, J., Chen, Y., Liu, H., Zhu, Y., Liu, Y., Li, M..  2018.  LipPass: Lip Reading-based User Authentication on Smartphones Leveraging Acoustic Signals. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. :1466–1474.

To prevent users' privacy from leakage, more and more mobile devices employ biometric-based authentication approaches, such as fingerprint, face recognition, voiceprint authentications, etc., to enhance the privacy protection. However, these approaches are vulnerable to replay attacks. Although state-of-art solutions utilize liveness verification to combat the attacks, existing approaches are sensitive to ambient environments, such as ambient lights and surrounding audible noises. Towards this end, we explore liveness verification of user authentication leveraging users' lip movements, which are robust to noisy environments. In this paper, we propose a lip reading-based user authentication system, LipPass, which extracts unique behavioral characteristics of users' speaking lips leveraging build-in audio devices on smartphones for user authentication. We first investigate Doppler profiles of acoustic signals caused by users' speaking lips, and find that there are unique lip movement patterns for different individuals. To characterize the lip movements, we propose a deep learning-based method to extract efficient features from Doppler profiles, and employ Support Vector Machine and Support Vector Domain Description to construct binary classifiers and spoofer detectors for user identification and spoofer detection, respectively. Afterwards, we develop a binary tree-based authentication approach to accurately identify each individual leveraging these binary classifiers and spoofer detectors with respect to registered users. Through extensive experiments involving 48 volunteers in four real environments, LipPass can achieve 90.21% accuracy in user identification and 93.1% accuracy in spoofer detection.

2018-07-18
Terai, A., Abe, S., Kojima, S., Takano, Y., Koshijima, I..  2017.  Cyber-Attack Detection for Industrial Control System Monitoring with Support Vector Machine Based on Communication Profile. 2017 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :132–138.

Industrial control systems (ICS) used in industrial plants are vulnerable to cyber-attacks that can cause fatal damage to the plants. Intrusion detection systems (IDSs) monitor ICS network traffic and detect suspicious activities. However, many IDSs overlook sophisticated cyber-attacks because it is hard to make a complete database of cyber-attacks and distinguish operational anomalies when compared to an established baseline. In this paper, a discriminant model between normal and anomalous packets was constructed with a support vector machine (SVM) based on an ICS communication profile, which represents only packet intervals and length, and an IDS with the applied model is proposed. Furthermore, the proposed IDS was evaluated using penetration tests on our cyber security test bed. Although the IDS was constructed by the limited features (intervals and length) of packets, the IDS successfully detected cyber-attacks by monitoring the rate of predicted attacking packets.

2018-06-20
Searles, R., Xu, L., Killian, W., Vanderbruggen, T., Forren, T., Howe, J., Pearson, Z., Shannon, C., Simmons, J., Cavazos, J..  2017.  Parallelization of Machine Learning Applied to Call Graphs of Binaries for Malware Detection. 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). :69–77.

Malicious applications have become increasingly numerous. This demands adaptive, learning-based techniques for constructing malware detection engines, instead of the traditional manual-based strategies. Prior work in learning-based malware detection engines primarily focuses on dynamic trace analysis and byte-level n-grams. Our approach in this paper differs in that we use compiler intermediate representations, i.e., the callgraph representation of binaries. Using graph-based program representations for learning provides structure of the program, which can be used to learn more advanced patterns. We use the Shortest Path Graph Kernel (SPGK) to identify similarities between call graphs extracted from binaries. The output similarity matrix is fed into a Support Vector Machine (SVM) algorithm to construct highly-accurate models to predict whether a binary is malicious or not. However, SPGK is computationally expensive due to the size of the input graphs. Therefore, we evaluate different parallelization methods for CPUs and GPUs to speed up this kernel, allowing us to continuously construct up-to-date models in a timely manner. Our hybrid implementation, which leverages both CPU and GPU, yields the best performance, achieving up to a 14.2x improvement over our already optimized OpenMP version. We compared our generated graph-based models to previously state-of-the-art feature vector 2-gram and 3-gram models on a dataset consisting of over 22,000 binaries. We show that our classification accuracy using graphs is over 19% higher than either n-gram model and gives a false positive rate (FPR) of less than 0.1%. We are also able to consider large call graphs and dataset sizes because of the reduced execution time of our parallelized SPGK implementation.