Visible to the public Biblio

Found 789 results

Filters: Keyword is learning (artificial intelligence)  [Clear All Filters]
2019-03-15
Inoue, T., Hasegawa, K., Kobayashi, Y., Yanagisawa, M., Togawa, N..  2018.  Designing Subspecies of Hardware Trojans and Their Detection Using Neural Network Approach. 2018 IEEE 8th International Conference on Consumer Electronics - Berlin (ICCE-Berlin). :1-4.

Due to the recent technological development, home appliances and electric devices are equipped with high-performance hardware device. Since demand of hardware devices is increased, production base become internationalized to mass-produce hardware devices with low cost and hardware vendors outsource their products to third-party vendors. Accordingly, malicious third-party vendors can easily insert malfunctions (also known as "hardware Trojans'') into their products. In this paper, we design six kinds of hardware Trojans at a gate-level netlist, and apply a neural-network (NN) based hardware-Trojan detection method to them. The designed hardware Trojans are different in trigger circuits. In addition, we insert them to normal circuits, and detect hardware Trojans using a machine-learning-based hardware-Trojan detection method with neural networks. In our experiment, we learned Trojan-infected benchmarks using NN, and performed cross validation to evaluate the learned NN. The experimental results demonstrate that the average TPR (True Positive Rate) becomes 72.9%, the average TNR (True Negative Rate) becomes 90.0%.

Ye, J., Yang, Y., Gong, Y., Hu, Y., Li, X..  2018.  Grey Zone in Pre-Silicon Hardware Trojan Detection. 2018 IEEE International Test Conference in Asia (ITC-Asia). :79-84.

Pre-Silicon hardware Trojan detection has been studied for years. The most popular benchmark circuits are from the Trust-Hub. Their common feature is that the probability of activating hardware Trojans is very low. This leads to a series of machine learning based hardware Trojan detection methods which try to find the nets with low signal probability of 0 or 1. On the other hand, it is considered that, if the probability of activating hardware Trojans is high, these hardware Trojans can be easily found through behaviour simulations or during functional test. This paper explores the "grey zone" between these two opposite scenarios: if the activation probability of a hardware Trojan is not low enough for machine learning to detect it and is not high enough for behaviour simulation or functional test to find it, it can escape from detection. Experiments show the existence of such hardware Trojans, and this paper suggests a new set of hardware Trojan benchmark circuits for future study.

Kim, D., Shin, D., Shin, D..  2018.  Unauthorized Access Point Detection Using Machine Learning Algorithms for Information Protection. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1876-1878.

With the frequent use of Wi-Fi and hotspots that provide a wireless Internet environment, awareness and threats to wireless AP (Access Point) security are steadily increasing. Especially when using unauthorized APs in company, government and military facilities, there is a high possibility of being subjected to various viruses and hacking attacks. It is necessary to detect unauthorized Aps for protection of information. In this paper, we use RTT (Round Trip Time) value data set to detect authorized and unauthorized APs in wired / wireless integrated environment, analyze them using machine learning algorithms including SVM (Support Vector Machine), C4.5, KNN (K Nearest Neighbors) and MLP (Multilayer Perceptron). Overall, KNN shows the highest accuracy.

Lin, W., Lin, H., Wang, P., Wu, B., Tsai, J..  2018.  Using Convolutional Neural Networks to Network Intrusion Detection for Cyber Threats. 2018 IEEE International Conference on Applied System Invention (ICASI). :1107-1110.

In practice, Defenders need a more efficient network detection approach which has the advantages of quick-responding learning capability of new network behavioural features for network intrusion detection purpose. In many applications the capability of Deep Learning techniques has been confirmed to outperform classic approaches. Accordingly, this study focused on network intrusion detection using convolutional neural networks (CNNs) based on LeNet-5 to classify the network threats. The experiment results show that the prediction accuracy of intrusion detection goes up to 99.65% with samples more than 10,000. The overall accuracy rate is 97.53%.

Noor, U., Anwar, Z., Noor, U., Anwar, Z., Rashid, Z..  2018.  An Association Rule Mining-Based Framework for Profiling Regularities in Tactics Techniques and Procedures of Cyber Threat Actors. 2018 International Conference on Smart Computing and Electronic Enterprise (ICSCEE). :1-6.

Tactics Techniques and Procedures (TTPs) in cyber domain is an important threat information that describes the behavior and attack patterns of an adversary. Timely identification of associations between TTPs can lead to effective strategy for diagnosing the Cyber Threat Actors (CTAs) and their attack vectors. This study profiles the prevalence and regularities in the TTPs of CTAs. We developed a machine learning-based framework that takes as input Cyber Threat Intelligence (CTI) documents, selects the most prevalent TTPs with high information gain as features and based on them mine interesting regularities between TTPs using Association Rule Mining (ARM). We evaluated the proposed framework with publicly available TTPbased CTI documents. The results show that there are 28 TTPs more prevalent than the other TTPs. Our system identified 155 interesting association rules among the TTPs of CTAs. A summary of these rules is given to effectively investigate threats in the network.

Deliu, I., Leichter, C., Franke, K..  2018.  Collecting Cyber Threat Intelligence from Hacker Forums via a Two-Stage, Hybrid Process Using Support Vector Machines and Latent Dirichlet Allocation. 2018 IEEE International Conference on Big Data (Big Data). :5008-5013.

Traditional security controls, such as firewalls, anti-virus and IDS, are ill-equipped to help IT security and response teams keep pace with the rapid evolution of the cyber threat landscape. Cyber Threat Intelligence (CTI) can help remediate this problem by exploiting non-traditional information sources, such as hacker forums and "dark-web" social platforms. Security and response teams can use the collected intelligence to identify emerging threats. Unfortunately, when manual analysis is used to extract CTI from non-traditional sources, it is a time consuming, error-prone and resource intensive process. We address these issues by using a hybrid Machine Learning model that automatically searches through hacker forum posts, identifies the posts that are most relevant to cyber security and then clusters the relevant posts into estimations of the topics that the hackers are discussing. The first (identification) stage uses Support Vector Machines and the second (clustering) stage uses Latent Dirichlet Allocation. We tested our model, using data from an actual hacker forum, to automatically extract information about various threats such as leaked credentials, malicious proxy servers, malware that evades AV detection, etc. The results demonstrate our method is an effective means for quickly extracting relevant and actionable intelligence that can be integrated with traditional security controls to increase their effectiveness.

2019-03-06
Lin, Y., Liu, H., Xie, G., Zhang, Y..  2018.  Time Series Forecasting by Evolving Deep Belief Network with Negative Correlation Search. 2018 Chinese Automation Congress (CAC). :3839-3843.

The recently developed deep belief network (DBN) has been shown to be an effective methodology for solving time series forecasting problems. However, the performance of DBN is seriously depended on the reasonable setting of hyperparameters. At present, random search, grid search and Bayesian optimization are the most common methods of hyperparameters optimization. As an alternative, a state-of-the-art derivative-free optimizer-negative correlation search (NCS) is adopted in this paper to decide the sizes of DBN and learning rates during the training processes. A comparative analysis is performed between the proposed method and other popular techniques in the time series forecasting experiment based on two types of time series datasets. Experiment results statistically affirm the efficiency of the proposed model to obtain better prediction results compared with conventional neural network models.

Liu, Y., Wang, Y., Lombardi, F., Han, J..  2018.  An Energy-Efficient Stochastic Computational Deep Belief Network. 2018 Design, Automation Test in Europe Conference Exhibition (DATE). :1175-1178.

Deep neural networks (DNNs) are effective machine learning models to solve a large class of recognition problems, including the classification of nonlinearly separable patterns. The applications of DNNs are, however, limited by the large size and high energy consumption of the networks. Recently, stochastic computation (SC) has been considered to implement DNNs to reduce the hardware cost. However, it requires a large number of random number generators (RNGs) that lower the energy efficiency of the network. To overcome these limitations, we propose the design of an energy-efficient deep belief network (DBN) based on stochastic computation. An approximate SC activation unit (A-SCAU) is designed to implement different types of activation functions in the neurons. The A-SCAU is immune to signal correlations, so the RNGs can be shared among all neurons in the same layer with no accuracy loss. The area and energy of the proposed design are 5.27% and 3.31% (or 26.55% and 29.89%) of a 32-bit floating-point (or an 8-bit fixed-point) implementation. It is shown that the proposed SC-DBN design achieves a higher classification accuracy compared to the fixed-point implementation. The accuracy is only lower by 0.12% than the floating-point design at a similar computation speed, but with a significantly lower energy consumption.

Hess, S., Satam, P., Ditzler, G., Hariri, S..  2018.  Malicious HTML File Prediction: A Detection and Classification Perspective with Noisy Data. 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA). :1-7.

Cybersecurity plays a critical role in protecting sensitive information and the structural integrity of networked systems. As networked systems continue to expand in numbers as well as in complexity, so does the threat of malicious activity and the necessity for advanced cybersecurity solutions. Furthermore, both the quantity and quality of available data on malicious content as well as the fact that malicious activity continuously evolves makes automated protection systems for this type of environment particularly challenging. Not only is the data quality a concern, but the volume of the data can be quite small for some of the classes. This creates a class imbalance in the data used to train a classifier; however, many classifiers are not well equipped to deal with class imbalance. One such example is detecting malicious HMTL files from static features. Unfortunately, collecting malicious HMTL files is extremely difficult and can be quite noisy from HTML files being mislabeled. This paper evaluates a specific application that is afflicted by these modern cybersecurity challenges: detection of malicious HTML files. Previous work presented a general framework for malicious HTML file classification that we modify in this work to use a $\chi$2 feature selection technique and synthetic minority oversampling technique (SMOTE). We experiment with different classifiers (i.e., AdaBoost, Gentle-Boost, RobustBoost, RusBoost, and Random Forest) and a pure detection model (i.e., Isolation Forest). We benchmark the different classifiers using SMOTE on a real dataset that contains a limited number of malicious files (40) with respect to the normal files (7,263). It was found that the modified framework performed better than the previous framework's results. However, additional evidence was found to imply that algorithms which train on both the normal and malicious samples are likely overtraining to the malicious distribution. We demonstrate the likely overtraining by determining that a subset of the malicious files, while suspicious, did not come from a malicious source.

2019-03-04
Lin, F., Beadon, M., Dixit, H. D., Vunnam, G., Desai, A., Sankar, S..  2018.  Hardware Remediation at Scale. 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :14–17.
Large scale services have automated hardware remediation to maintain the infrastructure availability at a healthy level. In this paper, we share the current remediation flow at Facebook, and how it is being monitored. We discuss a class of hardware issues that are transient and typically have higher rates during heavy load. We describe how our remediation system was enhanced to be efficient in detecting this class of issues. As hardware and systems change in response to the advancement in technology and scale, we have also utilized machine learning frameworks for hardware remediation to handle the introduction of new hardware failure modes. We present an ML methodology that uses a set of predictive thresholds to monitor remediation efficiency over time. We also deploy a recommendation system based on natural language processing, which is used to recommend repair actions for efficient diagnosis and repair. We also describe current areas of research that will enable us to improve hardware availability further.
Diao, Y., Rosu, D..  2018.  Improving response accuracy for classification- based conversational IT services. NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium. :1–15.
Conversational IT services are expected to reduce user wait times and improve overall customer satisfaction. Cloud-based solutions are readily available for enterprise subject matter experts (SMEs) to train user-question classifiers and build conversational services with little effort. However, methodologies that the SMEs can use to improve the response accuracy and conversation quality are merely stated and evaluated. In complex service scenarios such as software support, the scope of topics is typically large and the training samples are often limited. Thus, training the classifier based on labeled samples of plain user utterances is not effective in most cases. In this paper, we identify several methods for improving classification quality and evaluate them in concrete training set scenarios. Particularly, a process-based methodology is described that builds and refines on top of service domain knowledge in order to develop a scalable solution for training accurate conversation services. Enterprises and service providers are continuously seeking new ways to improve customer experience on working with IT systems, where user wait times and service resolution quality are critical business metrics. One of the latest trends is the use of conversational IT services. Customers can interact with a conversational service to express their questions in natural language and the system can automatically return relevant answers or execute back-end processes for automated actions. Various text classification techniques have been developed and applied to understand the user questions and trigger the correct responses. For instance, in the context of IT software support, customers can use conversational systems to get answers about software product errors, licenses, or upgrade processes. While the potential benefits of building conversational services are huge, it is often difficult to effectively train classification models that cover well the scope of realistically complex services. In this paper, we propose a training methodology that addresses the limitations in both the scope of topics and the scarcity of the training set. We further evaluate the proposed methodology in a real service support scenario and share the lessons learned.
Aborisade, O., Anwar, M..  2018.  Classification for Authorship of Tweets by Comparing Logistic Regression and Naive Bayes Classifiers. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :269–276.

At a time when all it takes to open a Twitter account is a mobile phone, the act of authenticating information encountered on social media becomes very complex, especially when we lack measures to verify digital identities in the first place. Because the platform supports anonymity, fake news generated by dubious sources have been observed to travel much faster and farther than real news. Hence, we need valid measures to identify authors of misinformation to avert these consequences. Researchers propose different authorship attribution techniques to approach this kind of problem. However, because tweets are made up of only 280 characters, finding a suitable authorship attribution technique is a challenge. This research aims to classify authors of tweets by comparing machine learning methods like logistic regression and naive Bayes. The processes of this application are fetching of tweets, pre-processing, feature extraction, and developing a machine learning model for classification. This paper illustrates the text classification for authorship process using machine learning techniques. In total, there were 46,895 tweets used as both training and testing data, and unique features specific to Twitter were extracted. Several steps were done in the pre-processing phase, including removal of short texts, removal of stop-words and punctuations, tokenizing and stemming of texts as well. This approach transforms the pre-processed data into a set of feature vector in Python. Logistic regression and naive Bayes algorithms were applied to the set of feature vectors for the training and testing of the classifier. The logistic regression based classifier gave the highest accuracy of 91.1% compared to the naive Bayes classifier with 89.8%.

2019-02-25
Gupta, M., Bakliwal, A., Agarwal, S., Mehndiratta, P..  2018.  A Comparative Study of Spam SMS Detection Using Machine Learning Classifiers. 2018 Eleventh International Conference on Contemporary Computing (IC3). :1–7.
With technological advancements and increment in content based advertisement, the use of Short Message Service (SMS) on phones has increased to such a significant level that devices are sometimes flooded with a number of spam SMS. These spam messages can lead to loss of private data as well. There are many content-based machine learning techniques which have proven to be effective in filtering spam emails. Modern day researchers have used some stylistic features of text messages to classify them to be ham or spam. SMS spam detection can be greatly influenced by the presence of known words, phrases, abbreviations and idioms. This paper aims to compare different classifying techniques on different datasets collected from previous research works, and evaluate them on the basis of their accuracies, precision, recall and CAP Curve. The comparison has been performed between traditional machine learning techniques and deep learning methods.
Karamollaoglu, H., Dogru, İ A., Dorterler, M..  2018.  Detection of Spam E-mails with Machine Learning Methods. 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). :1–5.

E-mail communication is one of today's indispensable communication ways. The widespread use of email has brought about some problems. The most important one of these problems are spam (unwanted) e-mails, often composed of advertisements or offensive content, sent without the recipient's request. In this study, it is aimed to analyze the content information of e-mails written in Turkish with the help of Naive Bayes Classifier and Vector Space Model from machine learning methods, to determine whether these e-mails are spam e-mails and classify them. Both methods are subjected to different evaluation criteria and their performances are compared.

Popovac, M., Karanovic, M., Sladojevic, S., Arsenovic, M., Anderla, A..  2018.  Convolutional Neural Network Based SMS Spam Detection. 2018 26th Telecommunications Forum (℡FOR). :1–4.
SMS spam refers to undesired text message. Machine Learning methods for anti-spam filters have been noticeably effective in categorizing spam messages. Dataset used in this research is known as Tiago's dataset. Crucial step in the experiment was data preprocessing, which involved reducing text to lower case, tokenization, removing stopwords. Convolutional Neural Network was the proposed method for classification. Overall model's accuracy was 98.4%. Obtained model can be used as a tool in many applications.
Peng, W., Huang, L., Jia, J., Ingram, E..  2018.  Enhancing the Naive Bayes Spam Filter Through Intelligent Text Modification Detection. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :849–854.

Spam emails have been a chronic issue in computer security. They are very costly economically and extremely dangerous for computers and networks. Despite of the emergence of social networks and other Internet based information exchange venues, dependence on email communication has increased over the years and this dependence has resulted in an urgent need to improve spam filters. Although many spam filters have been created to help prevent these spam emails from entering a user's inbox, there is a lack or research focusing on text modifications. Currently, Naive Bayes is one of the most popular methods of spam classification because of its simplicity and efficiency. Naive Bayes is also very accurate; however, it is unable to correctly classify emails when they contain leetspeak or diacritics. Thus, in this proposes, we implemented a novel algorithm for enhancing the accuracy of the Naive Bayes Spam Filter so that it can detect text modifications and correctly classify the email as spam or ham. Our Python algorithm combines semantic based, keyword based, and machine learning algorithms to increase the accuracy of Naive Bayes compared to Spamassassin by over two hundred percent. Additionally, we have discovered a relationship between the length of the email and the spam score, indicating that Bayesian Poisoning, a controversial topic, is actually a real phenomenon and utilized by spammers.

Ali, S. S., Maqsood, J..  2018.  .Net library for SMS spam detection using machine learning: A cross platform solution. 2018 15th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :470–476.

Short Message Service is now-days the most used way of communication in the electronic world. While many researches exist on the email spam detection, we haven't had the insight knowledge about the spam done within the SMS's. This might be because the frequency of spam in these short messages is quite low than the emails. This paper presents different ways of analyzing spam for SMS and a new pre-processing way to get the actual dataset of spam messages. This dataset was then used on different algorithm techniques to find the best working algorithm in terms of both accuracy and recall. Random Forest algorithm was then implemented in a real world application library written in C\# for cross platform .Net development. This library is capable of using a prebuild model for classifying a new dataset for spam and ham.

Vishagini, V., Rajan, A. K..  2018.  An Improved Spam Detection Method with Weighted Support Vector Machine. 2018 International Conference on Data Science and Engineering (ICDSE). :1–5.
Email is the most admired method of exchanging messages using the Internet. One of the intimidations to email users is to detect the spam they receive. This can be addressed using different detection and filtering techniques. Machine learning algorithms, especially Support Vector Machine (SVM), can play vital role in spam detection. We propose the use of weighted SVM for spam filtering using weight variables obtained by KFCM algorithm. The weight variables reflect the importance of different classes. The misclassification of emails is reduced by the growth of weight value. We evaluate the impact of spam detection using SVM, WSVM with KPCM and WSVM with KFCM.UCI Repository SMS Spam base dataset is used for our experimentation.
Lekshmi, M. B., Deepthi, V. R..  2018.  Spam Detection Framework for Online Reviews Using Hadoop’ s Computational Capability. 2018 International CET Conference on Control, Communication, and Computing (IC4). :436–440.
Nowadays, online reviews have become one of the vital elements for customers to do online shopping. Organizations and individuals use this information to buy the right products and make business decisions. This has influenced the spammers or unethical business people to create false reviews and promote their products to out-beat competitions. Sophisticated systems are developed by spammers to create bulk of spam reviews in any websites within hours. To tackle this problem, studies have been conducted to formulate effective ways to detect the spam reviews. Various spam detection methods have been introduced in which most of them extracts meaningful features from the text or used machine learning techniques. These approaches gave little importance on extracted feature type and processing rate. NetSpam[1] defines a framework which can classify the review dataset based on spam features and maps them to a spam detection procedure which performs better than previous works in predictive accuracy. In this work, a method is proposed that can improve the processing rate by applying a distributed approach on review dataset using MapReduce feature. Parallel programming concept using MapReduce is used for processing big data in Hadoop. The solution involves parallelising the algorithm defined in NetSpam and it defines a spam detection procedure with better predictive accuracy and processing rate.
2019-02-22
Hu, D., Wang, L., Jiang, W., Zheng, S., Li, B..  2018.  A Novel Image Steganography Method via Deep Convolutional Generative Adversarial Networks. IEEE Access. 6:38303-38314.

The security of image steganography is an important basis for evaluating steganography algorithms. Steganography has recently made great progress in the long-term confrontation with steganalysis. To improve the security of image steganography, steganography must have the ability to resist detection by steganalysis algorithms. Traditional embedding-based steganography embeds the secret information into the content of an image, which unavoidably leaves a trace of the modification that can be detected by increasingly advanced machine-learning-based steganalysis algorithms. The concept of steganography without embedding (SWE), which does not need to modify the data of the carrier image, appeared to overcome the detection of machine-learning-based steganalysis algorithms. In this paper, we propose a novel image SWE method based on deep convolutional generative adversarial networks. We map the secret information into a noise vector and use the trained generator neural network model to generate the carrier image based on the noise vector. No modification or embedding operations are required during the process of image generation, and the information contained in the image can be extracted successfully by another neural network, called the extractor, after training. The experimental results show that this method has the advantages of highly accurate information extraction and a strong ability to resist detection by state-of-the-art image steganalysis algorithms.

Neal, T., Sundararajan, K., Woodard, D..  2018.  Exploiting Linguistic Style as a Cognitive Biometric for Continuous Verification. 2018 International Conference on Biometrics (ICB). :270-276.

This paper presents an assessment of continuous verification using linguistic style as a cognitive biometric. In stylometry, it is widely known that linguistic style is highly characteristic of authorship using representations that capture authorial style at character, lexical, syntactic, and semantic levels. In this work, we provide a contrast to previous efforts by implementing a one-class classification problem using Isolation Forests. Our approach demonstrates the usefulness of this classifier for accurately verifying the genuine user, and yields recognition accuracy exceeding 98% using very small training samples of 50 and 100-character blocks.

2019-02-13
Feng, Y., Akiyama, H., Lu, L., Sakurai, K..  2018.  Feature Selection for Machine Learning-Based Early Detection of Distributed Cyber Attacks. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech). :173–180.

It is well known that distributed cyber attacks simultaneously launched from many hosts have caused the most serious problems in recent years including problems of privacy leakage and denial of services. Thus, how to detect those attacks at early stage has become an important and urgent topic in the cyber security community. For this purpose, recognizing C&C (Command & Control) communication between compromised bots and the C&C server becomes a crucially important issue, because C&C communication is in the preparation phase of distributed attacks. Although attack detection based on signature has been practically applied since long ago, it is well-known that it cannot efficiently deal with new kinds of attacks. In recent years, ML(Machine learning)-based detection methods have been studied widely. In those methods, feature selection is obviously very important to the detection performance. We once utilized up to 55 features to pick out C&C traffic in order to accomplish early detection of DDoS attacks. In this work, we try to answer the question that "Are all of those features really necessary?" We mainly investigate how the detection performance moves as the features are removed from those having lowest importance and we try to make it clear that what features should be payed attention for early detection of distributed attacks. We use honeypot data collected during the period from 2008 to 2013. SVM(Support Vector Machine) and PCA(Principal Component Analysis) are utilized for feature selection and SVM and RF(Random Forest) are for building the classifier. We find that the detection performance is generally getting better if more features are utilized. However, after the number of features has reached around 40, the detection performance will not change much even more features are used. It is also verified that, in some specific cases, more features do not always means a better detection performance. We also discuss 10 important features which have the biggest influence on classification.

2019-02-08
Wang, M., Zhu, W., Yan, S., Wang, Q..  2018.  SoundAuth: Secure Zero-Effort Two-Factor Authentication Based on Audio Signals. 2018 IEEE Conference on Communications and Network Security (CNS). :1-9.

Two-factor authentication (2FA) popularly works by verifying something the user knows (a password) and something she possesses (a token, popularly instantiated with a smart phone). Conventional 2FA systems require extra interaction like typing a verification code, which is not very user-friendly. For improved user experience, recent work aims at zero-effort 2FA, in which a smart phone placed close to a computer (where the user enters her username/password into a browser to log into a server) automatically assists with the authentication. To prove her possession of the smart phone, the user needs to prove the phone is on the login spot, which reduces zero-effort 2FA to co-presence detection. In this paper, we propose SoundAuth, a secure zero-effort 2FA mechanism based on (two kinds of) ambient audio signals. SoundAuth looks for signs of proximity by having the browser and the smart phone compare both their surrounding sounds and certain unpredictable near-ultrasounds; if significant distinguishability is found, SoundAuth rejects the login request. For the ambient signals comparison, we regard it as a classification problem and employ a machine learning technique to analyze the audio signals. Experiments with real login attempts show that SoundAuth not only is comparable to existent schemes concerning utility, but also outperforms them in terms of resilience to attacks. SoundAuth can be easily deployed as it is readily supported by most smart phones and major browsers.

Yousefi, M., Mtetwa, N., Zhang, Y., Tianfield, H..  2018.  A Reinforcement Learning Approach for Attack Graph Analysis. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :212-217.

Attack graph approach is a common tool for the analysis of network security. However, analysis of attack graphs could be complicated and difficult depending on the attack graph size. This paper presents an approximate analysis approach for attack graphs based on Q-learning. First, we employ multi-host multi-stage vulnerability analysis (MulVAL) to generate an attack graph for a given network topology. Then we refine the attack graph and generate a simplified graph called a transition graph. Next, we use a Q-learning model to find possible attack routes that an attacker could use to compromise the security of the network. Finally, we evaluate the approach by applying it to a typical IT network scenario with specific services, network configurations, and vulnerabilities.

Sisiaridis, D., Markowitch, O..  2018.  Reducing Data Complexity in Feature Extraction and Feature Selection for Big Data Security Analytics. 2018 1st International Conference on Data Intelligence and Security (ICDIS). :43-48.

Feature extraction and feature selection are the first tasks in pre-processing of input logs in order to detect cybersecurity threats and attacks by utilizing data mining techniques in the field of Artificial Intelligence. When it comes to the analysis of heterogeneous data derived from different sources, these tasks are found to be time-consuming and difficult to be managed efficiently. In this paper, we present an approach for handling feature extraction and feature selection utilizing machine learning algorithms for security analytics of heterogeneous data derived from different network sensors. The approach is implemented in Apache Spark, using its python API, named pyspark.