Biblio

List
Filter

Found 2348 results

Filters: Keyword is privacy [Clear All Filters]

2018-08-23

Zhang, Kai, Liu, Chuanren, Zhang, Jie, Xiong, Hui, Xing, Eric, Ye, Jieping. 2017. Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. :615–623.

Matrix sketching is aimed at finding compact representations of a matrix while simultaneously preserving most of its properties, which is a fundamental building block in modern scientific computing. Randomized algorithms represent state-of-the-art and have attracted huge interest from the fields of machine learning, data mining, and theoretic computer science. However, it still requires the use of the entire input matrix in producing desired factorizations, which can be a major computational and memory bottleneck in truly large problems. In this paper, we uncover an interesting theoretic connection between matrix low-rank decomposition and lossy signal compression, based on which a cascaded compression sampling framework is devised to approximate an m-by-n matrix in only O(m+n) time and space. Indeed, the proposed method accesses only a small number of matrix rows and columns, which significantly improves the memory footprint. Meanwhile, by sequentially teaming two rounds of approximation procedures and upgrading the sampling strategy from a uniform probability to more sophisticated, encoding-orientated sampling, significant algorithmic boosting is achieved to uncover more granular structures in the data. Empirical results on a wide spectrum of real-world, large-scale matrices show that by taking only linear time and space, the accuracy of our method rivals those state-of-the-art randomized algorithms consuming a quadratic, O(mn), amount of resources.

Birch, G. C., Woo, B. L., LaCasse, C. F., Stubbs, J. J., Dagel, A. L.. 2017. Computational optical physical unclonable functions. 2017 International Carnahan Conference on Security Technology (ICCST). :1–6.

Physical unclonable functions (PUFs) are devices which are easily probed but difficult to predict. Optical PUFs have been discussed within the literature, with traditional optical PUFs typically using spatial light modulators, coherent illumination, and scattering volumes; however, these systems can be large, expensive, and difficult to maintain alignment in practical conditions. We propose and demonstrate a new kind of optical PUF based on computational imaging and compressive sensing to address these challenges with traditional optical PUFs. This work describes the design, simulation, and prototyping of this computational optical PUF (COPUF) that utilizes incoherent polychromatic illumination passing through an additively manufactured refracting optical polymer element. We demonstrate the ability to pass information through a COPUF using a variety of sampling methods, including the use of compressive sensing. The sensitivity of the COPUF system is also explored. We explore non-traditional PUF configurations enabled by the COPUF architecture. The double COPUF system, which employees two serially connected COPUFs, is proposed and analyzed as a means to authenticate and communicate between two entities that have previously agreed to communicate. This configuration enables estimation of a message inversion key without the calculation of individual COPUF inversion keys at any point in the PUF life cycle. Our results show that it is possible to construct inexpensive optical PUFs using computational imaging. This could lead to new uses of PUFs in places where electrical PUFs cannot be utilized effectively, as low cost tags and seals, and potentially as authenticating and communicating devices.

Li, Q., Xu, B., Li, S., Liu, Y., Cui, D.. 2017. Reconstruction of measurements in state estimation strategy against cyber attacks for cyber physical systems. 2017 36th Chinese Control Conference (CCC). :7571–7576.

To improve the resilience of state estimation strategy against cyber attacks, the Compressive Sensing (CS) is applied in reconstruction of incomplete measurements for cyber physical systems. First, observability analysis is used to decide the time to run the reconstruction and the damage level from attacks. In particular, the dictionary learning is proposed to form the over-completed dictionary by K-Singular Value Decomposition (K-SVD). Besides, due to the irregularity of incomplete measurements, sampling matrix is designed as the measurement matrix. Finally, the simulation experiments on 6-bus power system illustrate that the proposed method achieves the incomplete measurements reconstruction perfectly, which is better than the joint dictionary. When only 29% available measurements are left, the proposed method has generality for four kinds of recovery algorithms.

Lagunas, E., Rugini, L.. 2017. Performance of compressive sensing based energy detection. 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). :1–5.

This paper investigates closed-form expressions to evaluate the performance of the Compressive Sensing (CS) based Energy Detector (ED). The conventional way to approximate the probability density function of the ED test statistic invokes the central limit theorem and considers the decision variable as Gaussian. This approach, however, provides good approximation only if the number of samples is large enough. This is not usually the case in CS framework, where the goal is to keep the sample size low. Moreover, working with a reduced number of measurements is of practical interest for general spectrum sensing in cognitive radio applications, where the sensing time should be sufficiently short since any time spent for sensing cannot be used for data transmission on the detected idle channels. In this paper, we make use of low-complexity approximations based on algebraic transformations of the one-dimensional Gaussian Q-function. More precisely, this paper provides new closed-form expressions for accurate evaluation of the CS-based ED performance as a function of the compressive ratio and the Signal-to-Noise Ratio (SNR). Simulation results demonstrate the increased accuracy of the proposed equations compared to existing works.

Xu, W., Yan, Z., Tian, Y., Cui, Y., Lin, J.. 2017. Detection with compressive measurements corrupted by sparse errors. 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP). :1–5.

Compressed sensing can represent the sparse signal with a small number of measurements compared to Nyquist-rate samples. Considering the high-complexity of reconstruction algorithms in CS, recently compressive detection is proposed, which performs detection directly in compressive domain without reconstruction. Different from existing work that generally considers the measurements corrupted by dense noises, this paper studies the compressive detection problem when the measurements are corrupted by both dense noises and sparse errors. The sparse errors exist in many practical systems, such as the ones affected by impulse noise or narrowband interference. We derive the theoretical performance of compressive detection when the sparse error is either deterministic or random. The theoretical results are further verified by simulations.

Ming, X., Shu, T., Xianzhong, X.. 2017. An energy-efficient wireless image transmission method based on adaptive block compressive sensing and softcast. 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC). :712–717.

With the rapid and radical evolution of information and communication technology, energy consumption for wireless communication is growing at a staggering rate, especially for wireless multimedia communication. Recently, reducing energy consumption in wireless multimedia communication has attracted increasing attention. In this paper, we propose an energy-efficient wireless image transmission scheme based on adaptive block compressive sensing (ABCS) and SoftCast, which is called ABCS-SoftCast. In ABCS-SoftCast, the compression distortion and transmission distortion are considered in a joint manner, and the energy-distortion model is formulated for each image block. Then, the sampling rate (SR) and power allocation factors of each image block are optimized simultaneously. Comparing with conventional SoftCast scheme, experimental results demonstrate that the energy consumption can be greatly reduced even when the receiving image qualities are approximately the same.

Svetinovic, Davor. 2017. Blockchain Engineering for the Internet of Things: Systems Security Perspective. Proceedings of the 3rd ACM International Workshop on IoT Privacy, Trust, and Security. :1–1.

The Internet of Things (IoT) technology has a potential to bring the benefits of intelligently interconnecting not just computers and humans, but most of everyday things. IoT has a promise of opening significant business process improvement opportunities leading to economic growth and cost reductions. However, there are many challenges facing IoT, including significant scalability and security challenges due to the integration of potentially huge number of things into the network. Many of scalability and security issues stem from a centralized, primarily client/server, architecture of IoT systems and frameworks. Blockchain technology, as a relativelly new approach to decentralized computation and assets management and transfer, has a potential to help solve a number of scalability and security issues that IoT is facing, primarilly through the removal of centralized points of failure for such systems. As such, blockchain technology and IoT integration provides a promising direction and it has recently generated significant research interest, e.g., [4]. In this talk, we present our experiences based on our recent project in enhancing security and privacy in decentralized energy trading in smart grids using blockchain, multi-signatures and anonymous messaging streams [1], that has built upon our previous work on Bitcoin-based decentralized carbon emissions trading infrastructure model [2]. In particular, we present the blockchain systems security issues within the context of IoT security and privacy requirements [3]. This is done with the intention of producing an early integrated security model for blockchain-powered IoT systems [5]. The presentation is constrained to the discussion of the architecture-level requirements [6]. Finally, we will present the main opportunity loss if the integration ignores the full realization of the real-world asset transaction paradigm.

Dorri, Ali, Kanhere, Salil S., Jurdak, Raja. 2017. Towards an Optimized BlockChain for IoT. Proceedings of the Second International Conference on Internet-of-Things Design and Implementation. :173–178.

There has been increasing interest in adopting BlockChain (BC), that underpins the crypto-currency Bitcoin, in Internet of Things (IoT) for security and privacy. However, BCs are computationally expensive and involve high bandwidth overhead and delays, which are not suitable for most IoT devices. This paper proposes a lightweight BC-based architecture for IoT that virtually eliminates the overheads of classic BC, while maintaining most of its security and privacy benefits. IoT devices benefit from a private immutable ledger, that acts similar to BC but is managed centrally, to optimize energy consumption. High resource devices create an overlay network to implement a publicly accessible distributed BC that ensures end-to-end security and privacy. The proposed architecture uses distributed trust to reduce the block validation processing time. We explore our approach in a smart home setting as a representative case study for broader IoT applications. Qualitative evaluation of the architecture under common threat models highlights its effectiveness in providing security and privacy for IoT applications. Simulations demonstrate that our method decreases packet and processing overhead significantly compared to the BC implementation used in Bitcoin.

Karvelas, Nikolaos P., Senftleben, Marius, Katzenbeisser, Stefan. 2017. Microblogging in a Privacy-Preserving Way. Proceedings of the 12th International Conference on Availability, Reliability and Security. :48:1–48:6.

Microblogging is a popular activity within the spectrum of Online Social Networking (OSN), which allows users to quicky exchange short messages. Such systems can be based on mobile clients that exchange their group-encrypted messages utilizing local communications such as Bluetooth. Since however in such cases, users do not want to disclose their group memberships, and thus have to wait for other group members to appear in the proximity, the message spread can be slow to non-existent. In this paper, we solve this problem and facilitate a higher message spread by employing a server that stores the messages of multiple groups in an Oblivious RAM (ORAM) data structure. The server can be accessed by the clients on demand to read or write their group-encrypted messages. Thus our solution can be used to add access pattern privacy on top of existing microblogging peer-2-peer architectures, and using an ORAM is a promising candidate to use in the given application scenario.

2018-07-18

Terai, A., Abe, S., Kojima, S., Takano, Y., Koshijima, I.. 2017. Cyber-Attack Detection for Industrial Control System Monitoring with Support Vector Machine Based on Communication Profile. 2017 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :132–138.

Industrial control systems (ICS) used in industrial plants are vulnerable to cyber-attacks that can cause fatal damage to the plants. Intrusion detection systems (IDSs) monitor ICS network traffic and detect suspicious activities. However, many IDSs overlook sophisticated cyber-attacks because it is hard to make a complete database of cyber-attacks and distinguish operational anomalies when compared to an established baseline. In this paper, a discriminant model between normal and anomalous packets was constructed with a support vector machine (SVM) based on an ICS communication profile, which represents only packet intervals and length, and an IDS with the applied model is proposed. Furthermore, the proposed IDS was evaluated using penetration tests on our cyber security test bed. Although the IDS was constructed by the limited features (intervals and length) of packets, the IDS successfully detected cyber-attacks by monitoring the rate of predicted attacking packets.

2018-06-20

Lee, Y., Choi, S. S., Choi, J., Song, J.. 2017. A Lightweight Malware Classification Method Based on Detection Results of Anti-Virus Software. 2017 12th Asia Joint Conference on Information Security (AsiaJCIS). :5–9.

With the development of cyber threats on the Internet, the number of malware, especially unknown malware, is also dramatically increasing. Since all of malware cannot be analyzed by analysts, it is very important to find out new malware that should be analyzed by them. In order to cope with this issue, the existing approaches focused on malware classification using static or dynamic analysis results of malware. However, the static and the dynamic analyses themselves are also too costly and not easy to build the isolated, secure and Internet-like analysis environments such as sandbox. In this paper, we propose a lightweight malware classification method based on detection results of anti-virus software. Since the proposed method can reduce the volume of malware that should be analyzed by analysts, it can be used as a preprocess for in-depth analysis of malware. The experimental showed that the proposed method succeeded in classification of 1,000 malware samples into 187 unique groups. This means that 81% of the original malware samples do not need to analyze by analysts.

Chakraborty, S., Stokes, J. W., Xiao, L., Zhou, D., Marinescu, M., Thomas, A.. 2017. Hierarchical learning for automated malware classification. MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM). :23–28.

Despite widespread use of commercial anti-virus products, the number of malicious files detected on home and corporate computers continues to increase at a significant rate. Recently, anti-virus companies have started investing in machine learning solutions to augment signatures manually designed by analysts. A malicious file's determination is often represented as a hierarchical structure consisting of a type (e.g. Worm, Backdoor), a platform (e.g. Win32, Win64), a family (e.g. Rbot, Rugrat) and a family variant (e.g. A, B). While there has been substantial research in automated malware classification, the aforementioned hierarchical structure, which can provide additional information to the classification models, has been ignored. In this paper, we propose the novel idea and study the performance of employing hierarchical learning algorithms for automated classification of malicious files. To the best of our knowledge, this is the first research effort which incorporates the hierarchical structure of the malware label in its automated classification and in the security domain, in general. It is important to note that our method does not require any additional effort by analysts because they typically assign these hierarchical labels today. Our empirical results on a real world, industrial-scale malware dataset of 3.6 million files demonstrate that incorporation of the label hierarchy achieves a significant reduction of 33.1% in the binary error rate as compared to a non-hierarchical classifier which is traditionally used in such problems.

Luo, J. S., Lo, D. C. T.. 2017. Binary malware image classification using machine learning with local binary pattern. 2017 IEEE International Conference on Big Data (Big Data). :4664–4667.

Malware classification is a critical part in the cyber-security. Traditional methodologies for the malware classification typically use static analysis and dynamic analysis to identify malware. In this paper, a malware classification methodology based on its binary image and extracting local binary pattern (LBP) features is proposed. First, malware images are reorganized into 3 by 3 grids which is mainly used to extract LBP feature. Second, the LBP is implemented on the malware images to extract features in that it is useful in pattern or texture classification. Finally, Tensorflow, a library for machine learning, is applied to classify malware images with the LBP feature. Performance comparison results among different classifiers with different image descriptors such as GIST, a spatial envelop, and the LBP demonstrate that our proposed approach outperforms others.

Kosmidis, Konstantinos, Kalloniatis, Christos. 2017. Machine Learning and Images for Malware Detection and Classification. Proceedings of the 21st Pan-Hellenic Conference on Informatics. :5:1–5:6.

Detecting malicious code with exact match on collected datasets is becoming a large-scale identification problem due to the existence of new malware variants. Being able to promptly and accurately identify new attacks enables security experts to respond effectively. My proposal is to develop an automated framework for identification of unknown vulnerabilities by leveraging current neural network techniques. This has a significant and immediate value for the security field, as current anti-virus software is typically able to recognize the malware type only after its infection, and preventive measures are limited. Artificial Intelligence plays a major role in automatic malware classification: numerous machine-learning methods, both supervised and unsupervised, have been researched to try classifying malware into families based on features acquired by static and dynamic analysis. The value of automated identification is clear, as feature engineering is both a time-consuming and time-sensitive task, with new malware studied while being observed in the wild.

Sethi, Kamalakanta, Chaudhary, Shankar Kumar, Tripathy, Bata Krishan, Bera, Padmalochan. 2017. A Novel Malware Analysis for Malware Detection and Classification Using Machine Learning Algorithms. Proceedings of the 10th International Conference on Security of Information and Networks. :107–113.

Nowadays, Malware has become a serious threat to the digitization of the world due to the emergence of various new and complex malware every day. Due to this, the traditional signature-based methods for detection of malware effectively becomes an obsolete method. The efficiency of the machine learning model in context to the detection of malware files has been proved by different researches and studies. In this paper, a framework has been developed to detect and classify different files (e.g exe, pdf, php, etc.) as benign and malicious using two level classifier namely, Macro (for detection of malware) and Micro (for classification of malware files as a Trojan, Spyware, Adware, etc.). Cuckoo Sandbox is used for generating static and dynamic analysis report by executing files in the virtual environment. In addition, a novel model is developed for extracting features based on static, behavioral and network analysis using analysis report generated by the Cuckoo Sandbox. Weka Framework is used to develop machine learning models by using training datasets. The experimental results using proposed framework shows high detection rate with an accuracy of 100% using J48 Decision tree model, 99% using SMO (Sequential Minimal Optimization) and 97% using Random Forest tree. It also shows effective classification rate with accuracy 100% using J48 Decision tree, 91% using SMO and 66% using Random Forest tree. These results are used for detecting and classifying unknown files as benign or malicious.

Wang, Qinglong, Guo, Wenbo, Zhang, Kaixuan, Ororbia, II, Alexander G., Xing, Xinyu, Liu, Xue, Giles, C. Lee. 2017. Adversary Resistant Deep Neural Networks with an Application to Malware Detection. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. :1145–1153.

Outside the highly publicized victories in the game of Go, there have been numerous successful applications of deep learning in the fields of information retrieval, computer vision, and speech recognition. In cybersecurity, an increasing number of companies have begun exploring the use of deep learning (DL) in a variety of security tasks with malware detection among the more popular. These companies claim that deep neural networks (DNNs) could help turn the tide in the war against malware infection. However, DNNs are vulnerable to adversarial samples, a shortcoming that plagues most, if not all, statistical and machine learning models. Recent research has demonstrated that those with malicious intent can easily circumvent deep learning-powered malware detection by exploiting this weakness. To address this problem, previous work developed defense mechanisms that are based on augmenting training data or enhancing model complexity. However, after analyzing DNN susceptibility to adversarial samples, we discover that the current defense mechanisms are limited and, more importantly, cannot provide theoretical guarantees of robustness against adversarial sampled-based attacks. As such, we propose a new adversary resistant technique that obstructs attackers from constructing impactful adversarial samples by randomly nullifying features within data vectors. Our proposed technique is evaluated on a real world dataset with 14,679 malware variants and 17,399 benign programs. We theoretically validate the robustness of our technique, and empirically show that our technique significantly boosts DNN robustness to adversarial samples while maintaining high accuracy in classification. To demonstrate the general applicability of our proposed method, we also conduct experiments using the MNIST and CIFAR-10 datasets, widely used in image recognition research.

Hassen, Mehadi, Chan, Philip K.. 2017. Scalable Function Call Graph-based Malware Classification. Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy. :239–248.

In an attempt to preserve the structural information in malware binaries during feature extraction, function call graph-based features have been used in various research works in malware classification. However, the approach usually employed when performing classification on these graphs, is based on computing graph similarity using computationally intensive techniques. Due to this, much of the previous work in this area incurred large performance overhead and does not scale well. In this paper, we propose a linear time function call graph (FCG) vector representation based on function clustering that has significant performance gains in addition to improved classification accuracy. We also show how this representation can enable using graph features together with other non-graph features.

Kebede, T. M., Djaneye-Boundjou, O., Narayanan, B. N., Ralescu, A., Kapp, D.. 2017. Classification of Malware programs using autoencoders based deep learning architecture and its application to the microsoft malware Classification challenge (BIG 2015) dataset. 2017 IEEE National Aerospace and Electronics Conference (NAECON). :70–75.

Distinguishing and classifying different types of malware is important to better understanding how they can infect computers and devices, the threat level they pose and how to protect against them. In this paper, a system for classifying malware programs is presented. The paper describes the architecture of the system and assesses its performance on a publicly available database (provided by Microsoft for the Microsoft Malware Classification Challenge BIG2015) to serve as a benchmark for future research efforts. First, the malicious programs are preprocessed such that they are visualized as gray scale images. We then make use of an architecture comprised of multiple layers (multiple levels of encoding) to carry out the classification process of those images/programs. We compare the performance of this approach against traditional machine learning and pattern recognition algorithms. Our experimental results show that the deep learning architecture yields a boost in performance over those conventional/standard algorithms. A hold-out validation analysis using the superior architecture shows an accuracy in the order of 99.15%.

Hassen, M., Carvalho, M. M., Chan, P. K.. 2017. Malware classification using static analysis based features. 2017 IEEE Symposium Series on Computational Intelligence (SSCI). :1–7.

Anti-virus vendors receive hundreds of thousands of malware to be analysed each day. Some are new malware while others are variations or evolutions of existing malware. Because analyzing each malware sample by hand is impossible, automated techniques to analyse and categorize incoming samples are needed. In this work, we explore various machine learning features extracted from malware samples through static analysis for classification of malware binaries into already known malware families. We present a new feature based on control statement shingling that has a comparable accuracy to ordinary opcode n-gram based features while requiring smaller dimensions. This, in turn, results in a shorter training time.

Pranamulia, R., Asnar, Y., Perdana, R. S.. 2017. Profile hidden Markov model for malware classification \#x2014; usage of system call sequence for malware classification. 2017 International Conference on Data and Software Engineering (ICoDSE). :1–5.

Malware technology makes it difficult for malware analyst to detect same malware files with different obfuscation technique. In this paper we are trying to tackle that problem by analyzing the sequence of system call from an executable file. Malware files which actually are the same should have almost identical or at least a similar sequence of system calls. In this paper, we are going to create a model for each malware class consists of malwares from different families based on its sequence of system calls. Method/algorithm that's used in this paper is profile hidden markov model which is a very well-known tool in the biological informatics field for comparing DNA and protein sequences. Malware classes that we are going to build are trojan and worm class. Accuracy for these classes are pretty high, it's above 90% with also a high false positive rate around 37%.

Ren, Z., Chen, G.. 2017. EntropyVis: Malware classification. 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI). :1–6.

Malware writers often develop malware with automated measures, so the number of malware has increased dramatically. Automated measures tend to repeatedly use significant modules, which form the basis for identifying malware variants and discriminating malware families. Thus, we propose a novel visualization analysis method for researching malware similarity. This method converts malicious Windows Portable Executable (PE) files into local entropy images for observing internal features of malware, and then normalizes local entropy images into entropy pixel images for malware classification. We take advantage of the Jaccard index to measure similarities between entropy pixel images and the k-Nearest Neighbor (kNN) classification algorithm to assign entropy pixel images to different malware families. Preliminary experimental results show that our visualization method can discriminate malware families effectively.

Aslanyan, H., Avetisyan, A., Arutunian, M., Keropyan, G., Kurmangaleev, S., Vardanyan, V.. 2017. Scalable Framework for Accurate Binary Code Comparison. 2017 Ivannikov ISPRAS Open Conference (ISPRAS). :34–38.

Comparison of two binary files has many practical applications: the ability to detect programmatic changes between two versions, the ability to find old versions of statically linked libraries to prevent the use of well-known bugs, malware analysis, etc. In this article, a framework for comparison of binary files is presented. Framework uses IdaPro [1] disassembler and Binnavi [2] platform to recover structure of the target program and represent it as a call graph (CG). A program dependence graph (PDG) corresponds to each vertex of the CG. The proposed comparison algorithm consists of two main stages. At the first stage, several heuristics are applied to find the exact matches. Two functions are matched if at least one of the calculated heuristics is the same and unique in both binaries. At the second stage, backward and forward slicing is applied on matched vertices of CG to find further matches. According to empiric results heuristic method is effective and has high matching quality for unchanged or slightly modified functions. As a contradiction, to match heavily modified functions, binary code clone detection is used and it is based on finding maximum common subgraph for pair of PDGs. To achieve high performance on extensive binaries, the whole matching process is parallelized. The framework is tested on the number of real world libraries, such as python, openssh, openssl, libxml2, rsync, php, etc. Results show that in most cases more than 95% functions are truly matched. The tool is scalable due to parallelization of functions matching process and generation of PDGs and CGs.

Jiao, L., Yin, H., Guo, D., Lyu, Y.. 2017. Heterogeneous Malware Spread Process in Star Network. 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW). :265–269.

The heterogeneous SIS model for virus spread in any finite size graph characterizes the influence of factors of SIS model and could be analyzed by the extended N-Intertwined model introduced in [1]. We specifically focus on the heterogeneous virus spread in the star network in this paper. The epidemic threshold and the average meta-stable state fraction of infected nodes are derived for virus spread in the star network. Our results illustrate the effect of the factors of SIS model on the steady state infection.

Searles, R., Xu, L., Killian, W., Vanderbruggen, T., Forren, T., Howe, J., Pearson, Z., Shannon, C., Simmons, J., Cavazos, J.. 2017. Parallelization of Machine Learning Applied to Call Graphs of Binaries for Malware Detection. 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). :69–77.

Malicious applications have become increasingly numerous. This demands adaptive, learning-based techniques for constructing malware detection engines, instead of the traditional manual-based strategies. Prior work in learning-based malware detection engines primarily focuses on dynamic trace analysis and byte-level n-grams. Our approach in this paper differs in that we use compiler intermediate representations, i.e., the callgraph representation of binaries. Using graph-based program representations for learning provides structure of the program, which can be used to learn more advanced patterns. We use the Shortest Path Graph Kernel (SPGK) to identify similarities between call graphs extracted from binaries. The output similarity matrix is fed into a Support Vector Machine (SVM) algorithm to construct highly-accurate models to predict whether a binary is malicious or not. However, SPGK is computationally expensive due to the size of the input graphs. Therefore, we evaluate different parallelization methods for CPUs and GPUs to speed up this kernel, allowing us to continuously construct up-to-date models in a timely manner. Our hybrid implementation, which leverages both CPU and GPU, yields the best performance, achieving up to a 14.2x improvement over our already optimized OpenMP version. We compared our generated graph-based models to previously state-of-the-art feature vector 2-gram and 3-gram models on a dataset consisting of over 22,000 binaries. We show that our classification accuracy using graphs is over 19% higher than either n-gram model and gives a false positive rate (FPR) of less than 0.1%. We are also able to consider large call graphs and dataset sizes because of the reduced execution time of our parallelized SPGK implementation.

Acarali, D., Rajarajan, M., Komninos, N., Herwono, I.. 2017. Event graphs for the observation of botnet traffic. 2017 8th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). :628–634.

Botnets are a growing threat to the security of data and services on a global level. They exploit vulnerabilities in networks and host machines to harvest sensitive information, or make use of network resources such as memory or bandwidth in cyber-crime campaigns. Bot programs by nature are largely automated and systematic, and this is often used to detect them. In this paper, we extend upon existing work in this area by proposing a network event correlation method to produce graphs of flows generated by botnets, outlining the implementation and functionality of this approach. We also show how this method can be combined with statistical flow-based analysis to provide a descriptive chain of events, and test on public datasets with an overall success rate of 94.1%.