Visible to the public Biblio

Found 377 results

Filters: Keyword is invasive software  [Clear All Filters]
2021-02-23
Shah, A., Clachar, S., Minimair, M., Cook, D..  2020.  Building Multiclass Classification Baselines for Anomaly-based Network Intrusion Detection Systems. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). :759—760.
This paper showcases multiclass classification baselines using different machine learning algorithms and neural networks for distinguishing legitimate network traffic from direct and obfuscated network intrusions. This research derives its baselines from Advanced Security Network Metrics & Tunneling Obfuscations dataset. The dataset captured legitimate and obfuscated malicious TCP communications on selected vulnerable network services. The multiclass classification NIDS is able to distinguish obfuscated and direct network intrusion with up to 95% accuracy.
2021-02-16
Khoury, J., Nassar, M..  2020.  A Hybrid Game Theory and Reinforcement Learning Approach for Cyber-Physical Systems Security. NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium. :1—9.
Cyber-Physical Systems (CPS) are monitored and controlled by Supervisory Control and Data Acquisition (SCADA) systems that use advanced computing, sensors, control systems, and communication networks. At first, CPS and SCADA systems were protected and secured by isolation. However, with recent industrial technology advances, the increased connectivity of CPSs and SCADA systems to enterprise networks has uncovered them to new cybersecurity threats and made them a primary target for cyber-attacks with the potential of causing catastrophic economic, social, and environmental damage. Recent research focuses on new methodologies for risk modeling and assessment using game theory and reinforcement learning. This paperwork proposes to frame CPS security on two different levels, strategic and battlefield, by meeting ideas from game theory and Multi-Agent Reinforcement Learning (MARL). The strategic level is modeled as imperfect information, extensive form game. Here, the human administrator and the malware author decide on the strategies of defense and attack, respectively. At the battlefield level, strategies are implemented by machine learning agents that derive optimal policies for run-time decisions. The outcomes of these policies manifest as the utility at a higher level, where we aim to reach a Nash Equilibrium (NE) in favor of the defender. We simulate the scenario of a virus spreading in the context of a CPS network. We present experiments using the MiniCPS simulator and the OpenAI Gym toolkit and discuss the results.
Sumantra, I., Gandhi, S. Indira.  2020.  DDoS attack Detection and Mitigation in Software Defined Networks. 2020 International Conference on System, Computation, Automation and Networking (ICSCAN). :1—5.
This work aims to formulate an effective scheme which can detect and mitigate of Distributed Denial of Service (DDoS) attack in Software Defined Networks. Distributed Denial of Service attacks are one of the most destructive attacks in the internet. Whenever you heard of a website being hacked, it would have probably been a victim of a DDoS attack. A DDoS attack is aimed at disrupting the normal operation of a system by making service and resources unavailable to legitimate users by overloading the system with excessive superfluous traffic from distributed source. These distributed set of compromised hosts that performs the attack are referred as Botnet. Software Defined Networking being an emerging technology, offers a solution to reduce network management complexity. It separates the Control plane and the data plane. This decoupling provides centralized control of the network with programmability and flexibility. This work harness this programming ability and centralized control of SDN to obtain the randomness of the network flow data. This statistical approach utilizes the source IP in the network and various attributes of TCP flags and calculates entropy from them. The proposed technique can detect volume based and application based DDoS attacks like TCP SYN flood, Ping flood and Slow HTTP attacks. The methodology is evaluated through emulation using Mininet and Detection and mitigation strategies are implemented in POX controller. The experimental results show the proposed method have improved performance evaluation parameters including the Attack detection time, Delay to serve a legitimate request in the presence of attacker and overall CPU utilization.
2021-02-10
Aktepe, S., Varol, C., Shashidhar, N..  2020.  MiNo: The Chrome Web Browser Add-on Application to Block the Hidden Cryptocurrency Mining Activities. 2020 8th International Symposium on Digital Forensics and Security (ISDFS). :1—5.

Cryptocurrencies are the digital currencies designed to replace the regular cash money while taking place in our daily lives especially for the last couple of years. Mining cryptocurrencies are one of the popular ways to have them and make a profit due to unstable values in the market. This attracts attackers to utilize malware on internet users' computer resources, also known as cryptojacking, to mine cryptocurrencies. Cryptojacking started to be a major issue in the internet world. In this case, we developed MiNo, a web browser add-on application to detect these malicious mining activities running without the user's permission or knowledge. This add-on provides security and efficiency for the computer resources of the internet users. MiNo designed and developed with double-layer protection which makes it ahead of its competitors in the market.

Tanana, D..  2020.  Behavior-Based Detection of Cryptojacking Malware. 2020 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT). :0543—0545.
With rise of cryptocurrency popularity and value, more and more cybercriminals seek to profit using that new technology. Most common ways to obtain illegitimate profit using cryptocurrencies are ransomware and cryptojacking also known as malicious mining. And while ransomware is well-known and well-studied threat which is obvious by design, cryptojacking is often neglected because it's less harmful and much harder to detect. This article considers question of cryptojacking detection. Brief history and definition of cryptojacking are described as well as reasons for designing custom detection technique. We also propose complex detection technique based on CPU load by an application, which can be applied to both browser-based and executable-type cryptojacking samples. Prototype detection program based on our technique was designed using decision tree algorithm. The program was tested in a controlled virtual machine environment and achieved 82% success rate against selected number of cryptojacking samples. Finally, we'll discuss generalization of proposed technique for future work.
2021-02-08
Zhang, J..  2020.  DeepMal: A CNN-LSTM Model for Malware Detection Based on Dynamic Semantic Behaviours. 2020 International Conference on Computer Information and Big Data Applications (CIBDA). :313–316.
Malware refers to any software accessing or being installed in a system without the authorisation of administrators. Various malware has been widely used for cyber-criminals to accomplish their evil intentions and goals. To combat the increasing amount and reduce the threat of malicious programs, a novel deep learning framework, which uses NLP techniques for reference, combines CNN and LSTM neurones to capture the locally spatial correlations and learn from sequential longterm dependency is proposed. Hence, high-level abstractions and representations are automatically extracted for the malware classification task. The classification accuracy improves from 0.81 (best one by Random Forest) to approximately 1.0.
2021-02-03
Ceron, J. M., Scholten, C., Pras, A., Santanna, J..  2020.  MikroTik Devices Landscape, Realistic Honeypots, and Automated Attack Classification. NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium. :1—9.

In 2018, several malware campaigns targeted and succeed to infect millions of low-cost routers (malwares e.g., VPN-Filter, Navidade, and SonarDNS). These routers were used, then, for all sort of cybercrimes: from DDoS attacks to ransomware. MikroTik routers are a peculiar example of low-cost routers. These routers are used to provide both last mile access to home users and are used in core network infrastructure. Half of the core routers used in one of the biggest Internet exchanges in the world are MikroTik devices. The problem is that vulnerable firmwares (RouterOS) used in homeusers houses are also used in core networks. In this paper, we are the first to quantify the problem that infecting MikroTik devices would pose to the Internet. Based on more than 4 TB of data, we reveal more than 4 million MikroTik devices in the world. Then, we propose an easy-to-deploy MikroTik honeypot and collect more than 17 millions packets, in 45 days, from sensors deployed in Australia, Brazil, China, India, Netherlands, and the United States. Finally, we use the collected data from our honeypots to automatically classify and assess attacks tailored to MikroTik devices. All our source-codes and analysis are publicly available. We believe that our honeypots and our findings in this paper foster security improvements in MikroTik devices worldwide.

Devi, B. T., Shitharth, S., Jabbar, M. A..  2020.  An Appraisal over Intrusion Detection Systems in Cloud Computing Security Attacks. 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA). :722—727.

Cloud computing provides so many groundbreaking advantages over native computing servers like to improve capacity and decrease costs, but meanwhile, it carries many security issues also. In this paper, we find the feasible security attacks made about cloud computing, including Wrapping, Browser Malware-Injection and Flooding attacks, and also problems caused by accountability checking. We have also analyzed the honey pot attack and its procedural intrusion way into the system. This paper on overall deals with the most common security breaches in cloud computing and finally honey pot, in particular, to analyze its intrusion way. Our major scope is to do overall security, analyze in the cloud and then to take up with a particular attack to deal with granular level. Honey pot is the one such attack that is taken into account and its intrusion policies are analyzed. The specific honey pot algorithm is in the queue as the extension of this project in the future.

2021-01-25
Zhang, J., Ji, X., Xu, W., Chen, Y.-C., Tang, Y., Qu, G..  2020.  MagView: A Distributed Magnetic Covert Channel via Video Encoding and Decoding. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. :357—366.

Air-gapped networks achieve security by using the physical isolation to keep the computers and network from the Internet. However, magnetic covert channels based on CPU utilization have been proposed to help secret data to escape the Faraday-cage and the air-gap. Despite the success of such cover channels, they suffer from the high risk of being detected by the transmitter computer and the challenge of installing malware into such a computer. In this paper, we propose MagView, a distributed magnetic cover channel, where sensitive information is embedded in other data such as video and can be transmitted over the air-gapped internal network. When any computer uses the data such as playing the video, the sensitive information will leak through the magnetic covert channel. The "separation" of information embedding and leaking, combined with the fact that the covert channel can be created on any computer, overcomes these limitations. We demonstrate that CPU utilization for video decoding can be effectively controlled by changing the video frame type and reducing the quantization parameter without video quality degradation. We prototype MagView and achieve up to 8.9 bps throughput with BER as low as 0.0057. Experiments under different environment are conducted to show the robustness of MagView. Limitations and possible countermeasures are also discussed.

Guri, M..  2020.  CD-LEAK: Leaking Secrets from Audioless Air-Gapped Computers Using Covert Acoustic Signals from CD/DVD Drives. 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). :808—816.

Air-gapped networks are isolated from the Internet, since they store and process sensitive information. It has been shown that attackers can exfiltrate data from air-gapped networks by sending acoustic signals generated by computer speakers, however this type of covert channel relies on the existence of loudspeakers in the air-gapped environment. In this paper, we present CD-LEAK - a novel acoustic covert channel that works in constrained environments where loudspeakers are not available to the attacker. Malware installed on a compromised computer can maliciously generate acoustic signals via the optical CD/DVD drives. Binary information can then be modulated over the acoustic signals and be picked up by a nearby Internet connected receiver (e.g., a workstation, hidden microphone, smartphone, laptop, etc.). We examine CD/DVD drives and discuss their acoustical characteristics. We also present signal generation and detection, and data modulation and demodulation algorithms. Based on our proposed method, we developed a transmitter and receiver for PCs and smartphones, and provide the design and implementation details. We examine the channel and evaluate it on various optical drives. We also provide a set of countermeasures against this threat - which has been overlooked.

2021-01-22
Mani, G., Pasumarti, V., Bhargava, B., Vora, F. T., MacDonald, J., King, J., Kobes, J..  2020.  DeCrypto Pro: Deep Learning Based Cryptomining Malware Detection Using Performance Counters. 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). :109—118.
Autonomy in cybersystems depends on their ability to be self-aware by understanding the intent of services and applications that are running on those systems. In case of mission-critical cybersystems that are deployed in dynamic and unpredictable environments, the newly integrated unknown applications or services can either be benign and essential for the mission or they can be cyberattacks. In some cases, these cyberattacks are evasive Advanced Persistent Threats (APTs) where the attackers remain undetected for reconnaissance in order to ascertain system features for an attack e.g. Trojan Laziok. In other cases, the attackers can use the system only for computing e.g. cryptomining malware. APTs such as cryptomining malware neither disrupt normal system functionalities nor trigger any warning signs because they simply perform bitwise and cryptographic operations as any other benign compression or encoding application. Thus, it is difficult for defense mechanisms such as antivirus applications to detect these attacks. In this paper, we propose an Operating Context profiling system based on deep neural networks-Long Short-Term Memory (LSTM) networks-using Windows Performance Counters data for detecting these evasive cryptomining applications. In addition, we propose Deep Cryptomining Profiler (DeCrypto Pro), a detection system with a novel model selection framework containing a utility function that can select a classification model for behavior profiling from both the light-weight machine learning models (Random Forest and k-Nearest Neighbors) and a deep learning model (LSTM), depending on available computing resources. Given data from performance counters, we show that individual models perform with high accuracy and can be trained with limited training data. We also show that the DeCrypto Profiler framework reduces the use of computational resources and accurately detects cryptomining applications by selecting an appropriate model, given the constraints such as data sample size and system configuration.
2021-01-18
Naik, N., Jenkins, P., Savage, N., Yang, L., Naik, K., Song, J..  2020.  Embedding Fuzzy Rules with YARA Rules for Performance Optimisation of Malware Analysis. 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1–7.
YARA rules utilises string or pattern matching to perform malware analysis and is one of the most effective methods in use today. However, its effectiveness is dependent on the quality and quantity of YARA rules employed in the analysis. This can be managed through the rule optimisation process, although, this may not necessarily guarantee effective utilisation of YARA rules and its generated findings during its execution phase, as the main focus of YARA rules is in determining whether to trigger a rule or not, for a suspect sample after examining its rule condition. YARA rule conditions are Boolean expressions, mostly focused on the binary outcome of the malware analysis, which may limit the optimised use of YARA rules and its findings despite generating significant information during the execution phase. Therefore, this paper proposes embedding fuzzy rules with YARA rules to optimise its performance during the execution phase. Fuzzy rules can manage imprecise and incomplete data and encompass a broad range of conditions, which may not be possible in Boolean logic. This embedding may be more advantageous when the YARA rules become more complex, resulting in multiple complex conditions, which may not be processed efficiently utilising Boolean expressions alone, thus compromising effective decision-making. This proposed embedded approach is applied on a collected malware corpus and is tested against the standard and enhanced YARA rules to demonstrate its success.
Naik, N., Jenkins, P., Savage, N., Yang, L., Boongoen, T., Iam-On, N..  2020.  Fuzzy-Import Hashing: A Malware Analysis Approach. 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1–8.
Malware has remained a consistent threat since its emergence, growing into a plethora of types and in large numbers. In recent years, numerous new malware variants have enabled the identification of new attack surfaces and vectors, and have become a major challenge to security experts, driving the enhancement and development of new malware analysis techniques to contain the contagion. One of the preliminary steps of malware analysis is to remove the abundance of counterfeit malware samples from the large collection of suspicious samples. This process assists in the management of man and machine resources effectively in the analysis of both unknown and likely malware samples. Hashing techniques are one of the fastest and efficient techniques for performing this preliminary analysis such as fuzzy hashing and import hashing. However, both hashing methods have their limitations and they may not be effective on their own, instead the combination of two distinctive methods may assist in improving the detection accuracy and overall performance of the analysis. This paper proposes a Fuzzy-Import hashing technique which is the combination of fuzzy hashing and import hashing to improve the detection accuracy and overall performance of malware analysis. This proposed Fuzzy-Import hashing offers several benefits which are demonstrated through the experimentation performed on the collected malware samples and compared against stand-alone techniques of fuzzy hashing and import hashing.
2021-01-11
Rajapkar, A., Binnar, P., Kazi, F..  2020.  Design of Intrusion Prevention System for OT Networks Using Deep Neural Networks. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). :1–6.

The Automation industries that uses Supervisory Control and Data Acquisition (SCADA) systems are highly vulnerable for Network threats. Systems that are air-gapped and isolated from the internet are highly affected due to insider attacks like Spoofing, DOS and Malware threats that affects confidentiality, integrity and availability of Operational Technology (OT) system elements and degrade its performance even though security measures are taken. In this paper, a behavior-based intrusion prevention system (IPS) is designed for OT networks. The proposed system is implemented on SCADA test bed with two systems replicates automation scenarios in industry. This paper describes 4 main classes of cyber-attacks with their subclasses against SCADA systems and methodology with design of components of IPS system, database creation, Baselines and deployment of system in environment. IPS system identifies not only IT protocols but also Industry Control System (ICS) protocols Modbus and DNP3 with their inside communication fields using deep packet inspection (DPI). The analytical results show 99.89% accuracy on binary classification and 97.95% accuracy on multiclass classification of different attack vectors performed on network with low false positive rate. These results are also validated by actual deployment of IPS in SCADA systems with the prevention of DOS attack.

2020-12-14
Dong, D., Ye, Z., Su, J., Xie, S., Cao, Y., Kochan, R..  2020.  A Malware Detection Method Based on Improved Fireworks Algorithm and Support Vector Machine. 2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET). :846–851.
The increasing of malwares has presented a serious threat to the security of computer systems in recent years. Traditional signature-based anti-virus systems are not able to detect metamorphic and previously unseen malwares and it inspires people to use machine learning methods such as Naive Bayes and Decision Tree to identity malicious executables. Among these methods, detecting malwares by using Support Vector Machine (SVM) is one of the most effective approaches. However, the parameters of SVM have serious impacts on its classification performance. In order to find the optimal parameter combination and avoid the problem of falling into local optimal solution, many methods based on evolutionary algorithms are proposed, including Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Differential Evolution (DE) and others. But these algorithms still face the problem of being trapped into local solution spaces in different degree. In this paper, an improved fireworks algorithm is presented and applied to search parameters of SVM: penalty factor c and kernel function parameter g. To research the performance of the proposed algorithm, numeric experiments are made and compared with some typical algorithms, the experimental results demonstrate it outperforms other algorithms.
2020-12-11
Phu, T. N., Hoang, L., Toan, N. N., Tho, N. Dai, Binh, N. N..  2019.  C500-CFG: A Novel Algorithm to Extract Control Flow-based Features for IoT Malware Detection. 2019 19th International Symposium on Communications and Information Technologies (ISCIT). :568—573.

{Static characteristic extraction method Control flow-based features proposed by Ding has the ability to detect malicious code with higher accuracy than traditional Text-based methods. However, this method resolved NP-hard problem in a graph, therefore it is not feasible with the large-size and high-complexity programs. So, we propose the C500-CFG algorithm in Control flow-based features based on the idea of dynamic programming, solving Ding's NP-hard problem in O(N2) time complexity, where N is the number of basic blocks in decom-piled executable codes. Our algorithm is more efficient and more outstanding in detecting malware than Ding's algorithm: fast processing time, allowing processing large files, using less memory and extracting more feature information. Applying our algorithms with IoT data sets gives outstanding results on 2 measures: Accuracy = 99.34%

Slawinski, M., Wortman, A..  2019.  Applications of Graph Integration to Function Comparison and Malware Classification. 2019 4th International Conference on System Reliability and Safety (ICSRS). :16—24.

We classify .NET files as either benign or malicious by examining directed graphs derived from the set of functions comprising the given file. Each graph is viewed probabilistically as a Markov chain where each node represents a code block of the corresponding function, and by computing the PageRank vector (Perron vector with transport), a probability measure can be defined over the nodes of the given graph. Each graph is vectorized by computing Lebesgue antiderivatives of hand-engineered functions defined on the vertex set of the given graph against the PageRank measure. Files are subsequently vectorized by aggregating the set of vectors corresponding to the set of graphs resulting from decompiling the given file. The result is a fast, intuitive, and easy-to-compute glass-box vectorization scheme, which can be leveraged for training a standalone classifier or to augment an existing feature space. We refer to this vectorization technique as PageRank Measure Integration Vectorization (PMIV). We demonstrate the efficacy of PMIV by training a vanilla random forest on 2.5 million samples of decompiled. NET, evenly split between benign and malicious, from our in-house corpus and compare this model to a baseline model which leverages a text-only feature space. The median time needed for decompilation and scoring was 24ms. 11Code available at https://github.com/gtownrocks/grafuple.

Payne, J., Kundu, A..  2019.  Towards Deep Federated Defenses Against Malware in Cloud Ecosystems. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :92—100.

In cloud computing environments with many virtual machines, containers, and other systems, an epidemic of malware can be crippling and highly threatening to business processes. In this vision paper, we introduce a hierarchical approach to performing malware detection and analysis using several recent advances in machine learning on graphs, hypergraphs, and natural language. We analyze individual systems and their logs, inspecting and understanding their behavior with attentional sequence models. Given a feature representation of each system's logs using this procedure, we construct an attributed network of the cloud with systems and other components as vertices and propose an analysis of malware with inductive graph and hypergraph learning models. With this foundation, we consider the multicloud case, in which multiple clouds with differing privacy requirements cooperate against the spread of malware, proposing the use of federated learning to perform inference and training while preserving privacy. Finally, we discuss several open problems that remain in defending cloud computing environments against malware related to designing robust ecosystems, identifying cloud-specific optimization problems for response strategy, action spaces for malware containment and eradication, and developing priors and transfer learning tasks for machine learning models in this area.

Huang, N., Xu, M., Zheng, N., Qiao, T., Choo, K. R..  2019.  Deep Android Malware Classification with API-Based Feature Graph. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :296—303.

The rapid growth of Android malware apps poses a great security threat to users thus it is very important and urgent to detect Android malware effectively. What's more, the increasing unknown malware and evasion technique also call for novel detection method. In this paper, we focus on API feature and develop a novel method to detect Android malware. First, we propose a novel selection method for API feature related with the malware class. However, such API also has a legitimate use in benign app thus causing FP problem (misclassify benign as malware). Second, we further explore structure relationships between these APIs and map to a matrix interpreted as the hand-refined API-based feature graph. Third, a CNN-based classifier is developed for the API-based feature graph classification. Evaluations of a real-world dataset containing 3,697 malware apps and 3,312 benign apps demonstrate that selected API feature is effective for Android malware classification, just top 20 APIs can achieve high F1 of 94.3% under Random Forest classifier. When the available API features are few, classification performance including FPR indicator can achieve effective improvement effectively by complementing our further work.

Ge, X., Pan, Y., Fan, Y., Fang, C..  2019.  AMDroid: Android Malware Detection Using Function Call Graphs. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). :71—77.

With the rapid development of the mobile Internet, Android has been the most popular mobile operating system. Due to the open nature of Android, c countless malicious applications are hidden in a large number of benign applications, which pose great threats to users. Most previous malware detection approaches mainly rely on features such as permissions, API calls, and opcode sequences. However, these approaches fail to capture structural semantics of applications. In this paper, we propose AMDroid that leverages function call graphs (FCGs) representing the behaviors of applications and applies graph kernels to automatically learn the structural semantics of applications from FCGs. We evaluate AMDroid on the Genome Project, and the experimental results show that AMDroid is effective to detect Android malware with 97.49% detection accuracy.

Abusnaina, A., Khormali, A., Alasmary, H., Park, J., Anwar, A., Mohaisen, A..  2019.  Adversarial Learning Attacks on Graph-based IoT Malware Detection Systems. 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). :1296—1305.

IoT malware detection using control flow graph (CFG)-based features and deep learning networks are widely explored. The main goal of this study is to investigate the robustness of such models against adversarial learning. We designed two approaches to craft adversarial IoT software: off-the-shelf methods and Graph Embedding and Augmentation (GEA) method. In the off-the-shelf adversarial learning attack methods, we examine eight different adversarial learning methods to force the model to misclassification. The GEA approach aims to preserve the functionality and practicality of the generated adversarial sample through a careful embedding of a benign sample to a malicious one. Intensive experiments are conducted to evaluate the performance of the proposed method, showing that off-the-shelf adversarial attack methods are able to achieve a misclassification rate of 100%. In addition, we observed that the GEA approach is able to misclassify all IoT malware samples as benign. The findings of this work highlight the essential need for more robust detection tools against adversarial learning, including features that are not easy to manipulate, unlike CFG-based features. The implications of the study are quite broad, since the approach challenged in this work is widely used for other applications using graphs.

Fan, M., Luo, X., Liu, J., Wang, M., Nong, C., Zheng, Q., Liu, T..  2019.  Graph Embedding Based Familial Analysis of Android Malware using Unsupervised Learning. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :771—782.

The rapid growth of Android malware has posed severe security threats to smartphone users. On the basis of the familial trait of Android malware observed by previous work, the familial analysis is a promising way to help analysts better focus on the commonalities of malware samples within the same families, thus reducing the analytical workload and accelerating malware analysis. The majority of existing approaches rely on supervised learning and face three main challenges, i.e., low accuracy, low efficiency, and the lack of labeled dataset. To address these challenges, we first construct a fine-grained behavior model by abstracting the program semantics into a set of subgraphs. Then, we propose SRA, a novel feature that depicts the similarity relationships between the Structural Roles of sensitive API call nodes in subgraphs. An SRA is obtained based on graph embedding techniques and represented as a vector, thus we can effectively reduce the high complexity of graph matching. After that, instead of training a classifier with labeled samples, we construct malware link network based on SRAs and apply community detection algorithms on it to group the unlabeled samples into groups. We implement these ideas in a system called GefDroid that performs Graph embedding based familial analysis of AnDroid malware using unsupervised learning. Moreover, we conduct extensive experiments to evaluate GefDroid on three datasets with ground truth. The results show that GefDroid can achieve high agreements (0.707-0.883 in term of NMI) between the clustering results and the ground truth. Furthermore, GefDroid requires only linear run-time overhead and takes around 8.6s to analyze a sample on average, which is considerably faster than the previous work.

Wu, Y., Li, X., Zou, D., Yang, W., Zhang, X., Jin, H..  2019.  MalScan: Fast Market-Wide Mobile Malware Scanning by Social-Network Centrality Analysis. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :139—150.

Malware scanning of an app market is expected to be scalable and effective. However, existing approaches use either syntax-based features which can be evaded by transformation attacks or semantic-based features which are usually extracted by performing expensive program analysis. Therefor, in this paper, we propose a lightweight graph-based approach to perform Android malware detection. Instead of traditional heavyweight static analysis, we treat function call graphs of apps as social networks and perform social-network-based centrality analysis to represent the semantic features of the graphs. Our key insight is that centrality provides a succinct and fault-tolerant representation of graph semantics, especially for graphs with certain amount of inaccurate information (e.g., inaccurate call graphs). We implement a prototype system, MalScan, and evaluate it on datasets of 15,285 benign samples and 15,430 malicious samples. Experimental results show that MalScan is capable of detecting Android malware with up to 98% accuracy under one second which is more than 100 times faster than two state-of-the-art approaches, namely MaMaDroid and Drebin. We also demonstrate the feasibility of MalScan on market-wide malware scanning by performing a statistical study on over 3 million apps. Finally, in a corpus of dataset collected from Google-Play app market, MalScan is able to identify 18 zero-day malware including malware samples that can evade detection of existing tools.

2020-12-02
Malvankar, A., Payne, J., Budhraja, K. K., Kundu, A., Chari, S., Mohania, M..  2019.  Malware Containment in Cloud. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :221—227.

Malware is pervasive and poses serious threats to normal operation of business processes in cloud. Cloud computing environments typically have hundreds of hosts that are connected to each other, often with high risk trust assumptions and/or protection mechanisms that are not difficult to break. Malware often exploits such weaknesses, as its immediate goal is often to spread itself to as many hosts as possible. Detecting this propagation is often difficult to address because the malware may reside in multiple components across the software or hardware stack. In this scenario, it is usually best to contain the malware to the smallest possible number of hosts, and it's also critical for system administration to resolve the issue in a timely manner. Furthermore, resolution often requires that several participants across different organizational teams scramble together to address the intrusion. In this vision paper, we define this problem in detail. We then present our vision of decentralized malware containment and the challenges and issues associated with this vision. The approach of containment involves detection and response using graph analytics coupled with a blockchain framework. We propose the use of a dominance frontier for profile nodes which must be involved in the containment process. Smart contracts are used to obtain consensus amongst the involved parties. The paper presents a basic implementation of this proposal. We have further discussed some open problems related to our vision.

2020-11-30
Stokes, J. W., Agrawal, R., McDonald, G., Hausknecht, M..  2019.  ScriptNet: Neural Static Analysis for Malicious JavaScript Detection. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–8.
Malicious scripts are an important computer infection threat vector for computer users. For internet-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We also propose a novel deep learning model, Pre-Informant Learning (PIL), which processes Javascript files as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our model variants are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating this model on a large corpus of 212,408 JavaScript files indicates that the best performing PIL model offers a 98.10% true positive rate (TPR) for the first 60K byte subsequences and 81.66% for the full-length files, at a false positive rate (FPR) of 0.50%. Both models significantly outperform several baseline models. The best performing PIL model can successfully detect 92.02% of unknown malware samples in a hindsight experiment where the true labels of the malicious JavaScript files were not known when the model was trained.