Visible to the public Biblio

Filters: Keyword is graph embedding  [Clear All Filters]
2023-09-18
Wang, Rui, Zheng, Jun, Shi, Zhiwei, Tan, Yu'an.  2022.  Detecting Malware Using Graph Embedding and DNN. 2022 International Conference on Blockchain Technology and Information Security (ICBCTIS). :28—31.
Nowadays, the popularity of intelligent terminals makes malwares more and more serious. Among the many features of application, the call graph can accurately express the behavior of the application. The rapid development of graph neural network in recent years provides a new solution for the malicious analysis of application using call graphs as features. However, there are still problems such as low accuracy. This paper established a large-scale data set containing more than 40,000 samples and selected the class call graph, which was extracted from the application, as the feature and used the graph embedding combined with the deep neural network to detect the malware. The experimental results show that the accuracy of the detection model proposed in this paper is 97.7%; the precision is 96.6%; the recall is 96.8%; the F1-score is 96.4%, which is better than the existing detection model based on Markov chain and graph embedding detection model.
Oshio, Kei, Takada, Satoshi, Han, Chansu, Tanaka, Akira, Takeuchi, Jun'ichi.  2022.  Poster: Flexible Function Estimation of IoT Malware Using Graph Embedding Technique. 2022 IEEE Symposium on Computers and Communications (ISCC). :1—3.
Most IoT malware is variants generated by editing and reusing parts of the functions based on publicly available source codes. In our previous study, we proposed a method to estimate the functions of a specimen using the Function Call Sequence Graph (FCSG), which is a directed graph of execution sequence of function calls. In the FCSG-based method, the subgraph corresponding to a malware functionality is manually created and called a signature-FSCG. The specimens with the signature-FSCG are expected to have the corresponding functionality. However, this method cannot detect the specimens with a slightly different subgraph from the signature-FSCG. This paper found that these specimens were supposed to have the same functionality for a signature-FSCG. These specimens need more flexible signature matching, and we propose a graph embedding technique to realize it.
2023-08-25
Zheng, Chaofan, Hu, Wenhui, Li, Tianci, Liu, Xueyang, Zhang, Jinchan, Wang, Litian.  2022.  An Insider Threat Detection Method Based on Heterogeneous Graph Embedding. 2022 IEEE 8th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :11—16.
Insider threats have high risk and concealment characteristics, which makes traditional anomaly detection methods less effective in insider threat detection. Existing detection methods ignore the logical relationship between user behaviors and the consistency of behavior sequences among homogeneous users, resulting in poor model effects. We propose an insider threat detection method based on internal user heterogeneous graph embedding. Firstly, according to the characteristics of CERT data, comprehensively consider the relationship between users, the time sequence, and logical relationship, and construct a heterogeneous graph. In the second step, according to the characteristics of heterogeneous graphs, the embedding learning of graph nodes is carried out according to random walk and Word2vec. Finally, we propose an Insider Threat Detection Design (ITDD) model which can map and the user behavior sequence information into a high-dimensional feature space. In the CERT r5.2 dataset, compared with a variety of traditional machine learning methods, the effect of our method is significantly better than the final result.
2023-03-17
Raj, Ankit, Somani, Sunil B..  2022.  Predicting Terror Attacks Using Neo4j Sandbox and Machine Learning Algorithms. 2022 6th International Conference On Computing, Communication, Control And Automation (ICCUBEA. :1–6.
Terrorism, and radicalization are major economic, political, and social issues faced by the world in today's era. The challenges that governments and citizens face in combating terrorism are growing by the day. Artificial intelligence, including machine learning and deep learning, has shown promising results in predicting terrorist attacks. In this paper, we attempted to build a machine learning model to predict terror activities using a global terrorism database in both relational and graphical forms. Using the Neo4j Sandbox, you can create a graph database from a relational database. We used the node2vec algorithm from Neo4j Sandbox's graph data science library to convert the high-dimensional graph to a low-dimensional vector form. In order to predict terror activities, seven machine learning models were used, and the performance parameters that were calculated were accuracy, precision, recall, and F1 score. According to our findings, the Logistic Regression model was the best performing model which was able to classify the dataset with an accuracy of 0.90, recall of 0.94 precision of 0.93, and an F1 score of 0.93.
ISSN: 2771-1358
2022-09-20
Pereira, Luiz Manella, Iyengar, S. S., Amini, M. Hadi.  2021.  On the Impact of the Embedding Process on Network Resilience Quantification. 2021 International Conference on Computational Science and Computational Intelligence (CSCI). :836—839.
Network resilience is crucial to ensure reliable and secure operation of critical infrastructures. Although graph theoretic methods have been developed to quantify the topological resilience of networks, i.e., measuring resilience with respect to connectivity, in this study we propose to use the tools from Topological Data Analysis (TDA), Algebraic Topology, and Optimal Transport (OT). In our prior work, we used these tools to create a resilience metric that bypassed the need to embed a network onto a space. We also hypothesized that embeddings could encode different information about a network and that different embeddings could result in different outcomes when computing resilience. In this paper we attempt to test this hypothesis. We will utilize the WEGL framework to compute the embedding for the considered network and compare the results against our prior work, which did not use an embedding process. To our knowledge, this is the first attempt to study the ramifications of choosing an embedding, thus providing a novel understanding into how to choose an embedding and whether such a choice matters when quantifying resilience.
2021-09-16
Ullman, Steven, Samtani, Sagar, Lazarine, Ben, Zhu, Hongyi, Ampel, Benjamin, Patton, Mark, Chen, Hsinchun.  2020.  Smart Vulnerability Assessment for Scientific Cyberinfrastructure: An Unsupervised Graph Embedding Approach. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–6.
The accelerated growth of computing technologies has provided interdisciplinary teams a platform for producing innovative research at an unprecedented speed. Advanced scientific cyberinfrastructures, in particular, provide data storage, applications, software, and other resources to facilitate the development of critical scientific discoveries. Users of these environments often rely on custom developed virtual machine (VM) images that are comprised of a diverse array of open source applications. These can include vulnerabilities undetectable by conventional vulnerability scanners. This research aims to identify the installed applications, their vulnerabilities, and how they vary across images in scientific cyberinfrastructure. We propose a novel unsupervised graph embedding framework that captures relationships between applications, as well as vulnerabilities identified on corresponding GitHub repositories. This embedding is used to cluster images with similar applications and vulnerabilities. We evaluate cluster quality using Silhouette, Calinski-Harabasz, and Davies-Bouldin indices, and application vulnerabilities through inspection of selected clusters. Results reveal that images pertaining to genomics research in our research testbed are at greater risk of high-severity shell spawning and data validation vulnerabilities.
2021-02-22
Chen, T., Lin, T., Hong, Y.- P..  2020.  Gait Phase Segmentation Using Weighted Dynamic Time Warping and K-Nearest Neighbors Graph Embedding. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1180–1184.
Gait phase segmentation is the process of identifying the start and end of different phases within a gait cycle. It is essential to many medical applications, such as disease diagnosis or rehabilitation. This work utilizes inertial measurement units (IMUs) mounted on the individual's foot to gather gait information and develops a gait phase segmentation method based on the collected signals. The proposed method utilizes a weighted dynamic time warping (DTW) algorithm to measure the distance between two different gait signals, and a k-nearest neighbors (kNN) algorithm to obtain the gait phase estimates. To reduce the complexity of the DTW-based kNN search, we propose a neural network-based graph embedding scheme that is able to map the IMU signals associated with each gait cycle into a distance-preserving low-dimensional representation while also producing a prediction on the k nearest neighbors of the test signal. Experiments are conducted on self-collected IMU gait signals to demonstrate the effectiveness of the proposed scheme.
2020-12-11
Abusnaina, A., Khormali, A., Alasmary, H., Park, J., Anwar, A., Mohaisen, A..  2019.  Adversarial Learning Attacks on Graph-based IoT Malware Detection Systems. 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). :1296—1305.

IoT malware detection using control flow graph (CFG)-based features and deep learning networks are widely explored. The main goal of this study is to investigate the robustness of such models against adversarial learning. We designed two approaches to craft adversarial IoT software: off-the-shelf methods and Graph Embedding and Augmentation (GEA) method. In the off-the-shelf adversarial learning attack methods, we examine eight different adversarial learning methods to force the model to misclassification. The GEA approach aims to preserve the functionality and practicality of the generated adversarial sample through a careful embedding of a benign sample to a malicious one. Intensive experiments are conducted to evaluate the performance of the proposed method, showing that off-the-shelf adversarial attack methods are able to achieve a misclassification rate of 100%. In addition, we observed that the GEA approach is able to misclassify all IoT malware samples as benign. The findings of this work highlight the essential need for more robust detection tools against adversarial learning, including features that are not easy to manipulate, unlike CFG-based features. The implications of the study are quite broad, since the approach challenged in this work is widely used for other applications using graphs.

Fan, M., Luo, X., Liu, J., Wang, M., Nong, C., Zheng, Q., Liu, T..  2019.  Graph Embedding Based Familial Analysis of Android Malware using Unsupervised Learning. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :771—782.

The rapid growth of Android malware has posed severe security threats to smartphone users. On the basis of the familial trait of Android malware observed by previous work, the familial analysis is a promising way to help analysts better focus on the commonalities of malware samples within the same families, thus reducing the analytical workload and accelerating malware analysis. The majority of existing approaches rely on supervised learning and face three main challenges, i.e., low accuracy, low efficiency, and the lack of labeled dataset. To address these challenges, we first construct a fine-grained behavior model by abstracting the program semantics into a set of subgraphs. Then, we propose SRA, a novel feature that depicts the similarity relationships between the Structural Roles of sensitive API call nodes in subgraphs. An SRA is obtained based on graph embedding techniques and represented as a vector, thus we can effectively reduce the high complexity of graph matching. After that, instead of training a classifier with labeled samples, we construct malware link network based on SRAs and apply community detection algorithms on it to group the unlabeled samples into groups. We implement these ideas in a system called GefDroid that performs Graph embedding based familial analysis of AnDroid malware using unsupervised learning. Moreover, we conduct extensive experiments to evaluate GefDroid on three datasets with ground truth. The results show that GefDroid can achieve high agreements (0.707-0.883 in term of NMI) between the clustering results and the ground truth. Furthermore, GefDroid requires only linear run-time overhead and takes around 8.6s to analyze a sample on average, which is considerably faster than the previous work.

2020-02-10
Cheng, Xiao, Wang, Haoyu, Hua, Jiayi, Zhang, Miao, Xu, Guoai, Yi, Li, Sui, Yulei.  2019.  Static Detection of Control-Flow-Related Vulnerabilities Using Graph Embedding. 2019 24th International Conference on Engineering of Complex Computer Systems (ICECCS). :41–50.

Static vulnerability detection has shown its effectiveness in detecting well-defined low-level memory errors. However, high-level control-flow related (CFR) vulnerabilities, such as insufficient control flow management (CWE-691), business logic errors (CWE-840), and program behavioral problems (CWE-438), which are often caused by a wide variety of bad programming practices, posing a great challenge for existing general static analysis solutions. This paper presents a new deep-learning-based graph embedding approach to accurate detection of CFR vulnerabilities. Our approach makes a new attempt by applying a recent graph convolutional network to embed code fragments in a compact and low-dimensional representation that preserves high-level control-flow information of a vulnerable program. We have conducted our experiments using 8,368 real-world vulnerable programs by comparing our approach with several traditional static vulnerability detectors and state-of-the-art machine-learning-based approaches. The experimental results show the effectiveness of our approach in terms of both accuracy and recall. Our research has shed light on the promising direction of combining program analysis with deep learning techniques to address the general static analysis challenges.

2020-01-02
Talasila, Prasad, Kakrambe, Mihir, Rai, Anurag, Santy, Sebastin, Goveas, Neena, Deshpande, Bharat M..  2018.  BITS Darshini: A Modular, Concurrent Protocol Analyzer Workbench. Proceedings of the 19th International Conference on Distributed Computing and Networking. :54:1–54:10.
Network measurements are essential for troubleshooting and active management of networks. Protocol analysis of captured network packet traffic is an important passive network measurement technique used by researchers and network operations engineers. In this work, we present a measurement workbench tool named BITS Darshini (Darshini in short) to enable scientific network measurements. We have created Darshini as a modular, concurrent web application that stores experimental meta-data and allows users to specify protocol parse graphs. Darshini performs protocol analysis on a concurrent pipeline architecture, persists the analysis to a database and provides the analysis results via a REST API service. We formulate the problem of mapping protocol parse graph to a concurrent pipeline as a graph embedding problem. Our tool, Darshini, performs protocol analysis up to transport layer and is suitable for the study of small and medium-sized networks. Darshini enables secure collaboration and consultations with experts.
2019-06-10
Jiang, H., Turki, T., Wang, J. T. L..  2018.  DLGraph: Malware Detection Using Deep Learning and Graph Embedding. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). :1029-1033.

In this paper we present a new approach, named DLGraph, for malware detection using deep learning and graph embedding. DLGraph employs two stacked denoising autoencoders (SDAs) for representation learning, taking into consideration computer programs' function-call graphs and Windows application programming interface (API) calls. Given a program, we first use a graph embedding technique that maps the program's function-call graph to a vector in a low-dimensional feature space. One SDA in our deep learning model is used to learn a latent representation of the embedded vector of the function-call graph. The other SDA in our model is used to learn a latent representation of the given program's Windows API calls. The two learned latent representations are then merged to form a combined feature vector. Finally, we use softmax regression to classify the combined feature vector for predicting whether the given program is malware or not. Experimental results based on different datasets demonstrate the effectiveness of the proposed approach and its superiority over a related method.