Visible to the public Biblio

Filters: Keyword is Signature Matching  [Clear All Filters]
2023-09-18
Oshio, Kei, Takada, Satoshi, Han, Chansu, Tanaka, Akira, Takeuchi, Jun'ichi.  2022.  Poster: Flexible Function Estimation of IoT Malware Using Graph Embedding Technique. 2022 IEEE Symposium on Computers and Communications (ISCC). :1—3.
Most IoT malware is variants generated by editing and reusing parts of the functions based on publicly available source codes. In our previous study, we proposed a method to estimate the functions of a specimen using the Function Call Sequence Graph (FCSG), which is a directed graph of execution sequence of function calls. In the FCSG-based method, the subgraph corresponding to a malware functionality is manually created and called a signature-FSCG. The specimens with the signature-FSCG are expected to have the corresponding functionality. However, this method cannot detect the specimens with a slightly different subgraph from the signature-FSCG. This paper found that these specimens were supposed to have the same functionality for a signature-FSCG. These specimens need more flexible signature matching, and we propose a graph embedding technique to realize it.
2022-02-07
Acharya, Jatin, Chuadhary, Anshul, Chhabria, Anish, Jangale, Smita.  2021.  Detecting Malware, Malicious URLs and Virus Using Machine Learning and Signature Matching. 2021 2nd International Conference for Emerging Technology (INCET). :1–5.
Nowadays most of our data is stored on an electronic device. The risk of that device getting infected by Viruses, Malware, Worms, Trojan, Ransomware, or any unwanted invader has increased a lot these days. This is mainly because of easy access to the internet. Viruses and malware have evolved over time so identification of these files has become difficult. Not only by viruses and malware your device can be attacked by a click on forged URLs. Our proposed solution for this problem uses machine learning techniques and signature matching techniques. The main aim of our solution is to identify the malicious programs/URLs and act upon them. The core idea in identifying the malware is selecting the key features from the Portable Executable file headers using these features we trained a random forest model. This RF model will be used for scanning a file and determining if that file is malicious or not. For identification of the virus, we are using the signature matching technique which is used to match the MD5 hash of the file with the virus signature database containing the MD5 hash of the identified viruses and their families. To distinguish between benign and illegitimate URLs there is a logistic regression model used. The regression model uses a tokenizer for feature extraction from the URL that is to be classified. The tokenizer separates all the domains, sub-domains and separates the URLs on every `/'. Then a TfidfVectorizer (Term Frequency - Inverse Document Frequency) is used to convert the text into a weighted value. These values are used to predict if the URL is safe to visit or not. On the integration of all three modules, the final application will provide full system protection against malicious software.