Visible to the public Biblio

Found 213 results

Filters: Keyword is static analysis  [Clear All Filters]
2022-08-12
Liu, Kui, Koyuncu, Anil, Kim, Dongsun, Bissyandè, Tegawende F..  2019.  AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). :1–12.
Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained with mutation operators through genetic programming. The performance of pattern-based APR systems, however, depends on the fix ingredients mined from fix changes in development histories. Unfortunately, collecting a reliable set of bug fixes in repositories can be challenging. In this paper, we propose to investigate the possibility in an APR scenario of leveraging code changes that address violations by static bug detection tools. To that end, we build the AVATAR APR system, which exploits fix patterns of static analysis violations as ingredients for patch generation. Evaluated on the Defects4J benchmark, we show that, assuming a perfect localization of faults, AVATAR can generate correct patches to fix 34/39 bugs. We further find that AVATAR yields performance metrics that are comparable to that of the closely-related approaches in the literature. While AVATAR outperforms many of the state-of-the-art pattern-based APR systems, it is mostly complementary to current approaches. Overall, our study highlights the relevance of static bug finding tools as indirect contributors of fix ingredients for addressing code defects identified with functional test cases.
Pathak, Abhishek, Sivakumar, Kaarthik, Haque, Mazhar, Ganesan, Prasanna.  2019.  Multi-Cluster Visualization and Live Reporting of Static Analysis Security Testing (SAST) Warnings. 2019 IEEE Cybersecurity Development (SecDev). :145–145.
This short paper discusses a case study of multi cluster visualization of Static Analysis Security Testing (SAST) warnings in large clusters catering to a majority of Cisco products in hierarchical organizational and checker views. This serves as a one stop shop for real-time visualization of Static Analysis (SA) warning trends, chart, downloading reports, and to effectively address the potential security weaknesses detected. Presently leading SAST tools like Coverity, codesonar, Klocwork etc do not provide inter-cluster or enterprise-wide visualization to effectively address the SA warnings.
Sachidananda, Vinay, Bhairav, Suhas, Ghosh, Nirnay, Elovici, Yuval.  2019.  PIT: A Probe Into Internet of Things by Comprehensive Security Analysis. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :522–529.
One of the major issues which are hindering widespread and seamless adoption of Internet of Thing (IoT) is security. The IoT devices are vulnerable and susceptible to attacks which became evident from a series of recent large-scale distributed denial-of-service (DDoS) attacks, leading to substantial business and financial losses. Furthermore, in order to find vulnerabilities in IoT, there is a lack of comprehensive security analysis framework. In this paper, we present a modular, adaptable and tunable framework, called PIT, to probe IoT systems at different layers of design and implementation. PIT consists of several security analysis engines, viz., penetration testing, fuzzing, static analysis, and dynamic analysis and an exploitation engine to discover multiple IoT vulnerabilities, respectively. We also develop a novel grey-box fuzzer, called Applica, as a part of the fuzzing engine to overcome the limitations of the present day fuzzers. The proposed framework has been evaluated on a real-world IoT testbed comprising of the state-of-the-art devices. We discovered several network and system-level vulnerabilities such as Buffer Overflow, Denial-of-Service, SQL Injection, etc., and successfully exploited them to demonstrate the presence of security loopholes in the IoT devices.
Berman, Maxwell, Adams, Stephen, Sherburne, Tim, Fleming, Cody, Beling, Peter.  2019.  Active Learning to Improve Static Analysis. 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). :1322–1327.
Static analysis tools are programs that run on source code prior to their compilation to binary executables and attempt to find flaws or defects in the code during the early stages of development. If left unresolved, these flaws could pose security risks. While numerous static analysis tools exist, there is no single tool that is optimal. Therefore, many static analysis tools are often used to analyze code. Further, some of the alerts generated by the static analysis tools are low-priority or false alarms. Machine learning algorithms have been developed to distinguish between true alerts and false alarms, however significant man hours need to be dedicated to labeling data sets for training. This study investigates the use of active learning to reduce the number of labeled alerts needed to adequately train a classifier. The numerical experiments demonstrate that a query by committee active learning algorithm can be utilized to significantly reduce the number of labeled alerts needed to achieve similar performance as a classifier trained on a data set of nearly 60,000 labeled alerts.
2022-07-29
Chen, Keren, Zheng, Nan, Cai, Qiyuan, Li, Yinan, Lin, Changyong, Li, Yuanfei.  2021.  Cyber-Physical Power System Vulnerability Analysis Based on Complex Network Theory. 2021 6th Asia Conference on Power and Electrical Engineering (ACPEE). :482—486.
The vulnerability assessment of the cyber-physical power system based on complex network theory is applied in this paper. The influence of the power system statistics upon the system vulnerability is studied based on complex network theory. The electrical betweenness is defined to suitably describe the power system characteristics. The real power systems are utilized as examples to analyze the distribution of the degree and betweenness of the power system as a complex network. The topology model of the cyber-physical power system is formed, and the static analysis is implemented to the study of the cyber-physical power system structural vulnerability. The IEEE 300 bus test system is selected to verify the model.
2022-07-28
[Anonymous].  2021.  An Automated Pipeline for Privacy Leak Analysis of Android Applications. 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1048—1050.
We propose an automated pipeline for analyzing privacy leaks in Android applications. By using a combination of dynamic and static analysis, we validate the results from each other to improve accuracy. Compare to the state-of-the-art approaches, we not only capture the network traffic for analysis, but also look into the data flows inside the application. We particularly focus on the privacy leakage caused by third-party services and high-risk permissions. The proposed automated approach will combine taint analysis, permission analysis, network traffic analysis, and dynamic function tracing during run-time to identify private information leaks. We further implement an automatic validation and complementation process to reduce false positives. A small-scale experiment has been conducted on 30 Android applications and a large-scale experiment on more than 10,000 Android applications is in progress.
Ruohonen, Jukka, Hjerppe, Kalle, Rindell, Kalle.  2021.  A Large-Scale Security-Oriented Static Analysis of Python Packages in PyPI. 2021 18th International Conference on Privacy, Security and Trust (PST). :1—10.
Different security issues are a common problem for open source packages archived to and delivered through software ecosystems. These often manifest themselves as software weaknesses that may lead to concrete software vulnerabilities. This paper examines various security issues in Python packages with static analysis. The dataset is based on a snapshot of all packages stored to the Python Package Index (PyPI). In total, over 197 thousand packages and over 749 thousand security issues are covered. Even under the constraints imposed by static analysis, (a) the results indicate prevalence of security issues; at least one issue is present for about 46% of the Python packages. In terms of the issue types, (b) exception handling and different code injections have been the most common issues. The subprocess module stands out in this regard. Reflecting the generally small size of the packages, (c) software size metrics do not predict well the amount of issues revealed through static analysis. With these results and the accompanying discussion, the paper contributes to the field of large-scale empirical studies for better understanding security problems in software ecosystems.
Wang, Jingjing, Huang, Minhuan, Nie, Yuanping, Li, Jin.  2021.  Static Analysis of Source Code Vulnerability Using Machine Learning Techniques: A Survey. 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD). :76—86.

With the rapid increase of practical problem complexity and code scale, the threat of software security is increasingly serious. Consequently, it is crucial to pay attention to the analysis of software source code vulnerability in the development stage and take efficient measures to detect the vulnerability as soon as possible. Machine learning techniques have made remarkable achievements in various fields. However, the application of machine learning in the domain of vulnerability static analysis is still in its infancy and the characteristics and performance of diverse methods are quite different. In this survey, we focus on a source code-oriented static vulnerability analysis method using machine learning techniques. We review the studies on source code vulnerability analysis based on machine learning in the past decade. We systematically summarize the development trends and different technical characteristics in this field from the perspectives of the intermediate representation of source code and vulnerability prediction model and put forward several feasible research directions in the future according to the limitations of the current approaches.

Obert, James, Loffredo, Tim.  2021.  Efficient Binary Static Code Data Flow Analysis Using Unsupervised Learning. 2021 4th International Conference on Artificial Intelligence for Industries (AI4I). :89—90.
The ever increasing need to ensure that code is reliably, efficiently and safely constructed has fueled the evolution of popular static binary code analysis tools. In identifying potential coding flaws in binaries, tools such as IDA Pro are used to disassemble the binaries into an opcode/assembly language format in support of manual static code analysis. Because of the highly manual and resource intensive nature involved with analyzing large binaries, the probability of overlooking potential coding irregularities and inefficiencies is quite high. In this paper, a light-weight, unsupervised data flow methodology is described which uses highly-correlated data flow graph (CDFGs) to identify coding irregularities such that analysis time and required computing resources are minimized. Such analysis accuracy and efficiency gains are achieved by using a combination of graph analysis and unsupervised machine learning techniques which allows an analyst to focus on the most statistically significant flow patterns while performing binary static code analysis.
Qian, Tiantian, Yang, Shengchun, Wang, Shenghe, Pan, Dong, Geng, Jian, Wang, Ke.  2021.  Static Security Analysis of Source-Side High Uncertainty Power Grid Based on Deep Learning. 2021 China International Conference on Electricity Distribution (CICED). :973—975.
As a large amount of renewable energy is injected into the power grid, the source side of the power grid becomes extremely uncertain. Traditional static safety analysis methods based on pure physical models can no longer quickly and reliably give analysis results. Therefore, this paper proposes a deep learning-based static security analytical method. First, the static security assessment index of the power grid under the N-1 principle is proposed. Secondly, a neural network model and its input and output data for static safety analysis problems are designed. Finally, the validity of the proposed method was verified by IEEE grid data. Experiments show that the proposed method can quickly and accurately give the static security analysis results of the source-side high uncertainty grid.
ÖZGÜR, Berkecan, Dogru, Ibrahim Alper, Uçtu, Göksel, ALKAN, Mustafa.  2021.  A Suggested Model for Mobile Application Penetration Test Framework. 2021 International Conference on Information Security and Cryptology (ISCTURKEY). :18—21.

Along with technological developments in the mobile environment, mobile devices are used in many areas like banking, social media and communication. The common characteristic of applications in these fields is that they contain personal or financial information of users. These types of applications are developed for Android or IOS operating systems and have become the target of attackers. To detect weakness, security analysts, perform mobile penetration tests using security analysis tools. These analysis tools have advantages and disadvantages to each other. Some tools can prioritize static or dynamic analysis, others not including these types of tests. Within the scope of the current model, we are aim to gather security analysis tools under the penetration testing framework, also contributing analysis results by data fusion algorithm. With the suggested model, security analysts will be able to use these types of analysis tools in addition to using the advantage of fusion algorithms fed by analysis tools outputs.

Ami, Amit Seal, Kafle, Kaushal, Nadkarni, Adwait, Poshyvanyk, Denys, Moran, Kevin.  2021.  µSE: Mutation-Based Evaluation of Security-Focused Static Analysis Tools for Android. 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). :53—56.
This demo paper presents the technical details and usage scenarios of μSE: a mutation-based tool for evaluating security-focused static analysis tools for Android. Mutation testing is generally used by software practitioners to assess the robustness of a given test-suite. However, we leverage this technique to systematically evaluate static analysis tools and uncover and document soundness issues.μSE's analysis has found 25 previously undocumented flaws in static data leak detection tools for Android.μSE offers four mutation schemes, namely Reachability, Complex-reachability, TaintSink, and ScopeSink, which determine the locations of seeded mutants. Furthermore, the user can extend μSE by customizing the API calls targeted by the mutation analysis.μSE is also practical, as it makes use of filtering techniques based on compilation and execution criteria that reduces the number of ineffective mutations.
Iqbal, Younis, Sindhu, Muddassar Azam, Arif, Muhammad Hassan, Javed, Muhammad Amir.  2021.  Enhancement in Buffer Overflow (BOF) Detection Capability of Cppcheck Static Analysis Tool. 2021 International Conference on Cyber Warfare and Security (ICCWS). :112—117.

Buffer overflow (BOF) vulnerability is one of the most dangerous security vulnerability which can be exploited by unwanted users. This vulnerability can be detected by both static and dynamic analysis techniques. For dynamic analysis, execution of the program is required in which the behavior of the program according to specifications is checked while in static analysis the source code is analyzed for security vulnerabilities without execution of code. Despite the fact that many open source and commercial security analysis tools employ static and dynamic methods but there is still a margin for improvement in BOF vulnerability detection capability of these tools. We propose an enhancement in Cppcheck tool for statically detecting BOF vulnerability using data flow analysis in C programs. We have used the Juliet Test Suite to test our approach. We selected two best tools cited in the literature for BOF detection (i.e. Frama-C and Splint) to compare the performance and accuracy of our approach. From the experiments, our proposed approach generated Youden Index of 0.45, Frama-C has only 0.1 Youden's score and Splint generated Youden score of -0.47. These results show that our technique performs better as compared to both Frama-C and Splint static analysis tools.

2022-07-14
Ayub, Md. Ahsan, Sirai, Ambareen.  2021.  Similarity Analysis of Ransomware based on Portable Executable (PE) File Metadata. 2021 IEEE Symposium Series on Computational Intelligence (SSCI). :1–6.
Threats, posed by ransomware, are rapidly increasing, and its cost on both national and global scales is becoming significantly high as evidenced by the recent events. Ransomware carries out an irreversible process, where it encrypts victims' digital assets to seek financial compensations. Adversaries utilize different means to gain initial access to the target machines, such as phishing emails, vulnerable public-facing software, Remote Desktop Protocol (RDP), brute-force attacks, and stolen accounts. To combat these threats of ransomware, this paper aims to help researchers gain a better understanding of ransomware application profiles through static analysis, where we identify a list of suspicious indicators and similarities among 727 active ran-somware samples. We start with generating portable executable (PE) metadata for all the studied samples. With our domain knowledge and exploratory data analysis tasks, we introduce some of the suspicious indicators of the structure of ransomware files. We reduce the dimensionality of the generated dataset by using the Principal Component Analysis (PCA) technique and discover clusters by applying the KMeans algorithm. This motivates us to utilize the one-class classification algorithms on the generated dataset. As a result, the algorithms learn the common data boundary in the structure of our studied ransomware samples, and thereby, we achieve the data-driven similarities. We use the findings to evaluate the trained classifiers with the test samples and observe that the Local Outlier Factor (LoF) performs better on all the selected feature spaces compared to the One-Class SVM and the Isolation Forest algorithms.
2022-05-19
Shimchik, N. V., Ignatyev, V. N., Belevantsev, A. A..  2021.  Improving Accuracy and Completeness of Source Code Static Taint Analysis. 2021 Ivannikov Ispras Open Conference (ISPRAS). :61–68.

Static analysis is a general name for various methods of program examination without actually executing it. In particular, it is widely used to discover errors and vulnerabilities in software. Taint analysis usually denotes the process of checking the flow of user-provided data in the program in order to find potential vulnerabilities. It can be performed either statically or dynamically. In the paper we evaluate several improvements for the static taint analyzer Irbis [1], which is based on a special case of interprocedural graph reachability problem - the so-called IFDS problem, originally proposed by Reps et al. [2]. The analyzer is currently being developed at the Ivannikov Institute for System Programming of the Russian Academy of Sciences (ISP RAS). The evaluation is based on several real projects with known vulnerabilities and a subset of the Juliet Test Suite for C/C++ [3]. The chosen subset consists of more than 5 thousand tests for 11 different CWEs.

Chen, Xiarun, Li, Qien, Yang, Zhou, Liu, Yongzhi, Shi, Shaosen, Xie, Chenglin, Wen, Weiping.  2021.  VulChecker: Achieving More Effective Taint Analysis by Identifying Sanitizers Automatically. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :774–782.
The automatic detection of vulnerabilities in Web applications using taint analysis is a hot topic. However, existing taint analysis methods for sanitizers identification are too simple to find available taint transmission chains effectively. These methods generally use pre-constructed dictionaries or simple keywords to identify, which usually suffer from large false positives and false negatives. No doubt, it will have a greater impact on the final result of the taint analysis. To solve that, we summarise and classify the commonly used sanitizers in Web applications and propose an identification method based on semantic analysis. Our method can accurately and completely identify the sanitizers in the target Web applications through static analysis. Specifically, we analyse the natural semantics and program semantics of existing sanitizers, use semantic analysis to find more in Web applications. Besides, we implemented the method prototype in PHP and achieved a vulnerability detection tool called VulChecker. Then, we experimented with some popular open-source CMS frameworks. The results show that Vulchecker can accurately identify more sanitizers. In terms of vulnerability detection, VulChecker also has a lower false positive rate and a higher detection rate than existing methods. Finally, we used VulChecker to analyse the latest PHP applications. We identified several new suspicious taint data propagation chains. Before the paper was completed, we have identified four unreported vulnerabilities. In general, these results show that our approach is highly effective in improving vulnerability detection based on taint analysis.
Zhang, Xueling, Wang, Xiaoyin, Slavin, Rocky, Niu, Jianwei.  2021.  ConDySTA: Context-Aware Dynamic Supplement to Static Taint Analysis. 2021 IEEE Symposium on Security and Privacy (SP). :796–812.
Static taint analyses are widely-applied techniques to detect taint flows in software systems. Although they are theoretically conservative and de-signed to detect all possible taint flows, static taint analyses almost always exhibit false negatives due to a variety of implementation limitations. Dynamic programming language features, inaccessible code, and the usage of multiple programming languages in a software project are some of the major causes. To alleviate this problem, we developed a novel approach, DySTA, which uses dynamic taint analysis results as additional sources for static taint analysis. However, naïvely adding sources causes static analysis to lose context sensitivity and thus produce false positives. Thus, we developed a hybrid context matching algorithm and corresponding tool, ConDySTA, to preserve context sensitivity in DySTA. We applied REPRODROID [1], a comprehensive benchmarking framework for Android analysis tools, to evaluate ConDySTA. The results show that across 28 apps (1) ConDySTA was able to detect 12 out of 28 taint flows which were not detected by any of the six state-of-the-art static taint analyses considered in ReproDroid, and (2) ConDySTA reported no false positives, whereas nine were reported by DySTA alone. We further applied ConDySTA and FlowDroid to 100 top Android apps from Google Play, and ConDySTA was able to detect 39 additional taint flows (besides 281 taint flows found by FlowDroid) while preserving the context sensitivity of FlowDroid.
Piskachev, Goran, Krishnamurthy, Ranjith, Bodden, Eric.  2021.  SecuCheck: Engineering configurable taint analysis for software developers. 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM). :24–29.
Due to its ability to detect many frequently occurring security vulnerabilities, taint analysis is one of the core static analyses used by many static application security testing (SAST) tools. Previous studies have identified issues that software developers face with SAST tools. This paper reports on our experience in building a configurable taint analysis tool, named SecuCheck, that runs in multiple integrated development environments. SecuCheck is built on top of multiple existing components and comes with a Java-internal domain-specific language fluentTQL for specifying taint-flows, designed for software developers. We evaluate the applicability of SecuCheck in detecting eleven taint-style vulnerabilities in microbench programs and three real-world Java applications with known vulnerabilities. Empirically, we identify factors that impact the runtime of SecuCheck.
Aljubory, Nawaf, Khammas, Ban Mohammed.  2021.  Hybrid Evolutionary Approach in Feature Vector for Ransomware Detection. 2021 International Conference on Intelligent Technology, System and Service for Internet of Everything (ITSS-IoE). :1–6.

Ransomware is one of the most serious threats which constitute a significant challenge in the cybersecurity field. The cybercriminals use this attack to encrypts the victim's files or infect the victim's devices to demand ransom in exchange to restore access to these files and devices. The escalating threat of Ransomware to thousands of individuals and companies requires an urgent need for creating a system capable of proactively detecting and preventing ransomware. In this research, a new approach is proposed to detect and classify ransomware based on three machine learning algorithms (Random Forest, Support Vector Machines , and Näive Bayes). The features set was extracted directly from raw byte using static analysis technique of samples to improve the detection speed. To offer the best detection accuracy, CF-NCF (Class Frequency - Non-Class Frequency) has been utilized for generate features vectors. The proposed approach can differentiate between ransomware and goodware files with a detection accuracy of up to 98.33 percent.

2022-05-12
Şengül, Özkan, Özkılıçaslan, Hasan, Arda, Emrecan, Yavanoğlu, Uraz, Dogru, Ibrahim Alper, Selçuk, Ali Aydın.  2021.  Implementing a Method for Docker Image Security. 2021 International Conference on Information Security and Cryptology (ISCTURKEY). :34–39.
Containers that can be easily created, transported and scaled with the use of container-based virtualization technologies work better than classical virtualization technologies and provide efficient resource usage. The Docker platform is one of the most widely used solutions among container-based virtualization technologies. The OS-level virtualization of the Docker platform and the container’s use of the host operating system kernel may cause security problems. In this study, a method including static and dynamic analysis has been proposed to ensure Docker image and container security. In the static analysis phase of the method, the packages of the images are scanned for vulnerabilities and malware. In the dynamic analysis phase, Docker containers are run for a certain period of time, after the open port scanning, network traffic is analyzed with the Snort3. Seven Docker images are analyzed and the results are shared.
2022-05-10
Pereira, José D'Abruzzo, Antunes, João Henggeler, Vieira, Marco.  2021.  On Building a Vulnerability Dataset with Static Information from the Source Code. 2021 10th Latin-American Symposium on Dependable Computing (LADC). :1–2.

Software vulnerabilities are weaknesses in software systems that can have serious consequences when exploited. Examples of side effects include unauthorized authentication, data breaches, and financial losses. Due to the nature of the software industry, companies are increasingly pressured to deploy software as quickly as possible, leading to a large number of undetected software vulnerabilities. Static code analysis, with the support of Static Analysis Tools (SATs), can generate security alerts that highlight potential vulnerabilities in an application's source code. Software Metrics (SMs) have also been used to predict software vulnerabilities, usually with the support of Machine Learning (ML) classification algorithms. Several datasets are available to support the development of improved software vulnerability detection techniques. However, they suffer from the same issues: they are either outdated or use a single type of information. In this paper, we present a methodology for collecting software vulnerabilities from known vulnerability databases and enhancing them with static information (namely SAT alerts and SMs). The proposed methodology aims to define a mechanism capable of more easily updating the collected data.

2022-04-01
Pereira, José D'Abruzzo, Campos, João R., Vieira, Marco.  2021.  Machine Learning to Combine Static Analysis Alerts with Software Metrics to Detect Security Vulnerabilities: An Empirical Study. 2021 17th European Dependable Computing Conference (EDCC). :1—8.

Software developers can use diverse techniques and tools to reduce the number of vulnerabilities, but the effectiveness of existing solutions in real projects is questionable. For example, Static Analysis Tools (SATs) report potential vulnerabilities by analyzing code patterns, and Software Metrics (SMs) can be used to predict vulnerabilities based on high-level characteristics of the code. In theory, both approaches can be applied from the early stages of the development process, but it is well known that they fail to detect critical vulnerabilities and raise a large number of false alarms. This paper studies the hypothesis of using Machine Learning (ML) to combine alerts from SATs with SMs to predict vulnerabilities in a large software project (under development for many years). In practice, we use four ML algorithms, alerts from two SATs, and a large number of SMs to predict whether a source code file is vulnerable or not (binary classification) and to predict the vulnerability category (multiclass classification). Results show that one can achieve either high precision or high recall, but not both at the same time. To understand the reason, we analyze and compare snippets of source code, demonstrating that vulnerable and non-vulnerable files share similar characteristics, making it hard to distinguish vulnerable from non-vulnerable code based on SAT alerts and SMs.

2022-02-24
Zhou, Andy, Sultana, Kazi Zakia, Samanthula, Bharath K..  2021.  Investigating the Changes in Software Metrics after Vulnerability Is Fixed. 2021 IEEE International Conference on Big Data (Big Data). :5658–5663.
Preventing software vulnerabilities while writing code is one of the most effective ways for avoiding cyber attacks on any developed system. Although developers follow some standard guiding principles for ensuring secure code, the code can still have security bottlenecks and be compromised by an attacker. Therefore, assessing software security while developing code can help developers in writing vulnerability free code. Researchers have already focused on metrics-based and text mining based software vulnerability prediction models. The metrics based models showed higher precision in predicting vulnerabilities although the recall rate is low. In addition, current research did not investigate the impact of individual software metric on the occurrences of vulnerabilities. The main objective of this paper is to track the changes in every software metric after the developer fixes a particular vulnerability. The results of our research will potentially motivate further research on building more accurate vulnerability prediction models based on the appropriate software metrics. In particular, we have compared a total of 250 files from Apache Tomcat and Apache CXF. These files were extracted from the Apache database and were chosen because Apache released these files as vulnerable in their publicly available security advisories. Using a static analysis tool, metrics of the targeted vulnerable files and relevant fixed files (files where vulnerable code is removed by the developers) were extracted and compared. We show that eight of the 40 metrics have an average increase of 2% from vulnerable to fixed files. These metrics include CountDeclClass, CountDeclClassMethod, CountDeclClassVariable, CountDeclInstanceVariable, CountDeclMethodDefault, CountLineCode, MaxCyclomaticStrict, MaxNesting. This study will help developers to assess software security through utilizing software metrics in secure coding practices.
2022-02-07
Kita, Kouhei, Uda, Ryuya.  2021.  Malware Subspecies Detection Method by Suffix Arrays and Machine Learning. 2021 55th Annual Conference on Information Sciences and Systems (CISS). :1–6.
Malware such as metamorphic virus changes its codes and it cannot be detected by pattern matching. Such malware can be detected by surface analysis, dynamic analysis or static analysis. We focused on surface analysis since neither virtual environments nor high level engineering is required. A representative method in surface analysis is n-gram with machine learning. On the other hand, important features are sometimes cut off by n-gram since n is not variable in some existing methods. Hence, scores of malware detection methods are not perfect. Moreover, creating n-gram features takes long time for comparing files. Furthermore, in some n-gram methods, invisible malware can be created when the methods are known to attackers. Therefore, we proposed a new malware subspecies detection method by suffix arrays and machine learning. We evaluated the method with four real malware subspecies families and succeeded to classify them with almost 100% accuracy.
Gülmez, Sibel, Sogukpinar, Ibrahim.  2021.  Graph-Based Malware Detection Using Opcode Sequences. 2021 9th International Symposium on Digital Forensics and Security (ISDFS). :1–5.
The impact of malware grows for IT (information technology) systems day by day. The number, the complexity, and the cost of them increase rapidly. While researchers are developing new and better detection algorithms, attackers are also evolving malware to fail the current detection techniques. Therefore malware detection becomes one of the most challenging tasks in cyber security. To increase the performance of the detection techniques, researchers benefit from different approaches. But some of them might cost a lot both in time and hardware resources. This situation puts forward fast and cheap detection methods. In this context, static analysis provides these utilities but it is important to keep detection accuracy high while reducing resource consumption. Opcodes (operational codes) are commonly used in static analysis but sometimes feature extraction from opcodes might be difficult since an opcode sequence might have a great length. Furthermore, most of the malware developers use obfuscation and encryption techniques to avoid detection methods based on static analysis. This kind of malware is called packed malware and according to common belief, packed malware should be either unpacked or analyzed dynamically in order to detect them. In this study, a graph-based malware detection method has been proposed to overcome these problems. The proposed method relies on obtaining the opcode graph of every executable file in the dataset and using them for future extraction. In this way, the proposed method reaches up to 98% detection accuracy. In addition to the accuracy rate, the proposed method makes it possible to detect packed malware without the need for unpacking or dynamic analysis.