Visible to the public Biblio

Filters: Author is Wu, Xiaoxue  [Clear All Filters]
2023-05-12
Bo, Lili, Meng, Xing, Sun, Xiaobing, Xia, Jingli, Wu, Xiaoxue.  2022.  A Comprehensive Analysis of NVD Concurrency Vulnerabilities. 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS). :9–18.

Concurrency vulnerabilities caused by synchronization problems will occur in the execution of multi-threaded programs, and the emergence of concurrency vulnerabilities often cause great threats to the system. Once the concurrency vulnerabilities are exploited, the system will suffer various attacks, seriously affecting its availability, confidentiality and security. In this paper, we extract 839 concurrency vulnerabilities from Common Vulnerabilities and Exposures (CVE), and conduct a comprehensive analysis of the trend, classifications, causes, severity, and impact. Finally, we obtained some findings: 1) From 1999 to 2021, the number of concurrency vulnerabilities disclosures show an overall upward trend. 2) In the distribution of concurrency vulnerability, race condition accounts for the largest proportion. 3) The overall severity of concurrency vulnerabilities is medium risk. 4) The number of concurrency vulnerabilities that can be exploited for local access and network access is almost equal, and nearly half of the concurrency vulnerabilities (377/839) can be accessed remotely. 5) The access complexity of 571 concurrency vulnerabilities is medium, and the number of concurrency vulnerabilities with high or low access complexity is almost equal. The results obtained through the empirical study can provide more support and guidance for research in the field of concurrency vulnerabilities.

ISSN: 2693-9177

2022-05-10
Zheng, Wei, Abdallah Semasaba, Abubakar Omari, Wu, Xiaoxue, Agyemang, Samuel Akwasi, Liu, Tao, Ge, Yuan.  2021.  Representation vs. Model: What Matters Most for Source Code Vulnerability Detection. 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). :647–653.
Vulnerabilities in the source code of software are critical issues in the realm of software engineering. Coping with vulnerabilities in software source code is becoming more challenging due to several aspects of complexity and volume. Deep learning has gained popularity throughout the years as a means of addressing such issues. In this paper, we propose an evaluation of vulnerability detection performance on source code representations and evaluate how Machine Learning (ML) strategies can improve them. The structure of our experiment consists of 3 Deep Neural Networks (DNNs) in conjunction with five different source code representations; Abstract Syntax Trees (ASTs), Code Gadgets (CGs), Semantics-based Vulnerability Candidates (SeVCs), Lexed Code Representations (LCRs), and Composite Code Representations (CCRs). Experimental results show that employing different ML strategies in conjunction with the base model structure influences the performance results to a varying degree. However, ML-based techniques suffer from poor performance on class imbalance handling when used in conjunction with source code representations for software vulnerability detection.
2021-05-18
Zheng, Wei, Gao, Jialiang, Wu, Xiaoxue, Xun, Yuxing, Liu, Guoliang, Chen, Xiang.  2020.  An Empirical Study of High-Impact Factors for Machine Learning-Based Vulnerability Detection. 2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF). :26–34.
Ahstract-Vulnerability detection is an important topic of software engineering. To improve the effectiveness and efficiency of vulnerability detection, many traditional machine learning-based and deep learning-based vulnerability detection methods have been proposed. However, the impact of different factors on vulnerability detection is unknown. For example, classification models and vectorization methods can directly affect the detection results and code replacement can affect the features of vulnerability detection. We conduct a comparative study to evaluate the impact of different classification algorithms, vectorization methods and user-defined variables and functions name replacement. In this paper, we collected three different vulnerability code datasets. These datasets correspond to different types of vulnerabilities and have different proportions of source code. Besides, we extract and analyze the features of vulnerability code datasets to explain some experimental results. Our findings from the experimental results can be summarized as follows: (i) the performance of using deep learning is better than using traditional machine learning and BLSTM can achieve the best performance. (ii) CountVectorizer can improve the performance of traditional machine learning. (iii) Different vulnerability types and different code sources will generate different features. We use the Random Forest algorithm to generate the features of vulnerability code datasets. These generated features include system-related functions, syntax keywords, and user-defined names. (iv) Datasets without user-defined variables and functions name replacement will achieve better vulnerability detection results.