Visible to the public Biblio

Filters: Keyword is large scale  [Clear All Filters]
2023-02-13
Wu, Yueming, Zou, Deqing, Dou, Shihan, Yang, Wei, Xu, Duo, Jin, Hai.  2022.  VulCNN: An Image-inspired Scalable Vulnerability Detection System. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). :2365—2376.
Since deep learning (DL) can automatically learn features from source code, it has been widely used to detect source code vulnerability. To achieve scalable vulnerability scanning, some prior studies intend to process the source code directly by treating them as text. To achieve accurate vulnerability detection, other approaches consider distilling the program semantics into graph representations and using them to detect vulnerability. In practice, text-based techniques are scalable but not accurate due to the lack of program semantics. Graph-based methods are accurate but not scalable since graph analysis is typically time-consuming. In this paper, we aim to achieve both scalability and accuracy on scanning large-scale source code vulnerabilities. Inspired by existing DL-based image classification which has the ability to analyze millions of images accurately, we prefer to use these techniques to accomplish our purpose. Specifically, we propose a novel idea that can efficiently convert the source code of a function into an image while preserving the program details. We implement Vul-CNN and evaluate it on a dataset of 13,687 vulnerable functions and 26,970 non-vulnerable functions. Experimental results report that VulCNN can achieve better accuracy than eight state-of-the-art vul-nerability detectors (i.e., Checkmarx, FlawFinder, RATS, TokenCNN, VulDeePecker, SySeVR, VulDeeLocator, and Devign). As for scalability, VulCNN is about four times faster than VulDeePecker and SySeVR, about 15 times faster than VulDeeLocator, and about six times faster than Devign. Furthermore, we conduct a case study on more than 25 million lines of code and the result indicates that VulCNN can detect large-scale vulnerability. Through the scanning reports, we finally discover 73 vulnerabilities that are not reported in NVD.
2019-05-01
Jiang, Yikun, Xie, Wei, Tang, Yong.  2018.  Detecting Authentication-Bypass Flaws in a Large Scale of IoT Embedded Web Servers. Proceedings of the 8th International Conference on Communication and Network Security. :56–63.

With the rapid development of network and communication technologies, everything is able to be connected to the Internet. IoT devices, which include home routers, IP cameras, wireless printers and so on, are crucial parts facilitating to build pervasive and ubiquitous networks. As the number of IoT devices around the world increases, the security issues become more and more serious. To handle with the security issues and protect the IoT devices from being compromised, the firmware of devices needs to be strengthened by discovering and repairing vulnerabilities. Current vulnerability detection tools can only help strengthening traditional software, nevertheless these tools are not practical enough for IoT device firmware, because of the peculiarity in firmware's structure and embedded device's architecture. Therefore, new vulnerability detection framework is required for analyzing IoT device firmware. This paper reviews related works on vulnerability detection in IoT firmware, proposes and implements a framework to automatically detect authentication-bypass flaws in a large scale of Linux-based firmware. The proposed framework is evaluated with a data set of 2351 firmware images from several target vendors, which is proved to be capable of performing large-scale and automated analysis on firmware, and 1 known and 10 unknown authentication-bypass flaws are found by the analysis.

2018-05-30
Chen, Yi, You, Wei, Lee, Yeonjoon, Chen, Kai, Wang, XiaoFeng, Zou, Wei.  2017.  Mass Discovery of Android Traffic Imprints Through Instantiated Partial Execution. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. :815–828.
Monitoring network behaviors of mobile applications, controlling their resource access and detecting potentially harmful apps are becoming increasingly important for the security protection within today's organizational, ISP and carriers. For this purpose, apps need to be identified from their communication, based upon their individual traffic signatures (called imprints in our research). Creating imprints for a large number of apps is nontrivial, due to the challenges in comprehensively analyzing their network activities at a large scale, for millions of apps on today's rapidly-growing app marketplaces. Prior research relies on automatic exploration of an app's user interfaces (UIs) to trigger its network activities, which is less likely to scale given the cost of the operation (at least 5 minutes per app) and its effectiveness (limited coverage of an app's behaviors). In this paper, we present Tiger (Traffic Imprint Generator), a novel technique that makes comprehensive app imprint generation possible in a massive scale. At the center of Tiger is a unique instantiated slicing technique, which aggressively prunes the program slice extracted from the app's network-related code by evaluating each variable's impact on possible network invariants, and removing those unlikely to contribute through assigning them concrete values. In this way, Tiger avoids exploring a large number of program paths unrelated to the app's identifiable traffic, thereby reducing the cost of the code analysis by more than one order of magnitude, in comparison with the conventional slicing and execution approach. Our experiments show that Tiger is capable of recovering an app's full network activities within 18 seconds, achieving over 98% coverage of its identifiable packets and 0.742% false detection rate on app identification. Further running the technique on over 200,000 real-world Android apps (including 78.23% potentially harmful apps) leads to the discovery of surprising new types of traffic invariants, including fake device information, hardcoded time values, session IDs and credentials, as well as complicated trigger conditions for an app's network activities, such as human involvement, Intent trigger and server-side instructions. Our findings demonstrate that many network activities cannot easily be invoked through automatic UI exploration and code-analysis based approaches present a promising alternative.
2017-09-26
Papadopoulos, Georgios Z., Gallais, Antoine, Schreiner, Guillaume, Noël, Thomas.  2016.  Importance of Repeatable Setups for Reproducible Experimental Results in IoT. Proceedings of the 13th ACM Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, & Ubiquitous Networks. :51–59.

Performance analysis of newly designed solutions is essential for efficient Internet of Things and Wireless Sensor Network (WSN) deployments. Simulation and experimental evaluation practices are vital steps for the development process of protocols and applications for wireless technologies. Nowadays, the new solutions can be tested at a very large scale over both simulators and testbeds. In this paper, we first discuss the importance of repeatable experimental setups for reproducible performance evaluation results. To this aim, we present FIT IoT-LAB, a very large-scale and experimental testbed, i.e., consists of 2769 low-power wireless devices and 127 mobile robots. We then demonstrate through a number of experiments conducted on FIT IoT-LAB testbed, how to conduct meaningful experiments under real-world conditions. Finally, we discuss to what extent results obtained from experiments could be considered as scientific, i.e., reproducible by the community.