Visible to the public Biblio

Filters: Keyword is program testing  [Clear All Filters]
Westland, T., Niu, N., Jha, R., Kapp, D., Kebede, T..  2020.  Relating the Empirical Foundations of Attack Generation and Vulnerability Discovery. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :37–44.
Automatically generating exploits for attacks receives much attention in security testing and auditing. However, little is known about the continuous effect of automatic attack generation and detection. In this paper, we develop an analytic model to understand the cost-benefit tradeoffs in light of the process of vulnerability discovery. We develop a three-phased model, suggesting that the cumulative malware detection has a productive period before the rate of gain flattens. As the detection mechanisms co-evolve, the gain will likely increase. We evaluate our analytic model by using an anti-virus tool to detect the thousands of Trojans automatically created. The anti-virus scanning results over five months show the validity of the model and point out future research directions.
Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M..  2020.  Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

Simon, L., Verma, A..  2020.  Improving Fuzzing through Controlled Compilation. 2020 IEEE European Symposium on Security and Privacy (EuroS P). :34–52.
We observe that operations performed by standard compilers harm fuzzing because the optimizations and the Intermediate Representation (IR) lead to transformations that improve execution speed at the expense of fuzzing. To remedy this problem, we propose `controlled compilation', a set of techniques to automatically re-factor a program's source code and cherry pick beneficial compiler optimizations to improve fuzzing. We design, implement and evaluate controlled compilation by building a new toolchain with Clang/LLVM. We perform an evaluation on 10 open source projects and compare the results of AFL to state-of-the-art grey-box fuzzers and concolic fuzzers. We show that when programs are compiled with this new toolchain, AFL covers 30 % new code on average and finds 21 additional bugs in real world programs. Our study reveals that controlled compilation often covers more code and finds more bugs than state-of-the-art fuzzing techniques, without the need to write a fuzzer from scratch or resort to advanced techniques. We identify two main reasons to explain why. First, it has proven difficult for researchers to appropriately configure existing fuzzers such as AFL. To address this problem, we provide guidelines and new LLVM passes to help automate AFL's configuration. This will enable researchers to perform a fairer comparison with AFL. Second, we find that current coverage-based evaluation measures (e.g. the total number of visited lines, edges or BBs) are inadequate because they lose valuable information such as which parts of a program a fuzzer actually visits and how consistently it does so. Coverage is considered a useful metric to evaluate a fuzzer's performance and devise a fuzzing strategy. However, the lack of a standard methodology for evaluating coverage remains a problem. To address this, we propose a rigorous evaluation methodology based on `qualitative coverage'. Qualitative coverage uniquely identifies each program line to help understand which lines are commonly visited by different fuzzers vs. which lines are visited only by a particular fuzzer. Throughout our study, we show the benefits of this new evaluation methodology. For example we provide valuable insights into the consistency of fuzzers, i.e. their ability to cover the same code or find the same bug across multiple independent runs. Overall, our evaluation methodology based on qualitative coverage helps to understand if a fuzzer performs better, worse, or is complementary to another fuzzer. This helps security practitioners adjust their fuzzing strategies.
Mashhadi, M. J., Hemmati, H..  2020.  Hybrid Deep Neural Networks to Infer State Models of Black-Box Systems. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :299–311.
Inferring behavior model of a running software system is quite useful for several automated software engineering tasks, such as program comprehension, anomaly detection, and testing. Most existing dynamic model inference techniques are white-box, i.e., they require source code to be instrumented to get run-time traces. However, in many systems, instrumenting the entire source code is not possible (e.g., when using black-box third-party libraries) or might be very costly. Unfortunately, most black-box techniques that detect states over time are either univariate, or make assumptions on the data distribution, or have limited power for learning over a long period of past behavior. To overcome the above issues, in this paper, we propose a hybrid deep neural network that accepts as input a set of time series, one per input/output signal of the system, and applies a set of convolutional and recurrent layers to learn the non-linear correlations between signals and the patterns, over time. We have applied our approach on a real UAV auto-pilot solution from our industry partner with half a million lines of C code. We ran 888 random recent system-level test cases and inferred states, over time. Our comparison with several traditional time series change point detection techniques showed that our approach improves their performance by up to 102%, in terms of finding state change points, measured by F1 score. We also showed that our state classification algorithm provides on average 90.45% F1 score, which improves traditional classification algorithms by up to 17%.
Golagha, M., Pretschner, A., Briand, L. C..  2020.  Can We Predict the Quality of Spectrum-based Fault Localization? 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). :4–15.
Fault localization and repair are time-consuming and tedious. There is a significant and growing need for automated techniques to support such tasks. Despite significant progress in this area, existing fault localization techniques are not widely applied in practice yet and their effectiveness varies greatly from case to case. Existing work suggests new algorithms and ideas as well as adjustments to the test suites to improve the effectiveness of automated fault localization. However, important questions remain open: Why is the effectiveness of these techniques so unpredictable? What are the factors that influence the effectiveness of fault localization? Can we accurately predict fault localization effectiveness? In this paper, we try to answer these questions by collecting 70 static, dynamic, test suite, and fault-related metrics that we hypothesize are related to effectiveness. Our analysis shows that a combination of only a few static, dynamic, and test metrics enables the construction of a prediction model with excellent discrimination power between levels of effectiveness (eight metrics yielding an AUC of .86; fifteen metrics yielding an AUC of.88). The model hence yields a practically useful confidence factor that can be used to assess the potential effectiveness of fault localization. Given that the metrics are the most influential metrics explaining the effectiveness of fault localization, they can also be used as a guide for corrective actions on code and test suites leading to more effective fault localization.
Li, Y., Yang, Y., Yu, X., Yang, T., Dong, L., Wang, W..  2020.  IoT-APIScanner: Detecting API Unauthorized Access Vulnerabilities of IoT Platform. 2020 29th International Conference on Computer Communications and Networks (ICCCN). :1—5.

The Internet of Things enables interaction between IoT devices and users through the cloud. The cloud provides services such as account monitoring, device management, and device control. As the center of the IoT platform, the cloud provides services to IoT devices and IoT applications through APIs. Therefore, the permission verification of the API is essential. However, we found that some APIs are unverified, which allows unauthorized users to access cloud resources or control devices; it could threaten the security of devices and cloud. To check for unauthorized access to the API, we developed IoT-APIScanner, a framework to check the permission verification of the cloud API. Through observation, we found there is a large amount of interactive information between IoT application and cloud, which include the APIs and related parameters, so we can extract them by analyzing the code of the IoT application, and use this for mutating API test cases. Through these test cases, we can effectively check the permissions of the API. In our research, we extracted a total of 5 platform APIs. Among them, the proportion of APIs without permission verification reached 13.3%. Our research shows that attackers could use the API without permission verification to obtain user privacy or control of devices.

Maksutov, A. A., Dmitriev, S. O., Lysenkov, V. I., Valter, D. A..  2018.  Mobile bootloader with security features. 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :335—338.
Modern mobile operating systems store a lot of excessive information that can be used against its owner or organization, like a call history or various system logs. This article describes a universal way of preventing any mobile operating system or application from saving its data in device's internal storage without reducing their functionality. The goal of this work is creation of a software that solves the described problem and works on the bootloading stage. A general algorithm of the designed software, along with its main solutions and requirements, is presented in this paper. Hardware requirement, software testing results and general applications of this software are also listed in this paper.
Su, H., Halak, B., Zwolinski, M..  2019.  Two-Stage Architectures for Resilient Lightweight PUFs. 2019 IEEE 4th International Verification and Security Workshop (IVSW). :19–24.
The following topics are dealt with: Internet of Things; invasive software; security of data; program testing; reverse engineering; product codes; binary codes; decoding; maximum likelihood decoding; field programmable gate arrays.
[Anonymous].  2018.  Cloud-based Labs and Programming Assignments in Networking and Cybersecurity Courses. 2018 IEEE Frontiers in Education Conference (FIE). :1—9.

This is a full paper for innovate practice. Building a private cloud or using a public cloud is now feasible at many institutions. This paper presents the innovative design of cloudbased labs and programming assignments for a networking course and a cybersecurity course, and our experiences of innovatively using the private cloud at our institution to support these learning activities. It is shown by the instructor's observations and student survey data that our approach benefits learning and teaching. This approach makes it possible and secure to develop some learning activities that otherwise would not be allowed on physical servers. It enables the instructor to support students' desire of developing programs in their preferred programming languages. It allows students to debug and test their programs on the same platform to be used by the instructor for testing and grading. The instructor does not need to spend extra time administrating the computing environments. A majority (88% or more) of the students agree that working on those learning activities in the private cloud not only helps them achieve the course learning objectives, but also prepares them for their future careers.

Švábenský, V., Vykopal, J..  2018.  Gathering Insights from Teenagers’ Hacking Experience with Authentic Cybersecurity Tools. 2018 IEEE Frontiers in Education Conference (FIE). :1—4.

This Work-In-Progress Paper for the Innovative Practice Category presents a novel experiment in active learning of cybersecurity. We introduced a new workshop on hacking for an existing science-popularizing program at our university. The workshop participants, 28 teenagers, played a cybersecurity game designed for training undergraduates and professionals in penetration testing. Unlike in learning environments that are simplified for young learners, the game features a realistic virtual network infrastructure. This allows exploring security tools in an authentic scenario, which is complemented by a background story. Our research aim is to examine how young players approach using cybersecurity tools by interacting with the professional game. A preliminary analysis of the game session showed several challenges that the workshop participants faced. Nevertheless, they reported learning about security tools and exploits, and 61% of them reported wanting to learn more about cybersecurity after the workshop. Our results support the notion that young learners should be allowed more hands-on experience with security topics, both in formal education and informal extracurricular events.

Sultana, K. Z., Williams, B. J., Bosu, A..  2018.  A Comparison of Nano-Patterns vs. Software Metrics in Vulnerability Prediction. 2018 25th Asia-Pacific Software Engineering Conference (APSEC). :355—364.

Context: Software security is an imperative aspect of software quality. Early detection of vulnerable code during development can better ensure the security of the codebase and minimize testing efforts. Although traditional software metrics are used for early detection of vulnerabilities, they do not clearly address the granularity level of the issue to precisely pinpoint vulnerabilities. The goal of this study is to employ method-level traceable patterns (nano-patterns) in vulnerability prediction and empirically compare their performance with traditional software metrics. The concept of nano-patterns is similar to design patterns, but these constructs can be automatically recognized and extracted from source code. If nano-patterns can better predict vulnerable methods compared to software metrics, they can be used in developing vulnerability prediction models with better accuracy. Aims: This study explores the performance of method-level patterns in vulnerability prediction. We also compare them with method-level software metrics. Method: We studied vulnerabilities reported for two major releases of Apache Tomcat (6 and 7), Apache CXF, and two stand-alone Java web applications. We used three machine learning techniques to predict vulnerabilities using nano-patterns as features. We applied the same techniques using method-level software metrics as features and compared their performance with nano-patterns. Results: We found that nano-patterns show lower false negative rates for classifying vulnerable methods (for Tomcat 6, 21% vs 34.7%) and therefore, have higher recall in predicting vulnerable code than the software metrics used. On the other hand, software metrics show higher precision than nano-patterns (79.4% vs 76.6%). Conclusion: In summary, we suggest developers use nano-patterns as features for vulnerability prediction to augment existing approaches as these code constructs outperform standard metrics in terms of prediction recall.

Huang, B., Zhang, P..  2018.  Software Runtime Accumulative Testing. 2018 12th International Conference on Reliability, Maintainability, and Safety (ICRMS). :218—222.

The "aging" phenomenon occurs after the long-term running of software, with the fault rate rising and running efficiency dropping. As there is no corresponding testing type for this phenomenon among conventional software tests, "software runtime accumulative testing" is proposed. Through analyzing several examples of software aging causing serious accidents, software is placed in the system environment required for running and the occurrence mechanism of software aging is analyzed. In addition, corresponding testing contents and recommended testing methods are designed with regard to all factors causing software aging, and the testing process and key points of testing requirement analysis for carrying out runtime accumulative testing are summarized, thereby providing a method and guidance for carrying out "software runtime accumulative testing" in software engineering.

Zong, P., Wang, Y., Xie, F..  2018.  Embedded Software Fault Prediction Based on Back Propagation Neural Network. 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). :553—558.

Predicting software faults before software testing activities can help rational distribution of time and resources. Software metrics are used for software fault prediction due to their close relationship with software faults. Thanks to the non-linear fitting ability, Neural networks are increasingly used in the prediction model. We first filter metric set of the embedded software by statistical methods to reduce the dimensions of model input. Then we build a back propagation neural network with simple structure but good performance and apply it to two practical embedded software projects. The verification results show that the model has good ability to predict software faults.

Chamarthi, R., Reddy, A. P..  2018.  Empirical Methodology of Testing Using FMEA and Quality Metrics. 2018 International Conference on Inventive Research in Computing Applications (ICIRCA). :85—90.

Testing which is an indispensable part of software engineering is itself an art and science which emerged as a discipline over a period. On testing, if defects are found, testers diminish the risk by providing the awareness of defects and solutions to deal with them before release. If testing does not find any defects, testing assure that under certain conditions the system functions correctly. To guarantee that enough testing has been done, major risk areas need to be tested. We have to identify the risks, analyse and control them. We need to categorize the risk items to decide the extent of testing to be covered. Also, Implementation of structured metrics is lagging in software testing. Efficient metrics are necessary to evaluate, manage the testing process and make testing a part of engineering discipline. This paper proposes the usage of risk based testing using FMEA technique and provides an ideal set of metrics which provides a way to ensure effective testing process.

Huang, S., Chen, Q., Chen, Z., Chen, L., Liu, J., Yang, S..  2019.  A Test Cases Generation Technique Based on an Adversarial Samples Generation Algorithm for Image Classification Deep Neural Networks. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). :520–521.

With widely applied in various fields, deep learning (DL) is becoming the key driving force in industry. Although it has achieved great success in artificial intelligence tasks, similar to traditional software, it has defects that, once it failed, unpredictable accidents and losses would be caused. In this paper, we propose a test cases generation technique based on an adversarial samples generation algorithm for image classification deep neural networks (DNNs), which can generate a large number of good test cases for the testing of DNNs, especially in case that test cases are insufficient. We briefly introduce our method, and implement the framework. We conduct experiments on some classic DNN models and datasets. We further evaluate the test set by using a coverage metric based on states of the DNN.

Ping, C., Jun-Zhe, Z..  2019.  Research on Intelligent Evaluation Method of Transient Analysis Software Function Test. 2019 International Conference on Advances in Construction Machinery and Vehicle Engineering (ICACMVE). :58–61.

In transient distributed cloud computing environment, software is vulnerable to attack, which leads to software functional completeness, so it is necessary to carry out functional testing. In order to solve the problem of high overhead and high complexity of unsupervised test methods, an intelligent evaluation method for transient analysis software function testing based on active depth learning algorithm is proposed. Firstly, the active deep learning mathematical model of transient analysis software function test is constructed by using association rule mining method, and the correlation dimension characteristics of software function failure are analyzed. Then the reliability of the software is measured by the spectral density distribution method of software functional completeness. The intelligent evaluation model of transient analysis software function testing is established in the transient distributed cloud computing environment, and the function testing and reliability intelligent evaluation are realized. Finally, the performance of the transient analysis software is verified by the simulation experiment. The results show that the accuracy of the software functional integrity positioning is high and the intelligent evaluation of the transient analysis software function testing has a good self-adaptability by using this method to carry out the function test of the transient analysis software. It ensures the safe and reliable operation of the software.

Zhang, Z., Xie, X..  2019.  On the Investigation of Essential Diversities for Deep Learning Testing Criteria. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). :394–405.

Recent years, more and more testing criteria for deep learning systems has been proposed to ensure system robustness and reliability. These criteria were defined based on different perspectives of diversity. However, there lacks comprehensive investigation on what are the most essential diversities that should be considered by a testing criteria for deep learning systems. Therefore, in this paper, we conduct an empirical study to investigate the relation between test diversities and erroneous behaviors of deep learning models. We define five metrics to reflect diversities in neuron activities, and leverage metamorphic testing to detect erroneous behaviors. We investigate the correlation between metrics and erroneous behaviors. We also go further step to measure the quality of test suites under the guidance of defined metrics. Our results provided comprehensive insights on the essential diversities for testing criteria to exhibit good fault detection ability.

Hamad, R. M. H., Fayoumi, M. Al.  2019.  Scalable Quality and Testing Lab (SQTL): Mission-Critical Applications Testing. 2019 International Conference on Computer and Information Sciences (ICCIS). :1–7.

Currently, the complexity of software quality and testing is increasing exponentially with a huge number of challenges knocking doors, especially when testing a mission-critical application in banking and other critical domains, or the new technology trends with decentralized and nonintegrated testing tools. From practical experience, software testing has become costly and more effort-intensive with unlimited scope. This thesis promotes the Scalable Quality and Testing Lab (SQTL), it's a centralized quality and testing platform, which integrates a powerful manual, automation and business intelligence tools. SQTL helps quality engineers (QE) effectively organize, manage and control all testing activities in one centralized lab, starting from creating test cases, then executing different testing types such as web, security and others. And finally, ending with analyzing and displaying all testing activities result in an interactive dashboard, which allows QE to forecast new bugs especially those related to security. The centralized SQTL is to empower QE during the testing cycle, help them to achieve a greater level of software quality in minimum time, effort and cost, and decrease defect density metric.

Zhao, Xinghan, Gao, Xiangfei.  2018.  An AI Software Test Method Based on Scene Deductive Approach. 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). :14—20.
Artificial intelligence (AI) software has high algorithm complexity, and the scale and dimension of the input and output parameters are high, and the test oracle isn't explicit. These features make a lot of difficulties for the design of test cases. This paper proposes an AI software testing method based on scene deductive approach. It models the input, output parameters and the environment, uses the random algorithm to generate the inputs of the test cases, then use the algorithm of deductive approach to make the software testing automatically, and use the test assertions to verify the results of the test. After description of the theory, this paper uses intelligent tracking car as an example to illustrate the application of this method and the problems needing attention. In the end, the paper describes the shortcoming of this method and the future research directions.
Zaman, Tarannum Shaila, Han, Xue, Yu, Tingting.  2019.  SCMiner: Localizing System-Level Concurrency Faults from Large System Call Traces. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :515—526.

Localizing concurrency faults that occur in production is hard because, (1) detailed field data, such as user input, file content and interleaving schedule, may not be available to developers to reproduce the failure; (2) it is often impractical to assume the availability of multiple failing executions to localize the faults using existing techniques; (3) it is challenging to search for buggy locations in an application given limited runtime data; and, (4) concurrency failures at the system level often involve multiple processes or event handlers (e.g., software signals), which can not be handled by existing tools for diagnosing intra-process(thread-level) failures. To address these problems, we present SCMiner, a practical online bug diagnosis tool to help developers understand how a system-level concurrency fault happens based on the logs collected by the default system audit tools. SCMiner achieves online bug diagnosis to obviate the need for offline bug reproduction. SCMiner does not require code instrumentation on the production system or rely on the assumption of the availability of multiple failing executions. Specifically, after the system call traces are collected, SCMiner uses data mining and statistical anomaly detection techniques to identify the failure-inducing system call sequences. It then maps each abnormal sequence to specific application functions. We have conducted an empirical study on 19 real-world benchmarks. The results show that SCMiner is both effective and efficient at localizing system-level concurrency faults.

Lv, Chengcheng, Zhang, Long, Zeng, Fanping, Zhang, Jian.  2019.  Adaptive Random Testing for XSS Vulnerability. 2019 26th Asia-Pacific Software Engineering Conference (APSEC). :63–69.
XSS is one of the common vulnerabilities in web applications. Many black-box testing tools may collect a large number of payloads and traverse them to find a payload that can be successfully injected, but they are not very efficient. And previous research has paid less attention to how to improve the efficiency of black-box testing to detect XSS vulnerability. To improve the efficiency of testing, we develop an XSS testing tool. It collects 6128 payloads and uses a headless browser to detect XSS vulnerability. The tool can discover XSS vulnerability quickly with the ART(Adaptive Random Testing) method. We conduct an experiment using 3 extensively adopted open source vulnerable benchmarks and 2 actual websites to evaluate the ART method. The experimental results indicate that the ART method can effectively improve the fuzzing method by more than 27.1% in reducing the number of attempts before accomplishing a successful injection.
Simos, Dimitris E., Garn, Bernhard, Zivanovic, Jovan, Leithner, Manuel.  2019.  Practical Combinatorial Testing for XSS Detection using Locally Optimized Attack Models. 2019 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). :122–130.
In this paper, we present a combinatorial testing methodology for automated black-box security testing of complex web applications. The focus of our work is the identification of Cross-site Scripting (XSS) vulnerabilities. We introduce a new modelling scheme for test case generation of XSS attack vectors consisting of locally optimized attack models. The modelling approach takes into account the response and behavior of the web application and is particularly efficient when used in conjunction with combinatorial testing. In addition to the modelling scheme, we present a research prototype of a security testing tool called XSSInjector, which executes attack vectors generated from our methodology against web applications. The tool also employs a newly developed test oracle for detecting XSS which allow us to precisely identify whether injected JavaScript is actually executed and thus eliminate false positives. Our testing methodology is sufficiently generic to be applied to any web application that returns HTML code. We describe the foundations of our approach and validate it via an extensive case study using a verification framework and real world web applications. In particular, we have found several new critical vulnerabilities in popular forum software, library management systems and gallery packages.
Patel, Keyur.  2019.  A Survey on Vulnerability Assessment Penetration Testing for Secure Communication. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :320–325.
As the technology is growing rapidly, the development of systems and software are becoming more complex. For this reason, the security of software and web applications become more vulnerable. In the last two decades, the use of internet application and security hacking activities are on top of the glance. The organizations are having the biggest challenge that how to secure their web applications from the rapidly increasing cyber threats because the organization can't compromise the security of their sensitive information. Vulnerability Assessment and Penetration Testing techniques may help organizations to find security loopholes. The weakness can be the asset for the attacker if the organizations are not aware of this. Vulnerability Assessment and Penetration Testing helps an organization to cover the security loopholes and determine their security arrangements are working as per defined policies or not. To cover the tracks and mitigate the threats it is necessary to install security patches. This paper includes the survey on the current vulnerabilities, determination of those vulnerabilities, the methodology used for determination, tools used to determine the vulnerabilities to secure the organizations from cyber threat.
Mohammadi, Mahmoud, Chu, Bill, Richter Lipford, Heather.  2019.  Automated Repair of Cross-Site Scripting Vulnerabilities through Unit Testing. 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). :370–377.
Many web applications are vulnerable to Cross Site Scripting (XSS) attacks enabling attackers to steal sensitive information and commit frauds. Much research in this area have focused on detecting vulnerable web pages using static and dynamic program analysis. The best practice to prevent XSS vulnerabilities is to encode untrusted dynamic content. However, a common programming error is the use of a wrong type of encoder to sanitize untrusted data, leaving the application vulnerable. We propose a new approach that can automatically fix this common type of XSS vulnerability in many situations. This approach is integrated into the software maintenance life cycle through unit testing. Vulnerable codes are refactored to reflect the suggested encoder and then verified using an attack evaluating mechanism to find a proper repair. Evaluation of this approach has been conducted on an open source medical record application with over 200 web pages written in JSP.
Chen, Yuqi, Poskitt, Christopher M., Sun, Jun.  2018.  Learning from Mutants: Using Code Mutation to Learn and Monitor Invariants of a Cyber-Physical System. 2018 IEEE Symposium on Security and Privacy (SP). :648–660.
Cyber-physical systems (CPS) consist of sensors, actuators, and controllers all communicating over a network; if any subset becomes compromised, an attacker could cause significant damage. With access to data logs and a model of the CPS, the physical effects of an attack could potentially be detected before any damage is done. Manually building a model that is accurate enough in practice, however, is extremely difficult. In this paper, we propose a novel approach for constructing models of CPS automatically, by applying supervised machine learning to data traces obtained after systematically seeding their software components with faults ("mutants"). We demonstrate the efficacy of this approach on the simulator of a real-world water purification plant, presenting a framework that automatically generates mutants, collects data traces, and learns an SVM-based model. Using cross-validation and statistical model checking, we show that the learnt model characterises an invariant physical property of the system. Furthermore, we demonstrate the usefulness of the invariant by subjecting the system to 55 network and code-modification attacks, and showing that it can detect 85% of them from the data logs generated at runtime.