Biblio | CPS-VO

AKCENGİZ, Ziya, Aslan, Melis, Karabayır, Özgür, Doğanaksoy, Ali, Uğuz, Muhiddin, Sulak, Fatih. 2020. Statistical Randomness Tests of Long Sequences by Dynamic Partitioning. 2020 International Conference on Information Security and Cryptology (ISCTURKEY). :68—74.

Random numbers have a wide usage in the area of cryptography. In practice, pseudo random number generators are used in place of true random number generators, as regeneration of them may be required. Therefore because of generation methods of pseudo random number sequences, statistical randomness tests have a vital importance. In this paper, a randomness test suite is specified for long binary sequences. In literature, there are many randomness tests and test suites. However, in most of them, to apply randomness test, long sequences are partitioned into a certain fixed length and the collection of short sequences obtained is evaluated instead. In this paper, instead of partitioning a long sequence into fixed length subsequences, a concept of dynamic partitioning is introduced in accordance with the random variable in consideration. Then statistical methods are applied. The suggested suite, containing four statistical tests: Collision Tests, Weight Test, Linear Complexity Test and Index Coincidence Test, all of them work with the idea of dynamic partitioning. Besides the adaptation of this approach to randomness tests, the index coincidence test is another contribution of this work. The distribution function and the application of all tests are given in the paper.

Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M.. 2020. Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

Zhang, Z., Xie, X.. 2019. On the Investigation of Essential Diversities for Deep Learning Testing Criteria. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). :394–405.

Recent years, more and more testing criteria for deep learning systems has been proposed to ensure system robustness and reliability. These criteria were defined based on different perspectives of diversity. However, there lacks comprehensive investigation on what are the most essential diversities that should be considered by a testing criteria for deep learning systems. Therefore, in this paper, we conduct an empirical study to investigate the relation between test diversities and erroneous behaviors of deep learning models. We define five metrics to reflect diversities in neuron activities, and leverage metamorphic testing to detect erroneous behaviors. We investigate the correlation between metrics and erroneous behaviors. We also go further step to measure the quality of test suites under the guidance of defined metrics. Our results provided comprehensive insights on the essential diversities for testing criteria to exhibit good fault detection ability.

Hoole, Alexander M., Traore, Issa, Delaitre, Aurelien, de Oliveira, Charles. 2016. Improving Vulnerability Detection Measurement: [Test Suites and Software Security Assurance]. Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering. :27:1–27:10.

The Software Assurance Metrics and Tool Evaluation (SAMATE) project at the National Institute of Standards and Technology (NIST) has created the Software Assurance Reference Dataset (SARD) to provide researchers and software security assurance tool developers with a set of known security flaws. As part of an empirical evaluation of a runtime monitoring framework, two test suites were executed and monitored, revealing deficiencies which led to a collaboration with the NIST SAMATE team to provide replacements. Test Suites 45 and 46 are analyzed, discussed, and updated to improve accuracy, consistency, preciseness, and automation. Empirical results show metrics such as recall, precision, and F-Measure are all impacted by invalid base assumptions regarding the test suites.

Hoole, Alexander M., Traore, Issa, Delaitre, Aurelien, de Oliveira, Charles. 2016. Improving Vulnerability Detection Measurement: [Test Suites and Software Security Assurance]. Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering. :27:1–27:10.

The Software Assurance Metrics and Tool Evaluation (SAMATE) project at the National Institute of Standards and Technology (NIST) has created the Software Assurance Reference Dataset (SARD) to provide researchers and software security assurance tool developers with a set of known security flaws. As part of an empirical evaluation of a runtime monitoring framework, two test suites were executed and monitored, revealing deficiencies which led to a collaboration with the NIST SAMATE team to provide replacements. Test Suites 45 and 46 are analyzed, discussed, and updated to improve accuracy, consistency, preciseness, and automation. Empirical results show metrics such as recall, precision, and F-Measure are all impacted by invalid base assumptions regarding the test suites.