Biblio
The Software Assurance Metrics and Tool Evaluation (SAMATE) project at the National Institute of Standards and Technology (NIST) has created the Software Assurance Reference Dataset (SARD) to provide researchers and software security assurance tool developers with a set of known security flaws. As part of an empirical evaluation of a runtime monitoring framework, two test suites were executed and monitored, revealing deficiencies which led to a collaboration with the NIST SAMATE team to provide replacements. Test Suites 45 and 46 are analyzed, discussed, and updated to improve accuracy, consistency, preciseness, and automation. Empirical results show metrics such as recall, precision, and F-Measure are all impacted by invalid base assumptions regarding the test suites.
The Software Assurance Metrics and Tool Evaluation (SAMATE) project at the National Institute of Standards and Technology (NIST) has created the Software Assurance Reference Dataset (SARD) to provide researchers and software security assurance tool developers with a set of known security flaws. As part of an empirical evaluation of a runtime monitoring framework, two test suites were executed and monitored, revealing deficiencies which led to a collaboration with the NIST SAMATE team to provide replacements. Test Suites 45 and 46 are analyzed, discussed, and updated to improve accuracy, consistency, preciseness, and automation. Empirical results show metrics such as recall, precision, and F-Measure are all impacted by invalid base assumptions regarding the test suites.