Test4Deep: An Effective White-Box Testing for Deep Neural Networks

Submitted by grigby1 on Tue, 02/18/2020 - 11:52am

Title	Test4Deep: An Effective White-Box Testing for Deep Neural Networks
Publication Type	Conference Paper
Year of Publication	2019
Authors	Yu, Jing, Fu, Yao, Zheng, Yanan, Wang, Zheng, Ye, Xiaojun
Conference Name	2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC)
Keywords	Biological neural networks, composability, Deep Learning testing, deep neural networks, DeepXplore, diversity, DNNs, generated test inputs, learning (artificial intelligence), machine learning, Metrics, multiple DNNs, neural nets, neuron coverage, Neurons, Optimization, Prediction algorithms, program testing, pubcrawl, resilience, Resiliency, Software, test cases, Test4Deep, Testing, white box cryptography, white box testing, white-box testing framework
Abstract	Current testing for Deep Neural Networks (DNNs) focuses on quantity of test cases but ignores diversity. To the best of our knowledge, DeepXplore is the first white-box framework for Deep Learning testing by triggering differential behaviors between multiple DNNs and increasing neuron coverage to improve diversity. Since it is based on multiple DNNs facing problems that (1) the framework is not friendly to a single DNN, (2) if incorrect predictions made by all DNNs simultaneously, DeepXplore cannot generate test cases. This paper presents Test4Deep, a white-box testing framework based on a single DNN. Test4Deep avoids mistakes of multiple DNNs by inducing inconsistencies between predicted labels of original inputs and that of generated test inputs. Meanwhile, Test4Deep improves neuron coverage to capture more diversity by attempting to activate more inactivated neurons. The proposed method was evaluated on three popular datasets with nine DNNs. Compared to DeepXplore, Test4Deep produced average 4.59% (maximum 10.49%) more test cases that all found errors and faults of DNNs. These test cases got 19.57% more diversity increment and 25.88% increment of neuron coverage. Test4Deep can further be used to improve the accuracy of DNNs by average up to 5.72% (maximum 7.0%).
URL	http://dx.doi.org/10.1109/CSE/EUC.2019.00013
DOI	10.1109/CSE/EUC.2019.00013
Citation Key	yu_test4deep_2019

Groups:

Science of Security VO