Fidelity: Towards Measuring the Trustworthiness of Neural Network Classification

Submitted by aekwall on Mon, 12/07/2020 - 12:32pm

Title	Fidelity: Towards Measuring the Trustworthiness of Neural Network Classification
Publication Type	Conference Paper
Year of Publication	2019
Authors	Yang, Z.
Conference Name	2019 IEEE Conference on Dependable and Secure Computing (DSC)
Keywords	adversarial attack detection, adversarial examples, adversarial settings, composability, Computational modeling, learning (artificial intelligence), machine learning, machine learning model, neural nets, neural network classification, neural network system, Neural networks, pattern classification, Perturbation methods, pubcrawl, security of data, security-critical tasks, Sociology, Statistics, Task Analysis, Trusted Computing, trustworthiness
Abstract	With the increasing performance of neural networks on many security-critical tasks, the security concerns of machine learning have become increasingly prominent. Recent studies have shown that neural networks are vulnerable to adversarial examples: carefully crafted inputs with negligible perturbations on legitimate samples could mislead a neural network to produce adversary-selected outputs while humans can still correctly classify them. Therefore, we need an additional measurement on the trustworthiness of the results of a machine learning model, especially in adversarial settings. In this paper, we analyse the root cause of adversarial examples, and propose a new property, namely fidelity, of machine learning models to describe the gap between what a model learns and the ground truth learned by humans. One of its benefits is detecting adversarial attacks. We formally define fidelity, and propose a novel approach to quantify it. We evaluate the quantification of fidelity in adversarial settings on two neural networks. The study shows that involving the fidelity enables a neural network system to detect adversarial examples with true positive rate 97.7%, and false positive rate 1.67% on a studied neural network.
DOI	10.1109/DSC47296.2019.8937572
Citation Key	yang_fidelity_2019

Groups:

Science of Security VO