Title: Quantifying Dataset Properties for Systematic Artificial Neural Network Classifier Verification

Author(s): Alex White, Darryl Hond, Hamid Asgari

Publication Event: Proceedings of the Twenty-eighth Safety-Critical Systems Symposium, York, UK

Publication Date: 2020-02-11

Resource URL: https://scsc.uk/r1172.pdf

Abstract:

Autonomous systems make use of a suite of algorithms for understanding the environment in which they are deployed. These algorithms typically solve one or more classic problems, such as classification, prediction and detection. This is a key step in making independent decisions in order to accomplish a set of objectives. Artificial neural networks (ANNs) are one such class of algorithms, which have shown great promise in view of their apparent ability to learn the complicated patterns underlying high-dimensional data. The decision boundary approximated by such networks is highly non-linear and difficult to interpret, which is particularly problematic in cases where these decisions can compromise the safety of either the system itself, or people. Furthermore, the choice of data used to prepare and test the network can have a dramatic impact on performance (e.g. misclassification) and consequently safety. In this paper, we  introduce a novel measure for quantifying the difference between the datasets used for training ANN-based object classification algorithms, and the test datasets used for verifying and evaluating classifier performance. This measure allows performance metrics to be placed into context by characterizing the test datasets employed for evaluation. A system requirement could specify the permitted form of the functional relationship between ANN classifier performance and the dissimilarity between training and test datasets. The novel measure is empirically assessed using publicly available datasets.