
Automated Whitebox Testing of Deep Learning Systems
Explore DeepXplore, an automated whitebox testing approach for deep learning systems, addressing the failures in existing testing methods to uncover erroneous behaviors for rare inputs. The proposed solutions include adversarial testing, neuron coverage measurement, and optimization for best test inputs.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
DeepXplore: Automated Whitebox Testing of Deep Learning Systems Kexin Pei Junfeng Yang Yinzhi Cao Suman Jana
Motivation Deep learning (DL) techniques are now deployed in safety-critical and time-critical domains Self-driving cars, malware detection Existing DL testing fails to expose erroneous behaviors for rare inputs Google self-driving car crashed into a bus that "should" have yielded Tesla car in autopilot did not recognize a trailer as an obstacle due to its "white color over bright sky" and "high ride height"
Proposed solution (I) Adversarial testing Start with an existing image Add minor changes that would fool the DL models but not the human eye
Proposed solution (II) Use neuron coverage to measure the parts of a DL system exercised by test inputs Code coverage does not work Run multiple DL systems over the same images to detect odd behaviors Most likely to be incorrect (Condorcet's jury theorem)
Proposed solution (III) Best test inputs for DL system Trigger many differential behaviors and achieve high neuron coverage Selecting them Can be represented as a joint optimization problem Can use gradient-based search techniques
DL Systems Include at least one Deep Neural Network (DNN) component DNN components learn their rules directly from data DNN s rules are mostly unknown to its developers
DNN architecture Multiple layers of neurons Input layer One or more hidden layers Output layer
The neuron Individual computing unit/mathematical function Multiple inputs I1, I2, with distinct weights w1, w2, Output is a function of weighted sum of inputs O = ( iwiIi) Often a step function
How the layers work together Each layer transforms the information contained in its input into a higher-level representation of the data
Limitations of existing DNN testing (I) Low test coverage No attempt to try to cover the rules Standard procedure is to divide the whole data set randomly into a training part and a testing part Sometimes include adversarial input Not enough
Limitations of existing DNN testing (II) Problems with low-coverage DNN tests Same as low-coverage tests of conventional software Software is not tested for rare conditions Some behaviors of DNN are left unexplored What if a nose is detected and its dominant color is red?