Publication Type

Journal Article

Version

acceptedVersion

Publication Date

7-2023

Abstract

Deep learning (DL) has been applied in many applications. Meanwhile, the quality of DL systems is becoming a big concern. To evaluate the quality of DL systems, a number of DL testing techniques have been proposed. To generate test cases, a set of initial seed inputs are required. Existing testing techniques usually construct seed corpus by randomly selecting inputs from training or test dataset. Till now, there is no study on how initial seed inputs affect the performance of DL testing and how to construct an optimal one. To fill this gap, we conduct the first systematic study to evaluate the impact of seed selection strategies on DL testing. Specifically, considering three popular goals of DL testing (i.e., coverage, failure detection and robustness), we develop five seed selection strategies including three based on single-objective optimization (SOO) and two based on multi-objective optimization (MOO). We evaluate these strategies on 7 testing tools. Our results demonstrate that the selection of initial seed inputs greatly affects the testing performance. SOO-based selection can construct the best seed corpus that can boost DL testing with respect to the specific testing goal. MOO-based selection strategies construct seed corpus that achieve balanced improvement on multiple objectives.

Keywords

Deep learning testing, Seed selection, Coverage, Robustness

Discipline

OS and Networks | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ACM Transactions on Software Engineering and Methodology

First Page

1

Last Page

33

ISSN

1049-331X

Identifier

10.1145/3607190

Publisher

Association for Computing Machinery (ACM)

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1145/3607190

Share

COinS