Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

12-2023

Abstract

Deep learning (DL) models are trained on sampled data, where the distribution of training data differs from that of real-world data (i.e., the distribution shift), which reduces the model's robustness. Various testing techniques have been proposed, including distribution-unaware and distribution-aware methods. However, distribution-unaware testing lacks effectiveness by not explicitly considering the distribution of test cases and may generate redundant errors (within same distribution). Distribution-aware testing techniques primarily focus on generating test cases that follow the training distribution, missing out-of-distribution data that may also be valid and should be considered in the testing process. In this paper, we propose a novel distribution-guided approach for generating valid test cases with diverse distributions, which can better evaluate the model's robustness (i.e., generating hard-to-detect errors) and enhance the model's robustness (i.e., enriching training data). Unlike existing testing techniques that optimize individual test cases, DistXplore optimizes test suites that represent specific distributions. To evaluate and enhance the model's robustness, we design two metrics: distribution difference, which maximizes the similarity in distribution between two different classes of data to generate hard-to-detect errors, and distribution diversity, which increase the distribution diversity of generated test cases for enhancing the model's robustness. To evaluate the effectiveness of DistXplore in model evaluation and enhancement, we compare DistXplore with 14 state-of-the-art baselines on 10 models across 4 datasets. The evaluation results show that DisXplore not only detects a larger number of errors (e.g., 2×+ on average). Furthermore, DistXplore achieves a higher improvement in empirical robustness (e.g., 5.2% more accuracy improvement than the baselines on average).

Keywords

Deep learning, Neural networks, Software defect analysis, Software testing and debugging

Discipline

Artificial Intelligence and Robotics

Research Areas

Cybersecurity; Intelligent Systems and Optimization

Publication

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA, United States of America, December 3-9, 2023

First Page

Last Page

ISBN

9798400703270

Identifier

10.1145/3611643.3616266

Publisher

Association for Computing Machinery

City or Country

New York, NY, United States of America

Citation

WANG, Longtian; XIE, Xiaofei; DU, Xiaoning; TIAN, Meng; GUO, Qing; YANG, Zheng; and SHEN, Chao. DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems. (2023). Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA, United States of America, December 3-9, 2023. 68-80.
Available at: https://ink.library.smu.edu.sg/sis_research/8516

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3611643.3616266

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Research Collection School Of Computing and Information Systems

DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links