Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

8-2021

Abstract

Developers need to perform adequate testing to ensure the quality of Automatic Speech Recognition (ASR) systems. However, manually collecting required test cases is tedious and time-consuming. Our recent work proposes CrossASR, a differential testing method for ASR systems. This method first utilizes Text-to-Speech (TTS) to generate audios from texts automatically and then feed these audios into different ASR systems for cross-referencing to uncover failed test cases. It also leverages a failure estimator to find failing test cases more efficiently. Such a method is inherently self-improvable: the performance can increase by leveraging more advanced TTS and ASR systems. So, in this accompanying tool demo paper, we further engineer CrossASR and propose CrossASR++, an easy-to-use ASR testing tool that can be conveniently extended to incorporate different TTS and ASR systems, and failure estimators. We also make CrossASR++ chunk texts from a given corpus dynamically and enable the estimator to work in a more effective and flexible way. We demonstrate that the new features can help CrossASR++ discover more failed test cases. Using the same TTS and ASR systems, CrossASR++ can uncover 26.2% more failed test cases for 4 ASRs than the original tool. Moreover, by simply adding one more ASR for cross-referencing, we can increase the number of failed test cases uncovered for each of the 4 ASR systems by 25.07%, 39.63%, 20.95% and 8.17% respectively. We also extend CrossASR++ with 5 additional failure estimators. Compared to worst estimator, the best one can discover 10.41% more failed test cases within the same amount of time. The demo video for CrossASR++ can be viewed at https://youtu.be/ddRk-f0QV-g and the source code can be found at https://github.com/soarsmu/CrossASRplus.

Keywords

automatic speech recognition, cross-referencing, test case generation, text-to-speech

Discipline

Artificial Intelligence and Robotics | Databases and Information Systems

Research Areas

Data Science and Engineering; Intelligent Systems and Optimization

Publication

29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '21)

First Page

1575

Last Page

1579

ISBN

9781450385626

Identifier

10.1145/3468264.3473124

Publisher

Association for Computing Machinery

City or Country

Athens, Greece

Share

COinS