Publication Type
Journal Article
Version
acceptedVersion
Publication Date
1-2022
Abstract
In this article, we investigate the task of normalizing transcribed texts in Vietnamese Automatic Speech Recognition (ASR) systems in order to improve user readability and the performance of downstream tasks. This task usually consists of two main sub-tasks: predicting and inserting punctuation (i.e., period, comma); and detecting and standardizing named entities (i.e., numbers, person names) from spoken forms to their appropriate written forms. To achieve these goals, we introduce a complete corpus including of 87,700 sentences and investigate conditional joint learning approaches which globally optimize two sub-tasks simultaneously. The experimental results are quite promising. Overall, the proposed architecture outperformed the conventional architecture which trains individual models on the two sub-tasks separately. The joint models are furthered improved when integrated with the surrounding contexts (SCs). Specifically, we obtained 81.13% for the first sub-task and 94.41% for the second sub-task in the F1 scores using the best model.
Keywords
ASR, named entity recognition, post-processing, punctuator, text normalization, transformer-based joint learning models
Discipline
Numerical Analysis and Scientific Computing | South and Southeast Asian Languages and Societies | Theory and Algorithms
Publication
Cybernetics and Systems
First Page
1
Last Page
18
ISSN
0196-9722
Identifier
10.1080/01969722.2022.2145654
Publisher
Taylor and Francis Group
Citation
BUI, The Viet; LUONG, Tho Chi; and TRAN, Oanh Thi.
Transformer-based joint learning approach for text normalization in Vietnamese Automatic Speech Recognition Systems. (2022). Cybernetics and Systems. 1-18.
Available at: https://ink.library.smu.edu.sg/sis_research/7591
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1080/01969722.2022.2145654
Included in
Numerical Analysis and Scientific Computing Commons, South and Southeast Asian Languages and Societies Commons, Theory and Algorithms Commons