Publication Type
Conference Paper
Version
acceptedVersion
Publication Date
12-2005
Abstract
This paper describes the process of creating a grapheme-to-phoneme (G2P) converter for Standard Malay (SM). A fundamental step to building TTS and ASR engines, is to build a good G2P system that can automatically generate accurate phonemic representations for words. Our goal is to generate phonemes that reflect real speech, thereby facilitating more accurate phoneme alignment with actual waveforms (obtained from voice-data collection), keeping human intervention to the minimum. Here we discuss the key areas in SM that require considerable phonemic alterations including letter elisions, consonant insertions, multiple ways of uttering a letter/diagraph – areas that any good G2P system for SM should address. The application of these rules to two sets of corpus will also be discussed, and their generated phonemes examined for both accuracy measurement as well as for further rule refinements.
Keywords
Speech recognition, Speech synthesis, Grapheme-to-Phoneme, Malay language
Discipline
Physical Sciences and Mathematics
Research Areas
Operations Management
Publication
COCOSDA Jakarta Conference, December 2005
City or Country
Jakarta, Indonesia
Citation
LI, Haizhou; Aljunied, Mahani; and Teoh, Boon Seong.
A Grapheme to Phoneme Converter for Standard Malay. (2005). COCOSDA Jakarta Conference, December 2005.
Available at: https://ink.library.smu.edu.sg/lkcsb_research/2781
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.