Publication Type

Conference Paper

Version

acceptedVersion

Publication Date

12-2005

Abstract

This paper describes the process of creating a grapheme-to-phoneme (G2P) converter for Standard Malay (SM). A fundamental step to building TTS and ASR engines, is to build a good G2P system that can automatically generate accurate phonemic representations for words. Our goal is to generate phonemes that reflect real speech, thereby facilitating more accurate phoneme alignment with actual waveforms (obtained from voice-data collection), keeping human intervention to the minimum. Here we discuss the key areas in SM that require considerable phonemic alterations including letter elisions, consonant insertions, multiple ways of uttering a letter/diagraph – areas that any good G2P system for SM should address. The application of these rules to two sets of corpus will also be discussed, and their generated phonemes examined for both accuracy measurement as well as for further rule refinements.

Keywords

Speech recognition, Speech synthesis, Grapheme-to-Phoneme, Malay language

Discipline

Physical Sciences and Mathematics

Research Areas

Operations Management

Publication

COCOSDA Jakarta Conference, December 2005

City or Country

Jakarta, Indonesia

Share

COinS