Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2024
Abstract
Advanced natural language processing (NLP) models are increasingly applied in music composition and performance, particularly in generating vocal melodies and simulating singing voices. While NLP techniques have been successfully used to analyze vocal performance data, providing insights into performance quality and style, the automatic transcription of vocal performances into sheet music remains a complex challenge. Existing tools for manual transcription are often insufficient due to the intricate dynamics of vocal expression. This study addresses the challenge of automating the conversion of vocal performances into sheet music using a combination of novel techniques, including large language models (LLMs). We present a method that successfully translates vocal audio input into display-ready sheet music. Our findings highlight the strengths and limitations of various approaches, particularly in the transcription of a cappella performances into notes and lyrics. This research contributes to the expanding field of NLP-driven music analysis and showcases the potential of these models to revolutionize vocal transcription in the future.
Keywords
Natural Language Processing, Vocal Performance, Automatic Music Transcription (AMT), Large Language Models, Machine Learning, A Cappella, Lyric Transcription, Sheet Music
Discipline
Artificial Intelligence and Robotics
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 2024 IEEE International Conference on Data Mining
ISBN
979-8-3315-0668-1
Identifier
10.1109/ICDMW65004.2024.00063
Publisher
IEEE
City or Country
Piscataway, NJ, USA
Citation
JIANG, Jinjing; TEO, Nicole; PEN, Haibo; and WANG, Zhaoxia.
Converting vocal performances into sheet music leveraging large language models. (2024). Proceedings of the 2024 IEEE International Conference on Data Mining.
Available at: https://ink.library.smu.edu.sg/sis_research/10664
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICDMW65004.2024.00063