Publication Type
Conference Paper
Version
publishedVersion
Publication Date
12-2024
Abstract
We propose C-MELT, a novel framework for multimodal self-supervised learning of Electrocardiogram (ECG) and text encoders. C-MELT pre-trains a contrastive-enhanced masked auto-encoder architecture using ECG-text paired data. It exploits the generative strengths with improved discriminative capabilities to enable robust cross-modal alignment. This is accomplished through a carefully designed model, loss functions, and a novel negative sampling strategy. Our preliminary experiments demonstrate significant performance improvements with up to 12% in downstream cardiac arrhythmia classification and patient identification tasks. Our findings demonstrate C-MELT's capacity to extract rich, clinically relevant features from ECG-text pairs, paving the way for more accurate and efficient cardiac diagnoses in real-world healthcare settings.
Discipline
Programming Languages and Compilers
Research Areas
Intelligent Systems and Optimization
Publication
The first NeurIPS workshop on Time Series in the Age of Large Models, Vancouver, 2024 December 15
Publisher
Emerald
City or Country
NIPS
Citation
PHAM, Hung Manh; SAEED, Aaqib; and MA, Dong.
Revisiting masked auto-encoders for ECG-language representation learning. (2024). The first NeurIPS workshop on Time Series in the Age of Large Models, Vancouver, 2024 December 15.
Available at: https://ink.library.smu.edu.sg/sis_research/9938
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://openreview.net/forum?id=RWarJNYh1D&referrer=%5Bthe%20profile%20of%20Dong%20Ma%5D(%2Fprofile%3Fid%3D~Dong_Ma5)