Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
6-2025
Abstract
Algorithmic detection of facial palsy offers the potential to improve current practices, which usually involve labor-intensive and subjective assessments by clinicians. In this paper, we present a multimodal fusion-based deep learning model that utilizes an MLP mixer-based model to process unstructured data (i.e. RGB images or images with facial line segments) and a feed-forward neural network to process structured data (i.e. facial landmark coordinates, features of facial expressions, or handcrafted features) for detecting facial palsy. We then contribute to a study to analyze the effect of different data modalities and the benefits of a multimodal fusion-based approach using videos of 20 facial palsy patients and 20 healthy subjects. Our multimodal fusion model achieved 96.00 F1, which is significantly higher than the feed-forward neural network trained on handcrafted features alone (82.80 F1) and an MLP mixer-based model trained on raw RGB images (89.00 F1).
Keywords
Machine Learning, Computer Vision, Multimodal Fusion, Facial Analysis
Discipline
OS and Networks | Theory and Algorithms
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Sustainability
Publication
Proceedings of the 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, Australia, June 10-13
First Page
197
Last Page
208
Identifier
10.1007/978-981-96-8298-0_16
Publisher
Springer
City or Country
Cham
Citation
OO, Heng Yim Nicole; LEE, Min Hun; and LIM, Jeong Hoon.
A multimodal fusion model leveraging MLP Mixer and handcrafted features-based deep learning networks for facial palsy detection. (2025). Proceedings of the 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, Australia, June 10-13. 197-208.
Available at: https://ink.library.smu.edu.sg/sis_research/10713
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/978-981-96-8298-0_16