Publication Type
Conference Paper
Version
acceptedVersion
Publication Date
8-2024
Abstract
Algorithmic detection of facial palsy offers the potential to improve current practices, which usually involve labor-intensive and subjective assessment by clinicians. In this paper, we present a multimodal fusion-based deep learning model that utilizes unstructured data (i.e. an image frame with facial line segments) and structured data (i.e. features of facial expressions) to detect facial palsy. We then contribute to a study to analyze the effect of different data modalities and the benefits of a multimodal fusion-based approach using videos of 21 facial palsy patients. Our experimental results show that among various data modalities (i.e. unstructured data - RGB images and images of facial line segments and structured data - coordinates of facial landmarks and features of facial expressions), the feed-forward neural network using features of facial expression achieved the highest precision of 76.22 while the ResNet-based model using images of facial line segments achieved the highest recall of 83.47. When we leveraged both images of facial line segments and features of facial expressions, our multimodal fusion-based deep learning model slightly improved the precision score to 77.05 at the expense of a decrease in the recall score.
Keywords
Machine Learning, Computer Vision, Multimodal Fusion, Facial Analysis
Discipline
Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
IJCAI 2024 Explainable Artificial Intelligence (XAI) Workshop, Virtual Conference, August 15
Publisher
Emerald
City or Country
Virtual Conference
Citation
OO, Heng Yim Nicole; LEE, Min Hun; and LIM, J. H..
Exploring a multimodal fusion-based deep learning network for detecting facial palsy. (2024). IJCAI 2024 Explainable Artificial Intelligence (XAI) Workshop, Virtual Conference, August 15.
Available at: https://ink.library.smu.edu.sg/sis_research/9958
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons