Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
12-2013
Abstract
In this paper, we propose a method for video-based human emotion recognition. For each video clip, all frames are represented as an image set, which can be modeled as a linear subspace to be embedded in Grassmannian manifold. After feature extraction, Class-specific One-to-Rest Partial Least Squares (PLS) is learned on video and audio data respectively to distinguish each class from the other confusing ones. Finally, an optimal fusion of classifiers learned from both modalities (video and audio) is conducted at decision level. Our method is evaluated on the Emotion Recognition In The Wild Challenge (EmotiW 2013). The experimental results on both validation set and blind test set are presented for comparison. The final accuracy achieved on test set outperforms the baseline by 26%
Keywords
emotion recognition; emotiw 2013 challenge; grassmannian manifolds; partial least squares regression
Discipline
Databases and Information Systems | Graphics and Human Computer Interfaces
Research Areas
Data Science and Engineering
Publication
Proceedings of the 15th ACM International Conference on Multimodal Interaction, Sydney, Australia 2013 Dec 9-13
First Page
525
Last Page
530
ISBN
9781450321297
Identifier
10.1145/2522848.2531738
Publisher
ACM
City or Country
New York
Citation
1
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Included in
Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons