Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2014

Abstract

Spatio-temporal interest point (STIP) based methods have shown promising results for human action classification. However, state-of-art works typically utilize bag-of-visual words (BoVW), which focuses on the statistical distribution of features but ignores their inherent structural relationships. To solve this problem, a descriptor, namely directional pair-wise feature (DPF), is proposed to encode the mutual direction information between pairwise words, aiming at adding more spatial discriminant to BoVW. Firstly, STIP features are extracted and classified into a set of labeled words. Then in each frame, the DPF is constructed for every pair of words with different labels, according to their assigned directional vector. Finally, DPFs are quantized to be a probability histogram as a representation of human action. The proposed method is evaluated on two challenging datasets, Rochester and UT-interaction, and the results based on chi-squared kernel SVM classifiers confirm that our method can classify human actions with high accuracies.

Keywords

Human action recognition, bag-of-word, co-occurrence

Discipline

Computer Engineering | Software Engineering

Research Areas

Data Science and Engineering

Publication

Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), Florence, May 5-9

First Page

1235

Last Page

1239

Identifier

10.1109/ICASSP.2014.6853794

City or Country

Florence

Additional URL

https://doi.org/10.1109/ICASSP.2014.6853794

Share

COinS