Egocentric temporal action proposals
Publication Type
Journal Article
Publication Date
11-2017
Abstract
We present an approach to localize generic actions in egocentric videos, called temporal action proposals (TAPs), for accelerating the action recognition step. An egocentric TAP refers to a sequence of frames that may contain a generic action performed by the wearer of a head-mounted camera, e.g., taking a knife, spreading jam, pouring milk, or cutting carrots. Inspired by object proposals, this paper aims at generating a small number of TAPs, thereby replacing the popular sliding window strategy, for localizing all action events in the input video. To this end, we first propose to temporally segment the input video into action atoms, which are the smallest units that may contain an action. We then apply a hierarchical clustering algorithm with several egocentric cues to generate TAPs. Finally, we propose two actionness networks to score the likelihood of each TAP containing an action. The top ranked candidates are returned as output TAPs. Experimental results show that the proposed TAP detection framework performs significantly better than relevant approaches for egocentric action detection.
Keywords
Atom optics, Optical imaging, Proposals, temporal action proposals, Video processing; Videos
Discipline
Databases and Information Systems
Research Areas
Information Systems and Management
Publication
IEEE Transactions on Image Processing
Volume
27
Issue
2
First Page
764
Last Page
777
ISSN
1057-7149
Identifier
10.1109/TIP.2017.2772904
Publisher
Institute of Electrical and Electronics Engineers
Citation
HUANG, Shao; WANG, Weiqiang; HE, Shengfeng; and LAU, Rynson W. H..
Egocentric temporal action proposals. (2017). IEEE Transactions on Image Processing. 27, (2), 764-777.
Available at: https://ink.library.smu.edu.sg/sis_research/8378
Additional URL
https://doi.org/10.1109/TIP.2017.2772904