Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
6-2020
Abstract
Temporal repetition counting aims to estimate the number of cycles of a given repetitive action. Existing deep learning methods assume repetitive actions are performed in a fixed time-scale, which is invalid for the complex repetitive actions in real life. In this paper, we tailor a context-aware and scale-insensitive framework, to tackle the challenges in repetition counting caused by the unknown and diverse cycle-lengths. Our approach combines two key insights: (1) Cycle lengths from different actions are unpredictable that require large-scale searching, but, once a coarse cycle length is determined, the variety between repetitions can be overcome by regression. (2) Determining the cycle length cannot only rely on a short fragment of video but a contextual understanding. The first point is implemented by a coarse-to-fine cycle refinement method. It avoids the heavy computation of exhaustively searching all the cycle lengths in the video, and, instead, it propagates the coarse prediction for further refinement in a hierarchical manner. We secondly propose a bidirectional cycle length estimation method for a context-aware prediction. It is a regression network that takes two consecutive coarse cycles as input, and predicts the locations of the previous and next repetitive cycles. To benefit the training and evaluation of temporal repetition counting area, we construct a new and largest benchmark, which contains 526 videos with diverse repetitive actions. Extensive experiments show that the proposed network trained on a single dataset outperforms state-of-the-art methods on several benchmarks, indicating that the proposed framework is general enough to capture repetition patterns across domains.
Keywords
Coarse to fine, Context-Aware, Contextual understanding, Cycle length, Learning methods, Number of cycles, Refinement methods, State-of-the-art methods
Discipline
Databases and Information Systems
Research Areas
Information Systems and Management
Publication
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, Online, June 14-19
First Page
667
Last Page
675
Identifier
10.1109/CVPR42600.2020.00075
Publisher
IEEE
City or Country
New Jersey
Citation
ZHANG, Huaidong; XU, Xuemiao; HAN, Guoqiang; and HE, Shengfeng.
Context-aware and scale-insensitive temporal repetition counting. (2020). Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, Online, June 14-19. 667-675.
Available at: https://ink.library.smu.edu.sg/sis_research/8523
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/CVPR42600.2020.00075