Research Collection Yong Pung How School Of Law

Learning unsupervised video object segmentation through visual attention

Wenguan WANG
Hongmei SONG
Shuyang ZHAO
Jianbing SHEN
Sanyuan ZHAO
Steven C. H. HOI, Singapore Management UniversityFollow
Haibin LING

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

6-2019

Abstract

This paper conducts a systematic study on the role of visual attention in Unsupervised Video Object Segmentation (UVOS) tasks. By elaborately annotating three popular video segmentation datasets (DAVIS, Youtube-Objects and SegTrack V2) with dynamic eye-tracking data in the UVOS setting, for the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgements during dynamic, task-driven viewing. Such novel observations provide an in-depth insight into the underlying rationale behind UVOS. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major merits: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance in comparison with state-of-the-arts.

Keywords

Segmentation, Grouping and Shape, Image and Video Synthesis

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): June 15-20, Long Beach, CA: Proceedings

First Page

3059

Last Page

3069

ISBN

9781728132938

Identifier

10.1109/CVPR.2019.00318

Publisher

IEEE

City or Country

Piscataway, NJ

Citation

WANG, Wenguan; SONG, Hongmei; ZHAO, Shuyang; SHEN, Jianbing; ZHAO, Sanyuan; HOI, Steven C. H.; and LING, Haibin. Learning unsupervised video object segmentation through visual attention. (2019). 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): June 15-20, Long Beach, CA: Proceedings. 3059-3069.
Available at: https://ink.library.smu.edu.sg/sol_research/3162

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/CVPR.2019.00318

Download

Find it in your library

Included in

Databases and Information Systems Commons

COinS

Research Collection Yong Pung How School Of Law

Learning unsupervised video object segmentation through visual attention

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection Yong Pung How School Of Law

Learning unsupervised video object segmentation through visual attention

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links