Deep pixel-level matching via attention for video co-segmentation

Publication Type

Journal Article

Publication Date

3-2020

Abstract

In video object co-segmentation, methods based on patch-level matching are widely leveraged to extract the similarity between video frames. However, these methods can easily lead to pixel misclassification because they reduce the precision of pixel localization; thus, the accuracies of the segmentation results of these methods are deducted. To address this problem, we propose a framework based on deep neural networks and equipped with a new attention module, which is designed for pixel-level matching to segment the object across video frames in this paper. In this attention module, the pixel-level matching step is able to compare the feature value of each pixel from one input frame with that of each pixel from another input frame for computing the similarity between two frames. Then a features fusion step is applied to efficiently fuse the feature maps of each frame with the similarity information for generating dense attention features. Finally, an up-sampling step refines the feature maps for obtaining high quality segmentation results by using these dense attention features. The ObMiC and DAVIS 2016 datasets were utilized to train and test our framework. Experimental results show that our framework achieves higher accuracy than those of other video segmentation methods that perform well in common information extraction.

Keywords

video co-segmentation, pixel-level matching, attention

Discipline

Information Security

Research Areas

Information Systems and Management

Publication

Applied Sciences

Volume

10

Issue

6

ISSN

2076-3417

Identifier

10.3390/app10061948

Publisher

MDPI

Additional URL

https://doi.org/10.3390/app10061948

This document is currently not available here.

Share

COinS