Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

6-2022

Abstract

Extracting class activation maps (CAM) is arguably the most standard step of generating pseudo masks for weakly supervised semantic segmentation (WSSS). Yet, we find that the crux of the unsatisfactory pseudo masks is the binary cross-entropy loss (BCE) widely used in CAM. Specifically, due to the sum-over-class pooling nature of BCE, each pixel in CAM may be responsive to multiple classes co-occurring in the same receptive field. To this end, we introduce an embarrassingly simple yet surprisingly effective method: Reactivating the converged CAM with BCE by using softmax crossentropy loss (SCE), dubbed ReCAM. Given an image, we use CAM to extract the feature pixels of each single class, and use them with the class label to learn another fully-connected layer (after the backbone) with SCE. Once converged, we extract ReCAM in the same way as in CAM.

Keywords

class activation maps, weakly supervised learning, semantic segmentation

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Data Science and Engineering

Publication

Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, June 18-24

First Page

969

Last Page

978

ISBN

9781665469463

Identifier

10.1109/CVPR52688.2022.00104

Publisher

IEEE

City or Country

New Orleans, Louisiana

Additional URL

http://doi.org/10.1109/CVPR52688.2022.00104

Share

COinS