Research Collection School Of Computing and Information Systems

On the sampling of web images for learning visual concept classifiers

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

7-2010

Abstract

Visual concept learning often requires a large set of training images. In practice, nevertheless, acquiring noise-free training labels with sufficient positive examples is always expensive. A plausible solution for training data collection is by sampling the largely available user-tagged images from social media websites. With the general belief that the probability of correct tagging is higher than that of incorrect tagging, such a solution often sounds feasible, though is not without challenges. First, user-tags can be subjective and, to certain extent, are ambiguous. For instance, an image tagged with “whales” may be simply a picture about ocean museum. Learning concept “whales” with such training samples will not be effective. Second, user-tags can be overly abbreviated. For instance, an image about concept “wedding” may be tagged with “love” or simply the couple’s names. As a result, crawling sufficient positive training examples is difficult. This paper empirically studies the impact of exploiting the tagged images towards concept learning, investigating the issue of how the quality of pseudo training images affects concept detection performance. In addition, we propose a simple approach, named semantic field, for predicting the relevance between a target concept and the tag list associated with the images. Specifically, the relevance is determined through concept-tag co-occurrence by exploring external sources such as WordNet and Wikipedia. The proposed approach is shown to be effective in selecting pseudo training examples, exhibiting better performance in concept learning than other approaches such as those based on keyword sampling and tag voting.

Keywords

Concept detection, Sampling, Web images

Discipline

Data Storage Systems | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the ACM International Conference on Image and Video Retrieval, ACM-CIVR 2010, Xi’an, China, July 5-7

First Page

Last Page

ISBN

9781450301176

Identifier

10.1145/1816041.1816051

Publisher

ACM

City or Country

Xi'an, China

Citation

ZHU, Shiai; WANG, Gang; NGO, Chong-wah; and JIANG, Yu-Gang. On the sampling of web images for learning visual concept classifiers. (2010). Proceedings of the ACM International Conference on Image and Video Retrieval, ACM-CIVR 2010, Xi’an, China, July 5-7. 50-57.
Available at: https://ink.library.smu.edu.sg/sis_research/6479

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Data Storage Systems Commons, Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

On the sampling of web images for learning visual concept classifiers

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

On the sampling of web images for learning visual concept classifiers

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links