Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

9-2007

Abstract

Based on keypoints extracted as salient image patches, an image can be described as a “bag of visual words” and this representation has been used in scene classification. The choice of dimension, selection, and weighting of visual words in this representation is crucial to the classification performance but has not been thoroughly studied in previous work. Given the analogy between this representation and the bag-of-words representation of text documents, we apply techniques used in text categorization, including term weighting, stop word removal, feature selection, to generate image representations that differ in the dimension, selection, and weighting of visual words. The impact of these representation choices to scene classification is studied through extensive experiments on the TRECVID and PASCAL collection. This study provides an empirical basis for designing visual-word representations that are likely to produce superior classification performance.

Keywords

Bag-of-visual-words, Keypoint, Local interest point, Scene classification

Discipline

Data Storage Systems | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the international workshop on Workshop on multimedia information retrieval: MIR07, Augsburg, Bavaria, September 28-29

First Page

197

Last Page

206

ISBN

9781595937780

Identifier

10.1145/1290082.1290111

Publisher

ACM

City or Country

Augsburg, Bavaria

Share

COinS