Publication Type
Journal Article
Version
publishedVersion
Publication Date
7-2010
Abstract
The Bag-of-Words (BoW) model is a promising image representation technique for image categorization and annotation tasks. One critical limitation of existing BoW models is that much semantic information is lost during the codebook generation process, an important step of BoW. This is because the codebook generated by BoW is often obtained via building the codebook simply by clustering visual features in Euclidian space. However, visual features related to the same semantics may not distribute in clusters in the Euclidian space, which is primarily due to the semantic gap between low-level features and high-level semantics. In this paper, we propose a novel scheme to learn optimized BoW models, which aims to map semantically related features to the same visual words. In particular, we consider the distance between semantically identical features as a measurement of the semantic gap, and attempt to learn an optimized codebook by minimizing this gap, aiming to achieve the minimal loss of the semantics. We refer to such kind of novel codebook as semantics-preserving codebook (SPC) and the corresponding model as the Semantics-Preserving Bag-of-Words (SPBoW) model. Extensive experiments on image annotation and object detection tasks with public testbeds from MIT's Labelme and PASCAL VOC challenge databases show that the proposed SPC learning scheme is effective for optimizing the codebook generation process, and the SPBoW model is able to greatly enhance the performance of the existing BoW model.
Keywords
Image retrieval, Research and development, Image storage, Image segmentation, Image representation, Loss measurement, Particle measurements, Object detection, Testing, Image databases
Discipline
Computer Sciences | Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Image Processing
Volume
19
Issue
7
First Page
1908
Last Page
1920
ISSN
1057-7149
Identifier
10.1109/TIP.2010.2045169
Publisher
IEEE
Citation
WU, Lei; HOI, Steven C. H.; and YU, Nenghai.
Semantics-Preserving Bag-of-Words Models and Applications. (2010). IEEE Transactions on Image Processing. 19, (7), 1908-1920.
Available at: https://ink.library.smu.edu.sg/sis_research/2309
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TIP.2010.2045169