Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
11-2014
Abstract
Learning effective feature representations and similarity measures are crucial to the retrieval performance of a content-based image retrieval (CBIR) system. Despite extensive research efforts for decades, it remains one of the most challenging open problems that considerably hinders the successes of real-world CBIR systems. The key challenge has been attributed to the well-known "semantic gap" issue that exists between low-level image pixels captured by machines and high-level semantic concepts perceived by human. Among various techniques, machine learning has been actively investigated as a possible direction to bridge the semantic gap in the long term. Inspired by recent successes of deep learning techniques for computer vision and other applications, in this paper, we attempt to address an open problem: if deep learning is a hope for bridging the semantic gap in CBIR and how much improvements in CBIR tasks can be achieved by exploring the state-of-the-art deep learning techniques for learning feature representations and similarity measures. Specifically, we investigate a framework of deep learning with application to CBIR tasks with an extensive set of empirical studies by examining a state-of-the-art deep learning method (Convolutional Neural Networks) for CBIR tasks under varied settings. From our empirical studies, we find some encouraging results and summarize some important insights for future research.
Keywords
content-based image retrieval, convolutional neural networks, deep learning, feature representation
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
MM '14: Proceedings of the 22nd ACM International Conference on Multimedia: November 3-7, 2014, Orlando
First Page
157
Last Page
166
ISBN
9781450330633
Identifier
10.1145/2647868.2654948
Publisher
ACM
City or Country
New York
Citation
WAN, Ji; WANG, Dayong; HOI, Steven C. H.; WU, Pengcheng; ZHU, Jianke; ZHANG, Yongdong; and LI, Jintao.
Deep learning for content-based image retrieval: A comprehensive study. (2014). MM '14: Proceedings of the 22nd ACM International Conference on Multimedia: November 3-7, 2014, Orlando. 157-166.
Available at: https://ink.library.smu.edu.sg/sis_research/2320
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/2647868.2654948