Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2014

Abstract

One of the fundamental problems in image search is to rank image documents according to a given textual query. We address two limitations of the existing image search engines in this paper. First, there is no straightforward way of comparing textual keywords with visual image content. Image search engines therefore highly depend on the surrounding texts, which are often noisy or too few to accurately describe the image content. Second, ranking functions are trained on query-image pairs labeled by human labelers, making the annotation intellectually expensive and thus cannot be scaled up. We demonstrate that the above two fundamental challenges can be mitigated by jointly exploring the subspace learning and the use of click-through data. The former aims to create a latent subspace with the ability in comparing information from the original incomparable views (i.e., textual and visual views), while the latter explores the largely available and freely accessible click-through data (i.e., “crowdsourced” human intelligence) for understanding query. Specifically, we investigate a series of click-throughbased subspace learning techniques (CSL) for image search. We conduct experiments on MSR-Bing Grand Challenge and the final evaluation performance achieves 퐷퐶퐺@25 = 0.47225. Moreover, the feature dimension is significantly reduced by several orders of magnitude (e.g., from thousands to tens).

Keywords

Click-through data, DNN image representation, Image search, Subspace learning

Discipline

Databases and Information Systems | Data Storage Systems | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the 22nd ACM international conference on Multimedia, MM 2014, Orlando, Florida, November 3-7

First Page

233

Last Page

236

ISBN

9781450330633

Identifier

10.1145/2647868.2656404

Publisher

ACM

City or Country

Orlando

Share

COinS