Publication Type

Journal Article

Publication Date

2010

Abstract

Learning a good distance metric plays a vital role in many multimedia retrieval and data mining tasks. For example, a typical content-based image retrieval (CBIR) system often relies on an effective distance metric to measure similarity between any two images. Conventional CBIR systems simply adopting Euclidean distance metric often fail to return satisfactory results mainly due to the well-known semantic gap challenge. In this article, we present a novel framework of Semi-Supervised Distance Metric Learning for learning effective distance metrics by exploring the historical relevance feedback log data of a CBIR system and utilizing unlabeled data when log data are limited and noisy. We formally formulate the learning problem into a convex optimization task and then present a new technique, named as “Laplacian Regularized Metric Learning” (LRML). Two efficient algorithms are then proposed to solve the LRML task. Further, we apply the proposed technique to two applications. One direct application is for Collaborative Image Retrieval (CIR), which aims to explore the CBIR log data for improving the retrieval performance of CBIR systems. The other application is for Collaborative Image Clustering (CIC), which aims to explore the CBIR log data for enhancing the clustering performance of image pattern clustering tasks. We conduct extensive evaluation to compare the proposed LRML method with a number of competing methods, including 2 standard metrics, 3 unsupervised metrics, and 4 supervised metrics with side information. Encouraging results validate the effectiveness of the proposed technique

Keywords

Distance metric learning, content-based image retrieval, multimedia data clustering

Discipline

Theory and Algorithms

Publication

ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP)

Volume

6

Issue

3

ISSN

1551-6857

Identifier

10.1145/1823746.1823752

Share

COinS