Publication Type

Conference Proceeding Article

Publication Date



Automated face annotation aims to automatically detect human faces from a photo and further name the faces with the corresponding human names. In this paper, we tackle this open problem by investigating a search-based face annotation (SBFA) paradigm for mining large amounts of web facial images freely available on the WWW. Given a query facial image for annotation, the idea of SBFA is to first search for top-n similar facial images from a web facial image database and then exploit these top-ranked similar facial images and their weak labels for naming the query facial image. To fully mine those information, this paper proposes a novel framework of Learning to Name Faces (L2NF) – a unified multimodal learning approach for search-based face annotation, which consists of the following major components: (i) we enhance the weak labels of top-ranked similar images by exploiting the “label smoothness" assumption; (ii) we construct the multimodal representations of a facial image by extracting different types of features; (iii) we optimize the distance measure for each type of features using distance metric learning techniques; and finally (iv) we learn the optimal combination of multiple modalities for annotation through a learning to rank scheme. We conduct a set of extensive empirical studies on two real-world facial image databases, in which encouraging results show that the proposed algorithms significantly boost the naming accuracy of search-based face annotation task.


Computer Sciences | Databases and Information Systems

Research Areas

Data Management and Analytics


Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval: July 28 - August 1, 2013, Dublin, Ireland

First Page


Last Page








City or Country

New York

Additional URL