Publication Type

Conference Proceeding Article

Version

Postprint

Publication Date

10-2015

Abstract

Relative similarity learning, as an important learning scheme for information retrieval, aims to learn a bi-linear similarity function from a collection of labeled instance-pairs, and the learned function would assign a high similarity value for a similar instance-pair and a low value for a dissimilar pair. Existing algorithms usually assume the labels of all the pairs in data streams are always made available for learning. However, this is not always realistic in practice since the number of possible pairs is quadratic to the number of instances in the database, and manually labeling the pairs could be very costly and time consuming. To overcome the limitation, we propose a novel framework of active online similarity learning. Specifically, we propose two new algorithms: (i)~PAAS: Passive-Aggressive Active Similarity learning; (ii)~CWAS: Confidence-Weighted Active Similarity learning, and we will prove their mistake bounds in theory. We have conducted extensive experiments on a variety of real-world data sets, and we find encouraging results that validate the empirical effectiveness of the proposed algorithms.

Keywords

machine learning, data streams, online learning

Discipline

Databases and Information Systems

Research Areas

Data Management and Analytics

Publication

CIKM 2015: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management: Melbourne, Australia, October 19-23, 2015

First Page

1181

Last Page

1190

ISBN

9781450337946

Identifier

10.1145/2806416.2806464

Publisher

ACM

City or Country

New York

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://doi.org/10.1145/2806416.2806464

Share

COinS