Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

7-2011

Abstract

Most studies of online learning measure the performance of a learner by classification accuracy, which is inappropriate for applications where the data are unevenly distributed among different classes. We address this limitation by developing online learning algorithm for maximizing Area Under the ROC curve (AUC), a metric that is widely used for measuring the classification performance for imbalanced data distributions. The key challenge of online AUC maximization is that it needs to optimize the pairwise loss between two instances from different classes. This is in contrast to the classical setup of online learning where the overall loss is a sum of losses over individual training examples. We address this challenge by exploiting the reservoir sampling technique, and present two algorithms for online AUC maximization with theoretic performance guarantee. Extensive experimental studies confirm the effectiveness and the efficiency of the proposed algorithms for maximizing AUC.

Discipline

Computer Sciences | Databases and Information Systems | Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

Proceedings of the 28th International Conference on Machine Learning ICML 2011: Bellevue, WA, June 28 - July 2

First Page

233

Last Page

240

ISBN

9781450306195

Publisher

International Machine Learning Society

City or Country

Madison, WI

Copyright Owner and License

Authors

Additional URL

https://www.icml-2011.org/papers/198_icmlpaper.pdf

Share

COinS