Publication Type

Conference Proceeding Article

Publication Date

7-2011

Abstract

Most studies of online learning measure the performance of a learner by classification accuracy, which is inappropriate for applications where the data are unevenly distributed among different classes. We address this limitation by developing online learning algorithm for maximizing Area Under the ROC curve (AUC), a metric that is widely used for measuring the classification performance for imbalanced data distributions. The key challenge of online AUC maximization is that it needs to optimize the pairwise loss between two instances from different classes. This is in contrast to the classical setup of online learning where the overall loss is a sum of losses over individual training examples. We address this challenge by exploiting the reservoir sampling technique, and present two algorithms for online AUC maximization with theoretic performance guarantee. Extensive experimental studies confirm the effectiveness and the efficiency of the proposed algorithms for maximizing AUC.

Discipline

Computer Sciences | Databases and Information Systems

Research Areas

Data Management and Analytics

Publication

Proceedings of the Twenty-eighth International Conference on Machine Learning: Bellevue, Washington, USA, June 28 - July 2, 2011

ISBN

9781450306195

Publisher

International Machine Learning Society

City or Country

Madison

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://www.icml-2011.org/papers/198_icmlpaper.pdf

Share

COinS