Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2014

Abstract

We investigate online active learning techniques for classification tasks in data stream mining applications. Unlike traditional learning approaches (either batch or online learning) that often require to request the class label of each incoming instance, online active learning queries only a subset of informative incoming instances to update the classification model, which aims to maximize classification performance using minimal human labeling effort during the entire online stream data mining task. In this paper, we present a new family of algorithms for online active learning called Passive-Aggressive Active (PAA) learning algorithms by adapting the popular Passive-Aggressive algorithms in an online active learning setting. Unlike the conventional Perceptron-based approach that employs only the misclassified instances for updating the model, the proposed PAA learning algorithms not only use the misclassified instances to update the classifier, but also exploit correctly classified examples with low prediction confidence. We theoretically analyse the mistake bounds of the proposed algorithms and conduct extensive experiments to examine their empirical performance, in which encouraging results show clear advantages of our algorithms over the baselines.

Keywords

Online Learning, Data Stream, Active Learning, Passive-Aggressive

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Publication

JMLR: Workshop and Conference Proceedings: Asian Conference on Machine Learning (ACML 2014), Nha Trang City, Vietnam, 26-28 November 2014

Volume

39

First Page

266

Last Page

282

Publisher

JMLR

City or Country

Cambridge, MA

Copyright Owner and License

Authors

Comments

Best Runner-Up Paper Award, 26-28 November 2014

Additional URL

http://jmlr.org/proceedings/papers/v39/lu14.pdf

Share

COinS