Publication Type

Journal Article

Version

acceptedVersion

Publication Date

9-2016

Abstract

Online learning plays an important role in many big datamining problems because of its high efficiency and scalability. In theliterature, many online learning algorithms using gradient information havebeen applied to solve online classification problems. Recently, more effectivesecond-order algorithms have been proposed, where the correlation between thefeatures is utilized to improve the learning efficiency. Among them,Confidence-Weighted (CW) learning algorithms are very effective, which assumethat the classification model is drawn from a Gaussian distribution, whichenables the model to be effectively updated with the second-order informationof the data stream. Despite being studied actively, these CW algorithms cannothandle nonseparable datasets and noisy datasets very well. In this article, wepropose a family of Soft Confidence-Weighted (SCW) learning algorithms for bothbinary classification and multiclass classification tasks, which is the firstfamily of online classification algorithms that enjoys four salient propertiessimultaneously: (1) large margin training, (2) confidence weighting, (3)capability to handle nonseparable data, and (4) adaptive margin. Ourexperimental results show that the proposed SCW algorithms significantlyoutperform the original CW algorithm. When comparing with a variety ofstate-of-the-art algorithms (including AROW, NAROW, and NHERD), we found thatSCW in general achieves better or at least comparable predictive performance,but enjoys considerably better efficiency advantage (i.e., using a smallernumber of updates and lower time cost). To facilitate future research, werelease all the datasets and source code to the public athttp://libol.stevenhoi.org/.

Keywords

machine learning, online learning

Discipline

Computer Sciences | Databases and Information Systems | Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

ACM Transactions on Intelligent Systems and Technology

Volume

8

Issue

1

First Page

1

Last Page

32

ISSN

2157-6904

Identifier

10.1145/2932193

Publisher

Association for Computing Machinery (ACM)

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1145/2932193

Share

COinS