Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2016

Abstract

Learning from data streams has been an important open research problem in the era of big data analytics. This paper investigates supervised machine learning techniques for mining data streams with application to online anomaly detection. Unlike conventional machine learning tasks, machine learning from data streams for online anomaly detection has several challenges: (i) data arriving sequentially and increasing rapidly, (ii) highly class-imbalanced distributions; and (iii) complex anomaly patterns that could evolve dynamically.To tackle these challenges, we propose a novel Cost-Sensitive Online Multiple Kernel Classification (CSOMKC) scheme for comprehensively mining data streams and demonstrate its application to online anomaly detection. Specifically, CSOMKC learns a kernel-based cost-sensitive prediction model for imbalanced data streams in a sequential or online learning fashion, in which a pool of multiple diverse kernels is dynamically explored.The optimal kernel predictor and the multiple kernel combination are learnt together, and simultaneously class imbalance issues are addressed. We give both theoretical and extensive empirical analysis of the proposed algorithms.

Keywords

Cost-Sensitive Learning, Online Learning, Multiple Kernel Learning

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

JMLR: Workshop and Conference Proceedings: 8th Asian Conference on Machine Learning: Hamilton, New Zealand, 2016 November 16-18

Volume

63

First Page

65

Last Page

80

ISSN

1532-4435

Publisher

JMLR

City or Country

Cambridge, MA

Copyright Owner and License

Authors

Additional URL

https://proceedings.mlr.press/v63/sahoo56.pdf

Share

COinS