Publication Type

Journal Article

Version

acceptedVersion

Publication Date

8-2014

Abstract

We study the problem of online multitask learning for solving multiple related classification tasks in parallel, aiming at classifying every sequence of data received by each task accurately and efficiently. One practical example of online multitask learning is the micro-blog sentiment detection on a group of users, which classifies micro-blog posts generated by each user into emotional or non-emotional categories. This particular online learning task is challenging for a number of reasons. First of all, to meet the critical requirements of online applications, a highly efficient and scalable classification solution that can make immediate predictions with low learning cost is needed. This requirement leaves conventional batch learning algorithms out of consideration. Second, classical classification methods, be it batch or online, often encounter a dilemma when applied to a group of tasks, i.e., on one hand, a single classification model trained on the entire collection of data from all tasks may fail to capture characteristics of individual task; on the other hand, a model trained independently on individual tasks may suffer from insufficient training data. To overcome these challenges, in this paper, we propose a collaborative online multitask learning method, which learns a global model over the entire data of all tasks. At the same time, individual models for multiple related tasks are jointly inferred by leveraging the global model through a collaborative online learning approach. We illustrate the efficacy of the proposed technique on a synthetic dataset. We also evaluate it on three real-life problems-spam email filtering, bioinformatics data classification, and micro-blog sentiment detection. Experimental results show that our method is effective and scalable at the online classification of multiple related tasks

Keywords

Artificial intelligence, Data mining, Machine learning, classification, learning systems, multitask learning, online learning

Discipline

Computer Sciences | Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

IEEE Transactions on Knowledge and Data Engineering (TKDE)

Volume

26

Issue

8

First Page

1866

Last Page

1876

ISSN

1041-4349

Identifier

10.1109/TKDE.2013.139

Publisher

IEEE

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1109/TKDE.2013.139

Share

COinS