Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

6-2006

Abstract

The goal of active learning is to select the most informative examples for manual labeling. Most of the previous studies in active learning have focused on selecting a single unlabeled example in each iteration. This could be inefficient since the classification model has to be retrained for every labeled example. In this paper, we present a framework for "batch mode active learning" that applies the Fisher information matrix to select a number of informative examples simultaneously. The key computational challenge is how to efficiently identify the subset of unlabeled examples that can result in the largest reduction in the Fisher information. To resolve this challenge, we propose an efficient greedy algorithm that is based on the property of submodular functions. Our empirical studies with five UCI datasets and one real-world medical image classification show that the proposed batch mode active learning algorithm is more effective than the state-of-the-art algorithms for active learning.

Keywords

Batch mode active learning, Greedy algorithm, Manual labeling, Iterative methods, Medical imaging

Discipline

Computer Sciences | Databases and Information Systems | Medicine and Health Sciences

Research Areas

Data Science and Engineering

Publication

ICML '06: Proceedings of the 23rd International Conference on Machine Learning: Pittsburgh, PA, June 25-29

First Page

417

Last Page

424

ISBN

9781595933836

Identifier

10.1145/1143844.1143897

Publisher

ACM

City or Country

New York

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1145/1143844.1143897

Share

COinS