Research Collection School Of Computing and Information Systems

Blocking reduction strategies in hierarchical text classification

Ee Peng LIM, Singapore Management UniversityFollow
Aixin SUN, Nanyang Technological University
Wee-Keong NG, Nanyang Technological University
Jaideep SRIVASTAVA, University of Minnesota - Twin Cities

Publication Type

Journal Article

Version

acceptedVersion

Publication Date

10-2004

Abstract

One common approach in hierarchical text classification involves associating classifiers with nodes in the category tree and classifying text documents in a top-down manner. Classification methods using this top-down approach can scale well and cope with changes to the category trees. However, all these methods suffer from blocking which refers to documents wrongly rejected by the classifiers at higher-levels and cannot be passed to the classifiers at lower-levels. We propose a classifier-centric performance measure known as blocking factor to determine the extent of the blocking. Three methods are proposed to address the blocking problem, namely, threshold reduction, restricted voting, and extended multiplicative. Our experiments using support vector machine (SVM) classifiers on the Reuters collection have shown that they all could reduce blocking and improve the classification accuracy. Our experiments have also shown that the Restricted Voting method delivered the best performance.

Keywords

Data mining, text mining, classification

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Publication

IEEE Transactions on Knowledge and Data Engineering

Volume

Issue

First Page

1305

Last Page

1308

ISSN

1041-4347

Identifier

10.1109/TKDE.2004.50

Publisher

IEEE

Citation

LIM, Ee Peng; SUN, Aixin; NG, Wee-Keong; and SRIVASTAVA, Jaideep. Blocking reduction strategies in hierarchical text classification. (2004). IEEE Transactions on Knowledge and Data Engineering. 16, (10), 1305-1308.
Available at: https://ink.library.smu.edu.sg/sis_research/124

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/TKDE.2004.50

Download

Find it in your library

Included in

Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

Blocking reduction strategies in hierarchical text classification

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Blocking reduction strategies in hierarchical text classification

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links