Research Collection School Of Computing and Information Systems

CrowdTC: Crowd-powered learning for text classification

Publication Type

Journal Article

Version

acceptedVersion

Publication Date

2-2022

Abstract

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.

Keywords

Text classification, crowdsourcing, keyword extraction, neural networks

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Publication

ACM Transactions on Knowledge Discovery from Data

Volume

Issue

First Page

15:1

Last Page

15:23

ISSN

1556-4681

Identifier

10.1145/3457216

Publisher

Association for Computing Machinery (ACM)

Citation

YANG, Keyu; Gao, Yunjun; LIANG, Lei; BIAN, Song; CHEN, Lu; and ZHENG, Baihua. CrowdTC: Crowd-powered learning for text classification. (2022). ACM Transactions on Knowledge Discovery from Data. 16, (1), 15:1-15:23.
Available at: https://ink.library.smu.edu.sg/sis_research/7149

Copyright Owner and License

Publisher

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3457216

Download

Included in

Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

CrowdTC: Crowd-powered learning for text classification

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

CrowdTC: Crowd-powered learning for text classification

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links