Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

7-2020

Abstract

An alternative to current mainstream preprocessing methods is proposed: Value Selection (VS). Unlike the existing methods such as feature selection that removes features and instance selection that eliminates instances, value selection eliminates the values (with respect to each feature) in the dataset with two purposes: reducing the model size and preserving its accuracy. Two probabilistic methods based on information theory's metric are proposed: PVS and P + VS. Extensive experiments on the benchmark datasets with various sizes are elaborated. Those results are compared with the existing preprocessing methods such as feature selection, feature transformation, and instance selection methods. Experiment results show that value selection can achieve the balance between accuracy and model size reduction.

Keywords

preprocessing, data mining, value selection, model size reduction, entropy, information theory

Discipline

Databases and Information Systems | Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

Proceedings of the 21st IEEE International Conference on Mobile Data Management

Identifier

10.1109/MDM48529.2020.00037

Publisher

IEEE

City or Country

Versailles, France

Citation

NJOO, Gunarto Sindoro; ZHENG, Baihua; HSU, Kuo-Wei; and PENG, Wen-Chih. Probabilistic Value Selection for Space Efficient Model. (2020). Proceedings of the 21st IEEE International Conference on Mobile Data Management.
Available at: https://ink.library.smu.edu.sg/sis_research/5264

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/MDM48529.2020.00037

Download

Find it in your library

Included in

Databases and Information Systems Commons, Theory and Algorithms Commons

COinS

Research Collection School Of Computing and Information Systems

Probabilistic Value Selection for Space Efficient Model

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Probabilistic Value Selection for Space Efficient Model

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links