Publication Type
Journal Article
Version
acceptedVersion
Publication Date
11-2021
Abstract
To improve the classification performance of support vector machines (SVMs) on imbalanced datasets, cost-sensitive learning methods have been proposed, e.g., DEC (Different Error Costs) and FSVM-CIL (Fuzzy SVM for Class Imbalance Learning). They relocate the hyperplane by adjusting the costs associated with misclassifying samples. However, the error costs are determined either empirically or by performing an exhaustive search in the parameter space. Both strategies can not guarantee effectiveness and efficiency simultaneously. In this paper, we propose ATEC, a solution that can efficiently find a preferable hyperplane by automatically tuning the error cost for between-class samples. ATEC distinguishes itself from all existing parameter tuning strategies by two main features: (1) it can evaluate how effective an error cost is in terms of classification accuracy; and (2) it changes the error cost in the right direction if it is not effective. Extensive experiments show that compared with the state-of-art methods, SVMs that are equipped with ATEC can not only obtain comparable improvements in terms of F1 score of minority class, area under the precision-recall curve (AUC-PR) and area under the ROC curve (AUC-ROC) scores, but also outperform the grid-search parameter tuning strategy by two orders of magnitude in terms of the training time when a high F1 score is required.
Keywords
Support vector machines, Tuning, Optimization, Kernel, Training, Standards, Fans
Discipline
Databases and Information Systems | Theory and Algorithms
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Knowledge and Data Engineering
Volume
33
Issue
11
First Page
3550
Last Page
3567
ISSN
1041-4347
Identifier
10.1109/TKDE.2020.2974949
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
CAO, Bin; LIU, Yuqi; HOU, Chenyu; FAN, Jing; ZHENG, Baihua; and JIN, Jianwei.
Expediting the accuracy-improving process of SVMs for class imbalance learning. (2021). IEEE Transactions on Knowledge and Data Engineering. 33, (11), 3550-3567.
Available at: https://ink.library.smu.edu.sg/sis_research/5097
Copyright Owner and License
LARC and Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TKDE.2020.2974949