Distributed representation learning with skip-gram model for trained random forests
Publication Type
Journal Article
Publication Date
9-2023
Abstract
The random forest family has been extensively studied due to its wide applications in machine learning and data analytics. However, the representation abilities of forests have not been explored yet. The existing forest representation is mainly based on feature hashing on the indices of leaf nodes. Feature hashing typically disregards the information from tree structures, i.e., the relationships between leaf nodes. Furthermore, the visualisation abilities of feature hashing are limited. On the contrary, the Skip-Gram model has been widely explored in word and node embedding due to its excellent representation ability. This paper proposes distributed representation learning for trained forests (DRL-TF) to extract co-occurrence relationships of samples and tree structures, and further boost the representation abilities of the trained forest using the Skip-Gram model. The experimental results demonstrate that the proposed DRL-TF outperforms the challenging baselines. To the best of the authors' knowledge, the visualisation by DRL-TF is the first tool to analyse the trained forests. The code is available at: https://github.com/machao199271/ DRL-TF.
Keywords
Distributed representation learning, random forest, co-occurrence relationship, Skip-Gram, feature hashing
Discipline
Databases and Information Systems | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
Neurocomputing
Volume
551
First Page
1
Last Page
12
ISSN
0925-2312
Identifier
10.1016/j.neucom.2023.126434
Publisher
Elsevier
Citation
MA, Chao; WANG, Tianjun; ZHANG, Le; CAO, Zhiguang; HUANG, Yue; and DING, Xinghao.
Distributed representation learning with skip-gram model for trained random forests. (2023). Neurocomputing. 551, 1-12.
Available at: https://ink.library.smu.edu.sg/sis_research/8220
Copyright Owner and License
Authors
Additional URL
https://doi.org/10.1016/j.neucom.2023.126434