Research Collection School Of Computing and Information Systems

Distributed representation learning with skip-gram model for trained random forests

Chao MA
Tianjun WANG
Le ZHANG
Zhiguang CAO, Singapore Management UniversityFollow
Yue HUANG
Xinghao DING

Publication Type

Journal Article

Publication Date

9-2023

Abstract

The random forest family has been extensively studied due to its wide applications in machine learning and data analytics. However, the representation abilities of forests have not been explored yet. The existing forest representation is mainly based on feature hashing on the indices of leaf nodes. Feature hashing typically disregards the information from tree structures, i.e., the relationships between leaf nodes. Furthermore, the visualisation abilities of feature hashing are limited. On the contrary, the Skip-Gram model has been widely explored in word and node embedding due to its excellent representation ability. This paper proposes distributed representation learning for trained forests (DRL-TF) to extract co-occurrence relationships of samples and tree structures, and further boost the representation abilities of the trained forest using the Skip-Gram model. The experimental results demonstrate that the proposed DRL-TF outperforms the challenging baselines. To the best of the authors' knowledge, the visualisation by DRL-TF is the first tool to analyse the trained forests. The code is available at: https://github.com/machao199271/ DRL-TF.

Keywords

Distributed representation learning, random forest, co-occurrence relationship, Skip-Gram, feature hashing

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Neurocomputing

Volume

551

First Page

Last Page

ISSN

0925-2312

Identifier

10.1016/j.neucom.2023.126434

Publisher

Elsevier

Citation

MA, Chao; WANG, Tianjun; ZHANG, Le; CAO, Zhiguang; HUANG, Yue; and DING, Xinghao. Distributed representation learning with skip-gram model for trained random forests. (2023). Neurocomputing. 551, 1-12.
Available at: https://ink.library.smu.edu.sg/sis_research/8220

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1016/j.neucom.2023.126434

This document is currently not available here.

COinS

Research Collection School Of Computing and Information Systems

Distributed representation learning with skip-gram model for trained random forests

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Additional URL

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Distributed representation learning with skip-gram model for trained random forests

Author

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Additional URL

Share

Search

Links

Browse

Links