Publication Type
Journal Article
Version
publishedVersion
Publication Date
3-2021
Abstract
Scene labeling or parsing aims to assign pixelwise semantic labels for an input image. Existing CNN-based models cannot leverage the label dependencies, while RNN-based models predict labels within the local context. In this paper, we propose a fast LSTM scene labeling network via structural inference. A minimum spanning tree is used to build the image structure for constructing semantic relationships. This structure allows efficient generation of direct parent-child dependencies for arbitrary levels of superpixels, and thus structural relationships can be learned with LSTM. In particular, we propose a bi-directional recurrent network to model the information flow along the parent-child path. In this way, the recurrent units in both coarse and fine levels can mutually transfer the global and local context information in the entire image structure. The proposed network is extremely fast, and it is 2.5x faster than the state-of-the-art RNN-based models. Extensive expseriments demonstrate that the proposed method provides a significant improvement in learning the label dependencies, and it outperforms state-of-the-art methods on different benchmarks. (C) 2021 Elsevier B.V. All rights reserved.
Keywords
LSTM, Structural inference, Scene labeling
Discipline
Information Security
Research Areas
Information Systems and Management
Publication
Neurocomputing
Volume
442
First Page
317
Last Page
326
ISSN
0925-2312
Identifier
10.1016/j.neucom.2020.12.134
Publisher
Elsevier
Citation
ZHANG, Huaidong; HAN, Chu; ZHANG, Xiaodan; DU, Yong; XU, Xuemiao; HAN, Guoqiang; QIN, Jing; and Shengfeng HE.
Fast scene labeling via structural inference. (2021). Neurocomputing. 442, 317-326.
Available at: https://ink.library.smu.edu.sg/sis_research/7838
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1016/j.neucom.2020.12.134