Publication Type

Journal Article

Version

acceptedVersion

Publication Date

12-2025

Abstract

Scene context prediction, which seeks to infer unknown contextual information from isolated object properties, currently faces limitations due to predominant reliance on pixel-wise supervision that overlooks real-world context priors. To address this, we present ContX, a context-prior-driven, coarse-to-fine model. ContX distinctively integrates explicit linguistic-contextual knowledge in two key ways. First, it proposes a linguistic guided context bank, leveraging linguistic-statistical contextual data to guide the rationality of segmentation shapes and foster meaningful inter-class contextual interactions. Second, ContX augments contextual comprehension by correlating layouts with linguistic descriptions, enhancing layout perception through a multi-modal strategy. Comprehensive experiments demonstrate ContX's superiority and versatility, outperforming current state-of-the-art methods in both qualitative and quantitative assessments. The code is available at https://github.com/liangjingxin4747/ContX.

Keywords

Generative adversarial network, Layout prediction, Prior knowledge, Scene context, Scene understanding

Discipline

Artificial Intelligence and Robotics | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Pattern Recognition

Volume

168

First Page

1

Last Page

13

ISSN

0031-3203

Identifier

10.1016/j.patcog.2025.111852

Publisher

Elsevier

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1016/j.patcog.2025.111852

Share

COinS