Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
12-2025
Abstract
Infrared-visible image fusion aims to integrate complementary information from two modalities to generate images with enriched semantic content. However, existing methods often neglect two critical aspects: the design of a local–global feature enhancement architecture and spatial alignment. To address these challenges, we propose Channel Selective and Spatial Alignment Fusion (CSSA-Fusion), a novel framework composed of two synergistic modules. The first is a selective channel and redundancy suppression module, which introduces a dual-branch selective channel attention mechanism to jointly capture local saliency and global channel importance for enhanced feature representation, and an informativeness–redundancy separation strategy to suppress redundant information while preserving discriminative features. The second is a directional feature processing module, consisting of a mechanism that decouples and recombines modality-specific and common representations to mitigate mutual interference, and a spatial alignment module that performs geometric alignment via horizontal and vertical coordinate decomposition to correct spatial discrepancies between modalities. Extensive experiments on benchmark datasets demonstrate that CSSA-Fusion consistently outperforms state-of-the-art deep learning methods on multiple quality metrics. The fused images exhibit superior visual quality with well-preserved textures and enhanced semantic details.
Keywords
Multimodal Imaging, Image Fusion, Image Quality, Deep Learning
Discipline
Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
MMAsia ’25: Proceedings of the 7th ACM International Conference on Multimedia in Asia, Kuala Lumpur, Malaysia, December 9-12
First Page
1
Last Page
7
ISBN
9798400720055
Identifier
10.1145/3743093.3770964
Publisher
ACM
City or Country
New York
Citation
LI, Zhen; ZENG, Zhi; XIAO, Zhongrui; WEN, Ming; ZHANG, Zhiyuan; and TIAN, Yibin.
CSSA-Fusion: Channel selective and spatial alignment infrared-visible image fusion. (2025). MMAsia ’25: Proceedings of the 7th ACM International Conference on Multimedia in Asia, Kuala Lumpur, Malaysia, December 9-12. 1-7.
Available at: https://ink.library.smu.edu.sg/sis_research/10728
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3743093.3770964