Research Collection School Of Computing and Information Systems

CylindFormer: Image-to-Point cloud registration with cylindrical transformer

Publication Type

Journal Article

Version

acceptedVersion

Publication Date

3-2026

Abstract

Accurate correspondence extraction between distinctive pixel-wise and point-wise features is critical for image-to-point cloud (I2P) registration. Recent efforts leveraging Transformers for I2P feature representation have demonstrated potential, primarily by first capturing intra-modality global contextual dependencies via self-attention, and then learning cross-modality correlations via cross-attention. The strength of vanilla Transformers lies in modeling cross-modality global feature correlations. However, such mechanisms often struggle with the structural disparity between dense image pixels and sparse 3D points, hindering the establishment of fine-grained correspondences. Moreover, global attention may introduce ambiguity, as interactions with many inconsistent regions of intra-modality may degrade feature distinctiveness. To address these limitations, we propose CylindFormer, a novel Cylindrical Transformer model designed to establish accurate and reliable correspondences for efficient and robust I2P registration. Specifically, to overcome the inherent structural discrepancy, our method leverages cylindrical projection of 3D points onto the image plane to define spatially-aware clusters, enabling local feature aggregation of image pixels. These fused features are then aligned with 3D point features through adaptive attention to strengthen cross-modality correlations. In addition, Cylindrical Transformer introduces a cylindrical self-attention mechanism to explicitly learn intra-modality global structural consistency, effectively mitigating feature ambiguity. Extensive experiments on diverse indoor and outdoor benchmarks demonstrate the efficacy of CylindFormer. On the challenging RGB-D Scenes V2 dataset, our method improves the inlier ratio by 10.1∼16.3 percentage points and the registration recall by 1.3∼14.6 points, while achieving over 13× pose acceleration and reducing model parameters to less than one-ninth of the state-of-the-art method. The source code will be released at https://github.com/jtw220/CylindFormer soon.

Keywords

Cylindrical Transformer, Fine-grained correspondences, Image-to-Point cloud registration

Discipline

Graphics and Human Computer Interfaces | Numerical Analysis and Scientific Computing

Research Areas

Software and Cyber-Physical Systems

Publication

International Journal of Computer Vision

Volume

134

Issue

First Page

Last Page

ISSN

0920-5691

Identifier

10.1007/s11263-026-02747-w

Publisher

Springer

Citation

WANG, Jingtao; TANG, Hao; SUN, Yanpeng; HE, Shengfeng; and LI, Zechao. CylindFormer: Image-to-Point cloud registration with cylindrical transformer. (2026). International Journal of Computer Vision. 134, (4), 1-21.
Available at: https://ink.library.smu.edu.sg/sis_research/11073

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1007/s11263-026-02747-w

Download

Included in

Graphics and Human Computer Interfaces Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

CylindFormer: Image-to-Point cloud registration with cylindrical transformer

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

CylindFormer: Image-to-Point cloud registration with cylindrical transformer

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links