Research Collection School Of Computing and Information Systems

Diffuse3D: Wide-angle 3D photography via bilateral diffusion

Yutao JIANG, South China University of Technology
Yang ZHOU, South China University of Technology
Yuan LIANG, South China University of Technology
Wenxi LIU, Fuzhou University
Jianbo JIAO, University of Birmingham
Yuhui QUAN, South China University of Technology
Shengfeng HE, Singapore Management UniversityFollow

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

10-2023

Abstract

This paper aims to resolve the challenging problem of wide-angle novel view synthesis from a single image, a.k.a. wide-angle 3D photography. Existing approaches rely on local context and treat them equally to inpaint occluded RGB and depth regions, which fail to deal with large-region occlusion (i.e., observing from an extreme angle) and foreground layers might blend into background inpainting. To address the above issues, we propose Diffuse3D which employs a pre-trained diffusion model for global synthesis, while amending the model to activate depth-aware inference. Our key insight is to alter the convolution mechanism in the denoising process. We inject depth information into the denoising convolution operation with bilateral kernels, i.e., a depth kernel and a spatial kernel, to consider layered correlations among pixels. In this way, foreground regions are overlooked in background inpainting and only pixels close in depth are leveraged. On the other hand, we propose a global-local balancing approach to maximize both contextual understandings. Extensive experiments demonstrate that our approach outperforms state-of-the-art methods in novel view synthesis, especially in wide-angle scenarios. More importantly, our method does not require any training and is a plug-and-play module that can be integrated with any diffusion model. Our code can be found at https://github.com/yutaojiang1/Diffuse3D.

Keywords

Diffusion model, wide-angle 3D photography, Diffuse3D

Discipline

Computer Sciences | Graphics and Human Computer Interfaces

Research Areas

Software and Cyber-Physical Systems

Publication

2023 IEEE/CVF International Conference on Computer Vision (ICCV): Paris, October 1-6: Proceedings

First Page

8998

Last Page

9008

ISBN

9798350307184

Identifier

10.1109/ICCV51070.2023.00826

Publisher

IEEE

City or Country

Piscataway, NJ

Citation

JIANG, Yutao; ZHOU, Yang; LIANG, Yuan; LIU, Wenxi; JIAO, Jianbo; QUAN, Yuhui; and HE, Shengfeng. Diffuse3D: Wide-angle 3D photography via bilateral diffusion. (2023). 2023 IEEE/CVF International Conference on Computer Vision (ICCV): Paris, October 1-6: Proceedings. 8998-9008.
Available at: https://ink.library.smu.edu.sg/sis_research/8558

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/ICCV51070.2023.00826

Download

Download Research Data

Included in

Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

Diffuse3D: Wide-angle 3D photography via bilateral diffusion

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Diffuse3D: Wide-angle 3D photography via bilateral diffusion

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links