Publication Type

Journal Article

Version

acceptedVersion

Publication Date

12-2024

Abstract

3D neural rendering enables photo-realistic reconstruction of a specific scene by encoding discontinuous inputs into a neural representation. Despite the remarkable rendering results, the storage of network parameters is not transmission-friendly and not extendable to metaverse applications. In this paper, we propose an invertible neural rendering approach that enables generating an interactive 3D model from a single image (i.e., 3D Snapshot). Our idea is to distill a pre-trained neural rendering model (e.g., NeRF) into a visualizable image form that can then be easily inverted back to a neural network. To this end, we first present a neural image distillation method to optimize three neural planes for representing the original neural rendering model. However, this representation is noisy and visually meaningless. We thus propose a dynamic invertible neural network to embed this noisy representation into a plausible image representation of the scene. We demonstrate promising reconstruction quality quantitatively and qualitatively, by comparing to the original neural rendering model, as well as video-based invertible methods. On the other hand, our method can store dozens of NeRFs with a compact restoration network (5 MB), and embedding each 3D scene takes up only 160 KB of storage. More importantly, our approach is the first solution that allows embedding a neural rendering model into image representations, which enables applications like creating an interactive 3D model from a printed image in the metaverse.

Keywords

Three Dimensional Displays, Rendering Computer Graphics, Solid Modeling, Image Reconstruction, Image Color Analysis, Neural Networks, Metaverse, Invertible Image Processing, Neural Representations, Single Image, Neural Coding, 3 D Snapshots, Neural Network, Dynamic Network, Image Representation, Reconstruction Quality, 3 D Scene, Compact Network, Scene Representation, Dynamic Neural Network, Neural Image, Loss Function, Data Storage, Model Size, Wavelet Transform, Spatial Domain, Spatial Coordinates, Volume Density, Short Video, Image Embedding, View Synthesis, Steganography, Noisy Images, Dynamic Update, Least Significant Bit, View Direction, Spherical Harmonics, Intermediate Representation, Half Of The Channel

Discipline

Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

IEEE Transactions on Pattern Analysis and Machine Intelligence

Volume

46

Issue

12

First Page

11524

Last Page

11531

ISSN

0162-8828

Identifier

10.1109/TPAMI.2024.3411051

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TPAMI.2024.3411051

Share

COinS