Research Collection School Of Computing and Information Systems

RIGID: Recurrent GAN inversion and editing of real face videos

Yangyang XU, University of Hong Kong
Shengfeng HE, Singapore Management UniversityFollow
Kwan-Yee K. WONG, University of Hong Kong
Pingluo LUO, University of Hong Kong

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

10-2023

Abstract

GAN inversion is indispensable for applying the powerful editability of GAN to real images. However, existing methods invert video frames individually often leading to undesired inconsistent results over time. In this paper, we propose a unified recurrent framework, named Recurrent vIdeo GAN Inversion and eDiting (RIGID), to explicitly and simultaneously enforce temporally coherent GAN inversion and facial editing of real videos. Our approach models the temporal relations between current and previous frames from three aspects. To enable a faithful real video reconstruction, we first maximize the inversion fidelity and consistency by learning a temporal compensated latent code. Second, we observe incoherent noises lie in the high-frequency domain that can be disentangled from the latent space. Third, to remove the inconsistency after attribute manipulation, we propose an in-between frame composition constraint such that the arbitrary frame must be a direct composite of its neighboring frames. Our unified framework learns the inherent coherence between input frames in an end-to-end manner, and therefore it is agnostic to a specific attribute and can be applied to arbitrary editing of the same video without re-training. Extensive experiments demonstrate that RIGID outperforms state-of-the-art methods qualitatively and quantitatively in both inversion and editing tasks. The deliverables can be found in https://cnnlstm.github.io/RIGID.

Discipline

Computer Sciences | Graphics and Human Computer Interfaces

Research Areas

Software and Cyber-Physical Systems

Publication

2023 IEEE/CVF International Conference on Computer Vision (ICCV): Paris, October 2-6: Proceedings

First Page

13645

Last Page

13655

ISBN

9798350307184

Identifier

10.1109/ICCV51070.2023.01259

Publisher

IEEE Computer Society

City or Country

Washington, DC

Citation

XU, Yangyang; HE, Shengfeng; WONG, Kwan-Yee K.; and LUO, Pingluo. RIGID: Recurrent GAN inversion and editing of real face videos. (2023). 2023 IEEE/CVF International Conference on Computer Vision (ICCV): Paris, October 2-6: Proceedings. 13645-13655.
Available at: https://ink.library.smu.edu.sg/sis_research/8534

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/ICCV51070.2023.01259

Download

Download Research Data

Included in

Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

RIGID: Recurrent GAN inversion and editing of real face videos

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

RIGID: Recurrent GAN inversion and editing of real face videos

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links