Publication Type

Journal Article

Version

acceptedVersion

Publication Date

3-2026

Abstract

Recent advancements in text-guided diffusion models have enabled powerful image manipulation capabilities. However, balancing reconstruction fidelity and editability for real images remains a significant challenge. In this work, we introduce Editing Inversion (EditInv), a novel framework that inverts and edits real images for specific editing tasks by optimizing specific prompt embeddings within the extended  space. By leveraging distinct embeddings across different U-Net layers and time steps, EditInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability. This hierarchical editing mechanism classifies tasks into structure, appearance, and global edits, optimizing only those embeddings that are unaffected by the current editing task. Extensive experiments on benchmark datasets demonstrate EditInv’s superior performance over existing methods, delivering both quantitative and qualitative improvements while showcasing its versatility with a few-step diffusion model.

Keywords

Image editing, diffusion inversion, disentanglement

Discipline

Graphics and Human Computer Interfaces | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

International Journal of Computer Vision

Volume

134

Issue

4

First Page

1

Last Page

18

ISSN

0920-5691

Identifier

10.1007/s11263-025-02691-1

Publisher

Springer

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1007/s11263-025-02691-1

Share

COinS