Publication Type
Journal Article
Version
acceptedVersion
Publication Date
4-2025
Abstract
In this paper, we propose a novel translation model, UniTranslator, for transforming representations between visually distinct domains under conditions of limited training data and significant visual differences. The main idea behind our approach is leveraging the domain-neutral capabilities of CLIP as a bridging mechanism, while utilizing a separate module to extract abstract, domain-agnostic semantics from the embeddings of both the source and target realms. Fusing these abstract semantics with target-specific semantics results in a transformed embedding within the CLIP space. To bridge the gap between the disparate worlds of CLIP and StyleGAN, we introduce a new non-linear mapper, the CLIP2P mapper. Utilizing CLIP embeddings, this module is tailored to approximate the latent distribution in the StyleGAN's latent space, effectively acting as a connector between these two spaces. The proposed UniTranslator is versatile and capable of performing various tasks, including style mixing, stylization, and translations, even in visually challenging scenarios across different visual domains. Notably, UniTranslator generates high-quality translations that showcase domain relevance, diversity, and improved image quality. UniTranslator surpasses the performance of existing general-purpose models and performs well against specialized models in representative tasks.
Keywords
Translation, Visualization, Semantics, Data models, Correlation, Codes, Image quality, Training data, Training, Oceans, Generative adversarial networks, image-to-image translation, GAN embedding
Discipline
Graphics and Human Computer Interfaces | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume
47
Issue
4
First Page
2865
Last Page
2881
ISSN
0162-8828
Identifier
10.1109/TPAMI.2025.3530099
Publisher
Institute of Electrical and Electronics Engineers
Citation
DU, Yong; ZHAN, Jiahui; LI, Xinzhe; DONG, Junyu; CHEN, Sheng; YANG, Ming-Hsuan; and HE, Shengfeng.
One-for-All: Towards universal domain translation with a single StyleGAN. (2025). IEEE Transactions on Pattern Analysis and Machine Intelligence. 47, (4), 2865-2881.
Available at: https://ink.library.smu.edu.sg/sis_research/10512
Copyright Owner and License
Authors
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TVCG.2024.3522565