MixSA: Training-free reference-based sketch extraction via Mixture-of-Self-Attention

Publication Type

Journal Article

Publication Date

9-2025

Abstract

Current sketch extraction methods either require extensive training or fail to capture a wide range of artistic styles, limiting their practical applicability and versatility. We introduce Mixture-of-Self-Attention (MixSA), a training-free sketch extraction method that leverages strong diffusion priors for enhanced sketch perception. At its core, MixSA employs a mixture-of-self-attention technique, which manipulates self-attention layers by substituting the keys and values with those from reference sketches. This allows for the seamless integration of brushstroke elements into initial outline images, offering precise control over texture density and enabling interpolation between styles to create novel, unseen styles. By aligning brushstroke styles with the texture and contours of colored images, particularly in late decoder layers handling local textures, MixSA addresses the common issue of color averaging by adjusting initial outlines. Evaluated with various perceptual metrics, MixSA demonstrates superior performance in sketch quality, flexibility, and applicability. This approach not only overcomes the limitations of existing methods but also empowers users to generate diverse, high-fidelity sketches that more accurately reflect a wide range of artistic expressions.

Keywords

Sketch extraction, image representations, image generation, image-to-image translation

Discipline

Graphics and Human Computer Interfaces | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

IEEE Transactions on Visualization and Computer Graphics

Volume

31

Issue

9

First Page

6208

Last Page

6222

ISSN

1077-2626

Identifier

10.1109/TVCG.2024.3502395

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TVCG.2024.3502395

This document is currently not available here.

Share

COinS