Publication Type

Journal Article

Version

acceptedVersion

Publication Date

11-2025

Abstract

In free-hand sketch recognition, state-of-the-art methods often struggle to extract spatial features from sketches with sparse distributions, which are characterized by significant blank regions devoid of informative content. To address this challenge, we introduce a novel framework for sketch recognition, termed Sketch-SparseNet. This framework incorporates an advanced convolutional component: the Sketch-Driven Dilated Deformable Block (SD3B). This component excels at extracting spatial features and accurately recognizing free-hand sketches with sparse distributions. The SD3B component innovatively bridges gaps in the blank areas of sketches by establishing spatial relationships among disconnected stroke points through adaptive reshaping of convolution kernels. These kernels are deformable, dilatable, and dynamically positioned relative to the sketch strokes, ensuring the preservation of spatial information from sketch points. Consequently, Sketch-SparseNet extracts a more accurate and compact representation of spatial features, enhancing sketch recognition performance. Additionally, we introduce the SmoothAlign loss function, which minimizes the disparity between the output features of parallel SD3B and CNNs, facilitating effective feature fusion. Extensive evaluations on the QuickDraw-414k and TU-Berlin datasets highlight our method’s state-of-the-art performance, achieving accuracies of 79.52% and 85.78%, respectively. To our knowledge, this work represents the first application of a sparse convolution framework that substantially alleviates the adverse effects of sparse sketch points. The codes are available at https://github.com/kingbackyang/Sketch-SparseNet.

Keywords

Sketch recognition, Sketch-SparseNet, Sketch-Driven Dilated Deformable Block, Point clouds, QuickDraw-414k, TU-Berlin

Discipline

Graphics and Human Computer Interfaces | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Pattern Recognition

Volume

167

First Page

1

Last Page

12

ISSN

0031-3203

Identifier

10.1016/j.patcog.2025.111682

Publisher

Elsevier

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1016/j.patcog.2025.111682

Share

COinS