Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

8-2025

Abstract

Few-shot learning (FSL) addresses the challenge of classifying novel classes with limited training samples. While some methods leverage semantic knowledge from smaller-scale models to mitigate data scarcity, these approaches often introduce noise and bias due to the data's inherent simplicity. In this paper, we propose a novel framework, Synergistic Knowledge Transfer (SYNTRANS), which effectively transfers diverse and complementary knowledge from large multimodal models to empower the off-the-shelf few-shot learner. Specifically, SYNTRANS employs CLIP as a robust teacher and uses a few-shot vision encoder as a weak student, distilling semantic-aligned visual knowledge via an unsupervised proxy task. Subsequently, a training-free synergistic knowledge mining module facilitates collaboration among large multimodal models to extract high-quality semantic knowledge. Building upon this, a visual-semantic bridging module enables bi-directional knowledge transfer between visual and semantic spaces, transforming explicit visual and implicit semantic knowledge into category-specific classifier weights. Finally, SYNTRANS introduces a visual weight generator and a semantic weight reconstructor to adaptively construct optimal multimodal FSL classifiers. Experimental results on four FSL datasets demonstrate that SYNTRANS, even when paired with a simple few-shot vision encoder, significantly outperforms current state-of-the-art methods.

Discipline

Artificial Intelligence and Robotics

Areas of Excellence

Digital transformation

Publication

IJCAI '25: Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, Montreal, Canada, August 16-22

First Page

6227

Last Page

6235

Identifier

10.24963/ijcai.2025/693

Publisher

ACM

City or Country

New York

Additional URL

https://doi.org/10.24963/ijcai.2025/693

Share

COinS