Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
10-2025
Abstract
Recent advancements in CLIP-based out-of-distribution (OOD) detection have shown promising results via regularization on prompt tuning, leveraging background features extracted from a few in-distribution (ID) samples as proxies for OOD features.However, these methods suffer from an inherent limitation: a lack of diversity in the extracted OOD features from the few-shot ID data.To address this issue, we propose to leverage external datasets as auxiliary outlier data (i.e., pseudo OOD samples) to extract rich, diverse OOD features, with the features from not only background regions but also foreground object regions, thereby supporting more discriminative prompt tuning for OOD detection. We further introduce Auxiliary Prompt Tuning (APT), a novel framework that can be used as a plug-in module to enable existing prompt tuning-based methods to utilize the auxiliary data for more accurate OOD detection.There are two key challenges of utilizing those auxiliary data in prompt tuning, including I) foreground-background decomposition of unlabeled auxiliary data with diverse outlying objects and II) optimization of foreground OOD features. APT tackles challenge I with an adaptive logit-based Kullback–Leibler divergence method and challenge II by constructing foreground-background pairs for each foreground region to enable effective exploitation of foreground OOD features. Extensive experiments on standard and hard OOD benchmarks show that APT achieves state-of-the-art performance, obtaining significant improvements in challenging scenarios, e.g., hard OOD and 1-shot detection.
Discipline
Graphics and Human Computer Interfaces | Programming Languages and Compilers
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the 2025 IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, Hawaii, October 19-23
First Page
4776
Last Page
4785
City or Country
Honolulu, Hawai'i
Citation
MIAO, Wenjun; PANG, Guansong; WANG, Zihan; ZHENG, Jin; and BAI, Xiao.
Auxiliary prompt tuning of vision‑language models for few‑shot out‑of‑distribution detection. (2025). Proceedings of the 2025 IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, Hawaii, October 19-23. 4776-4785.
Available at: https://ink.library.smu.edu.sg/sis_research/10933
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://iccv.thecvf.com/virtual/2025/poster/2246
Included in
Graphics and Human Computer Interfaces Commons, Programming Languages and Compilers Commons