Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

3-2025

Abstract

As vision-language models advance, addressing the Zero-Shot Learning (ZSL) problem in the open world becomes increasingly crucial. Specifically, a robust model must handle three types of samples during inference: seen classes with visual and semantic information provided in training, unseen classes with only the semantic information in training, and unknown samples with no prior information from training. Existing methods either handle seen and unseen classes together (ZSL) or seen and unknown classes (known as Open-Set Recognition, OSR). However, none addresses the simultaneous handling of all three, which we term Open-Set Zero-Shot Learning (OZSL). To address this problem, we propose a two-stage approach for OZSL that recognizes seen, unseen, and unknown samples. The first stage classifies samples as either seen or not, while the second stage distinguishes unseen from unknown. Furthermore, we introduce a cross-stage knowledge transfer mechanism that leverages semantic relationships between seen and unseen classes to enhance learning in the second stage. Extensive experiments demonstrate the efficacy of the proposed approach compared to naívely combining existing ZSL and OSR methods. The code is available at https://github.com/smufang/OZSL.

Discipline

Artificial Intelligence and Robotics

Areas of Excellence

Digital transformation

Publication

Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, USA, February 26 - March 6

First Page

6868

Last Page

6878

Identifier

10.1109/WACV61041.2025.00668

City or Country

Tucson, Arizona, United States

Additional URL

https://doi.org/10.1109/WACV61041.2025.00668

Share

COinS