Publication Type
Journal Article
Version
acceptedVersion
Publication Date
8-2025
Abstract
This article investigates the problem of continual learning (CL) of vision-language models (VLMs) in open domains, where models are required to perform continual updating and inference on a stream of datasets from diverse seen and unseen domains with novel classes. Such a capability is crucial for various applications in open environments, e.g., AI assistants, autonomous driving systems, and robotics. Current CL studies mostly focus on closed-set scenarios in a single domain with known classes. Large pretrained VLMs such as CLIP have showcased exceptional zero-shot recognition capabilities, and several recent studies have leveraged the unique characteristics of VLMs to mitigate catastrophic forgetting in CL. However, they primarily focus on closed-set CL in a single-domain dataset. Open-domain CL of large VLMs is significantly more challenging due to 1) large class correlations and domain gaps across the datasets and 2) the forgetting of zero-shot knowledge in the pretrained VLMs and the knowledge learned from the newly adapted datasets. In this work, we introduce a novel approach, termed CoLeCLIP, which learns an open-domain CL model based on CLIP. It addresses these challenges through joint learning of a set of task prompts and a cross-domain class vocabulary. Extensive experiments on 11 domain datasets show that CoLeCLIP achieves new state-of-the-art performance for open-domain CL under both task- and class-incremental learning (CIL) settings.
Discipline
Artificial Intelligence and Robotics | OS and Networks
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
IEEE Transactions on Neural Networks and Learning Systems
Volume
36
Issue
8
First Page
15137
Last Page
15151
ISSN
2162-237X
Identifier
10.1109/TNNLS.2025.3547882
Publisher
Institute of Electrical and Electronics Engineers
Citation
LI, Yukun; PANG, Guansong; SUO, Wei; CHEN, Chenchen; XI, Yuling; LIU, Lingqiao; CHEN, Hao; LIANG, Guoqiang; and WANG, Peng.
CoLeCLIP: Open-domain continual learning via joint task prompt and vocabulary learning. (2025). IEEE Transactions on Neural Networks and Learning Systems. 36, (8), 15137-15151.
Available at: https://ink.library.smu.edu.sg/sis_research/10398
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TNNLS.2025.3547882