Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
10-2023
Abstract
We tackle the data scarcity challenge in few-shot point cloud recognition of 3D objects by using a joint prediction from a conventional 3D model and a well-pretrained 2D model. Surprisingly, such an ensemble, though seems trivial, has hardly been shown effective in recent 2D-3D models. We find out the crux is the less effective training for the “joint hard samples”, which have high confidence prediction on different wrong labels, implying that the 2D and 3D models do not collaborate well. To this end, our proposed invariant training strategy, called INVJOINT, does not only emphasize the training more on the hard samples, but also seeks the invariance between the conflicting 2D and 3D ambiguous predictions. INVJOINT can learn more collaborative 2D and 3D representations for better ensemble. Extensive experiments on 3D shape classification with widely-adopted ModelNet10/40, ScanObjectNN and Toys4K, and shape retrieval with ShapeNet-Core validate the superiority of our INVJOINT.
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, October 2-6
First Page
14463
Last Page
14474
Publisher
IEEE
City or Country
Piscataway, NJ
Citation
YI, Xuanyu; DENG, Jiajun; SUN, Qianru; HUA, Xian-Sheng; LIM, Joo-Hwee; and ZHANG, Hanwang.
Invariant training 2D-3D joint hard samples for few-shot point cloud recognition. (2023). Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, October 2-6. 14463-14474.
Available at: https://ink.library.smu.edu.sg/sis_research/8389
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.