Research Collection School Of Computing and Information Systems

InceptionNeXt: When Inception meets ConvNeXt

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

6-2024

Abstract

Inspired by the long-range modeling ability of ViTs, large-kernel convolutions are widely studied and adopted recently to enlarge the receptive field and improve model performance, like the remarkable work ConvNeXt which employs 7×7 depthwise convolution. Although such depthwise operator only consumes a few FLOPs, it largely harms the model efficiency on powerful computing devices due to the high memory access costs. For example, ConvNeXtT has similar FLOPs with ResNet-50 but only achieves ∼ 60% throughputs when trained on A100 GPUs with full precision. Although reducing the kernel size of ConvNeXt can improve speed, it results in significant performance degradation, which poses a challenging problem: How to speed up large-kernel-based CNN models while preserving their performance. To tackle this issue, inspired by Inceptions, we propose to decompose large-kernel depthwise convolution into four parallel branches along channel dimension, i.e., small square kernel, two orthogonal band kernels, and an identity mapping. With this new Inception depthwise convolution, we build a series of networks, namely IncepitonNeXt, which not only enjoy high throughputs but also maintain competitive performance. For instance, InceptionNeXt-T achieves 1.6× higher training throughputs than ConvNeX-T, as well as attains 0.2% top-1 accuracy improvement on ImageNet-1K. We anticipate InceptionNeXt can serve as an economical baseline for future architecture design to reduce carbon footprint

Keywords

CNN, convolution, efficient neural networks

Discipline

Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

2024 IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR): Seattle, June 17-21: Proceedings

First Page

Last Page

ISBN

9798350353006

Identifier

10.1109/CVPR52733.2024.00542

Publisher

IEEE

City or Country

Piscataway, NJ

Citation

YU, Weihao; ZHOU, Pan; YAN, Shuicheng; and WANG, Xinchao. InceptionNeXt: When Inception meets ConvNeXt. (2024). 2024 IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR): Seattle, June 17-21: Proceedings. 1-12.
Available at: https://ink.library.smu.edu.sg/sis_research/8981

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/CVPR52733.2024.00542

Download

Included in

Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

InceptionNeXt: When Inception meets ConvNeXt

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

InceptionNeXt: When Inception meets ConvNeXt

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links