Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
6-2020
Abstract
We investigate the compression of deep neural networks by quantizing their weights and activations into multiple binary bases, known as multi-bit networks (MBNs), which accelerate the inference and reduce the storage for the deployment on low-resource mobile and embedded platforms. We propose Adaptive Loss-aware Quantization (ALQ), a new MBN quantization pipeline that is able to achieve an average bitwidth below one-bit without notable loss in inference accuracy. Unlike previous MBN quantization solutions that train a quantizer by minimizing the error to reconstruct full precision weights, ALQ directly minimizes the quantizationinduced error on the loss function involving neither gradient approximation nor full precision maintenance. ALQ also exploits strategies including adaptive bitwidth, smooth bitwidth reduction, and iterative trained quantization to allow a smaller network size without loss in accuracy. Experiment results on popular image datasets show that ALQ outperforms state-of-the-art compressed networks in terms of both storage and accuracy.
Keywords
Quantization (signal), Optimization, Neural networks, Adaptive systems, Microprocessors, Training, Tensile stress
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Software and Cyber-Physical Systems
Publication
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition: June 16-18, Seattle, WA: Proceedings
First Page
7988
Last Page
7997
ISBN
9781728171692
Identifier
10.1109/CVPR42600.2020.00801
Publisher
IEEE Computer Society
City or Country
Los Alamitos, CA
Citation
QU, Zhongnan; ZHOU, Zimu; CHENG, Yun; and THIELE, Lothar.
Adaptive loss-aware quantization for multi-bit networks. (2020). 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition: June 16-18, Seattle, WA: Proceedings. 7988-7997.
Available at: https://ink.library.smu.edu.sg/sis_research/5251
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/CVPR42600.2020.00801
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons