Research Collection School Of Computing and Information Systems

Rethinking pruning for accelerating deep inference at the edge

Dawei GAO, Beijing University of Aeronautics and Astronautics (Beihang University)
Xiaoxi HE, ETH Zurich
Zimu ZHOU, Singapore Management UniversityFollow
Yongxin TONG, Beijing University of Aeronautics and Astronautics (Beihang University)
Ke XU, Beijing University of Aeronautics and Astronautics (Beihang University)
Lothar THIELE, ETH Zurich

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

8-2020

Abstract

There is a growing trend to deploy deep neural networks at the edge for high-accuracy, real-time data mining and user interaction. Applications such as speech recognition and language understanding often apply a deep neural network to encode an input sequence and then use a decoder to generate the output sequence. A promising technique to accelerate these applications on resource-constrained devices is network pruning, which compresses the size of the deep neural network without severe drop in inference accuracy. However, we observe that although existing network pruning algorithms prove effective to speed up the prior deep neural network, they lead to dramatic slowdown of the subsequent decoding and may not always reduce the overall latency of the entire application. To rectify such drawbacks, we propose entropy-based pruning, a new regularizer that can be seamlessly integrated into existing network pruning algorithms. Our key theoretical insight is that reducing the information entropy of the deep neural network outputs decreases the upper bound of the subsequent decoding search space. We validate our solution with two state-of-the-art network pruning algorithms on two model architectures. Experimental results show that compared with existing network pruning algorithms, our entropy-based pruning method notably suppresses and even eliminates the increase of decoding time, and achieves shorter overall latency with only negligible extra accuracy loss in the applications.

Keywords

Deep Learning, Sequence Labelling, Network Pruning, Automatic Speech Recognition, Name Entity Recognition

Discipline

Databases and Information Systems | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, San Diego, CA, August 22-27

First Page

155

Last Page

164

ISBN

9781450379984

Identifier

10.1145/3394486.3403058

Publisher

ACM

City or Country

New York

Citation

GAO, Dawei; HE, Xiaoxi; ZHOU, Zimu; TONG, Yongxin; XU, Ke; and THIELE, Lothar. Rethinking pruning for accelerating deep inference at the edge. (2020). KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, San Diego, CA, August 22-27. 155-164.
Available at: https://ink.library.smu.edu.sg/sis_research/5292

Copyright Owner and License

Publisher

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3394486.3403058

Download

Find it in your library

Included in

Databases and Information Systems Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Rethinking pruning for accelerating deep inference at the edge

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Rethinking pruning for accelerating deep inference at the edge

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links