On-device deep multi-task inference via multi-task zipping

Xiaoxi HE
Xu WANG
Zimu ZHOU, Singapore Management University
Jiahang WU
Zheng YANG
Lothar THIELE

Abstract

Future mobile devices are anticipated to perceive, understand and react to the world on their own by running multiple correlated deep neural networks locally on-device. Yet the complexity of these deep models needs to be trimmed down both within-model and cross-model to fit in mobile storage and memory. Previous studies squeeze the redundancy within a single model. In this work, we aim to reduce the redundancy across multiple models. We propose Multi-Task Zipping (MTZ), a framework to automatically merge correlated, pre-trained deep neural networks for cross-model compression. Central in MTZ is a layer-wise neuron sharing and incoming weight updating scheme that induces a minimal change in the error function. MTZ inherits information from each model and demands light retraining to re-boost the accuracy of individual tasks. MTZ supports typical network layers (fully-connected, convolutional and residual) and applies to inference tasks with different input domains. Evaluations show that MTZ can fully merge the hidden layers of two VGG-16 networks with a 3.18% increase in the test error averaged on ImageNet for object classification and CelebA for facial attribute classification, or share 39.61% parameters between the two networks with