Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

12-2018

Abstract

As an incremental-gradient algorithm, the hybrid stochastic gradient descent (HSGD) enjoys merits of both stochastic and full gradient methods for finite-sum problem optimization. However, the existing rate-of-convergence analysis for HSGD is made under with-replacement sampling (WRS) and is restricted to convex problems. It is not clear whether HSGD still carries these advantages under the common practice of without-replacement sampling (WoRS) for non-convex problems. In this paper, we affirmatively answer this open question by showing that under WoRS and for both convex and non-convex problems, it is still possible for HSGD (with constant step-size) to match full gradient descent in rate of convergence, while maintaining comparable sample-size-independent incremental first-order oracle complexity to stochastic gradient descent. For a special class of finite-sum problems with linear prediction models, our convergence results can be further improved in some cases. Extensive numerical results confirm our theoretical affirmation and demonstrate the favorable efficiency of WoRS-based HSGD.

Discipline

OS and Networks

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

Proceedings of the 32nd Annual Conference on Advances in Neural Information Processing Systems, Montréal, Canada, 2018 December 2-8

First Page

1

Last Page

10

Publisher

NeurIPS

City or Country

Montréal, CANADA

Additional URL

https://papers.nips.cc/paper_files/paper/2018/hash/67e103b0761e60683e83c559be18d40c-Abstract.html

Share

COinS