Publication Type
Journal Article
Version
publishedVersion
Publication Date
7-2019
Abstract
Gradient-based Monte Carlo sampling algorithms, like Langevin dynamics and Hamiltonian Monte Carlo, are important methods for Bayesian inference. In large-scale settings, full-gradients are not affordable and thus stochastic gradients evaluated on mini-batches are used as a replacement. In order to reduce the high variance of noisy stochastic gradients, Dubey et al. (in: Advances in neural information processing systems, pp 1154–1162, 2016) applied the standard variance reduction technique on stochastic gradient Langevin dynamics and obtained both theoretical and experimental improvements. In this paper, we apply the variance reduction tricks on Hamiltonian Monte Carlo and achieve better theoretical convergence results compared with the variance-reduced Langevin dynamics. Moreover, we apply the symmetric splitting scheme in our variance-reduced Hamiltonian Monte Carlo algorithms to further improve the theoretical results. The experimental results are also consistent with the theoretical results. As our experiment shows, variance-reduced Hamiltonian Monte Carlo demonstrates better performance than variance-reduced Langevin dynamics in Bayesian regression and classification tasks on real-world datasets.
Keywords
Hamiltonian Monte Carlo, Variance reduction, Bayesian inference
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering; Intelligent Systems and Optimization
Publication
Machine Learning
Volume
108
Issue
8-9
First Page
1701
Last Page
1727
ISSN
0885-6125
Identifier
10.1007/s10994-019-05825-y
Publisher
Springer
Citation
LI, Zhize; ZHANG, Tianyi; CHENG, Shuyu; ZHU, Jun; and LI, Jian.
Stochastic gradient Hamiltonian Monte Carlo with variance reduction for Bayesian inference. (2019). Machine Learning. 108, (8-9), 1701-1727.
Available at: https://ink.library.smu.edu.sg/sis_research/8689
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/s10994-019-05825-y