Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
12-2020
Abstract
We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an efficient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√NKT logT), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of Ω(√NKT), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewisestationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is Ω(√T). Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of GLR-CUCB compared to other state-of-the-art algorithms.
Discipline
Databases and Information Systems | Theory and Algorithms
Research Areas
Data Science and Engineering
Publication
Thirty-Third AAAI Conference on Artificial Intelligence: AAAI-20: New York, February 7-12: Proceedings
First Page
1
Last Page
9
ISBN
9781577358350
Identifier
10.1609/aaai.v34i04.6176
Publisher
AAAI Press
City or Country
Menlo Park, CA
Citation
ZHOU, Huozhi; WANG, Lingda; VARSHNEY, Lav N.; and LIM, Ee-Peng.
A near-optimal change-detection based algorithm for piecewise-stationary combinatorial semi-bandits. (2020). Thirty-Third AAAI Conference on Artificial Intelligence: AAAI-20: New York, February 7-12: Proceedings. 1-9.
Available at: https://ink.library.smu.edu.sg/sis_research/4968
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1609/aaai.v34i04.6176