Research Collection School Of Computing and Information Systems

A near-optimal change-detection based algorithm for piecewise-stationary combinatorial semi-bandits

Huozhi ZHOU, University of Illinois at Urbana-Champaign
Lingda WANG, University of Illinois at Urbana-Champaign
Lav N. VARSHNEY, University of Illinois at Urbana-Champaign
Ee-Peng LIM, Singapore Management UniversityFollow

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

12-2020

Abstract

We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an eﬃcient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√NKT logT), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of Ω(√NKT), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewisestationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is Ω(√T). Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of GLR-CUCB compared to other state-of-the-art algorithms.

Discipline

Databases and Information Systems | Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

Thirty-Third AAAI Conference on Artificial Intelligence: AAAI-20: New York, February 7-12: Proceedings

First Page

Last Page

ISBN

9781577358350

Identifier

10.1609/aaai.v34i04.6176

Publisher

AAAI Press

City or Country

Menlo Park, CA

Citation

ZHOU, Huozhi; WANG, Lingda; VARSHNEY, Lav N.; and LIM, Ee-Peng. A near-optimal change-detection based algorithm for piecewise-stationary combinatorial semi-bandits. (2020). Thirty-Third AAAI Conference on Artificial Intelligence: AAAI-20: New York, February 7-12: Proceedings. 1-9.
Available at: https://ink.library.smu.edu.sg/sis_research/4968

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1609/aaai.v34i04.6176

Download

Find it in your library

Included in

Databases and Information Systems Commons, Theory and Algorithms Commons

COinS

Research Collection School Of Computing and Information Systems

A near-optimal change-detection based algorithm for piecewise-stationary combinatorial semi-bandits

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

A near-optimal change-detection based algorithm for piecewise-stationary combinatorial semi-bandits

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links