Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2021

Abstract

In this paper, we introduce a non-stationary and context-free Multi-Armed Bandit (MAB) problem and a novel algorithm (which we refer to as BMAB) to solve it. The problem is context-free in the sense that no side information about users or items is needed. We work in a continuous-time setting where each timestamp corresponds to a visit by a user and a corresponding decision regarding recommendation. The main novelty is that we model the reward distribution as a consequence of variations in the intensity of the activity, and thereby we assist the exploration/exploitation dilemma by exploring the temporal dynamics of the audience. To achieve this, we assume that the recommendation procedure can be split into two different states: the loyal and the curious state. We identify the current state by modelling the events as a mixture of two Poisson processes, one for each of the possible states. We further assume that the loyal audience is associated with a single stationary reward distribution, but each bursty period comes with its own reward distribution. We test our algorithm and compare it to several baselines in two strands of experiments: synthetic data simulations and real-world datasets. The results demonstrate that BMAB achieves competitive results when compared to state-of-the-art methods.

Keywords

Recommender Systems, Reinforcement Learning, Online learning, Poisson processes, Time Series Analysis, bursty methods, audience dynamics

Discipline

Artificial Intelligence and Robotics | Numerical Analysis and Scientific Computing

Research Areas

Intelligent Systems and Optimization

Publication

RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems, September 27 - October 1, Amsterdam

First Page

292

Last Page

301

ISBN

9781450384582

Identifier

10.1145/3460231.3474250

Publisher

ACM

City or Country

New York

Citation

ALVES, Rodrigo; LEDENT, Antoine; and KLOFT, Marius. Burst-induced Multi-Armed Bandit for learning recommendation. (2021). RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems, September 27 - October 1, Amsterdam. 292-301.
Available at: https://ink.library.smu.edu.sg/sis_research/7209

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Comments

Video of the presentation available at https://www.youtube.com/watch?v=PS8FTUIdfAQ

Additional URL

https://doi.org/10.1145/3460231.3474250

Download

Included in

Artificial Intelligence and Robotics Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

Burst-induced Multi-Armed Bandit for learning recommendation

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Comments

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Burst-induced Multi-Armed Bandit for learning recommendation

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Comments

Additional URL

Included in

Share

Search

Links

Browse

Links