Bayesian Online Model Selection
About
Online model selection in Bayesian bandits poses a fundamental challenge of exploration: For an unknown environment instance drawn from the prior distribution, how can we adaptively explore multiple bandit learners, and compete with the best one in terms of performance? We address this problem by introducing a novel Bayesian algorithm for online model selection in stochastic bandits. We establish an oracle-best guarantee of on the Bayesian regret, where is the number of base learners, is the regret coefficient of the optimal base learner, and is the time horizon. We further validate our algorithm through experiments across various stochastic bandit settings, demonstrating its performance is competitive with that of the best base learner.
Speakers

Yuke Zhang
Advised by Professor Aguêmon Yves Atchadé, Yuke is generally interested in statistics foundations of deep learning under the Bayesian perspective. Specifically, Yuke is focusing on online learning on high-dimensional sparse spatio-temporal data, with emphasis on algorithms like Thompson Sampling.

Aida Afshar
Aida Afshar is a PhD student at Boston University’s Faculty of Computing and Data Sciences. Previously, she received her bachelor's degree in Mathematics with a minor in Computer Science from Sharif University of Technology. Her primary research interests are Sequential Decision-Making and Statistical Foundations of Continual Learning. Learn more about Aida here: aidaafshar.github.io