Journal of Applied Mathematics and Stochastic Analysis
Volume 12 (1999), Issue 2, Pages 151-160

Exact solution of the Bellman equation for a β-discounted reward in a two-armed bandit with switching arms

Doncho S. Donchev

Higher Institute of Food and Flavor Industries, 26, Maritza str., Plovdiv 4002, Bulgaria

Received 1 May 1998; Revised 1 October 1998

Copyright © 1999 Doncho S. Donchev. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


We consider the symmetric Poissonian two-armed bandit problem. For the case of switching arms, only one of which creates reward, we solve explicitly the Bellman equation for a β-discounted reward and prove that a myopic policy is optimal.