共 9 条
[1]
Agrawal R.(1995)Sample mean based index policies with Advances in Applied Probability 27 1054-1078
[2]
Burnetas A.(1996)log Advances in Applied Mathematics 17 122-142
[3]
Katehakis M.(1994) regret for the multi-armed bandit problem Journal of Optimization Theory and Applications 83 113-154
[4]
Ishikida T.(1985)Optimal adaptive policies for sequential allocation problems Advances in Applied Mathematics 6 4-22
[5]
Varaiya P.(1991)Multi-armed bandit problem revisited Annals of Operations Research 28 297-312
[6]
Lai T.(undefined)Asymptotically efficient adaptive allocation rules undefined undefined undefined-undefined
[7]
Robbins H.(undefined)Nonparametric bandit methods undefined undefined undefined-undefined
[8]
Yakowitz S.(undefined)undefined undefined undefined undefined-undefined
[9]
Lowe W.(undefined)undefined undefined undefined undefined-undefined