共 24 条
[8]
Linear Least-Squares algorithms for temporal difference learning[J] . Steven J. Bradtke,Andrew G. Barto.Machine Learning . 1996 (1)
[9]
Reinforcement learning with replacing eligibility traces[J] . Satinder P. Singh,Richard S. Sutton.Machine Learning . 1996 (1)
[10]
Incremental multi-step Q-learning[J] . Jing Peng,Ronald J. Williams.Machine Learning . 1996 (1)