共 6 条
[1]
Barto A.G., Bradtke S.J., Singh S.P., Learning to Act Using Real-Time Dynamic Programming, Artificial Intelligence, 72, pp. 81-138, (1995)
[2]
Barto A.S., Sutton R., Reinforcement Learning, (1997)
[3]
Bertsekas D.P., Tsitsiklis J.N., Neuro-Dynamic Programming, (1996)
[4]
Glover F., Taillard E., De Werra D., A User's Guide to Tabu Search, Annals of Operations Research, 41, pp. 3-28, (1993)
[5]
Pattipati K.R., Alexandridis M.G., Application of Heuristic Search and Information Theory to Sequential Fault Diagnosis, IEEE Transactions on Systems, Man, and Cybernetics, 20, pp. 872-887, (1990)
[6]
Tesauro G., Galperin G.R., On-Line Policy Improvement Using Monte Carlo Search, (1996)