TARGET-LEVEL CRITERION IN MARKOV DECISION-PROCESSES

被引：38

作者：

BOUAKIZ, M

KEBIR, Y

机构：

[1] LOYOLA UNIV,DEPT MANAGEMENT SCI,CHICAGO,IL 60611

[2] LOYOLA UNIV,DEPT MATH SCI,CHICAGO,IL 60611

来源：

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS | 1995年 / 86卷 / 01期

关键词：

MARKOV DECISION PROCESSES; TARGET-LEVEL CRITERION; FIXED POINTS; DYNAMIC PROGRAMMING; SUCCESSIVE APPROXIMATIONS;

D O I：

10.1007/BF02193458

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programming equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.

引用

页码：1 / 15

页数：15

共 13 条

[1] DISCOUNTED MDP - DISTRIBUTION-FUNCTIONS AND EXPONENTIAL UTILITY MAXIMIZATION [J].