Markov Decision Processes in Artificial IntelligenceOlivier Sigaud, Olivier Buffet Markov Decision Processes (MDPs) are a mathematical framework for modeling sequential decision problems under uncertainty as well as reinforcement learning problems. Written by experts in the field, this book provides a global view of current research using MDPs in artificial intelligence. It starts with an introductory presentation of the fundamental aspects of MDPs (planning in MDPs, reinforcement learning, partially observable MDPs, Markov games and the use of non-classical criteria). It then presents more advanced research trends in the field and gives some concrete examples using illustrative real life applications. |
Contents
Reinforcement Learning | |
Approximate Dynamic Programming | |
Factored Markov Decision Processes | |
PolicyGradient Algorithms | |
Online Resolution Techniques | |
Partially Observable Markov Decision Processes | |
Stochastic Games | |
DECMDPPOMDP | |
NonStandard Criteria | |
Online Learning for MicroObject Manipulation | |
Autonomous Helicopter Searching for a Landing Area in | |
Resource Consumption Control for an Autonomous Robot | |
Operations Planning | |
Index | |
Other editions - View all
Markov Decision Processes in Artificial Intelligence Olivier Sigaud,Olivier Buffet No preview available - 2010 |
Common terms and phrases
abalone action adapted agent application approach Artificial Intelligence average reward belief Bellman Bellman equation calculated Chapter Coffee Robot CoMDP complexity computation Conference on Artificial Conference on Machine consider convergence cost criteria criterion DEC-MDPs DEC-POMDP decision problems decision tree defined DEFINITION deterministic dominated dynamic programming efficient equation estimate example execution exploration Figure finite horizon FMDP formalized framework function approximation gradient graph heuristic initial International Conference joint policy learning algorithms linear Machine Learning Markov decision processes methods mixed strategies multiagent Nash equilibrium nodes obtained operator optimal policy optimal value function partially observable payoff performance player POMDPs possibilistic possible probabilistic probability distribution Proceedings proposed Q-learning reachable reinforcement learning represent representation reward function rover RTDP sea otters simulations solution solve space state-action pair step stochastic games strategy trajectories transition function uncertainty update utility variables vector ZILBERSTEIN