Multiagent Planning with Factored MDPs (2002)by C.E. Guestrin, D. Koller, and R. Parr
Abstract:
We present a new, principled and efficient planning algorithm for
cooperative multi-agent dynamic systems. A striking feature of our method
is that the coordination and communication between the agents is not
imposed, but derived directly from the system dynamics and function
approximation architecture. We view the entire multi-agent system as a
single large Markov decision process (MDP), which we assume canb e
represented in a factored way using a dynamic Bayesian network (DBN). THe
actionspace of the resulting MDP is the joint action space of the entire
set of agents. Our approach is based on the use of factored linear value
functions as an approximation to the joint value function. This
factorization of the value function allows the agents to coordinate their
actions at runtime using a natural message passing scheme. We provide
asimple and efficient method for determining such an approximate value
function by solving a single linear program, whose size is determined by the
interaction between the value function structure and the DBN, avoiding t he
exponential blowup in the state and action space. We show that our approach
compares favorably with approaches based on reward sharing. We alsoshow
that our algorithm is an efficient alternative to more complicated
algorithms even in the single agent case.
Download Information
|
C.E. Guestrin, D. Koller, and R. Parr (2002). "Multiagent Planning with Factored MDPs." Advances in Neural Information Processing Systems (NIPS 2001) (pp. 1523-1530).
|
|
Bibtex citation
@inproceedings{Guestrin+al:NIPS01,
author = "C.E. Guestrin and D. Koller and R. Parr",
booktitle = {Advances in Neural Information Processing Systems (NIPS 2001)},
title = "Multiagent Planning with Factored {MDP}s",
address = "Vancouver, Canada",
pages = "1523--1530",
year = "2002",
}
full list
|