Next: About this document
Up: No Title
Previous: No Title
References
- 1
-
S. Singh and D. Bertsekas.
Reinforcement learning for dynamic channel allocation in cellular
telephone systems.
In Michael C. Mozer, Michael I. Jordan, and Thomas Petsche, editors,
Advances in Neural Information Processing Systems, volume 9, page 974.
The MIT Press, 1997.
- 2
-
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour.
Policy gradient methods for reinforcement learning with function
approximation.
In S. A. Solla, T. K. Leen, and K.-R Müller, editors, Advances
in Neural Information Processing Systems, volume 12. The MIT Press, 2000.
- 3
-
G. Tesauro.
Neurogammon wins computer olympiad.
Neural Computation, 1(3):321-323, 1989.
- 4
-
J. N. Tsitsiklis V. R. Konda.
Actor-critic algorithms.
In S. A. Solla and T. K. Leenand K.-R Müller, editors,
Advances in Neural Information Processing Systems, volume 12. The MIT
Press, 2000.
Dirk Ormoneit
Tue Sep 5 16:37:23 PDT 2000