440
Views
6
CrossRef citations to date
0
Altmetric
Articles

A parallel framework for Bayesian reinforcement learning

, &
Pages 7-23 | Received 01 Sep 2013, Accepted 19 Nov 2013, Published online: 13 Mar 2014

References

  • Baird, L. (1995, July 9–12). Residual algorithms: Reinforcement learning with function approximation. Proceedings of the 12th international conference on machine learning, Tahoe City, CA.
  • Chu, C.-T., Kim, S. K., Lin, Y.-A., Yu, Y. Y., Bradski, G., Ng, A. Y., et al. (2007). Map-reduce for machine learning on multicore. In B. Schölkopf, J. C. Platt, & T. Hoffman (Eds.), Advances in neural information processing systems (pp. 281–288). Cambridge, MA: MIT Press.
  • Doshi, P., Goodwin, R., Akkiraju R., & Verma, K. (2005). Dynamic workflow composition using Markov decision processes. International Journal of Web Services Research, 2, 1–17. doi: 10.4018/jwsr.2005010101
  • Dearden, R., Friedman, N., & Andre, D. (1999, July). Model based Bayesian exploration. Proceedings of the fifteenth conference on uncertainty in artificial intelligence (pp. 150–159). Stockholm, Sweden.
  • Dutreilh, X., Rivierre, N., Moreau, A., Malenfant, J., & Truck, I. (2010). From data center resource allocation to control theory and back. In S. S. Yau & L.-J. Zhang (Eds.), 2010 IEEE 3rd international conference on cloud computing (CLOUD) (pp. 410–417). Miami, FL: IEEE.
  • Friedman, N., & Singer, Y. (1999). Efficient Bayesian parameter estimation in large discrete domains. In Advances in neural information processing systems 11: Proceedings of the 1998 conference, Vol. 11, p. 417, The MIT Press. Retrieved from http://books.google.co.in/books?id=bMuzXPzlkG0C&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false
  • Grounds, M., & Kudenko, D. (2008, May 12–16). Parallel reinforcement learning with linear function approximation. Adaptive agents and multi-agent systems III. Adaptation and multi-agent learning. Estoril, Portugal.
  • Grounds, M., & Kudenko, D. (2009). Learning shaping rewards in model-based reinforcement learning. Proc. AAMAS 2009 workshop on adaptive learning agents, Budapest, Hungary.
  • Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
  • Kawaguchi, K., & Araya, M. (2013). A greedy approximation of Bayesian reinforcement learning with probably optimistic transition model. Proc. AAMAS 2013 workshop on adaptive learning agents, Saint Paul, Minnesota, USA.
  • Kretchmar, R. M. (2002, July 14–18). Parallel reinforcement learning. The 6th world conference on systemics, cybernetics, and informatics, Orlando, FL.
  • Kushida, M., Takahashi, K., Ueda, H., & Miyahara, T. (2006, December). A comparative study of parallel reinforcement learning methods with a PC cluster system. Proceedings of the IEEE/WIC/ACM international conference on intelligent agent technology, pp. 18–22, Hong Kong, China.
  • Li, Y., & Schuurmans, D. (2012). MapReduce for parallel reinforcement learning. In S. Sanner & M. Hutter (Eds.), Recent advances in reinforcement learning (pp. 309–320). New York: Springer Berlin Heidelberg.
  • Littman, M. L. (1994, July 10–13). Markov games as a framework for multi-agent reinforcement learning. Proceedings of the 11th international conference on machine learning, New Brunswick, NJ.
  • Melo, F. S., Meyn, S. P., & Ribeiro, M. I. (2008, July 5–9). An analysis of reinforcement learning with function approximation. Proceedings of the 25th international conference on machine learning, Helsinki, Finland.
  • Nau, D., Ghallab, M., & Traverso, P. (2004) Automated planning: Theory & practice. San Francisco, CA: Morgan Kaufmann.
  • Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete Bayesian reinforcement learning. In W. W. Cohen & A. Moore (Eds.), Proceedings of the 23rd international conference on machine learning (pp. 697–704). New York, NY: ACM.
  • Russell, S. J., Norvig, P., Canny, J. F., Malik, J. M., & Edwards, D. D. (1995). Artificial intelligence: A modern approach (Vol. 2). Englewood Cliffs, NJ: Prentice Hall.
  • Spiegelhalter, D. J., Dawid, A. P., Lauritzen, S. L., & Cowell, R. G. (1993). Bayesian analysis in expert systems. Statistical Science, 8, 219–247. doi: 10.1214/ss/1177010888
  • Strens, M. (2000). A Bayesian framework for reinforcement learning. ICML, (pp. 943–950). Stanford, CA.
  • Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (1st ed., Vol. 1). Cambridge, MA: Cambridge University Press.
  • Sutton, R. S., McAllester, D. A., Singh, S. P., & Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation, In Advances in Neural Information Processing Systems, Denver, Colorado, USA, Vol. 99, pp. 1057–1063. Retrieved from http://webdocs.cs.ualberta.ca/sutton/papers/SMSM-NIPS99.pdf
  • Watkins, C. (1989). Learning from delayed rewards. England: University of Cambridge.
  • Zinkevich, M., Weimer, M., Smola, A., & Li, L. (2010). Parallelized stochastic gradient descent. Advances in neural information processing systems, (Vol. 23, pp. 2595–2603), Lake Tahoe, Nevada, USA.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.