63
Views
1
CrossRef citations to date
0
Altmetric
Computers and Computing

Solving Partially Observable Environments with Universal Search Using Dataflow Graph-Based Programming Model

ORCID Icon &

References

  • L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning and acting in partially observable stochastic domains,” Artif. Intell., Vol. 101, no. 1–2, pp. 99–134, 1998. DOI: 10.1016/S0004-3702(98)00023-X.
  • A. R. Cassandra, “A survey of POMDP applications,” in Working Notes of AAAI 1998 Fall Symposium on Planning with Partially Observable Markov Decision Processes. Vol. 1724, Oct. 1998. Orlando, Florida.
  • T. Glasmachers, and J. Schmidhuber, “Optimal direct policy search", In The Fourth Conference on Artificial General Intelligence, Mountain View, CA, August 2011, pp. 52–61.
  • L. A. Levin, “Universal sequential search problems,” Probl. Peredaci Inf., Vol. 9, pp. 115–6, 1973. Translated in Problems of Information Transmission 9, pp. 265–6.
  • M. Li, and P. Vitányi, An Introduction to Kolmogorov Complexity and Its Applications. Vol. 3, New York: Springer Science & Business Media, 2008.
  • R. J. Solomonoff, “A formal theory of inductive inference. Part I,” Inf. Control, Vol. 7, no. 1, pp. 1–22, 1964. DOI: 10.1016/S0019-9958(64)90223-2.
  • R. J. Solomonoff, “A formal theory of inductive inference. Part II,” Inf. Control, Vol. 7, no. 2, pp. 224–54, 1964. Doi: 10.1016/S0019-9958(64)90131-7.
  • W. M. Johnston, J. R. Hanna, and R. J. Millar, “Advances in dataflow programming languages,” ACM Comput. Surveys (CSUR), Vol. 36, no. 1, pp. 1–34, 2004. Doi: 10.1145/1013208.1013209.
  • T. Akidau, et al., The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing, 2015
  • J. Brandt, W. Reisig, and U. Leser, “Computation semantics of the functional scientific workflow language cuneiform,” J. Funct. Program., Vol. 27, 2017. DOI: 10.1017/S0956796817000119.
  • M. Looks, and B. Goertzel, “Program representation for general intelligence" In 2nd conference on AGI, Arlington, Virginia, May 2009, pp. 146-151.
  • M. Hutter. A gentle introduction to the universal algorithmic agent AIXI,” Technical report, Manno-Lugano, Switzerland: IDSIA. 2003, Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.11.8797&rep=rep1&type=pdf
  • J. Schmidhuber, “Gödel machines: Fully self-referential optimal universal self-improvers,” Artif. Gen. Intell., pp. 199–226, 2007. DOI: 10.1007/978-3-540-68677-4_7.
  • B. R. Steunebrink, and J. Schmidhuber, “Towards an actual gödel machine implementation: A lesson in self-reflective systems,” in Theoretical Foundations of Artificial General Intelligence, editors: P.  Wang, and B. Goertzel, Ed. Paris: Atlantis Press, 2012, pp. 173–95.
  • B. R. Steunebrink, K. R. Thórisson, and J. Schmidhuber, “Growing recursive self-improvers,” in International Conference on Artificial General Intelligence. Cham: Springer, Jul. 2016, pp. 129–39.
  • J. Schmidhuber, “Optimal ordered problem solver,” Mach. Learn., Vol. 54, no. 3, pp. 211–54, 2004. DOI: 10.1023/B:MACH.0000015880.99707.b2.
  • E. Özkural, “Gigamachine: Incremental machine learning on desktop computers.” arXiv preprint arXiv:1709.03413, 2017.
  • S. K. Paul, P. Gupta, and P. Bhaumik, “Learning to solve single variable linear equations by universal search with probabilistic program graphs", In IBICA,Kochi, India, Dec. 2018, pp. 310-320.
  • S. K. Paul, and P. Bhaumik, “A reinforcement learning agent based on genetic programming and universal search”, In 4th ICICCS, IEEE, Madurai, India.May 2020, pp. 122-128.
  • D. Silver, and J. Veness, “Monte-Carlo planning in large POMDPs. in NeurIPS, Vancouver, Canada, 2010, pp. 2164–2172.
  • N. Ye, A. Somani, D. Hsu, and W. S. Lee, “Despot: online pomdp planning with regularization,” J. Artif. Intell. Res., Vol. 58, pp. 231–66, 2017. DOI: 10.1613/jair.5328.
  • Z. N. Sunberg, and M. J. Kochenderfer, “Online algorithms for POMDPs with continuous state, action, and observation spaces,” in 28 International Conference on Automated Planning and Scheduling, Delft, The Netherlands, Jun. 2018.
  • H. Kurniawati, and V. Yadav, “An online POMDP solver for uncertainty planning in dynamic environment,” in Robotics Research. Springer Tracts in Advanced Robotics, M. Inaba, and P. Corke, Ed., Cham: Springer , 2016, pp. 611-629, Vol. 114.
  • V. Mnih, et al., “Human-level control through deep reinforcement learning,” Nature, Vol. 518, no. 7540, pp. 529–33, 2015. DOI: 10.1038/nature14236.
  • M. Hausknecht, and P. Stone, “Deep recurrent q-learning for partially observable MDPs,” in 2015 AAAI Fall Symposium Series, Arlington, Virginia, Sep. 2015.
  • T. P. Le, N. A. Vien, and T. Chung, “A deep hierarchical reinforcement learning algorithm in partially observable Markov decision processes,” IEEE. Access., Vol. 6, pp. 49089–102, 2018. DOI: 10.1109/ACCESS.2018.2854283.
  • L. Meng, R. Gorbet, and D. Kulić, “Memory-based deep reinforcement learning for POMDP,” arXiv preprint arXiv:2102.12344, 2021.
  • R. S. Bird, and P. L. Wadler, An Introduction to Functional Programming. Hoboken, NJ: Prentice Hall, 1988.
  • P. Wadler, “Comprehending monads,” in Proceedings of the 1990 ACM Conference on LISP and Functional Programming, May, 1990, pp. 61–78.
  • J. Hughes, “Generalising monads to arrows,” Sci. Comput. Program., Vol. 37, no. 1–3, pp. 67–111, 2000. DOI: 10.1016/S0167-6423(99)00023-4.
  • J. Schmidhuber, J. Zhao, and M. Wiering, “Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self improvement,” Mach. Learn., Vol. 28, no. 1, pp. 105–30, 1997. DOI: 10.1023/A:1007383707642.
  • P. D. Grünwald, and P. M. Vitányi, “Algorithmic information theory,” in Handbook of the Philosophy of Information, P. Adriaans, and J. van Benthem, Ed. Amsterdam, Netherlands: Elsevier, 2008, pp. 281–320.
  • T. M. Cover, and J. A. Thomas, Elements of Information Theory. Hoboken, NJ: John Wiley & Sons, 2012.
  • S. K. Paul, and P. Bhaumik, “A fast universal search by equivalent program pruning”.In ICACCI, Jaipur, India, Sep. 2016, pp. 454-460.
  • J. Veness, K. S. Ng, M. Hutter, W. Uther, and D. Silver, “A monte-carlo aixi approximation,” J. Artif. Intell. Res., Vol. 40, pp. 95–142, 2011. DOI: 10.1613/jair.3125.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.