References
- L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning and acting in partially observable stochastic domains,” Artif. Intell., Vol. 101, no. 1–2, pp. 99–134, 1998. DOI: 10.1016/S0004-3702(98)00023-X.
- A. R. Cassandra, “A survey of POMDP applications,” in Working Notes of AAAI 1998 Fall Symposium on Planning with Partially Observable Markov Decision Processes. Vol. 1724, Oct. 1998. Orlando, Florida.
- T. Glasmachers, and J. Schmidhuber, “Optimal direct policy search", In The Fourth Conference on Artificial General Intelligence, Mountain View, CA, August 2011, pp. 52–61.
- L. A. Levin, “Universal sequential search problems,” Probl. Peredaci Inf., Vol. 9, pp. 115–6, 1973. Translated in Problems of Information Transmission 9, pp. 265–6.
- M. Li, and P. Vitányi, An Introduction to Kolmogorov Complexity and Its Applications. Vol. 3, New York: Springer Science & Business Media, 2008.
- R. J. Solomonoff, “A formal theory of inductive inference. Part I,” Inf. Control, Vol. 7, no. 1, pp. 1–22, 1964. DOI: 10.1016/S0019-9958(64)90223-2.
- R. J. Solomonoff, “A formal theory of inductive inference. Part II,” Inf. Control, Vol. 7, no. 2, pp. 224–54, 1964. Doi: 10.1016/S0019-9958(64)90131-7.
- W. M. Johnston, J. R. Hanna, and R. J. Millar, “Advances in dataflow programming languages,” ACM Comput. Surveys (CSUR), Vol. 36, no. 1, pp. 1–34, 2004. Doi: 10.1145/1013208.1013209.
- T. Akidau, et al., The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing, 2015
- J. Brandt, W. Reisig, and U. Leser, “Computation semantics of the functional scientific workflow language cuneiform,” J. Funct. Program., Vol. 27, 2017. DOI: 10.1017/S0956796817000119.
- M. Looks, and B. Goertzel, “Program representation for general intelligence" In 2nd conference on AGI, Arlington, Virginia, May 2009, pp. 146-151.
- M. Hutter. A gentle introduction to the universal algorithmic agent AIXI,” Technical report, Manno-Lugano, Switzerland: IDSIA. 2003, Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.11.8797&rep=rep1&type=pdf
- J. Schmidhuber, “Gödel machines: Fully self-referential optimal universal self-improvers,” Artif. Gen. Intell., pp. 199–226, 2007. DOI: 10.1007/978-3-540-68677-4_7.
- B. R. Steunebrink, and J. Schmidhuber, “Towards an actual gödel machine implementation: A lesson in self-reflective systems,” in Theoretical Foundations of Artificial General Intelligence, editors: P. Wang, and B. Goertzel, Ed. Paris: Atlantis Press, 2012, pp. 173–95.
- B. R. Steunebrink, K. R. Thórisson, and J. Schmidhuber, “Growing recursive self-improvers,” in International Conference on Artificial General Intelligence. Cham: Springer, Jul. 2016, pp. 129–39.
- J. Schmidhuber, “Optimal ordered problem solver,” Mach. Learn., Vol. 54, no. 3, pp. 211–54, 2004. DOI: 10.1023/B:MACH.0000015880.99707.b2.
- E. Özkural, “Gigamachine: Incremental machine learning on desktop computers.” arXiv preprint arXiv:1709.03413, 2017.
- S. K. Paul, P. Gupta, and P. Bhaumik, “Learning to solve single variable linear equations by universal search with probabilistic program graphs", In IBICA,Kochi, India, Dec. 2018, pp. 310-320.
- S. K. Paul, and P. Bhaumik, “A reinforcement learning agent based on genetic programming and universal search”, In 4th ICICCS, IEEE, Madurai, India.May 2020, pp. 122-128.
- D. Silver, and J. Veness, “Monte-Carlo planning in large POMDPs. in NeurIPS, Vancouver, Canada, 2010, pp. 2164–2172.
- N. Ye, A. Somani, D. Hsu, and W. S. Lee, “Despot: online pomdp planning with regularization,” J. Artif. Intell. Res., Vol. 58, pp. 231–66, 2017. DOI: 10.1613/jair.5328.
- Z. N. Sunberg, and M. J. Kochenderfer, “Online algorithms for POMDPs with continuous state, action, and observation spaces,” in 28 International Conference on Automated Planning and Scheduling, Delft, The Netherlands, Jun. 2018.
- H. Kurniawati, and V. Yadav, “An online POMDP solver for uncertainty planning in dynamic environment,” in Robotics Research. Springer Tracts in Advanced Robotics, M. Inaba, and P. Corke, Ed., Cham: Springer , 2016, pp. 611-629, Vol. 114.
- V. Mnih, et al., “Human-level control through deep reinforcement learning,” Nature, Vol. 518, no. 7540, pp. 529–33, 2015. DOI: 10.1038/nature14236.
- M. Hausknecht, and P. Stone, “Deep recurrent q-learning for partially observable MDPs,” in 2015 AAAI Fall Symposium Series, Arlington, Virginia, Sep. 2015.
- T. P. Le, N. A. Vien, and T. Chung, “A deep hierarchical reinforcement learning algorithm in partially observable Markov decision processes,” IEEE. Access., Vol. 6, pp. 49089–102, 2018. DOI: 10.1109/ACCESS.2018.2854283.
- L. Meng, R. Gorbet, and D. Kulić, “Memory-based deep reinforcement learning for POMDP,” arXiv preprint arXiv:2102.12344, 2021.
- R. S. Bird, and P. L. Wadler, An Introduction to Functional Programming. Hoboken, NJ: Prentice Hall, 1988.
- P. Wadler, “Comprehending monads,” in Proceedings of the 1990 ACM Conference on LISP and Functional Programming, May, 1990, pp. 61–78.
- J. Hughes, “Generalising monads to arrows,” Sci. Comput. Program., Vol. 37, no. 1–3, pp. 67–111, 2000. DOI: 10.1016/S0167-6423(99)00023-4.
- J. Schmidhuber, J. Zhao, and M. Wiering, “Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self improvement,” Mach. Learn., Vol. 28, no. 1, pp. 105–30, 1997. DOI: 10.1023/A:1007383707642.
- P. D. Grünwald, and P. M. Vitányi, “Algorithmic information theory,” in Handbook of the Philosophy of Information, P. Adriaans, and J. van Benthem, Ed. Amsterdam, Netherlands: Elsevier, 2008, pp. 281–320.
- T. M. Cover, and J. A. Thomas, Elements of Information Theory. Hoboken, NJ: John Wiley & Sons, 2012.
- S. K. Paul, and P. Bhaumik, “A fast universal search by equivalent program pruning”.In ICACCI, Jaipur, India, Sep. 2016, pp. 454-460.
- J. Veness, K. S. Ng, M. Hutter, W. Uther, and D. Silver, “A monte-carlo aixi approximation,” J. Artif. Intell. Res., Vol. 40, pp. 95–142, 2011. DOI: 10.1613/jair.3125.