366
Views
1
CrossRef citations to date
0
Altmetric
Research Article

A knowledge-driven layered inverse reinforcement learning approach for recognizing human intents

&
Pages 1015-1044 | Received 13 Feb 2019, Accepted 05 Jan 2020, Published online: 04 Feb 2020

References

  • Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. Proceedings of the twenty-first international conference on machine learning (pp. 1). New York, NY, USA: ACM.
  • Andersen, G., Vrancx, P., & Bou-Ammar, H. (2018). Learning high-level representations from demonstrations. Retrieved from https://arxiv.org/pdf/1802.06604.pdf
  • Arora, S., & Doshi, P. (2018). A survey of inverse reinforcement learning: Challenges, methods and progress. Retrieved from https://arxiv.org/abs/1806.06877
  • Babeş-Vroman, M., Marivate, V., Subramanian, K., & Littman, M. (2011). Apprenticeship learning about multiple intentions. Proceedings of the 28th international conference on international conference on machine learning (pp. 897–904). USA: Omnipress.
  • Bhattacharyya, R., Bhuyan, Z., & Hazarika, S. M. (2017). O-pro: An ontology for object affordance reasoning. In A. Basu, S. Das, P. Horain, & S. Bhattacharya (Eds.), Intelligent human computer interaction: 8th international conference, ihci 2016, pilani, india, december 12– 13,2016, proceedings (pp. 39–50). Cham: Springer International Publishing.
  • Bhattacharyya, R., & Hazarika, S. M. (2018). Object affordance driven inverse reinforce- ment learning through conceptual abstraction and advice. Paladyn, Journal of Behavioral Robotics, 9(1), 277–294.
  • Bonet, B., & Pearl, J. (2002). Qualitative mdps and pomdps: An order-of-magnitude approximation. Proceedings of the eighteenth conference on uncertainty in artificial intelligence (pp. 61–68). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
  • Braga, A. P., & Araújo, A. F. (2003). A topological reinforcement learning agent for navigation. Neural Computing & Applications, 12(3–4), 220–236.
  • Braga, A. P. D. S., & Araújo, A. F. (2006). Influence zones: A strategy to enhance reinforcement learning. Neurocomputing, 70(1–3), 21–34.
  • Choi, J., & Eung Kim, K. (2012). Nonparametric bayesian inverse reinforcement learning for multiple reward functions. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in neural information processing systems 25 (pp. 305–313). United States of America: Curran Associates, Inc.
  • Clementini, E., Di Felice, P., & Hernández, D. (1997). Qualitative representation of positional information. Artificial Intelligence, 95(2), 317–356.
  • Cohn, A. G., & Hazarika, S. M. (2001). Qualitative spatial representation and reasoning: An overview. Fundamenta Informaticae, 46(1–2), 1–29.
  • Cohn, A. G., Renz, J., & Sridhar, M. (2012). Thinking inside the box: A comprehensive spatial representation for video analysis. Proceedings of the thirteenth international conference on principles of knowledge representation and reasoning (pp. 588–592). Rome, Italy: AAAI Press.
  • Cruz, F., Magg, S., Weber, C., & Wermter, S. (2016). Training agents with interactive reinforcement learning and contextual affordances. IEEE Transactions on Cognitive and Developmental Systems, 8(4), 271–284.
  • Cruz, F., Parisi, G. I., & Wermter, S. (2018). Multi-modal feedback for affordance-driven interactive reinforcement learning. 2018 international joint conference on neural networks (pp. 1–8). Rio de Janeiro, Brazil: IEEE.
  • Dai, P., Strehl, A. L., & Goldsmith, J. (2008). Expediting rl by using graphical structures. Proceedings of the 7th international joint conference on autonomous agents and multiagent systems - volume 3 (pp. 1325–1328). Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems.
  • Davis, E., & Marcus, G. (2016). The scope and limits of simulation in automated reasoning. Artificial Intelligence, 233, 60–72.
  • Dayan, P., & Hinton, G. E. (1993). Feudal reinforcement learning. In S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.), Advances in neural information processing systems 5 (pp. 271–278). Colorado, USA: Morgan-Kaufmann.
  • Deng, Z., Mi, J., Han, D., Huang, R., Xiong, X., & Zhang, J. (2017). Hierarchical robot learning for physical collaboration between humans and robots. 2017 IEEE international conference on robotics and biomimetics (robio) (p. 750–755). Macau, China: IEEE.
  • Dimitrakakis, C., & Rothkopf, C. A. (2012). Bayesian multitask inverse reinforcement learning. In S. Sanner & M. Hutter (Eds.), Recent advances in reinforcement learning (pp. 273–284). Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Falomir, Z., Jiménez-Ruiz, E., Museros, L., & Escrig, M. T. (2009). An ontology for qualitative description of images. Spatial and Temporal Reasoning for Ambient Intelligence Systems, 21-31.
  • Frommberger, L. (2010). Qualitative spatial abstraction in reinforcement learning. Germany: Springer Science & Business Media.
  • Garnelo, M., Arulkumaran, K., & Shanahan, M. (2016). Towards deep symbolic reinforcement learning. Retrieved from https://arxiv.org/abs/1609.05518
  • Hafez, M. B., & Loo, C. K. (2015). Topological q-learning with internally guided exploration for mobile robot navigation. Neural Computing and Applications, 26(8), 1939–1954.
  • Horton, T. E., Chakraborty, A., & Amant, R. S. (2012). Affordances for robots: A brief survey. AVANT. Pismo Awangardy Filozoficzno-Naukowej, 2, 70–84.
  • James, M. (2013). Burlap (brown-umbc reinforcement learning and planning). Retrieved from http://burlap.cs.brown.edu/index.html
  • Kok, S., Sumner, M., Richardson, M., Singla, P., Poon, H., & Domingos, P. (2006). The alchemy system for statistical relational ai (technical report). Seattle, Washington: Department of computer science and engineering, university of washington.
  • Kolter, J. Z., Abbeel, P., & Ng, A. Y. (2007). Hierarchical apprenticeship learning, with application to quadruped locomotion. Proceedings of the 20th international conference on neural information processing systems (pp. 769–776). USA: Curran Associates Inc. Retrieved from http://dl.acm.org/citation.cfm?id=2981562.2981659
  • Köpf, F., Inga, J., Rothfuß, S., Flad, M., & Hohmann, S. (2017). Inverse reinforcement learning for identification in linear-quadratic dynamic games. IFAC-PapersOnLine, 50(1), 14902–14908.
  • Koppula, H. S., Gupta, R., & Saxena, A. (2013). Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research, 32(8), 951–970.
  • Krishnan, S., Garg, A., Liaw, R., Thananjeyan, B., Miller, L., Pokorny, F. T., & Goldberg, K. (2019). Swirl: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards. The International Journal of Robotics Research, 38(2–3), 126–145.
  • Kulkarni, T. D., Narasimhan, K. R., Saeedi, A., & Tenenbaum, J. B. (2016). Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. Proceedings of the 30th international conference on neural information processing systems (pp. 3682–3690). USA: Curran Associates Inc.
  • Kunapuli, G., Odom, P., Shavlik, J. W., & Natarajan, S. (2013). Guiding autonomous agents to better behaviors through human advice. 2013 IEEE 13th international conference on data mining (p. 409–418). Dallas, Texas, USA: IEEE.
  • Lake, B. M., Salakhutdinov, R., & Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332–1338.
  • Lang, T., & Toussaint, M. (2010). Planning with noisy probabilistic relational rules. Journal of Artificial Intelligence Research, 39, 1–49.
  • Lopes, M., Melo, F., & Montesano, L. (2009). Active learning for reward estimation in inverse reinforcement learning. In W. Buntine, M. Grobelnik, D. Mladenić, & J. Shawe- Taylor (Eds.), Machine learning and knowledge discovery in databases (pp. 31–46). Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Lopes, M., Melo, F. S., & Montesano, L. (2007). Affordance-based imitation learning in robots. 2007 IEEE/RSJ international conference on intelligent robots and systems (p. 1015–1021) San Diego, CA, USA: IEEE.
  • Mazumder, S., Liu, B., Wang, S., Zhu, Y., Liu, L., & Li, J. (2018). Action permissibility in deep reinforcement learning and application to autonomous driving. Retrieved from https://www.kdd.org/kdd2018/files/deep-learning-day/DLDay18paper41.pdf
  • Metelli, A. M., Pirotta, M., & Restelli, M. (2017). Compatible reward inverse reinforcement learning. In I. Guyon, U. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. V. N. Vishwanathan, R. Garnett(Eds.), Advances in neural information processing systems 30 (pp. 2050–2059). United States: Curran Associates, Inc.
  • Min, H., Yi, C., Luo, R., Zhu, J., & Bi, S. (2016). Affordance research in developmental robotics: A survey. IEEE Transactions on Cognitive and Developmental Systems, 8(4), 237–255.
  • Monfort, M., Liu, A., & Ziebart, B. D. (2015). Intent prediction and trajectory forecasting via predictive inverse linear-quadratic regulation. Proceedings of the twenty-ninth aaai conference on artificial intelligence (pp. 3672–3678). Austin, Texas: AAAI Press.
  • Odom, P., & Natarajan, S. (2016). Active advice seeking for inverse reinforcement learning. Proceedings of the 2016 international conference on autonomous agents & multiagent systems (pp. 512–520). Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems. Retrieved from http://dl.acm.org/citation.cfm?id=2936924.2936998
  • Ramachandran, D., & Amir, E. (2007). Bayesian inverse reinforcement learning. Urbana, 51(61801), 1–4.
  • Reyes, A., Sucar, L. E., Morales, E., & Ibarguengoytia, P. (2005). Learning qualitative markov decision processes. Neural information processing systems nips 2005 workshop on machine learning based robotics in unstructured environments. British Columbia, Canada.
  • Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning, 62(1–2), 107–136.
  • Rothkopf, C. A., & Ballard, D. H. (2013, August). Modular inverse reinforcement learning for visuomotor behavior. Biological Cybernetics, 107(4), 477–490.
  • Rothkopf, C. A., & Dimitrakakis, C. (2011). Preference elicitation and inverse reinforcement learning. In D. Gunopulos, T. Hofmann, D. Malerba, & M. Vazirgiannis (Eds.), Machine learning and knowledge discovery in databases (pp. 34–48). Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Savarimuthu, T. R., Buch, A. G., Schlette, C., Wantia, N., Roßmann, J., & Martínez, D., others. (2018). Teaching a robot the semantics of assembly tasks. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 485, 670–692.
  • Shukla, N., He, Y., Chen, F., & Zhu, S.-C. (2017, November 13–15). Learning human utility from video demonstrations for deductive planning in robotics. In S. Levine, V. Vanhoucke, & K. Goldberg (Eds.), Proceedings of the 1st annual conference on robot learning (Vol. 78, pp. 448–457). California: PMLR.
  • Telgerdi, F., Khalilian, A., & Pouyan, A. A. (2014). Qualitative reinforcement learning to accelerate finding an optimal policy. 2014 4th international conference on computer and knowledge engineering (iccke) (pp. 575–580). Mashhad, Iran: IEEE.
  • Thippur, A., Burbridge, C., Kunze, L., Alberti, M., Folkesson, J., Jensfelt, P., & Hawes, N. (2015). A comparison of qualitative and metric spatial relation models for scene under- standing. Proceedings of the twenty-ninth aaai conference on artificial intelligence (pp. 1632–1640). Austin, Texas: AAAI Press.
  • Tran, S. D., & Davis, L. S. (2008). Event modeling and recognition using markov logic networks. In D. Forsyth, P. Torr, & A. Zisserman (Eds.), Computer vision – Eccv 2008 (pp. 610–623). Berlin, Heidelberg: Springer Berlin Heidelberg.
  • van Otterlo, M. (2012). Solving relational and first-order logical markov decision processes: A survey. In M. Wiering & M. van Otterlo (Eds.), Reinforcement learning: State-of-the-art (pp. 253–292). Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Van Otterlo, M. (2009). The logic of adaptive behavior. Amsterdam: IOS Press.
  • Wen, M., Papusha, I., & Topcu, U. (2017). Learning from demonstrations with high-level side information. Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17 (pp. 3055–3061). Melbourne, Australia: AAAI press.
  • Zeng, Y., Xu, K., Yin, Q., Qin, L., Zha, Y., & Yeoh, W. (2018, February 2–7). Inverse reinforcement learning based human behavior modeling for goal recognition in dynamic local network interdiction The workshops of the the thirty-second AAAI conference on artificial intelligence (pp. 646–653). New Orleans, Louisiana, USA. Retrieved from https://aaai.org/ocs/index.php/WS/AAAIW18/paper/view/16162
  • Zhang, S. (2015). Parameterized modular inverse reinforcement learning (Unpublished doctoral dissertation). The University of Texas at Austin.
  • Zhifei, S., & Joo, E. M. (2012). A review of inverse reinforcement learning theory and recent advances. 2012 IEEE congress on evolutionary computation (pp. 1–8). Australia: IEEE.
  • Ziebart, B. D., Maas, A., Bagnell, J. A., & Dey, A. K. (2008). Maximum entropy inverse reinforcement learning. Proceedings of the 23rd national conference on artificial intelligence (pp. 1433–1438). Chicago, Illinois: AAAI Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.