ABSTRACT
Goal-driven autonomy is an agent model for managing a dynamic environment by reasoning about current and potential goals while planning and acting. Since unexpected events and conditions may cause an agent’s goals and plans to become invalid or infeasible, an agent with goal-driven autonomy should monitor the environment against its expectations. Designed for dynamic, open, and partially observable environments, such an agent can create new goals or change its existing goals as needed. We present a formalisation of expectations for agents operating in these kinds of environments. Our formalisation includes situations where agents have the capability to sense the environment with some associated costs. We examine agent choices and behaviour in these domains and evaluate multiple approaches for selecting a subset of the agent’s sensing actions to execute. The contributions of this work are (1) a specification of different approaches to generating expectations; (2) a formalisation of the autonomy problem that minimises sensing costs; (3) a complexity analysis of the problem; (4) new algorithms for deciding which sensing actions to perform; and (5) empirical results demonstrating the benefit and cost of these approaches.
Acknowledgments
This research was supported by the Office of Naval Research under grant N00014-18-1-2009, by the Air Force Office of Scientific Research grant FA2386-17-1-4063, and by the National Science Foundation grants 1909879, 1217888 and 1849131. We also thank the anonymous reviewers for their comments. The views, opinions and findings expressed are those of the authors and should not be interpreted as representing the official views or policies of Navatek LLC, the Department of Defense, or the US. Government.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. For further details, please see an overview of GDA (Klenk et al., Citation2013).
2. Another way to think about frequency is the percentage of total sensing that occurs per each action executed. With a frequency of 1, 100% of sensing occurs every action, with a frequency of 2, 50% of sensing occurs with every action, with a frequency of 5, 20% of sensing occurs per action, etc. An infinite frequency means 0% sensing occurs per action, and instead sensing only occurs when the goal is believed to be true. The ideal frequency choice in order to optimise total sensing costs depends on the degree to which the environment is dynamic. More dynamic domains warrant higher frequencies for sensing.
3. Explanation is the process of determining causes that lead to the observed discrepancy, we direct the interested ready to prior work on explanation in GDA systems such as (Molineaux et al., Citation2012).
4. Goal formulation is the process by which new goals are generated dynamically, for examples in GDA systems see (Johnson et al., Citation2018).