References
- P. Auer , N. Cesa-Bianchi , and P. Fischer , Finite-time analysis of the multiarmed bandit problem , Mach. Learn. 47 (2002), pp. 235–256.
- S. Bubeck and N. Cesa-Bianchi , Regret analysis of stochastic and nonstochastic multi-armed bandit problems , Found. Trends Mach. Learn. 5 (2012), pp. 1–122.
- T.L. Lai and H. Robbins , Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost , Adv. Appl. Math. 6 (1985), pp. 4–22.
- K. Oyo , M. Ichino , and T. Takahashi , Cognitive validity of a causal value function with loose symmetry and its effectiveness for N-armed bandit problems , Trans. Jpn. Soc. Artif. Intell. 30(2) (2015), pp. 403–416.
- E. Rubin , Figure and ground , in Readings in Perception , D.C. Beardslee and M. Wertheimer , eds., D. Van Nostrand, Princeton (NJ), 1958, pp. 194–209.
- S. Shinohara , R. Taguchi , K. Katsurada , and T. Nitta , A model of belief formation based on causality and application to N-armed bandit problem , Trans. Jpn. Soc. Artif. Intell. 22(1) (2007), pp. 58–68.
- H.A. Simon . Rational choice and the structure of the environment , Psychol. Rev. 63 (1956), pp. 129–138.
- T. Takahashi , M. Nakano , and S. Shinohara , Cognitive symmetry: Illogical but rational biases , Symmet: Cult. Sci. 21 (2010), pp. 275–294.
- D. Uragami , T. Takahashi , and Y. Matsuo , Cognitively inspired reinforcement learning architecture and its application to giant-swing motion control , BioSystems 116 (2014), pp. 1–9.