Search in:

Connection Science Volume 27, 2015 - Issue 3: Adaptive Learning Agents, Part 3

Submit an article Journal homepage

Free access

1,299

Views

CrossRef citations to date

Altmetric

Articles

A reinforcement learning model of joy, distress, hope and fear

Joost BroekensDepartment of Intelligent Systems, Interactive Intelligence Group, Delft University of Technology, 4 Mekelweg, Delft, The NetherlandsCorrespondence[email protected]

Elmer JacobsDepartment of Intelligent Systems, Interactive Intelligence Group, Delft University of Technology, 4 Mekelweg, Delft, The Netherlands

Catholijn M. JonkerDepartment of Intelligent Systems, Interactive Intelligence Group, Delft University of Technology, 4 Mekelweg, Delft, The Netherlands

Pages 215-233 | Received 05 Sep 2014, Accepted 25 Feb 2015, Published online: 23 Apr 2015

Cite this article
https://doi.org/10.1080/09540091.2015.1031081
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF

Figures & data

Figure 1. The maze used in the experiments. Each square is a state. The agent starts at S and the reward can (initially) be found at R. For the simulations that test fear extinction and the effect of expectation of return, the end of all non-goal arms in the maze can contain a randomly placed punishment (see text).

Table 1. Control and varied values of different parameters used in the simulations.

Display Table

Figure 2. Intensity of joy/distress for a single agent, observed in the first 2000 steps. Later intensity of joy is strongly reduced compared to the intensity resulting from the first goal encounter (spike at $t = 100$ ).

Figure 3. Intensity of joy/distress, mean over 50 agents, observed in the first 2000 steps. The noisy signal in the first 1000 steps reflects the fact that this is a non-smoothed average of only 50 agents, with each agent producing a spike of joy for the first goal encounter.

Figure 4. Intensity of hope( $V_{+} (s)$ ), mean over 50 agents, observed in the first 2000 steps. Over the course of learning, the hope signal grows to reflect the anticipation of the goal reward, while in the first several hundred trials virtually no hope is present.

Figure 5. Left: intensity of fear, mean over 500 agents, 5000 steps, for four conditions: H/L (high-hope/low-hope), and N/P (punished arms versus no punishment). Fear extincts over time and low-hope agents (Bellman update) generate more fear in the presence of punishment than high-hope agents ( ${MAX}_{a}$ update). Right: zoom-in of Left figure for the first 500 steps.

Figure 6. Intensity of hope, mean over 500 agents, 5000 steps, for four conditions: H/L (high-hope/low-hope), and N/P (punished arms versus no punishment). Low-hope agents (Bellman update) perform worse in the presence of punished non-goal maze arms while high-hope ( ${MAX}_{a}$ ) agents perform better in the presence of such punishment.

Figure 7. State value interpreted as the agent's “experience”, mean over 500 agents, 500 steps, for four conditions: H/L (high-hope/low-hope), and N/P (punished arms versus no punishment).

Figure 8. Intensity of joy, mean over 50 agents. Left figure shows the difference between a probability of $0.1$ (first run) versus $0.25$ (second run) of failing an action. The right figure shows the difference between returning the agent (first run) to its starting position versus relocating the reward (second run).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

A reinforcement learning model of joy, distress, hope and fear

Table 1. Control and varied values of different parameters used in the simulations.

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

A reinforcement learning model of joy, distress, hope and fear

Figures & data

Table 1. Control and varied values of different parameters used in the simulations.

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date