Abstract
Many theories propose that top-down attentional signals control processing in sensory cortices by modulating neural activity. But who controls the controller? Here we investigate how a biologically plausible neural reinforcement learning scheme can create higher order representations and top-down attentional signals. The learning scheme trains neural networks using two factors that gate Hebbian plasticity: (1) an attentional feedback signal from the response-selection stage to earlier processing levels; and (2) a globally available neuromodulator that encodes the reward prediction error. We demonstrate how the neural network learns to direct attention to one of two coloured stimuli that are arranged in a rank-order. Like monkeys trained on this task, the network develops units that are tuned to the rank-order of the colours and it generalizes this newly learned rule to previously unseen colour combinations. These results provide new insight into how individuals can learn to control attention as a function of reward contingency.
No potential conflict of interest was reported by the authors.
The work was supported by an NWO-EW [grant number 612.066.826], an NWO-VICI grant, a Brain and Cognition [grant number 433-09-208]), the European Union Seventh Framework Program (project 269921 “BrainScaleS” and PITN-GA-2011-290011 “ABC”) and an ERC advanced [grant number 339490].
No potential conflict of interest was reported by the authors.
The work was supported by an NWO-EW [grant number 612.066.826], an NWO-VICI grant, a Brain and Cognition [grant number 433-09-208]), the European Union Seventh Framework Program (project 269921 “BrainScaleS” and PITN-GA-2011-290011 “ABC”) and an ERC advanced [grant number 339490].