Abstract
Evaluation of both immediate and future outcomes of one's actions is a critical requirement for intelligent behavior. Using functional magnetic resonance imaging (fMRI), we investigated brain mechanisms for reward prediction at different time scales in a Markov decision task. When human subjects learned actions on the basis of immediate rewards, significant activity was seen in the lateral orbitofrontal cortex and the striatum. When subjects learned to act in order to obtain large future rewards while incurring small immediate losses, the dorsolateral prefrontal cortex, inferior parietal cortex, dorsal raphe nucleus and cerebellum were also activated. Computational modelβbased regression analysis using the predicted future rewards and prediction errors estimated from subjects' performance data revealed graded maps of time scale within the insula and the striatum: ventroanterior regions were involved in predicting immediate rewards and dorsoposterior regions were involved in predicting future rewards. These results suggest differential involvement of the cortico-basal ganglia loops in reward prediction at different time scales.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Neural substrates of reward anticipation and outcome in schizophrenia: a meta-analysis of fMRI findings in the monetary incentive delay task
Simulating the impact of white matter connectivity on processing time scales using brain network models
Neural inhibition as implemented by an actor-critic model involves the human dorsal striatum and ventral tegmental area
References
Bechara, A., Damasio, H. & Damasio, A.R. Emotion, decision making and the orbitofrontal cortex. Cereb. Cortex 10, 295β307 (2000).
Mobini, S. et al. Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl.) 160, 290β298 (2002).
Cardinal, R.N., Pennicott, D.R., Sugathapala, C.L., Robbins, T.W. & Everitt, B.J. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science 292, 2499β2501 (2001).
Rogers, R.D. et al. Dissociable deficits in the decision-making cognition of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms. Neuropsychopharmacology 20, 322β339 (1999).
Evenden, J.L. & Ryan, C.N. The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology (Berl.) 128, 161β170 (1996).
Mobini, S., Chiang, T.J., Ho, M.Y., Bradshaw, C.M. & Szabadi, E. Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl.) 152, 390β397 (2000).
Doya, K. Metalearning and neuromodulation. Neural Net. 15, 495β506 (2002).
Berns, G.S., McClure, S.M., Pagnoni, G. & Montague, P.R. Predictability modulates human brain response to reward. J. Neurosci. 21, 2793β2798 (2001).
Breiter, H.C., Aharon, I., Kahneman, D., Dale, A. & Shizgal, P. Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron 30, 619β639 (2001).
O'Doherty, J.P., Deichmann, R., Critchley, H.D. & Dolan, R.J. Neural responses during anticipation of a primary taste reward. Neuron 33, 815β826 (2002).
O'Doherty, J.P., Dayan, P., Friston, K., Critchley, H. & Dolan, R.J. Temporal difference models and reward-related learning in the human brain. Neuron 38, 329β337 (2003).
Sutton, R.S. & Barto, A.G. Reinforcement Learning (MIT Press, Cambridge, Massachusetts, 1998).
Houk, J.C., Adams, J.L. & Barto, A.G. in Models of Information Processing in the Basal Ganglia (eds. Houk, J.C., Davis, J.L. & Beiser, D.G.) 249β270 (MIT Press, Cambridge, Massachusetts, 1995).
Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward. Science 275, 1593β1599 (1997).
Doya, K. Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol. 10, 732β739 (2000).
McClure, S.M., Berns, G.S. & Montague, P.R. Temporal prediction errors in a passive learning task activate human striatum. Neuron 38, 339β346 (2003).
Mesulam, M.M. & Mufson, E.J. Insula of the old world monkey. III: Efferent cortical output and comments on function. J. Comp. Neurol. 212, 38β52 (1982).
Cavada, C., Company, T., Tejedor, J., Cruz-Rizzolo, R.J. & Reinoso-Suarez, F. The anatomical connections of the macaque monkey orbitofrontal cortex. Cereb. Cortex 10, 220β242 (2000).
Chikama, M., McFarland, N.R., Amaral, D.G. & Haber, S.N. Insular cortical projections to functional regions of the striatum correlate with cortical cytoarchitectonic organization in the primate. J. Neurosci. 17, 9686β9705 (1997).
Balleine, B.W. & Dickinson, A. The effect of lesions of the insular cortex on instrumental conditioning: evidence for a role in incentive memory. J. Neurosci. 20, 8954β8964 (2000).
Knutson, B., Fong, G.W., Bennett, S.M., Adams, C.M. & Hommer, D. A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI. Neuroimage 18, 263β272 (2003).
Ullsperger, M. & von Cramon, D.Y. Error monitoring using external feedback: specific roles of the habenular complex, the reward system, and the cingulate motor area revealed by functional magnetic resonance imaging. J. Neurosci. 23, 4308β4314 (2003).
O'Doherty, J., Critchley, H., Deichmann, R. & Dolan, R.J. Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. J. Neurosci. 23, 7931β7939 (2003).
Koepp, M.J. et al. Evidence for striatal dopamine release during a video game. Nature 393, 266β268 (1998).
Elliott, R., Friston, K.J. & Dolan, R.J. Dissociable neural responses in human reward systems. J. Neurosci. 20, 6159β6165 (2000).
Knutson, B., Adams, C.M., Fong, G.W. & Hommer, D. Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J. Neurosci. 21, RC159 (2001).
Pagnoni, G., Zink, C.F., Montague, P.R. & Berns, G.S. Activity in human ventral striatum locked to errors of reward prediction. Nat. Neurosci. 5, 97β98 (2002).
Elliott, R., Newman, J.L., Longe, O.A. & Deakin, J.F. Differential response patterns in the striatum and orbitofrontal cortex to financial reward in humans: a parametric functional magnetic resonance imaging study. J. Neurosci. 23, 303β307 (2003).
Haruno, M. et al. A neural correlate of reward-based behavioral learning in caudate nucleus: a functional magnetic resonance imaging study of a stochastic decision task. J. Neurosci. 24, 1660β1665 (2004).
Reynolds, J.N. & Wickens, J.R. Dopamine-dependent plasticity of corticostriatal synapses. Neural Net. 15, 507β521 (2002).
Tremblay, L. & Schultz, W. Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. J. Neurophysiol. 83, 1864β1876 (2000).
Critchley, H.D., Mathias, C.J. & Dolan, R.J. Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron 29, 537β545 (2001).
Rogers, R.D. et al. Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. J. Neurosci. 19, 9029β9038 (1999).
Rolls, E.T. The orbitofrontal cortex and reward. Cereb. Cortex 10, 284β294 (2000).
Hanakawa, T. et al. The role of rostral Brodmann area 6 in mental-operation tasks: an integrative neuroimaging approach. Cereb. Cortex 12, 1157β1170 (2002).
Owen, A.M., Doyon, J., Petrides, M. & Evans, A.C. Planning and spatial working memory: a positron emission tomography study in humans. Eur. J. Neurosci. 8, 353β364 (1996).
Baker, S.C. et al. Neural systems engaged by planning: a PET study of the Tower of London task. Neuropsychologia 34, 515β526 (1996).
Middleton, F.A. & Strick, P.L. Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Brain Res. Rev. 31, 236β250 (2000).
Haber, S.N., Kunishio, K., Mizobuchi, M. & Lynd-Balta, E. The orbital and medial prefrontal circuit through the primate basal ganglia. J. Neurosci. 15, 4851β4867 (1995).
Eagle, D.M., Humby, T., Dunnett, S.B. & Robbins, T.W. Effects of regional striatal lesions on motor, motivational, and executive aspects of progressive-ratio performance in rats. Behav. Neurosci. 113, 718β731 (1999).
Pears, A., Parkinson, J.A., Hopewell, L., Everitt, B.J. & Roberts, A.C. Lesions of the orbitofrontal but not medial prefrontal cortex disrupt conditioned reinforcement in primates. J. Neurosci. 23, 11189β11201 (2003).
Hikosaka, O. et al. Parallel neural networks for learning sequential procedures. Trends Neurosci. 22, 464β471 (1999).
Mijnster, M.J. et al. Regional and cellular distribution of serotonin 5-hydroxytryptamine2a receptor mRNA in the nucleus accumbens, olfactory tubercle, and caudate putamen of the rat. J. Comp. Neurol. 389, 1β11 (1997).
Compan, V., Segu, L., Buhot, M.C. & Daszuta, A. Selective increases in serotonin 5-HT1B/1D and 5-HT2A/2C binding sites in adult rat basal ganglia following lesions of serotonergic neurons. Brain Res. 793, 103β111 (1998).
Celada, P., Puig, M.V., Casanovas, J.M., Guillazo, G. & Artigas, F. Control of dorsal raphe serotonergic neurons by the medial prefrontal cortex: involvement of serotonin-1A, GABA(A), and glutamate receptors. J. Neurosci. 21, 9917β9929 (2001).
Martin-Ruiz, R. et al. Control of serotonergic function in medial prefrontal cortex by serotonin-2A receptors through a glutamate-dependent mechanism. J. Neurosci. 21, 9856β9866 (2001).
Hikosaka, K. & Watanabe, M. Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. Cereb. Cortex 10, 263β271 (2000).
Shidara, M. & Richmond, B.J. Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science 296, 1709β1711 (2002).
Matsumoto, K., Suzuki, W. & Tanaka, K. Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science 301, 229β232 (2003).
Acknowledgements
We thank K. Samejima, N. Schweighofer, M. Haruno, H. Imamizu, S. Higuchi, T. Yoshioka, T. Chaminade and M. Kawato for helpful discussions and technical advice. This research was funded by 'Creating the Brain,' Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Figure 1 (download PDF )
An example of the time series of the explanatory variables for one subject. (PDF 34 kb)
Supplementary Figure 2 (download PDF )
A schematic diagram of the brain areas involved in reward prediction at different time scales. The dotted lines indicate a cortico-cortico connection, and the green arrows indicate the serotonergic pathways from the dorsal raphe. The 'limbic loop' (including lateral OFC and ventral striatum) is involved in short-term reward prediction. The 'cognitive and motor loops' (including DLPFC, PMd and dorsal striatum) are involved in long-term reward prediction. Ventroanterior-to-dorsoposterior topographical projections from the insula to the striatum are involved in short-to-long-term reward prediction (rainbow-colored arrow). The mPFC and dorsal raphe, which are reciprocally connected, may regulate these loops by cortico-cortical and cortico-striatal projections from mPFC and serotonergic projections from dorsal raphe. SNr, substantia nigra pars reticulate. (PDF 415 kb)
Supplementary Table 1 (download PDF )
Areas significantly activated in the block-design analysis. (PDF 11 kb)
Supplementary Table 2 (download PDF )
Areas with significant correlation with reward prediction V(t) estimated with different discount factors Ξ³. (PDF 15 kb)
Supplementary Table 3 (download PDF )
Voxels with significant correlation with reward prediction error Ξ΄(t) estimated with different discount factors Ξ³. (PDF 15 kb)
Rights and permissions
About this article
Cite this article
Tanaka, S., Doya, K., Okada, G. et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat Neurosci 7, 887β893 (2004). https://doi.org/10.1038/nn1279
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/nn1279
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
