BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area

American Association for the Advancement of Science (AAAS) - Tập 319 Số 5867 - Trang 1264-1267 - 2008
Kimberlee D’Ardenne1,2,3,4, Samuel M. McClure1,3,4, Leigh E. Nystrom1,3,4, Jonathan D. Cohen1,3,4
1Center for the Study of Brain, Mind, and Behavior, Princeton University, Princeton, NJ 08544 USA
2Department of Chemistry, Princeton University, Princeton, NJ 08544 USA
3Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15260, USA
4Department of Psychology, Princeton University, Princeton, NJ 08544 USA

Tóm tắt

Current theories hypothesize that dopamine neuronal firing encodes reward prediction errors. Although studies in nonhuman species provide direct support for this theory, functional magnetic resonance imaging (fMRI) studies in humans have focused on brain areas targeted by dopamine neurons [ventral striatum (VStr)] rather than on brainstem dopaminergic nuclei [ventral tegmental area (VTA) and substantia nigra]. We used fMRI tailored to directly image the brainstem. When primary rewards were used in an experiment, the VTA blood oxygen level–dependent (BOLD) response reflected a positive reward prediction error, whereas the VStr encoded positive and negative reward prediction errors. When monetary gains and losses were used, VTA BOLD responses reflected positive reward prediction errors modulated by the probability of winning. We detected no significant VTA BOLD response to nonrewarding events.

Từ khóa


Tài liệu tham khảo

10.1523/JNEUROSCI.16-05-01936.1996

10.1126/science.275.5306.1593

G. Paxinos X. Huang Atlas of the Human Brainstem (Academic Press San Diego CA 1995).

Several recent studies ( 5 – 18 ) report findings from the SN and VTA with spatial resolution not better than 21 mm 3 which is inadequate for imaging from the VTA and only gives a small number of measurements in the SN.

10.1038/nature02581

B. H. Schottet al., Learn. Mem.11, 383 (2004).

B. C. Wittmannet al., Neuron45, 459 (2005).

P. Dunckleyet al., J. Neurosci.25, 7333 (2005).

V. Menon, D. J. Levitin, Neuroimage28, 175 (2005).

J. P. O'Doherty, T. W. Buchanan, B. Seymour, R. J. Dolan, Neuron49, 157 (2006).

B. H. Schottet al., J. Neurosci.26, 1407 (2006).

J. C. Dreher, P. Kohn, K. F. Berman, Cereb. Cortex16, 561 (2006).

N. Bunzeck, E. Duzel, Neuron51, 369 (2006).

M. Fairhurst, K. Wiech, P. Dunckley, I. Tracey, Pain128, 101 (2007).

P. N. Tobler, P. C. Fletcher, E. T. Bullmore, W. Schultz, Neuron54, 167 (2007).

B. H. Schottet al., Brain130, 2412 (2007).

B. C. Wittmann, N. Bunzeck, R. J. Dolan, E. Duzel, Neuroimage38, 194 (2007).

N. Bunzecket al., Cereb. Cortex17, 2940 (2007).

K. A. Schneider, M. C. Richter, S. Kastner, J. Neurosci.24, 8975 (2004).

K. A. Schneider, S. Kastner, J. Neurophysiol.94, 2491 (2005).

A. R. Guimaraeset al., Hum. Brain Mapp.6, 33 (1998).

H. Oikawa, M. Sasaki, Y. Tamakawa, S. Ehara, K. Tohyama, AJNR Am. J. Neuroradiol.23, 1747 (2002).

T. Eckertet al., Neuroimage21, 229 (2004).

V. Napadow, R. Dhond, D. Kennedy, K. K. Hui, N. Makris, Neuroimage32, 1113 (2006).

10.1038/1124

10.1126/science.1077349

The reward prediction error term depends on time through a value function V ( t ) that gives the reward expected into the infinite future. It is the time derivative of V ( t ) dV ( t )/ dt that gives the reward expected at time t and is hypothesized to be communicated to midbrain dopamine nuclei ( 1 ).

10.1038/35084005

A. Viswanathan, R. D. Freeman, Nat. Neurosci.10, 1308 (2007).

The reward prediction error signal could also be generated by recurrent collaterals in the VTA.

10.1016/j.neuron.2005.05.020

H. M. Bayer, B. Lau, P. W. Glimcher, J. Neurophysiol.98, 1428 (2007).

10.1038/nn802

10.1006/nimg.2000.0593

10.1016/S0896-6273(03)00154-5

10.1016/S0896-6273(03)00169-7

10.1126/science.1094285

The VTA and SN include a variety of neuron types not just dopaminergic neurons and the BOLD response measured from these nuclei presumably reflects the composite activity of these neurons. Nevertheless as discussed further on our findings suggest that the BOLD responses we observed reflected a dominant if not exclusive influence of dopamine neuron activity. In addition the methods we used were not optimized for detecting responses in the SN or VStr. Because of the cardiac gated functional data acquisition image acquisition time was necessarily less than the time between heartbeats which limited the number of slices we were able to acquire (see supporting online material for details). Slices were placed so as to optimize our coverage of the VTA. The VTA is small and thus even with the number of slices we were limited to by the participant's heart rate we were able to record from the entire region. However we were only able to record from portions of the striatum (Fig. 3A) and SN. Given this limited coverage of the striatum and SN our measurements from these structures are likely to have been underpowered.

A recent study reports an equal BOLD-fMRI response in the VStr to delivery of juice and water ( 40 ). Consequently we included both juice and water trials in our study to keep participants as interested as possible. Juice and water trials were randomized across all scanning runs and results are collapsed across both trial types.

S. M. McClure, K. M. Ericson, D. I. Laibson, G. Loewenstein, J. D. Cohen, J. Neurosci.27, 5796 (2007).

In addition to regressors for reward prediction errors we also included regressors for the display of the visual cue and for dV/dt . No brain regions showed a significant response to the display of the cue during and after training [compare to ( 37 )]; however the visual cortex was not imaged in this experiment due to slice positioning (Fig. 1C). Additionally no brain regions showed a significant response to dV/dt .

Mean event-related BOLD responses were calculated for the receipt of expected and unexpected rewards (see supporting online material for details on impulse response function generation and for plots of time course data). The areas under the response curves were calculated from time t = 3 s to 7 s after reward receipt and correlated across subjects in the VTA and VStr. This segment of the response curves was selected because it corresponds to the time points of the peak in the BOLD response.

In temporal difference reinforcement learning the reward prediction error is given by δ( t ) = r ( t ) + E { r ( t + 1)| S t +1 } – E { r ( t )| S t . δ( t ) is the reward prediction error r ( t ) is the reward value at time t and E { r ( t )| S t } is the expected value of reward given the history of stimuli up to time t which is termed S t ( 1 2 ). When the stimulus is shown the BOLD response should be proportional to δ( t ) = E { r ( t + 1)| S t +1 } because we assume that expected value is constant or zero between the events. We represented the BOLD response to the display of the first number as not varying in magnitude because the probability of winning is unknown until participants press a button indicating their decision. It is important to note that when reward r ( t ) is less than 1 (i.e. when the participant loses $1) predictions for adverse outcomes can be made. Rewriting the prediction error equation in terms of reward value and probability of reward gives δ( t ) = r ( t ) + 1 × p ( r ( t )| S t ) – 1 × (1 – p ( r ( t )| S t ) = r ( t ) – 2 p ( r ( t )| S t ) + 1. For wins δ( t ) = 2(1 – p ( r ( t )| S t )) and decreases linearly with the probability of winning. For losses δ( t = 2 p ( r ( t )| S t ) and increases linearly with the probability of winning.

N. D. Daw, S. Kakade, P. Dayan, Neural Netw.15, 603 (2002).

10.1038/379449a0

10.1126/science.1093360

D. L. Cameron, M. W. Wessendorf, J. T. Williams, Neuroscience77, 155 (1997).

T. S. Braver, J. D. Cohen, in Control of Cognitive Processes: Attention and Performance XVIII, S. Monsell, J. Driver, Eds. (MIT Press, Cambridge, MA, 2000) pp. 713–737.

10.1146/annurev.neuro.28.061604.135709

H. M. Duvernoy The Human Brain Stem and Cerebellum: Surface Structure Vascularization Three Dimensional Sectional Anatomy and MRI (Springer New York 1995).

We thank V. Napadow for access to the brainstem normalization algorithm ahead of its publication and for guidance with data acquisition. We thank C. L. Buck K. Lowenberg and E. Barkley-Levenson for help with participant recruitment and scanning. We also thank R. Tengi for helping manage the large amount of disk space necessary to accomplish data analysis. This work was supported by NIH grants P50 MH062196 (J.D.C.) T32 MH065214 (J.D.C.) and F32 MH072141 (S.M.M.).