Electrophysiological measures reveal the role of anterior cingulate cortex in learning from unreliable feedback

Springer Science and Business Media LLC - Tập 18 - Trang 949-963 - 2018
Peng Li1,2, Weiwei Peng1, Hong Li1,2,3, Clay B. Holroyd4
1Brain Function and Psychological Science Research Center, Shenzhen University, Shenzhen, China
2Shenzhen Key Laboratory of Affective and Social Cognitive Science, Shenzhen University, Shenzhen, China
3Center for Language and Brain, Shenzhen Institute of Neuroscience, Shenzhen, China
4Department of Psychology, University of Victoria, Victoria, Canada

Tóm tắt

Although a growing number of studies have investigated the neural mechanisms of reinforcement learning, it remains unclear how the brain responds to feedback that is unreliable. A recent theory proposes that the reward positivity (RewP) component of the event-related brain potential (ERP) and frontal midline theta (FMT) power reflect separate feedback-related processing functions of anterior cingulate cortex (ACC). In the present study, the electroencephalogram (EEG) was recorded from participants as they engaged in a time estimation task in which feedback reliability was manipulated across conditions. After each response, they received a cue that indicated that the following feedback stimulus was 100%, 75%, or 50% reliable. The results showed that participants’ time estimates adjusted linearly according to the feedback reliability. Moreover, presentation of the cue indicating 100% reliability elicited a larger RewP-like ERP component than the other cues did, and feedback presentation elicited a RewP of approximately equal amplitude for all of the three reliability conditions. By contrast, FMT power elicited by negative feedback decreased linearly from the 100% condition to 75% and 50% condition, and only FMT power predicted behavioral adjustments on the following trials. In addition, an analysis of Beta power and cross-frequency coupling (CFC) of Beta power with FMT phase suggested that Beta-FMT communication modulated motor areas for the purpose of adjusting behavior. We interpreted these findings in terms of the hierarchical reinforcement learning account of ACC, in which the RewP and FMT are proposed to reflect reward processing and control functions of ACC, respectively.

Tài liệu tham khảo

Altamura, M., Goldberg, T. E., Elvevåg, B., Holroyd, T., Carver, F. W., & Weinberger, D. R., et al. (2010). Prefrontal cortex modulation during anticipation of working memory demands as revealed by magnetoencephalography. Journal of Biomedical Imaging, 2010(10), 12. Axmacher, N., Henseler, M. M., Jensen, O., Weinreich, I., Elger, C. E., & Fell, J. (2010). Cross-frequency coupling supports multi-item working memory in the human hippocampus. Proceedings of the National Academy of Sciences, 107(7), 3228-3233. Baker, T. E. and C. B. Holroyd (2011). "Dissociated roles of the anterior cingulate cortex in reward and conflict processing as revealed by the feedback error-related negativity and N200." Biological Psychology 87(1): 25-34. Behrens, T. E., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. (2007). Learning the value of information in an uncertain world. Nature neuroscience, 10(9), 1214-1221. Bernat, E. M., Nelson, L. D., Steele, V. R., Gehring, W. J., & Patrick, C. J. (2011). Externalizing psychopathology and gain–loss feedback in a simulated gambling task: Dissociable components of brain response revealed by time-frequency analysis. Journal of Abnormal Psychology, 120(2), 352-364. Bernat, E.M., Nelson, L.D., Holroyd, C.B., Gehring, W.J., and Patrick, C.J. (2008). Separating cognitive processes with principal components analysis of EEG time-frequency distributions. Proceedings of the Society of Photo-Optical Instrumentation Engineers, Vol. 7074, 70740S. Botvinick, M. M. (2007). Conflict monitoring and decision making: reconciling two perspectives on anterior cingulate function, Cognitive Affective, & Behavioral Neuroscience, 7(4), 356-366. Bromberg-Martin, E. S., and Hikosaka, O. (2009). Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron, 63,119–126. Bruns, A., & Eckhorn, R. (2004). Task-related coupling from high- to low-frequency signals among visual cortical areas in human subdural recordings. International Journal of Psychophysiology, 51(2), 97. Buschman, T. J., Denovellis, E. L., Diogo, C., Bullock, D., & Miller, E. K. (2012). Synchronous oscillatory neural ensembles for rules in the prefrontal cortex. Neuron, 76(4), 838-846. Buschman, T.J., and Miller, E.K. (2007). Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science, 315, 1860–1862. Canolty R. T. & Knight, R. T. (2010). The functional role of cross-frequency coupling. Trends in Cognitive Sciences, 14(11), 506-515. Carlson, J. M., Foti, D., Mujica-Parodi, L. R., Harmon-Jones, E., & Hajcak, G. (2011). Ventral striatal and medial prefrontal BOLD activation is correlated with reward-related electrocortical activity: a combined ERP and fMRI study. Neuroimage, 57(4), 1608-1616. Carter, C. S., & Van Veen, V. (2007). Anterior cingulate cortex and conflict detection: an update of theory and data, Cognitive Affective, & Behavioral Neuroscience, 7(4), 367-379. Cavanagh, J. F., & Shackman, A. J. (2015). Frontal midline theta reflects anxiety and cognitive control: meta-analytic evidence. Journal of Physiology-Paris, 109(1–3), 3–15. Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414-421. Cavanagh, J. F., Frank, M. J., Klein, T. J., & Allen, J. J. (2010). Frontal theta links prediction errors to behavioral adaptation in reinforcement learning. Neuroimage, 49(4), 3198-3209. Cavanagh, J. F., Zambrano-Vazquez, L., & Allen, J. J. (2012). Theta lingua franca: A common mid-frontal substrate for action monitoring processes. Psychophysiology, 49(2), 220-238. Chase, H. W., Swainson, R., Durham, L., Benham, L., & Cools, R. (2011). Feedback-related negativity codes prediction error but not behavioral adjustment during probabilistic reversal learning. Journal of Cognitive Neuroscience, 23(4), 936-946. Cohen, M. X., Elger, C. E., & Fell, J. (2009). Oscillatory activity and phase–amplitude coupling in the human medial frontal cortex during decision making. Journal of cognitive neuroscience, 21(2), 390-402. Cohen, M. X., Elger, C. E., & Ranganath, C. (2007). Reward expectation modulates feedback-related negativity and eeg spectra. Neuroimage, 35(2), 968–978. Cohen, M. X., & Ranganath, C. (2007). Reinforcement learning signals predict future decisions. Journal of Neuroscience, 27(2), 371. Cohen, M. X., Wilmes, K. A., & Vijver, I. V. D. (2011). Cortical electrophysiological network dynamics of feedback learning. Trends in Cognitive Sciences, 15(12), 558–566. Delorme, A., & Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of neuroscience methods, 134(1), 9-21. Diedenhofen, B. & Musch, J. (2015). cocor: A comprehensive solution for the statistical comparison of correlations. PLoS ONE, 10(4): e0121945. Engel, A. K., & Fries, P. (2010). Beta-band oscillations—signalling the status quo?. Current opinion in neurobiology, 20(2), 156-165. Ernst, B., & Steinhauser, M. (2015). Effects of invalid feedback on learning and feedback-related brain activity in decision-making. Brain and cognition, 99, 78-86. Ernst, B., & Steinhauser, M. (2017). Top-down control over feedback processing: the probability of valid feedback affects feedback-related brain activity. Brain & Cognition, 115, 33. Hsieh, L. T., & Ranganath, C. (2013). Frontal midline theta oscillations during working memory maintenance and episodic encoding and retrieval. Neuroimage, 85(2), 721-729. HajiHosseini, A., & Holroyd, C. B. (2013). Frontal midline theta and N200 amplitude reflect complementary information about expectancy and outcome evaluation. Psychophysiology, 50(6), 550-562. HajiHosseini, A., & Holroyd, C. B. (2015a). Reward feedback stimuli elicit high-beta eeg oscillations in human dorsolateral prefrontal cortex. Scientific Reports, 5, 13021. HajiHosseini, A., & Holroyd, C. B. (2015b). Frontal beta oscillations reflect encoding of information related to desired task performance irrespective of feedback valence. Program No. 352.24. 2015 Neuroscience Meeting Planner. Washington, DC: Society for Neuroscience, 2015. Online HajiHosseini, A., Rodríguez-Fornells, A., Marco-Pallarés, J., 2012. The role of beta-gamma oscillations in unexpected rewards processing. Neuroimage, 60, 1678–1685. Hittner, J. B., May, K., & Silver, N. C. (2003). A Monte Carlo evaluation of tests for comparing dependent correlations. The Journal of general psychology, 130(2), 149-168. Holroyd, C. B. (2004). A note on the oddball N200 and the feedback ERN. Neurophysiology, 78, 447-455. Holroyd, C. B., (2016). The waste disposal problem of effortful control. In: Braver, T. (Ed.), Motivation and cognitive control. Psychology Press, New York, NY, pp. 235–260. Holroyd, C. B., HajiHosseini, A., & Baker, T. E. (2012). ERPs and EEG oscillations, Best friends forever: comment on Cohen et al. Trends in Cognitive Sciences, 16, 192. Holroyd, C. B. & Krigolson O. E. (2007). Reward prediction error signals associated with a modified time estimation task. Psychophysiology, 44(6): 913-917. Holroyd, C. B. and M. G. H. Coles (2002). The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679-709. Holroyd, C. B., Pakzad-Vaezi, K. L., & Krigolson, O. E. (2008). The feedback correct-related positivity: sensitivity of the event-related brain potential to unexpected positive feedback. Psychophysiology, 45(5), 688–697. Holroyd, C. B., & McClure, S. M. (2015). Hierarchical control over effortful behavior by rodent medial frontal cortex: A computational model. Psychological review, 122(1), 54. Holroyd, C. B., Nieuwenhuis, S., Yeung, N., Nystrom, L., Mars, R. B., Coles, M. G., & Cohen, J. D. (2004). Dorsal anterior cingulate cortex shows fMRI response to internal and external error signals. Nature Neuroscience, 7(5), 497. Holroyd, C. B., & Umemoto, A. (2016). The research domain criteria framework: The case for anterior cingulate cortex. Neuroscience & Biobehavioral Reviews, 71, 418-443. Holroyd, C. B., & Yeung, N. (2012). Motivation of extended behaviors by anterior cingulate cortex. Trends in Cognitive Sciences, 16(2), 122–128. Irene, V. D. V., Ridderinkhof, K. R., & Cohen, M. X. (2011). Frontal oscillatory dynamics predict feedback learning and action adjustment. Journal of Cognitive Neuroscience, 23(12), 4106-4121. Johnston, K., Levin, H. M., Koval, M. J., & Everling, S. (2007). Top-down control-signal dynamics in anterior cingulate and prefrontal cortex neurons following task switching. Neuron, 53(3), 453-462. Karlsson, M. P., Tervo, D. G., Karpova, A. Y. (2012). Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science, 338(6103):135–139. Khamassi, M., Lallée, S., Enel, P., Procyk, E., & Dominey, P. F. (2011). Robot cognitive control with a neurophysiologically inspired reinforcement learning model. Frontiers in neurorobotics, 5. Li, P., Jia, S., Feng, T., Liu, Q., Suo, T., & Li, H. (2010). The influence of the diffusion of responsibility effect on outcome evaluations: Electrophysiological evidence from an ERP study. Neuroimage, 52(4), 1727–1733. Li, P., Baker, T. E., Warren, C., & Li, H. (2016). Oscillatory profiles of positive, negative and neutral feedback stimuli during adaptive decision making. International Journal of Psychophysiology, 107, 37-43. Luft, C. D. B., et al. (2013). High-Learners Present Larger Mid-Frontal Theta Power and Connectivity in Response to Incorrect Performance Feedback. Journal of Neuroscience 33(5): 2029-2038. Marco-Pallarés, J., Cucurell, D., Cunillera, T., García, R., Andrés-Pueyo, A., Münte, T. F., & Rodríguez-Fornells, A. (2008). Human oscillatory activity associated to reward processing in a gambling task. Neuropsychologia, 46, 241-248. Marco-Pallarés, J., Münte, T. F., & Rodríguez-Fornells, A. (2015). The role of high-frequency oscillatory activity in reward processing and learning. Neuroscience & Biobehavioral Reviews, 49, 1-7. Miltner, W. H. R., Braun, C. H., & Coles, M. G. H. (1997). Event-related brain potentials following incorrect feedback in a time-estimation task: Evidence for a “generic” neural system for error detection. Journal of Cognitive Neuroscience, 9, 788–798. Mouraux, A., Guerit, J. M., & Plaghki, L. (2003). Non-phase locked electroencephalogram (EEG) responses to CO 2 laser skin stimulations may reflect central interactions between A∂-and C-fibre afferent volleys. Clinical neurophysiology, 114(4), 710-722. Mouraux, A., & Iannetti, G. D. (2008). Across-trial averaging of event-related eeg responses and beyond. Magnetic Resonance Imaging, 26(7), 1041-54. Nieuwenhuis, S., Astonjones, G., & Cohen, J. D. (2005a). Decision making, the p3, and the locus coeruleus-norepinephrine system. Psychological Bulletin, 131(4), 510-32. Nieuwenhuis, S., Slagter, H., Alting von Geusau, N., Heslenfeld, D.J., & Holroyd, C.B. (2005b). Knowing good from bad: Differential activation of human cortical areas by positive and negative outcomes. European Journal of Neuroscience, 21, 3161-3168. Onslow, A. C., Bogacz, R., & Jones, M. W. (2011). Quantifying phase-amplitude coupling in neuronal network oscillations. Progress in Biophysics & Molecular Biology, 105(1–2), 49-57. O’Reilly, J. X., Schüffelgen, U., Cuell, S. F., Behrens, T. E., Mars, R. B., & Rushworth, M. F. (2013). Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proceedings of the National Academy of Sciences, 110(38), E3660-E3669. Pesaran, B., Nelson, M.J., and Andersen, R.A. (2008). Free choice activates a decision circuit between frontal and parietal cortex. Nature, 453, 406–409. Proudfit, G. H. (2015). The reward positivity: From basic research on reward to a biomarker for depression. Psychophysiology, 52(4), 449-459. Rutishauser, U., Ross, I. B., Mamelak, A. N., & Schuman, E. M. (2010). Human memory strength is predicted by theta-frequency phase-locking of single neurons. Nature, 464(7290), 903-7. Sambrook, T. D., & Goslin, J. (2015). A neural reward prediction error revealed by a meta-analysis of ERPs using great grand averages. Psychological Bulletin, 141(1), 213-235. Schiffer, A. M., Siletti, K., Waszak, F., & Yeung, N. (2017). Adaptive behaviour and feedback processing integrate experience and instruction in reinforcement learning. NeuroImage, 146, 626-641. Siegel, M., Donner, T. H., & Engel, A. K. (2012). Spectral fingerprints of large-scale neuronal interactions. Nature Reviews Neuroscience, 13(2), 121-134. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press. Umemoto, A., Hajihosseini, A., Yates, M. E., & Holroyd, C. B. (2017). Reward-based contextual learning supported by anterior cingulate cortex. Cognitive Affective & Behavioral Neuroscience, 17(3), 642. Verguts, T. (2017). Binding by random bursts: a computational model of cognitive control. Journal of Cognitive Neuroscience, 29(6), 1103-1118. Walsh, M. M., & Anderson, J. R. (2011). Modulation of the feedback-related negativity by instruction and experience. Proceedings of the National Academy of Sciences of the United States of America, 108(47), 19048-53. Walsh, M. M., & Anderson, J. R. (2012). Learning from experience: event-related potential correlates of reward processing, neural adaptation, and behavioral choice. Neuroscience & Biobehavioral Reviews, 36(8), 1870-1884. Warren, C. M., & Holroyd, C. B. (2012). The impact of deliberative strategy dissociates ERP components related to conflict processing vs. reinforcement learning. Frontiers in neuroscience, 6. Warren, C. M., Hyman, J. M., Seamans, J. K., & Holroyd, C. B. (2015). Reward processing in the rodent anterior cingulate cortex. Journal of Physiology, Paris, 109 (1), 87-94. Wang, J., Chen, Z., Peng, X., Yang, T., Li, P., Cong, F., & Li, H. (2016). To know or not to know? theta and delta reflect complementary information about an advanced cue before feedback in decision-making. Frontiers in psychology, 7.