Pupil dilation and response slowing distinguish deliberate explorative choices in the probabilistic learning task

Springer Science and Business Media LLC - Tập 22 - Trang 1108-1129 - 2022
Galina L. Kozunova1, Ksenia E. Sayfulina1, Andrey O. Prokofyev1, Vladimir A. Medvedev1, Anna M. Rytikova1, Tatiana A. Stroganova1, Boris V. Chernyshev1
1Center for Neurocognitive Research (MEG-Center), Moscow State University of Psychology and Education, Moscow, Russia

Tóm tắt

This study examined whether pupil size and response time would distinguish directed exploration from random exploration and exploitation. Eighty-nine participants performed the two-choice probabilistic learning task while their pupil size and response time were continuously recorded. Using LMM analysis, we estimated differences in the pupil size and response time between the advantageous and disadvantageous choices as a function of learning success, i.e., whether or not a participant has learned the probabilistic contingency between choices and their outcomes. We proposed that before a true value of each choice became known to a decision-maker, both advantageous and disadvantageous choices represented a random exploration of the two options with an equally uncertain outcome, whereas the same choices after learning manifested exploitation and direct exploration strategies, respectively. We found that disadvantageous choices were associated with increases both in response time and pupil size, but only after the participants had learned the choice-reward contingencies. For the pupil size, this effect was strongly amplified for those disadvantageous choices that immediately followed gains as compared to losses in the preceding choice. Pupil size modulations were evident during the behavioral choice rather than during the pretrial baseline. These findings suggest that occasional disadvantageous choices, which violate the acquired internal utility model, represent directed exploration. This exploratory strategy shifts choice priorities in favor of information seeking and its autonomic and behavioral concomitants are mainly driven by the conflict between the behavioral plan of the intended exploratory choice and its strong alternative, which has already proven to be more rewarding.

Tài liệu tham khảo

Anderson, B. A. (2016). The attention habit: How reward learning shapes attentional selection. Year in Cognitive Neuroscience, 1369, 24–39. https://doi.org/10.1111/nyas.12957 Aston-Jones, G., & Cohen, J. D. (2005). An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annual Review of Neuroscience, 28, 403–450. https://doi.org/10.1146/annurev.neuro.28.061604.135709 Attard-Johnson, J., Ó Ciardha, C., & Bindemann, M. (2019). Comparing methods for the analysis of pupillary response. Behavior Research Methods, 51(1), 83-95. https://doi.org/10.3758/s13428-018-1108-6 Averbeck, B. B. (2015). Theory of Choice in Bandit, Information Sampling and Foraging Tasks. PLoS Computational Biology, 11(3), e1004164. https://doi.org/10.1371/journal.pcbi.1004164 Barthelme, S. (2019). eyelinker: Import ASC Files from EyeLink Eye Trackers. from https://cran.r-project.org/web/packages/eyelinker/index.html Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01 Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (1997). Deciding Advantageously Before Knowing the Advantageous Strategy. Science, 275(5304), 1293. https://doi.org/10.1126/science.275.5304.1293 Benjamini, Y., & Yekutieli, D. (2001). The Control of the False Discovery Rate in Multiple Testing under Dependency. The Annals of Statistics, 29(4), 1165–1188. Berlyne, D. E. (1966). Curiosity and exploration. Science, 153(3731), 25–33. Botvinick, M. M., Cohen, J. D., & Carter, C. S. (2004). Conflict monitoring and anterior cingulate cortex: An update. Trends in Cognitive Sciences, 8(12), 539–546. https://doi.org/10.1016/j.tics.2004.10.003 Brevers, D., Noel, X., Bechara, A., Vanavermaete, N., Verbanck, P., & Kornreich, C. (2015). Effect of Casino-Related Sound, Red Light and Pairs on Decision-Making During the Iowa Gambling Task. Journal of Gambling Studies, 31(2), 409–421. https://doi.org/10.1007/s10899-013-9441-2 Cavanagh, J. F., Wiecki, T. V., Cohen, M. X., Figueroa, C. M., Samanta, J., Sherman, S. J., & Frank, M. J. (2011). Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nature Neuroscience, 14(11), 1462-U1140. https://doi.org/10.1038/nn.2925 Cavanagh, J. F., Wiecki, T. V., Kochar, A., & Frank, M. J. (2014). Eye Tracking and Pupillometry Are Indicators of Dissociable Latent Decision Processes. Journal of Experimental Psychology-General, 143(4), 1476–1488. https://doi.org/10.1037/a0035813 Cogliati Dezza, I., Yu, A. J., Cleeremans, A., & Alexander, W. (2017). Learning the value of information and reward over time when solving exploration-exploitation problems. Scientific Reports, 7(1), 16919. https://doi.org/10.1038/s41598-017-17237-w Cohen, M. X., & van Gaal, S. (2013). Dynamic interactions between large-scale brain networks predict behavioral adaptation after perceptual errors. Cerebral Cortex, 23(5), 1061–1072. https://doi.org/10.1093/cercor/bhs069 Critchley, H. D., Tang, J., Glaser, D., Butterworth, B., & Dolan, R. J. (2005). Anterior cingulate activity during error and autonomic response. NeuroImage, 27(4), 885–895. https://doi.org/10.1016/j.neuroimage.2005.05.047 Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876–879. https://doi.org/10.1038/nature04766 de Gee, J. W., Knapen, T., & Donner, T. H. (2014). Decision-related pupil dilation reflects upcoming choice and individual bias. Proceedings of the National Academy of Sciences, 111(5), E618. https://doi.org/10.1073/pnas.1317557111 Dudschig, C., & Jentzsch, I. (2009). Speeding before and slowing after errors: Is it all just strategy? Brain Research, 1296, 56–62. https://doi.org/10.1016/j.brainres.2009.08.009 Dyson, B. J., & Quinlan, P. T. (2003). Feature and conjunction processing in the auditory modality. Perception & Psychophysics, 65(2), 254–272. https://doi.org/10.3758/BF03194798 Egner, T. (2007). Congruency sequence effects and cognitive control. Cognitive, Affective, & Behavioral Neuroscience, 7(4), 380–390. https://doi.org/10.3758/CABN.7.4.380 Ellerby, Z. W., & Tunney, R. J. (2017). The Effects of Heuristics and Apophenia on Probabilistic Choice. Advances in Cognitive Psychology, 13(4), 280–295. https://doi.org/10.5709/acp-0228-9 Frank, M. J., Seeberger, L. C., & O’Reilly, R. C. (2004). By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism. Science, 306(5703), 1940–1943. https://doi.org/10.1126/science.1102941 Gaffan, E. A., & Davies, J. (1981). The role of exploration in win-shift and win-stay performance on a radial maze. Learning and Motivation, 12(3), 282–299. https://doi.org/10.1016/0023-9690(81)90010-2 Garren, S. (2019). jmuOutlier: permutation tests for nonparametric statistics. R package version 2.2. from https://CRAN.R-project.org/package=jmuOutlier Gilzenrat, M. S., Nieuwenhuis, S., Jepma, M., & Cohen, J. D. (2010). Pupil diameter tracks changes in control state predicted by the adaptive gain theory of locus coeruleus function. Cognitive Affective & Behavioral Neuroscience, 10(2), 252–269. https://doi.org/10.3758/Cabn.10.2.252 Goudriaan, A. E., Oosterlaan, J., de Beurs, E., & van den Brink, W. (2005). Decision making in pathological gambling: A comparison between pathological gamblers, alcohol dependents, persons with Tourette syndrome, and normal controls. Cognitive Brain Research, 23(1), 137–151. https://doi.org/10.1016/j.cogbrainres.2005.01.017 Guttel, E., & Harel, A. (2005). Matching Probabilities: The Behavioral Law and Economics of Repeated Behavior. U. Chi. l. Rev., 72, 1197. Halekoh, U., & Højsgaard, S. (2014). A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models – The R Package pbkrtest. Journal of Statistical Software, 59(9), 32. https://doi.org/10.18637/jss.v059.i09 Hershman, R., & Henik, A. (2019). Dissociation Between Reaction Time and Pupil Dilation in the Stroop Task. Journal of Experimental Psychology-Learning Memory and Cognition, 45(10), 1899–1909. https://doi.org/10.1037/xlm0000690 Higgins, J. J. (2004). An introduction to modern nonparametric statistics: Brooks/Cole Pacific Grove, CA. Ivan, V. E., Banks, P. J., Goodfellow, K., & Gruber, A. J. (2018). Lose-Shift Responding in Humans Is Promoted by Increased Cognitive Load. Frontiers in Integrative Neuroscience, 12(9). https://doi.org/10.3389/fnint.2018.00009 Jepma, M., Beek, E. T. T., Wagenmakers, E. J., van Gerven, J. M. A., & Nieuwenhuis, S. (2010). The role of the noradrenergic system in the exploration-exploitation trade-off: a psychopharmacological study. Frontiers in human neuroscience, 4https://doi.org/10.3389/Fnhum.2010.00170 Jepma, M., & Nieuwenhuis, S. (2011). Pupil Diameter Predicts Changes in the Exploration-Exploitation Trade-off: Evidence for the Adaptive Gain Theory. Journal of Cognitive Neuroscience, 23(7), 1587–1596. https://doi.org/10.1162/jocn.2010.21548 Joshi, S., & Gold, J. I. (2020). Pupil Size as a Window on Neural Substrates of Cognition. Trends in Cognitive Sciences, 24(6), 466–480. https://doi.org/10.1016/j.tics.2020.03.005 Joshi, S., Li, Y., Kalwani, R. M., & Gold, J. I. (2016). Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate Cortex. Neuron, 89(1), 221–234. https://doi.org/10.1016/j.neuron.2015.11.028 Kahneman, D. (2003). A perspective on judgment and choice: Mapping bounded rationality. American Psychologist, 58(9), 697–720. https://doi.org/10.1037/0003-066X.58.9.697 Kliegl, R., Wei, P., Dambacher, M., Yan, M., & Zhou, X. (2011). Experimental Effects and Individual Differences in Linear Mixed Models: Estimating the Relationship between Spatial, Object, and Attraction Effects in Visual Attention. Frontiers in psychology, 1(238). https://doi.org/10.3389/fpsyg.2010.00238 Koenig, S., Uengoer, M., & Lachnit, H. (2018). Pupil dilation indicates the coding of past prediction errors: Evidence for attentional learning theory. Psychophysiology, 55(4), ARTN e13020. https://doi.org/10.1111/psyp.13020 Kolling, N., Behrens, T. E. J., Wittmann, M. K., & Rushworth, M. F. S. (2016). Multiple signals in anterior cingulate cortex. Current Opinion in Neurobiology, 37, 36–43. https://doi.org/10.1016/j.conb.2015.12.007 Kozunova, G., Voronin, N., Venidiktov, V., & Stroganova, T. (2018). Reinforcement Learning: a Role of Immediate Feedback and Internal Model. Zhurnal Vysshei Nervnoi Deyatelnosti Imeni I.P. Pavlova, 68(5), 602–613. https://doi.org/10.1134/S0044467718050076 Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software, 1(13), 1–26. https://doi.org/10.18637/jss.v082.i13 Laeng, B., Orbo, M., Holmlund, T., & Miozzo, M. (2011). Pupillary Stroop effects. Cognitive Processing, 12(1), 13–21. https://doi.org/10.1007/s10339-010-0370-z Lavin, C., San Martin, R., & Jubal, E. R. (2014). Pupil dilation signals uncertainty and surprise in a learning gambling task. Frontiers in behavioral neuroscience, 7, Artn 218. https://doi.org/10.3389/Fnbeh.2013.00218 Lenth, R. V. (2021). emmeans: estimated marginal means, aka least-squares means. R package version 1.6. 0. from https://CRAN.R-project.org/package=emmeans Lin, H., Saunders, B., Hutcherson, C. A., & Inzlicht, M. (2018). Midfrontal theta and pupil dilation parametrically track subjective conflict (but also surprise) during intertemporal choice. NeuroImage, 172, 838–852. https://doi.org/10.1016/j.neuroimage.2017.10.055 Mathôt, S., Fabius, J., Van Heusden, E., & Van der Stigchel, S. (2018). Safe and sensible preprocessing and baseline correction of pupil-size data. Behavior Research Methods, 50(1), 94–106. https://doi.org/10.3758/s13428-017-1007-2 Murphy, P. R., van Moort, M. L., & Nieuwenhuis, S. (2016). The Pupillary Orienting Response Predicts Adaptive Behavioral Adjustment after Errors. PLoS ONE, 11(3), e0151763. https://doi.org/10.1371/journal.pone.0151763 O’Connell, R. G., Dockree, P. M., Robertson, I. H., Bellgrove, M. A., Foxe, J. J., & Kelly, S. P. (2009). Uncovering the Neural Signature of Lapsing Attention: Electrophysiological Signals Predict Errors up to 20 s before They Occur. Journal of Neuroscience, 29(26), 8604–8611. https://doi.org/10.1523/jneurosci.5967-08.2009 Payzan-LeNestour, E., & Bossaerts, P. (2012). Do not bet on the unknown versus try to find out more: estimation uncertainty and "unexpected uncertainty" both modulate exploration. Frontiers in Neuroscience, 6. https://doi.org/10.3389/fnins.2012.00150 Poe, G. R., Foote, S., Eschenko, O., Johansen, J. P., Bouret, S., Aston-Jones, G., . . . Sara, S. J. (2020). Locus coeruleus: a new look at the blue spot. Nature Reviews Neuroscience, 21(11), 644-659. https://doi.org/10.1038/s41583-020-0360-9 Preciado, D., Munneke, J., & Theeuwes, J. (2017). Mixed signals: The effect of conflicting reward- and goal-driven biases on selective attention. Attention, Perception, & Psychophysics, 79(5), 1297–1310. https://doi.org/10.3758/s13414-017-1322-9 Preuschoff, K., Hart, B. M., & Einhauser, W. (2011). Pupil dilation signals surprise: evidence for noradrenaline's role in decision making. Frontiers in Neuroscience, 5, Unsp 115. https://doi.org/10.3389/Fnins.2011.00115 Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873–922. https://doi.org/10.1162/neco.2008.12-06-420 Richer, F., & Beatty, J. (1987). Contrasting Effects of Response Uncertainty on the Task-Evoked Pupillary Response and Reaction-Time. Psychophysiology, 24(3), 258–261. https://doi.org/10.1111/j.1469-8986.1987.tb00291.x Satterthwaite, T. D., Green, L., Myerson, J., Parker, J., Ramaratnam, M., & Buckner, R. L. (2007). Dissociable but inter-related systems of cognitive control and reward during decision making: Evidence from pupillometry and event-related fMRI. NeuroImage, 37(3), 1017–1031. https://doi.org/10.1016/j.neuroimage.2007.04.066 Saunders, B., Lin, H., Milyavskaya, M., & Inzlicht, M. (2017). The emotive nature of conflict monitoring in the medial prefrontal cortex. International Journal of Psychophysiology, 119, 31–40. https://doi.org/10.1016/j.ijpsycho.2017.01.004 Sayfulina, K., Kozunova, G., Medvedev, V., Rytikova, A., & Chernyshev, B. (2020). Decision making under uncertainty: exploration and exploitation. Journal of Modern Foreign Psychology, 9(2), 93–106. https://doi.org/10.17759/jmfp.2020090208 Schulz, E., & Gershman, S. J. (2019). The algorithmic architecture of exploration in the human brain. Current Opinion in Neurobiology, 55, 7–14. https://doi.org/10.1016/j.conb.2048.11.003 Schwartenbeck, P., Passecker, J., Hauser, T. U., FitzGerald, T. H. B., Kronbichler, M., & Friston, K. J. (2019). Computational mechanisms of curiosity and goal-directed exploration. eLife, 8, e41703. https://doi.org/10.7554/eLife.41703 Shanks, D. R., Tunney, R. J., & McCarthy, J. D. (2002). A re-examination of probability matching and rational choice. Journal of Behavioral Decision Making, 15(3), 233–250. https://doi.org/10.1002/bdm.413 Shenhav, A., Botvinick, M. M., & Cohen, J. D. (2013). The Expected Value of Control: An Integrative Theory of Anterior Cingulate Cortex Function. Neuron, 79(2), 217–240. https://doi.org/10.1016/j.neuron.2013.07.007 Stuart, A., Ord, J. K., & Arnold, S. (1999). Kendall's advanced theory of statistics. vol. 2a: Classical inference and the linear model. London: Arnold. Sutton, R. S., & Barto, A. G. (1999). Reinforcement Learning. Journal of Cognitive Neuroscience, 11(1), 126–134. Tibon, R., & Levy, D. A. (2015). Striking a balance: analyzing unbalanced event-related potential data. Frontiers in psychology, 6(555). https://doi.org/10.3389/fpsyg.2015.00555 Tukey, J. (1977). Exploratory data analysis (Vol. 2, pp. 131–160). Reading, PA: Addison-Wesley. Unsworth, N., & Robison, M. K. (2016). Pupillary correlates of lapses of sustained attention. Cognitive, Affective, & Behavioral Neuroscience, 16(4), 601–615. https://doi.org/10.3758/s13415-016-0417-4 Unturbe, J., & Corominas, J. (2007). Probability matching involves rule-generating ability: A neuropsychological mechanism dealing with probabilities. Neuropsychology, 21(5), 621–630. https://doi.org/10.1037/0894-4105.21.5.621 Urai, A. E., Braun, A., & Donner, T. H. (2017). Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias. Nature Communications, 8(1), 14637. https://doi.org/10.1038/ncomms14637 Usher, M., Cohen, J. D., Servan-Schreiber, D., Rajkowski, J., & Aston-Jones, G. (1999). The role of locus coeruleus in the regulation of cognitive performance. Science, 283(5401), 549–554. https://doi.org/10.1126/science.283.5401.549 Van Slooten, J. C., Jahfari, S., Knapen, T., & Theeuwes, J. (2018). How pupil responses track value-based decision-making during and after reinforcement learning. PLoS computational biology, 14(11). https://doi.org/10.1371/journal.pcbi.1006632 Vossen, H., Van Breukelen, G., Hermens, H., Van Os, J., & Lousberg, R. (2011). More potential in statistical analyses of event-related potentials: A mixed regression approach. International Journal of Methods in Psychiatric Research, 20(3), e56–e68. https://doi.org/10.1002/mpr.348 Vulkan, N. (2000). An economist’s perspective on probability matching. Journal of Economic Surveys, 14(1), 101–118. https://doi.org/10.1111/1467-6419.00106 Warren, C. M., Wilson, R. C., van der Wee, N. J., Giltay, E. J., van Noorden, M. S., Cohen, J. D., & Nieuwenhuis, S. (2017). The effect of atomoxetine on random and directed exploration in humans. PLoS ONE, 12(4), e0176034. https://doi.org/10.1371/journal.pone.0176034 Wessel, J. R., Danielmeier, C., & Ullsperger, M. (2011). Error Awareness Revisited: Accumulation of Multimodal Evidence from Central and Autonomic Nervous Systems. Journal of Cognitive Neuroscience, 23(10), 3021–3036. https://doi.org/10.1162/jocn.2011.21635 Wilson, R. C., Bonawitz, E., Costa, V. D., & Ebitz, R. B. (2021). Balancing exploration and exploitation with information and randomization. Current Opinion in Behavioral Sciences, 38, 49–56. https://doi.org/10.1016/j.cobeha.2020.10.001 Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A., & Cohen, J. D. (2014). Humans Use Directed and Random Exploration to Solve the Explore-Exploit Dilemma. Journal of Experimental Psychology-General, 143(6), 2074–2081. https://doi.org/10.1037/a0038199 Zajkowski, W. K., Kossut, M., & Wilson, R. C. (2017). A causal role for right frontopolar cortex in directed, but not random, exploration. eLife, 6, e27430. https://doi.org/10.7554/eLife.27430 Zenon, A. (2019). Eye pupil signals information gain. Proceedings of the Royal Society B-Biological Sciences, 286(1911), Artn 20191593. https://doi.org/10.1098/Rspb.2019.1593