Pupil dilation and response slowing distinguish deliberate explorative choices in the probabilistic learning task
Tóm tắt
This study examined whether pupil size and response time would distinguish directed exploration from random exploration and exploitation. Eighty-nine participants performed the two-choice probabilistic learning task while their pupil size and response time were continuously recorded. Using LMM analysis, we estimated differences in the pupil size and response time between the advantageous and disadvantageous choices as a function of learning success, i.e., whether or not a participant has learned the probabilistic contingency between choices and their outcomes. We proposed that before a true value of each choice became known to a decision-maker, both advantageous and disadvantageous choices represented a random exploration of the two options with an equally uncertain outcome, whereas the same choices after learning manifested exploitation and direct exploration strategies, respectively. We found that disadvantageous choices were associated with increases both in response time and pupil size, but only after the participants had learned the choice-reward contingencies. For the pupil size, this effect was strongly amplified for those disadvantageous choices that immediately followed gains as compared to losses in the preceding choice. Pupil size modulations were evident during the behavioral choice rather than during the pretrial baseline. These findings suggest that occasional disadvantageous choices, which violate the acquired internal utility model, represent directed exploration. This exploratory strategy shifts choice priorities in favor of information seeking and its autonomic and behavioral concomitants are mainly driven by the conflict between the behavioral plan of the intended exploratory choice and its strong alternative, which has already proven to be more rewarding.
Tài liệu tham khảo
Anderson, B. A. (2016). The attention habit: How reward learning shapes attentional selection. Year in Cognitive Neuroscience, 1369, 24–39. https://doi.org/10.1111/nyas.12957
Aston-Jones, G., & Cohen, J. D. (2005). An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annual Review of Neuroscience, 28, 403–450. https://doi.org/10.1146/annurev.neuro.28.061604.135709
Attard-Johnson, J., Ó Ciardha, C., & Bindemann, M. (2019). Comparing methods for the analysis of pupillary response. Behavior Research Methods, 51(1), 83-95. https://doi.org/10.3758/s13428-018-1108-6
Averbeck, B. B. (2015). Theory of Choice in Bandit, Information Sampling and Foraging Tasks. PLoS Computational Biology, 11(3), e1004164. https://doi.org/10.1371/journal.pcbi.1004164
Barthelme, S. (2019). eyelinker: Import ASC Files from EyeLink Eye Trackers. from https://cran.r-project.org/web/packages/eyelinker/index.html
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (1997). Deciding Advantageously Before Knowing the Advantageous Strategy. Science, 275(5304), 1293. https://doi.org/10.1126/science.275.5304.1293
Benjamini, Y., & Yekutieli, D. (2001). The Control of the False Discovery Rate in Multiple Testing under Dependency. The Annals of Statistics, 29(4), 1165–1188.
Berlyne, D. E. (1966). Curiosity and exploration. Science, 153(3731), 25–33.
Botvinick, M. M., Cohen, J. D., & Carter, C. S. (2004). Conflict monitoring and anterior cingulate cortex: An update. Trends in Cognitive Sciences, 8(12), 539–546. https://doi.org/10.1016/j.tics.2004.10.003
Brevers, D., Noel, X., Bechara, A., Vanavermaete, N., Verbanck, P., & Kornreich, C. (2015). Effect of Casino-Related Sound, Red Light and Pairs on Decision-Making During the Iowa Gambling Task. Journal of Gambling Studies, 31(2), 409–421. https://doi.org/10.1007/s10899-013-9441-2
Cavanagh, J. F., Wiecki, T. V., Cohen, M. X., Figueroa, C. M., Samanta, J., Sherman, S. J., & Frank, M. J. (2011). Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nature Neuroscience, 14(11), 1462-U1140. https://doi.org/10.1038/nn.2925
Cavanagh, J. F., Wiecki, T. V., Kochar, A., & Frank, M. J. (2014). Eye Tracking and Pupillometry Are Indicators of Dissociable Latent Decision Processes. Journal of Experimental Psychology-General, 143(4), 1476–1488. https://doi.org/10.1037/a0035813
Cogliati Dezza, I., Yu, A. J., Cleeremans, A., & Alexander, W. (2017). Learning the value of information and reward over time when solving exploration-exploitation problems. Scientific Reports, 7(1), 16919. https://doi.org/10.1038/s41598-017-17237-w
Cohen, M. X., & van Gaal, S. (2013). Dynamic interactions between large-scale brain networks predict behavioral adaptation after perceptual errors. Cerebral Cortex, 23(5), 1061–1072. https://doi.org/10.1093/cercor/bhs069
Critchley, H. D., Tang, J., Glaser, D., Butterworth, B., & Dolan, R. J. (2005). Anterior cingulate activity during error and autonomic response. NeuroImage, 27(4), 885–895. https://doi.org/10.1016/j.neuroimage.2005.05.047
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876–879. https://doi.org/10.1038/nature04766
de Gee, J. W., Knapen, T., & Donner, T. H. (2014). Decision-related pupil dilation reflects upcoming choice and individual bias. Proceedings of the National Academy of Sciences, 111(5), E618. https://doi.org/10.1073/pnas.1317557111
Dudschig, C., & Jentzsch, I. (2009). Speeding before and slowing after errors: Is it all just strategy? Brain Research, 1296, 56–62. https://doi.org/10.1016/j.brainres.2009.08.009
Dyson, B. J., & Quinlan, P. T. (2003). Feature and conjunction processing in the auditory modality. Perception & Psychophysics, 65(2), 254–272. https://doi.org/10.3758/BF03194798
Egner, T. (2007). Congruency sequence effects and cognitive control. Cognitive, Affective, & Behavioral Neuroscience, 7(4), 380–390. https://doi.org/10.3758/CABN.7.4.380
Ellerby, Z. W., & Tunney, R. J. (2017). The Effects of Heuristics and Apophenia on Probabilistic Choice. Advances in Cognitive Psychology, 13(4), 280–295. https://doi.org/10.5709/acp-0228-9
Frank, M. J., Seeberger, L. C., & O’Reilly, R. C. (2004). By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism. Science, 306(5703), 1940–1943. https://doi.org/10.1126/science.1102941
Gaffan, E. A., & Davies, J. (1981). The role of exploration in win-shift and win-stay performance on a radial maze. Learning and Motivation, 12(3), 282–299. https://doi.org/10.1016/0023-9690(81)90010-2
Garren, S. (2019). jmuOutlier: permutation tests for nonparametric statistics. R package version 2.2. from https://CRAN.R-project.org/package=jmuOutlier
Gilzenrat, M. S., Nieuwenhuis, S., Jepma, M., & Cohen, J. D. (2010). Pupil diameter tracks changes in control state predicted by the adaptive gain theory of locus coeruleus function. Cognitive Affective & Behavioral Neuroscience, 10(2), 252–269. https://doi.org/10.3758/Cabn.10.2.252
Goudriaan, A. E., Oosterlaan, J., de Beurs, E., & van den Brink, W. (2005). Decision making in pathological gambling: A comparison between pathological gamblers, alcohol dependents, persons with Tourette syndrome, and normal controls. Cognitive Brain Research, 23(1), 137–151. https://doi.org/10.1016/j.cogbrainres.2005.01.017
Guttel, E., & Harel, A. (2005). Matching Probabilities: The Behavioral Law and Economics of Repeated Behavior. U. Chi. l. Rev., 72, 1197.
Halekoh, U., & Højsgaard, S. (2014). A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models – The R Package pbkrtest. Journal of Statistical Software, 59(9), 32. https://doi.org/10.18637/jss.v059.i09
Hershman, R., & Henik, A. (2019). Dissociation Between Reaction Time and Pupil Dilation in the Stroop Task. Journal of Experimental Psychology-Learning Memory and Cognition, 45(10), 1899–1909. https://doi.org/10.1037/xlm0000690
Higgins, J. J. (2004). An introduction to modern nonparametric statistics: Brooks/Cole Pacific Grove, CA.
Ivan, V. E., Banks, P. J., Goodfellow, K., & Gruber, A. J. (2018). Lose-Shift Responding in Humans Is Promoted by Increased Cognitive Load. Frontiers in Integrative Neuroscience, 12(9). https://doi.org/10.3389/fnint.2018.00009
Jepma, M., Beek, E. T. T., Wagenmakers, E. J., van Gerven, J. M. A., & Nieuwenhuis, S. (2010). The role of the noradrenergic system in the exploration-exploitation trade-off: a psychopharmacological study. Frontiers in human neuroscience, 4https://doi.org/10.3389/Fnhum.2010.00170
Jepma, M., & Nieuwenhuis, S. (2011). Pupil Diameter Predicts Changes in the Exploration-Exploitation Trade-off: Evidence for the Adaptive Gain Theory. Journal of Cognitive Neuroscience, 23(7), 1587–1596. https://doi.org/10.1162/jocn.2010.21548
Joshi, S., & Gold, J. I. (2020). Pupil Size as a Window on Neural Substrates of Cognition. Trends in Cognitive Sciences, 24(6), 466–480. https://doi.org/10.1016/j.tics.2020.03.005
Joshi, S., Li, Y., Kalwani, R. M., & Gold, J. I. (2016). Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate Cortex. Neuron, 89(1), 221–234. https://doi.org/10.1016/j.neuron.2015.11.028
Kahneman, D. (2003). A perspective on judgment and choice: Mapping bounded rationality. American Psychologist, 58(9), 697–720. https://doi.org/10.1037/0003-066X.58.9.697
Kliegl, R., Wei, P., Dambacher, M., Yan, M., & Zhou, X. (2011). Experimental Effects and Individual Differences in Linear Mixed Models: Estimating the Relationship between Spatial, Object, and Attraction Effects in Visual Attention. Frontiers in psychology, 1(238). https://doi.org/10.3389/fpsyg.2010.00238
Koenig, S., Uengoer, M., & Lachnit, H. (2018). Pupil dilation indicates the coding of past prediction errors: Evidence for attentional learning theory. Psychophysiology, 55(4), ARTN e13020. https://doi.org/10.1111/psyp.13020
Kolling, N., Behrens, T. E. J., Wittmann, M. K., & Rushworth, M. F. S. (2016). Multiple signals in anterior cingulate cortex. Current Opinion in Neurobiology, 37, 36–43. https://doi.org/10.1016/j.conb.2015.12.007
Kozunova, G., Voronin, N., Venidiktov, V., & Stroganova, T. (2018). Reinforcement Learning: a Role of Immediate Feedback and Internal Model. Zhurnal Vysshei Nervnoi Deyatelnosti Imeni I.P. Pavlova, 68(5), 602–613. https://doi.org/10.1134/S0044467718050076
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software, 1(13), 1–26. https://doi.org/10.18637/jss.v082.i13
Laeng, B., Orbo, M., Holmlund, T., & Miozzo, M. (2011). Pupillary Stroop effects. Cognitive Processing, 12(1), 13–21. https://doi.org/10.1007/s10339-010-0370-z
Lavin, C., San Martin, R., & Jubal, E. R. (2014). Pupil dilation signals uncertainty and surprise in a learning gambling task. Frontiers in behavioral neuroscience, 7, Artn 218. https://doi.org/10.3389/Fnbeh.2013.00218
Lenth, R. V. (2021). emmeans: estimated marginal means, aka least-squares means. R package version 1.6. 0. from https://CRAN.R-project.org/package=emmeans
Lin, H., Saunders, B., Hutcherson, C. A., & Inzlicht, M. (2018). Midfrontal theta and pupil dilation parametrically track subjective conflict (but also surprise) during intertemporal choice. NeuroImage, 172, 838–852. https://doi.org/10.1016/j.neuroimage.2017.10.055
Mathôt, S., Fabius, J., Van Heusden, E., & Van der Stigchel, S. (2018). Safe and sensible preprocessing and baseline correction of pupil-size data. Behavior Research Methods, 50(1), 94–106. https://doi.org/10.3758/s13428-017-1007-2
Murphy, P. R., van Moort, M. L., & Nieuwenhuis, S. (2016). The Pupillary Orienting Response Predicts Adaptive Behavioral Adjustment after Errors. PLoS ONE, 11(3), e0151763. https://doi.org/10.1371/journal.pone.0151763
O’Connell, R. G., Dockree, P. M., Robertson, I. H., Bellgrove, M. A., Foxe, J. J., & Kelly, S. P. (2009). Uncovering the Neural Signature of Lapsing Attention: Electrophysiological Signals Predict Errors up to 20 s before They Occur. Journal of Neuroscience, 29(26), 8604–8611. https://doi.org/10.1523/jneurosci.5967-08.2009
Payzan-LeNestour, E., & Bossaerts, P. (2012). Do not bet on the unknown versus try to find out more: estimation uncertainty and "unexpected uncertainty" both modulate exploration. Frontiers in Neuroscience, 6. https://doi.org/10.3389/fnins.2012.00150
Poe, G. R., Foote, S., Eschenko, O., Johansen, J. P., Bouret, S., Aston-Jones, G., . . . Sara, S. J. (2020). Locus coeruleus: a new look at the blue spot. Nature Reviews Neuroscience, 21(11), 644-659. https://doi.org/10.1038/s41583-020-0360-9
Preciado, D., Munneke, J., & Theeuwes, J. (2017). Mixed signals: The effect of conflicting reward- and goal-driven biases on selective attention. Attention, Perception, & Psychophysics, 79(5), 1297–1310. https://doi.org/10.3758/s13414-017-1322-9
Preuschoff, K., Hart, B. M., & Einhauser, W. (2011). Pupil dilation signals surprise: evidence for noradrenaline's role in decision making. Frontiers in Neuroscience, 5, Unsp 115. https://doi.org/10.3389/Fnins.2011.00115
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873–922. https://doi.org/10.1162/neco.2008.12-06-420
Richer, F., & Beatty, J. (1987). Contrasting Effects of Response Uncertainty on the Task-Evoked Pupillary Response and Reaction-Time. Psychophysiology, 24(3), 258–261. https://doi.org/10.1111/j.1469-8986.1987.tb00291.x
Satterthwaite, T. D., Green, L., Myerson, J., Parker, J., Ramaratnam, M., & Buckner, R. L. (2007). Dissociable but inter-related systems of cognitive control and reward during decision making: Evidence from pupillometry and event-related fMRI. NeuroImage, 37(3), 1017–1031. https://doi.org/10.1016/j.neuroimage.2007.04.066
Saunders, B., Lin, H., Milyavskaya, M., & Inzlicht, M. (2017). The emotive nature of conflict monitoring in the medial prefrontal cortex. International Journal of Psychophysiology, 119, 31–40. https://doi.org/10.1016/j.ijpsycho.2017.01.004
Sayfulina, K., Kozunova, G., Medvedev, V., Rytikova, A., & Chernyshev, B. (2020). Decision making under uncertainty: exploration and exploitation. Journal of Modern Foreign Psychology, 9(2), 93–106. https://doi.org/10.17759/jmfp.2020090208
Schulz, E., & Gershman, S. J. (2019). The algorithmic architecture of exploration in the human brain. Current Opinion in Neurobiology, 55, 7–14. https://doi.org/10.1016/j.conb.2048.11.003
Schwartenbeck, P., Passecker, J., Hauser, T. U., FitzGerald, T. H. B., Kronbichler, M., & Friston, K. J. (2019). Computational mechanisms of curiosity and goal-directed exploration. eLife, 8, e41703. https://doi.org/10.7554/eLife.41703
Shanks, D. R., Tunney, R. J., & McCarthy, J. D. (2002). A re-examination of probability matching and rational choice. Journal of Behavioral Decision Making, 15(3), 233–250. https://doi.org/10.1002/bdm.413
Shenhav, A., Botvinick, M. M., & Cohen, J. D. (2013). The Expected Value of Control: An Integrative Theory of Anterior Cingulate Cortex Function. Neuron, 79(2), 217–240. https://doi.org/10.1016/j.neuron.2013.07.007
Stuart, A., Ord, J. K., & Arnold, S. (1999). Kendall's advanced theory of statistics. vol. 2a: Classical inference and the linear model. London: Arnold.
Sutton, R. S., & Barto, A. G. (1999). Reinforcement Learning. Journal of Cognitive Neuroscience, 11(1), 126–134.
Tibon, R., & Levy, D. A. (2015). Striking a balance: analyzing unbalanced event-related potential data. Frontiers in psychology, 6(555). https://doi.org/10.3389/fpsyg.2015.00555
Tukey, J. (1977). Exploratory data analysis (Vol. 2, pp. 131–160). Reading, PA: Addison-Wesley.
Unsworth, N., & Robison, M. K. (2016). Pupillary correlates of lapses of sustained attention. Cognitive, Affective, & Behavioral Neuroscience, 16(4), 601–615. https://doi.org/10.3758/s13415-016-0417-4
Unturbe, J., & Corominas, J. (2007). Probability matching involves rule-generating ability: A neuropsychological mechanism dealing with probabilities. Neuropsychology, 21(5), 621–630. https://doi.org/10.1037/0894-4105.21.5.621
Urai, A. E., Braun, A., & Donner, T. H. (2017). Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias. Nature Communications, 8(1), 14637. https://doi.org/10.1038/ncomms14637
Usher, M., Cohen, J. D., Servan-Schreiber, D., Rajkowski, J., & Aston-Jones, G. (1999). The role of locus coeruleus in the regulation of cognitive performance. Science, 283(5401), 549–554. https://doi.org/10.1126/science.283.5401.549
Van Slooten, J. C., Jahfari, S., Knapen, T., & Theeuwes, J. (2018). How pupil responses track value-based decision-making during and after reinforcement learning. PLoS computational biology, 14(11). https://doi.org/10.1371/journal.pcbi.1006632
Vossen, H., Van Breukelen, G., Hermens, H., Van Os, J., & Lousberg, R. (2011). More potential in statistical analyses of event-related potentials: A mixed regression approach. International Journal of Methods in Psychiatric Research, 20(3), e56–e68. https://doi.org/10.1002/mpr.348
Vulkan, N. (2000). An economist’s perspective on probability matching. Journal of Economic Surveys, 14(1), 101–118. https://doi.org/10.1111/1467-6419.00106
Warren, C. M., Wilson, R. C., van der Wee, N. J., Giltay, E. J., van Noorden, M. S., Cohen, J. D., & Nieuwenhuis, S. (2017). The effect of atomoxetine on random and directed exploration in humans. PLoS ONE, 12(4), e0176034. https://doi.org/10.1371/journal.pone.0176034
Wessel, J. R., Danielmeier, C., & Ullsperger, M. (2011). Error Awareness Revisited: Accumulation of Multimodal Evidence from Central and Autonomic Nervous Systems. Journal of Cognitive Neuroscience, 23(10), 3021–3036. https://doi.org/10.1162/jocn.2011.21635
Wilson, R. C., Bonawitz, E., Costa, V. D., & Ebitz, R. B. (2021). Balancing exploration and exploitation with information and randomization. Current Opinion in Behavioral Sciences, 38, 49–56. https://doi.org/10.1016/j.cobeha.2020.10.001
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A., & Cohen, J. D. (2014). Humans Use Directed and Random Exploration to Solve the Explore-Exploit Dilemma. Journal of Experimental Psychology-General, 143(6), 2074–2081. https://doi.org/10.1037/a0038199
Zajkowski, W. K., Kossut, M., & Wilson, R. C. (2017). A causal role for right frontopolar cortex in directed, but not random, exploration. eLife, 6, e27430. https://doi.org/10.7554/eLife.27430
Zenon, A. (2019). Eye pupil signals information gain. Proceedings of the Royal Society B-Biological Sciences, 286(1911), Artn 20191593. https://doi.org/10.1098/Rspb.2019.1593