Visual novelty, curiosity, and intrinsic reward in machine learning and the brain

Current Opinion in Neurobiology - Tập 58 - Trang 167-174 - 2019
Andrew Jaegle1, Vahid Mehrpour1, Nicole Rust1
1Department of Psychology, University of Pennsylvania, United States

Tài liệu tham khảo

Sutton, 2017 Lee, 2012, Neural basis of reinforcement learning and decision making, Annu Rev Neurosci, 35, 287, 10.1146/annurev-neuro-062111-150512 Silver, 2017, Mastering the game of Go without human knowledge, Nature, 550, 354, 10.1038/nature24270 Mnih, 2015, Human-level control through deep reinforcement learning, Nature, 518, 529, 10.1038/nature14236 Gottlieb, 2018, Towards a neuroscience of active sampling and curiosity, Nat Rev Neurosci, 1 Reynolds, 2015, Infant visual attention and object recognition, Behav Brain Res, 285, 34, 10.1016/j.bbr.2015.01.015 Glickman, 1966, Curiosity in zoo animals, Behaviour, 151, 10.1163/156853966X00074 Hall, 2018, Curious creatures: a multi-taxa investigation of responses to novelty in a zoo environment, PeerJ, 6, 10.7717/peerj.4454 Schultz, 1998, Predictive reward signal of dopamine neurons, J Neurophysiol, 80, 1, 10.1152/jn.1998.80.1.1 Kakade, 2002, Dopamine: generalization and bonuses, Neural Networks, 15, 549, 10.1016/S0893-6080(02)00048-5 Bellman, 1954, The theory of dynamic programming, Bull Am Math Soc, 60, 503, 10.1090/S0002-9904-1954-09848-8 Strehl, 2008, An analysis of model-based interval estimation for Markov decision processes, J Comput Syst Sci, 74, 1309, 10.1016/j.jcss.2007.08.009 Kolter, 2009, Near-Bayesian exploration in polynomial time, 513 Lai, 1985, Asymptotically efficient adaptive allocation rules, Adv Appl Math, 6, 4, 10.1016/0196-8858(85)90002-8 Strehl, 2005, A theoretical analysis of model-based interval estimation, 856 Bellemare, 2013, The arcade learning environment: an evaluation platform for general agents, J Artif Intell Res, 47, 253, 10.1613/jair.3912 Goodfellow, 2016, vol 1 LeCun, 2015, Deep learning, Nature, 521, 436, 10.1038/nature14539 Tang, 2017, #Exploration: a study of count-based exploration for deep reinforcement learning, 2753 Abel, 2016, Exploratory gradient boosting for reinforcement learning in complex domains Savinov, 2019, Episodic curiosity through reachability Anselmi, 2016, Unsupervised learning of invariant representations, Theor Comput Sci, 633, 112, 10.1016/j.tcs.2015.06.048 Bellemare, 2016, Unifying count-based exploration and intrinsic motivation, 1471 Martin, 2017, Count-based exploration in feature space for reinforcement learning Ostrovski, 2017, Count-based exploration with neural density models Cover, 2006 Schmidhuber, 2009, Driven by compression progress: a simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes, J SICE, 48, 21 Singh, 2004, Intrinsically motivated reinforcement learning, 1281 Yang, 2016, Theoretical perspectives on active sensing, Curr Opin Behav Sci, 11, 100, 10.1016/j.cobeha.2016.06.009 Houthooft, 2016, VIME: variational information maximizing exploration, 1109 Mohamed, 2015, Variational information maximisation for intrinsically motivated reinforcement learning, 1 Sorg, 2010, Variance-based rewards for approximate Bayesian reinforcement learning, 564 Burda, 2019, Exploration by random network distillation Burda, 2019, Large-scale study of curiosity-driven learning Haber, 2018, Learning to play with intrinsically-motivated self-aware agents Pathak, 2017, Curiosity-driven exploration by self-supervised prediction Gittins, 2011 Gittins, 1979, Bandit processes and dynamic allocation indices, J R Stat Soc Ser B, 41, 148 Russo, 2018, A Tutorial on Thompson Sampling, Found Trends Mach Learn, 11, 1, 10.1561/2200000070 Thompson, 1933, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, 25, 285, 10.1093/biomet/25.3-4.285 Standing, 1973, Learning 10,000 pictures, Q J Exp Psychol, 25, 207, 10.1080/14640747308400340 Brady, 2008, Visual long-term memory has a massive storage capacity for object details, Proc Natl Acad Sci U S A, 105, 14325, 10.1073/pnas.0803390105 Bogacz, 2003, Comparison of computational models of familiarity discrimination in the perirhinal cortex, Hippocampus, 13, 494, 10.1002/hipo.10093 Brown, 2015, In search of a recognition memory engram, Neurosci Biobehav Rev, 50, 12, 10.1016/j.neubiorev.2014.09.016 Fahy, 1993, Neuronal activity related to visual recognition memory: long-term memory and the encoding of recency and familiarity information in the primate anterior and medial inferior temporal and rhinal cortex, Exp Brain Res, 96, 457, 10.1007/BF00234113 Li, 1993, The representation of stimulus familiarity in anterior inferior temporal cortex, J Neurophysiol, 69, 1918, 10.1152/jn.1993.69.6.1918 Xiang, 1998, Differential neuronal encoding of novelty, familiarity and recency in regions of the anterior temporal lobe, Neuropharmacology, 37, 657, 10.1016/S0028-3908(98)00030-6 Desimone, 1996, Neural mechanisms for visual memory and their role in attention, Proc Natl Acad Sci U S A, 93, 13494, 10.1073/pnas.93.24.13494 Meyer, 2018, Single-exposure visual memory judgments are reflected in inferotemporal cortex, eLife, 7, 10.7554/eLife.32259 DiCarlo, 2012, How does the brain solve visual object recognition?, Neuron, 73, 415, 10.1016/j.neuron.2012.01.010 Grill-Spector, 2006, Repetition and the brain: neural models of stimulus-specific effects, Trends Cogn Sci, 10, 14, 10.1016/j.tics.2005.11.006 McMahon, 2007, Repetition suppression in monkey inferotemporal cortex: relation to behavioral priming, J Neurophysiol, 97, 3532, 10.1152/jn.01042.2006 Vogels, 2016, Sources of adaptation of inferior temporal cortical responses, Cortex, 80, 185, 10.1016/j.cortex.2015.08.024 Zhou, 2018, Compressive temporal summation in human visual cortex, J Neurosci, 38, 691, 10.1523/JNEUROSCI.1724-17.2017 Lim, 2015, Inferring learning rules from distributions of firing rates in cortical neurons, Nat Neurosci, 18, 1804, 10.1038/nn.4158 Summerfield, 2008, Neural repetition suppression reflects fulfilled perceptual expectations, Nat Neurosci, 11, 1004, 10.1038/nn.2163 Grotheer, 2014, Repetition probability effects depend on prior experiences, J Neurosci, 34, 6640, 10.1523/JNEUROSCI.5326-13.2014 Yildirim, 2020, Physical object representations for perception and cognition Hong, 2016, Explicit information for category-orthogonal object properties increases along the ventral stream, Nat Neurosci, 19, 613, 10.1038/nn.4247 Sawamura, 2006, Selectivity of neuronal adaptation does not match response selectivity: a single-cell study of the FMRI adaptation paradigm, Neuron, 49, 307, 10.1016/j.neuron.2005.11.028 De Baene, 2010, Effects of adaptation on the stimulus selectivity of macaque inferior temporal spiking activity and local field potentials, Cereb Cortex, 20, 2145, 10.1093/cercor/bhp277