Neurally plausible mechanisms for learning selective and invariant representations

The Journal of Mathematical Neuroscience - Tập 10 - Trang 1-15 - 2020
Fabio Anselmi1,2,3, Ankit Patel1,4, Lorenzo Rosasco2
1Center for Neuroscience and Artificial Intelligence, Department of Neuroscience, Baylor College of Medicine, Houston, USA
2Laboratory for Computational and Statistical Learning (LCSL), Istituto Italiano di Tecnologia, Genova, Genova, Italy
3Center for Brains, Minds, and Machines (CBMM), Massachusetts Institute of Technology, Cambridge, USA
4Department of Electrical & Computer Engineering, Rie University, Houston, USA

Tóm tắt

Coding for visual stimuli in the ventral stream is known to be invariant to object identity preserving nuisance transformations. Indeed, much recent theoretical and experimental work suggests that the main challenge for the visual cortex is to build up such nuisance invariant representations. Recently, artificial convolutional networks have succeeded in both learning such invariant properties and, surprisingly, predicting cortical responses in macaque and mouse visual cortex with unprecedented accuracy. However, some of the key ingredients that enable such success—supervised learning and the backpropagation algorithm—are neurally implausible. This makes it difficult to relate advances in understanding convolutional networks to the brain. In contrast, many of the existing neurally plausible theories of invariant representations in the brain involve unsupervised learning, and have been strongly tied to specific plasticity rules. To close this gap, we study an instantiation of simple-complex cell model and show, for a broad class of unsupervised learning rules (including Hebbian learning), that we can learn object representations that are invariant to nuisance transformations belonging to a finite orthogonal group. These findings may have implications for developing neurally plausible theories and models of how the visual cortex or artificial neural networks build selectivity for discriminating objects and invariance to real-world nuisance transformations.

Tài liệu tham khảo

McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5:115–33. Fukushima K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern. 1980;36(4):193–202. Caelli TM, Liu Z-Q. On the minimum number of templates required for shift, rotation and size invariant pattern recognition. Pattern Recognit. 1988;21(3):205–16. Lenz R. Group invariant pattern recognition. Pattern Recognit. 1990;23(1):199–217. Földiák P. Learning invariance from transformation sequences. Neural Comput. 1991;3(2):194–200. Grace AE, Spann M. A comparison between Fourier–Mellin descriptors and moment based features for invariant object recognition using neural networks. Pattern Recognit Lett. 1991;12(10):635–43. Flusser J, Suk T. Pattern recognition by affine moment invariants. Pattern Recognit. 1993;26(1):167–74. Olshausen BA, Anderson CH, Van Essen DC. A multiscale dynamic routing circuit for forming size- and position-invariant object representations. J Comput Neurosci. 1995;2(1):45–62. Van Gool L, Moons T, Pauwels E, Oosterlinck A. Vision and Lie’s approach to invariance. Image Vis Comput. 1995;13(4):259–77. Michaelis M, Sommer G. A Lie group approach to steerable filters. Pattern Recognit Lett. 1995;16(11):1165–74. Wood J. Invariant pattern recognition: a review. Pattern Recognit. 1996;29(1):1–17. Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nat Neurosci. 1999;2(11):1019–25. Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci USA. 2014;111(23):8619–24. https://doi.org/10.1073/pnas.1403112111. https://app.dimensions.ai/details/publication/pub.1021339188 and http://www.pnas.org/content/111/23/8619.full.pdf. Pinto N, Cox DD, DiCarlo JJ. Why is real-world visual object recognition hard? PLoS Comput Biol. 2008;4(1):e27. https://doi.org/10.1371/journal.pcbi.0040027. Lee T, Soatto S. Video-based descriptors for object recognition. Image Vis Comput. 2012;29:639–52. Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol. 1962;160(1):106–54. http://jp.physoc.org/content/160/1/106.full.pdf. Hubel DH, Wiesel TN. Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat. J Neurophysiol. 1965;28(2):229–89. http://jn.physiology.org/cgi/reprint/28/2/229.pdf. Hubel DH, Wiesel TN. Receptive fields and functional architecture of monkey striate cortex. J Physiol. 1968;195(1):215–43. http://jp.physoc.org/content/195/1/215.abstract. Lowe DG. Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision. Vol. 2; 1999. p. 1150–7. https://doi.org/10.1109/ICCV.1999.790410. Achille A, Soatto S. Emergence of invariance and disentangling in deep representations. In: ICML workshop on principled approaches to deep learning; 2017. Soatto S. Steps towards a theory of visual information: active perception, signal-to-symbol conversion and the interplay between sensing and control. arXiv:1110.2053 (2011). Lessmann M, Würtz RP. Learning invariant object recognition from temporal correlation in a hierarchical network. Neural Netw. 2014;54:70–84. Lenc K, Vedaldi A. Understanding image representations by measuring their equivariance and equivalence. In: IEEE conf. on computer vision and pattern recognition (CVPR). 2015. Shao Z, Li Y. Integral invariants for space motion trajectory matching and recognition. Pattern Recognit. 2015;48(8):2418–32. Anselmi F, Rosasco L, Poggio T. On invariance and selectivity in representation learning. Inf Inference. 2015;5(2):134–58. Cohen TS, Welling M. Group equivariant convolutional networks. In: International conference on machine learning (ICML). 2016. Gens R, Domingos PM. Deep symmetry networks. In: Advances in neural information processing system (NIPS). Vol. 27. 2014. p. 2537–45. Anderson BM, Hy T, Kondor R. Cormorant: covariant molecular neural networks. In: Advances in neural information processing systems. Vol. 32. 2019. p. 14510–9. Cohen TS, Geiger M, Weiler M. A general theory of equivariant CNNs on homogeneous spaces. In: Advances in neural information processing systems. Vol. 32. 2019. Kondor R, Trivedi S. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In: Proceedings of the 35th international conference on machine learning, ICML 2018. 2018. Kondor R, Lin Z, Trivedi S. Clebsch–Gordan nets: a fully fourier space spherical convolutional neural network. arXiv:1806.09231 (2018). Cohen TS, Geiger M, Köhler J, Welling M. Spherical CNNs. arXiv:1801.10130 (2018). Anselmi F, Leibo JZ, Rosasco L, Mutch J, Tacchetti A, Poggio T. Unsupervised learning of invariant representations. Theor Comput Sci. 2016;633:112–21. Anselmi F, Evangelopoulos G, Rosasco L, Poggio T. Symmetry-adapted representation learning. Pattern Recognit. 2019;86:201–8. https://doi.org/10.1016/j.patcog.2018.07.025. Hassoun MH, et al.. Fundamentals of artificial neural networks. 1995. Zacks S. The theory of statistical inference. New York: Wiley; 1971. Polsky A, Mel BW, Schiller J. Computational subunits in thin dendrites of pyramidal cells. Nat Neurosci. 2004;7(6):621–7. https://doi.org/10.1038/nn1253. Anselmi F, Evangelopoulos G, Rosasco L, Poggio T. Symmetry regularization. CBMM Memo 063 (2017). Akrout M, Wilson C, Humphreys PC, Lillicrap TP, Tweed DB. Deep learning without weight transport. arXiv:1904.05391 (2019).