Path classification by stochastic linear recurrent neural networks

Wiebke Bartolomaeus1, Youness Boutaib1, Sandra Nestler2,3, Holger Rauhut1
1Chair for Mathematics of Information Processing, RWTH Aachen University, Aachen, Germany
2RWTH Aachen University, Aachen, Germany
3Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute “Brain Structure-Function Relationships” (INM-10), Jülich Research Centre, Jülich, Germany

Tóm tắt

AbstractWe investigate the functioning of a classifying biological neural network from the perspective of statistical learning theory, modelled, in a simplified setting, as a continuous-time stochastic recurrent neural network (RNN) with the identity activation function. In the purely stochastic (robust) regime, we give a generalisation error bound that holds with high probability, thus showing that the empirical risk minimiser is the best-in-class hypothesis. We show that RNNs retain a partial signature of the paths they are fed as the unique information exploited for training and classification tasks. We argue that these RNNs are easy to train and robust and support these observations with numerical experiments on both synthetic and real data. We also show a trade-off phenomenon between accuracy and robustness.

Từ khóa


Tài liệu tham khảo

Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure? In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp. 16–25 (2006)

Bartlett, P.L., Boucheron, S., Lugosi, G.: Model selection and error estimation. Mach. Learn. 48(1), 85–113 (2002)

Bhattacharya, K., Hosseini, B., Kovachki, N.B., Stuart, A.M.: Model reduction and neural networks for parametric PDEs. SMAI J. Comput. Math. 7, 121–157 (2020)

Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines (2012)

Chen, K.-T.: Integration of paths, geometric invariants and a generalized Baker–Hausdorff formula. Ann. Math. (2) 65, 163–178 (1957)

Chevyrev, I., Kormilitzin, A.: A primer on the signature method in machine learning. arXiv preprint (2016). arXiv:1603.03788

Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

Dalvi, N., Domingos, P., Sanghai, S., Verma, D.: Adversarial classification. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 99–108 (2004)

DePasquale, B., Cueva, C.J., Rajan, K., Escola, G.S., Abbott, L.: full-FORCE: a target-based method for training recurrent networks. PLoS ONE 13(2), e0191527 (2018)

Doya, K.: Bifurcations in the learning of recurrent neural networks 3. Learn. RTRL 3, 17 (1992)

Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Mach. Learn. 107(3), 481–508 (2018)

Funahashi, K.-I., Nakamura, Y.: Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 6(6), 801–806 (1993)

Gonon, L., Grigoryeva, L., Ortega, J.-P.: Approximation bounds for random neural networks and reservoir systems. arXiv preprint (2020). arXiv:2002.05933

Gonon, L., Grigoryeva, L., Ortega, J.-P.: Risk bounds for reservoir computing. J. Mach. Learn. Res. 21, Paper No. 240, 61 pp. (2020)

Graham, B.: Sparse arrays of signatures for online character recognition. arXiv preprint (2013). arXiv:1308.0371

Gressmann, F., Király, F.J., Mateen, B., Oberhauser, H.: Probabilistic supervised learning. arXiv preprint (2018). arXiv:1801.00753

Hälvä, H., Hyvarinen, A.: Hidden Markov nonlinear ICA: unsupervised learning from nonstationary time series. In: Conference on Uncertainty in Artificial Intelligence. PMLR, pp. 939–948 (2020)

Hambly, B., Lyons, T.: Uniqueness for the signature of a path of bounded variation and the reduced path group. Ann. Math. (2) 171(1), 109–167 (2010)

Hyvarinen, A., Morioka, H.: Nonlinear ICA of temporally dependent stationary sources. In: Artificial Intelligence and Statistics. PMLR, pp. 460–469 (2017)

Jaeger, H., Haas, H.: Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667), 78–80 (2004)

Király, F.J., Oberhauser, H.: Kernels for sequentially ordered data. J. Mach. Learn. Res. 20, Paper No. 31, 45 pp. (2019)

Koiran, P., Sontag, E.D.: Vapnik–Chervonenkis dimension of recurrent neural networks. Discrete Appl. Math. 86(1), 63–79 (1998)

Koltchinskii, V., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: High Dimensional Probability, II (Seattle, WA, 1999). Progr. Probab., vol. 47, pp. 443–457. Birkhäuser, Boston (2000)

Kraft, D., et al.: A software package for sequential quadratic programming

Ledoux, M., Talagrand, M.: Probability in Banach Spaces. Ergebnisse der Mathematik und ihrer Grenzgebiete (3), vol. 23. Springer, Berlin (1991)

Levin, D., Lyons, T., Ni, H.: Learning from the past, predicting the statistics for the future, learning an evolving system. arXiv preprint (2013). arXiv:1309.0260

Li, Z., Han, J., E, W., Li, Q.: On the curse of memory in recurrent neural networks: approximation and optimization analysis. In: International Conference on Learning Representations (2021)

Lim, S.H.: Understanding recurrent neural networks using nonequilibrium response theory. J. Mach. Learn. Res. 22, 1–48 (2021)

Lyons, T.J.: Differential equations driven by rough signals. Rev. Mat. Iberoam. 14(2), 215–310 (1998)

Lyons, T.J., Caruana, M., Lévy, T.: Differential equations driven by rough paths. In: Lectures from the 34th Summer School on Probability Theory Held in Saint-Flour, July 6–24, 2004. Lecture Notes in Mathematics, vol. 1908, pp. 6–24. Springer, Berlin (2007). With an introduction concerning the Summer School by Jean Picard

Maass, W., Joshi, P., Sontag, E.D.: Computational aspects of feedback in neural circuits. PLoS Comput. Biol. 3(1), e165 (2007)

Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2012)

Nestler, S., Keup, C., Dahmen, D., Gilson, M., Rauhut, H., Helias, M.: Unfolding recurrence by Green’s functions for optimized reservoir computing. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 17380–17390. Curran Associates, Red Hook (2020)

Oberhauser, H., Schell, A.: Nonlinear independent component analysis for continuous-time signals. arXiv preprint (2021). arXiv:2102.02876

Raghunathan, A., Xie, S.M., Yang, F., Duchi, J., Liang, P.: Understanding and mitigating the tradeoff between robustness and accuracy. arXiv preprint (2020). arXiv:2002.10716

Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)

Sompolinsky, H., Crisanti, A., Sommers, H.-J.: Chaos in random neural networks. Phys. Rev. Lett. 61(3), 259–262 (1988)

Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: International Conference on Learning Representations (2019)