Deep tensor networks with matrix product operators

Springer Science and Business Media LLC - Tập 4 - Trang 1-12 - 2022
Bojan Žunkovič1
1Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia

Tóm tắt

We introduce deep tensor networks, which are exponentially wide neural networks based on the tensor network representation of the weight matrices. We evaluate the proposed method on the image classification (MNIST, FashionMNIST) and sequence prediction (cellular automata) tasks. In the image classification case, deep tensor networks improve our matrix product state baselines and achieve 0.49% error rate on MNIST and 8.3% error rate on FashionMNIST. In the sequence prediction case, we demonstrate an exponential improvement in the number of parameters compared to the one-layer tensor network methods. In both cases, we discuss the non-uniform and the uniform tensor network models and show that the latter generalises well to different input sizes.

Tài liệu tham khảo

Adhikary S, Srinivasan S, Miller J, Rabusseau G, Boots B (2021) Quantum tensor networks, stochastic processes, and weighted automata. In International Conference on Artificial Intelligence and Statistics, PMLR, pp 2080–2088 Bradley T-D, Miles Stoudenmire E, Terilla J (2020) Modeling sequences with quantum states: a look under the hood. Machine Learning: Science and Technology 1(3):035008 Bradley T-D, Vlassopoulos Y (2020) Language modeling with reduced densities. arXiv:2007.03834 Chen J, Cheng S, Xie H, Wang L, Xiang T (2018) Equivalence of restricted boltzmann machines and tensor network states. Phys Rev B 97(8):085104 Cheng Song, Chen Jing, Wang Lei (2018) Information perspective to probabilistic modeling: Boltzmann machines versus born machines. Entropy 20(8):583 Cheng S, Wang L, Xiang T, Zhang P (2019) Tree tensor networks for generative modeling. Phys Rev B 99(15):155131 Cheng S, Wang L, Zhang P (2021) Supervised learning with projected entangled pair states. Phys Rev B 103(12):125117 Chen Y, Pan Y, Dong D (2021) Residual tensor train: a flexible and efficient approach for learning multiple multilinear correlations. arXiv:2108.08659 Cohen Nadav, Or Sharir, Shashua Amnon (2016) On the expressive power of deep learning: a tensor analysis. In: Conference on learning theory, PMLR, pp 698–728 Convy Ian, Huggins William, Liao H, Birgitta Whaley K (2021) Mutual information scaling for tensor network machine learning. arXiv:2103.00105 Cong Iris, Choi Soonwon, Lukin MD (2019) Quantum convolutional neural networks. Nat Phys 15(12):1273–1278 Deng D-L, Li X, Das Sarma S (2017) Quantum entanglement in neural network states. Phys Rev X 7(2):021021 Dymarsky A, Pavlenko K (2021) Tensor network to learn the wavefunction of data. arXiv:2111.08014 Efthymiou S, Hidary J, Leichenauer S (2019) Tensornetwork for machine learning. arXiv:1906.06329 Felser T, Trenti M, Sestini L, Gianelle A, Zuliani D, Lucchesi D, Montangero S (2021) Quantum-inspired machine learning on high-energy physics data. npj Quantum Inform 7(1):1–8 Glasser I, Pancotti N, Ignacio Cirac J (2018) Supervised learning with generalized tensor networks. arXiv:1806.05964 Glasser Ivan, Sweke Ryan, Pancotti Nicola, Eisert Jens, Cirac Ignacio (2019) Expressive power of tensor-network factorizations for probabilistic modeling. Advances in neural information processing systems, 32 Garipov Timur, Podoprikhin Dmitry, Novikov A, Vetrov D (2016) Ultimate tensorization: compressing convolutional and fc layers alike. arXiv:1611.03214 Guo C, Jie Z, Lu W, Poletti D (2018) Matrix product operators for sequence-to-sequence learning. Phys Rev E 98(4):042114 Guo C, Modi K, Poletti D (2020) Tensor-network-based machine learning of non-markovian quantum processes. Phys Rev A 102(6):062414 Haegeman J, Lubich C, Oseledets I, Vandereycken B, Verstraete F (2016) Unifying time evolution and optimization with matrix product states. Phys Rev B 94(16):165116 Hrinchuk O, Khrulkov V, Mirvakhabova L, Orlova E, Oseledets I (2019) Tensorized embedding layers for efficient model compression. arXiv:1901.10787 Huang Y (2017) Provably efficient neural network representation for image classification. arXiv:1711.04606 Kong F, Liu X-Y, Henao R (2021) Quantum tensor network in machine learning:, An application to tiny object classification. arXiv:2101.03154 Levine Yoav, Yakira David, Cohen N, Shashua A (2017) Deep learning and quantum entanglement: Fundamental connections with implications to network design. arXiv:1704.01552 Liu D, Ran S-J, Wittek P, Peng C, García RB, Su G, Lewenstein M (2019) Machine learning by unitary tensor network of hierarchical tree structure. New J Phys 21(7):073059 Liu Jing, Li Sujie, Zhang J, Zhang P (2021) Tensor networks for unsupervised machine learning. arXiv:2106.12974 Lu S, Kanász-Nagy M, Kukuljan I, Ignacio Cirac J (2021) Tensor networks and efficient descriptions of classical data. arXiv:2103.06872 Miles Stoudenmire E (2018) Learning relevant features of data with multi-scale tensor networks. Quantum Sci Technol 3(3):034003 Miles Stoudenmire E, Schwab DJ (2016) Supervised learning with quantum-inspired tensor networks. arXiv:1605.05775 Martyn John, Vidal Guifre, Roberts C, Leichenauer S (2020) Entanglement and tensor networks for supervised image classification. arXiv:2007.06082 Meng Y-M, Zhang J, Zhang P, Gao C, Ran S-J (2020) Residual matrix product state for machine learning. arXiv:2012.11841 Meshkini Khatereh, Platos Jan, Ghassemain Hassan (2019) An analysis of convolutional neural network for fashion images classification (fashion-mnist). In: International conference on intelligent information technologies for industry. Springer, pp 85–95 McDonnell MD, Vladusich Tony (2015) Enhanced image classification with a fast-learning shallow convolutional neural network. In: 2015 International joint conference on neural networks (IJCNN). IEEE, pp 1–7 Novikov Alexander, Podoprikhin Dmitrii, Osokin Anton, Vetrov DP (2015) Tensorizing neural networks. Advances in neural information processing systems, 28 Pestun V, Vlassopoulos Y (2017) Tensor network language model. arXiv:1710.10248 Reyes JA, Stoudenmire EM (2021) Multi-scale tensor network architecture for machine learning. Mach Learn Sci Technol 2(3):035036 Schuld M, Sweke R, Meyer JJ (2021) Effect of data encoding on the expressive power of variational quantum-machine-learning models. Phys Rev A 103(3):032430 Streit Ananda, Santos Gustavo, Leão R, de Souza e Silva E, Menasché Daniel, Towsley Don (2020) Network anomaly detection based on tensor decomposition. In: 2020 Mediterranean communication and computer networking conference (medcomnet), IEEE, pp 1–8 Stokes J, Terilla J (2019) Probabilistic modeling with matrix product states. Entropy 21 (12):1236 Sun Z-Z, Peng C, Liu D, Ran S-J, Su G (2020) Generative tensor network classification model for supervised machine learning. Phys Rev B 101(7):075135 Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, Rabinovich Andrew (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 Wang Jinhui, Roberts Chase, Vidal G, Leichenauer S (2020) Anomaly detection with tensor networks. arXiv:2006.02516 Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N, Kaiser Ł, Polosukhin Illia (2017) Attention is all you need. Advances in neural information processing systems 30 Wolfram S (1983) Statistical mechanics of cellular automata. Rev Mod Phys 55(3):601 Wolfram S et al (2002) A new kind of science, volume 5 Wolfram media Champaign