Deep tensor networks with matrix product operators
Tóm tắt
We introduce deep tensor networks, which are exponentially wide neural networks based on the tensor network representation of the weight matrices. We evaluate the proposed method on the image classification (MNIST, FashionMNIST) and sequence prediction (cellular automata) tasks. In the image classification case, deep tensor networks improve our matrix product state baselines and achieve 0.49% error rate on MNIST and 8.3% error rate on FashionMNIST. In the sequence prediction case, we demonstrate an exponential improvement in the number of parameters compared to the one-layer tensor network methods. In both cases, we discuss the non-uniform and the uniform tensor network models and show that the latter generalises well to different input sizes.
Tài liệu tham khảo
Adhikary S, Srinivasan S, Miller J, Rabusseau G, Boots B (2021) Quantum tensor networks, stochastic processes, and weighted automata. In International Conference on Artificial Intelligence and Statistics, PMLR, pp 2080–2088
Bradley T-D, Miles Stoudenmire E, Terilla J (2020) Modeling sequences with quantum states: a look under the hood. Machine Learning: Science and Technology 1(3):035008
Bradley T-D, Vlassopoulos Y (2020) Language modeling with reduced densities. arXiv:2007.03834
Chen J, Cheng S, Xie H, Wang L, Xiang T (2018) Equivalence of restricted boltzmann machines and tensor network states. Phys Rev B 97(8):085104
Cheng Song, Chen Jing, Wang Lei (2018) Information perspective to probabilistic modeling: Boltzmann machines versus born machines. Entropy 20(8):583
Cheng S, Wang L, Xiang T, Zhang P (2019) Tree tensor networks for generative modeling. Phys Rev B 99(15):155131
Cheng S, Wang L, Zhang P (2021) Supervised learning with projected entangled pair states. Phys Rev B 103(12):125117
Chen Y, Pan Y, Dong D (2021) Residual tensor train: a flexible and efficient approach for learning multiple multilinear correlations. arXiv:2108.08659
Cohen Nadav, Or Sharir, Shashua Amnon (2016) On the expressive power of deep learning: a tensor analysis. In: Conference on learning theory, PMLR, pp 698–728
Convy Ian, Huggins William, Liao H, Birgitta Whaley K (2021) Mutual information scaling for tensor network machine learning. arXiv:2103.00105
Cong Iris, Choi Soonwon, Lukin MD (2019) Quantum convolutional neural networks. Nat Phys 15(12):1273–1278
Deng D-L, Li X, Das Sarma S (2017) Quantum entanglement in neural network states. Phys Rev X 7(2):021021
Dymarsky A, Pavlenko K (2021) Tensor network to learn the wavefunction of data. arXiv:2111.08014
Efthymiou S, Hidary J, Leichenauer S (2019) Tensornetwork for machine learning. arXiv:1906.06329
Felser T, Trenti M, Sestini L, Gianelle A, Zuliani D, Lucchesi D, Montangero S (2021) Quantum-inspired machine learning on high-energy physics data. npj Quantum Inform 7(1):1–8
Glasser I, Pancotti N, Ignacio Cirac J (2018) Supervised learning with generalized tensor networks. arXiv:1806.05964
Glasser Ivan, Sweke Ryan, Pancotti Nicola, Eisert Jens, Cirac Ignacio (2019) Expressive power of tensor-network factorizations for probabilistic modeling. Advances in neural information processing systems, 32
Garipov Timur, Podoprikhin Dmitry, Novikov A, Vetrov D (2016) Ultimate tensorization: compressing convolutional and fc layers alike. arXiv:1611.03214
Guo C, Jie Z, Lu W, Poletti D (2018) Matrix product operators for sequence-to-sequence learning. Phys Rev E 98(4):042114
Guo C, Modi K, Poletti D (2020) Tensor-network-based machine learning of non-markovian quantum processes. Phys Rev A 102(6):062414
Haegeman J, Lubich C, Oseledets I, Vandereycken B, Verstraete F (2016) Unifying time evolution and optimization with matrix product states. Phys Rev B 94(16):165116
Hrinchuk O, Khrulkov V, Mirvakhabova L, Orlova E, Oseledets I (2019) Tensorized embedding layers for efficient model compression. arXiv:1901.10787
Huang Y (2017) Provably efficient neural network representation for image classification. arXiv:1711.04606
Kong F, Liu X-Y, Henao R (2021) Quantum tensor network in machine learning:, An application to tiny object classification. arXiv:2101.03154
Levine Yoav, Yakira David, Cohen N, Shashua A (2017) Deep learning and quantum entanglement: Fundamental connections with implications to network design. arXiv:1704.01552
Liu D, Ran S-J, Wittek P, Peng C, García RB, Su G, Lewenstein M (2019) Machine learning by unitary tensor network of hierarchical tree structure. New J Phys 21(7):073059
Liu Jing, Li Sujie, Zhang J, Zhang P (2021) Tensor networks for unsupervised machine learning. arXiv:2106.12974
Lu S, Kanász-Nagy M, Kukuljan I, Ignacio Cirac J (2021) Tensor networks and efficient descriptions of classical data. arXiv:2103.06872
Miles Stoudenmire E (2018) Learning relevant features of data with multi-scale tensor networks. Quantum Sci Technol 3(3):034003
Miles Stoudenmire E, Schwab DJ (2016) Supervised learning with quantum-inspired tensor networks. arXiv:1605.05775
Martyn John, Vidal Guifre, Roberts C, Leichenauer S (2020) Entanglement and tensor networks for supervised image classification. arXiv:2007.06082
Meng Y-M, Zhang J, Zhang P, Gao C, Ran S-J (2020) Residual matrix product state for machine learning. arXiv:2012.11841
Meshkini Khatereh, Platos Jan, Ghassemain Hassan (2019) An analysis of convolutional neural network for fashion images classification (fashion-mnist). In: International conference on intelligent information technologies for industry. Springer, pp 85–95
McDonnell MD, Vladusich Tony (2015) Enhanced image classification with a fast-learning shallow convolutional neural network. In: 2015 International joint conference on neural networks (IJCNN). IEEE, pp 1–7
Novikov Alexander, Podoprikhin Dmitrii, Osokin Anton, Vetrov DP (2015) Tensorizing neural networks. Advances in neural information processing systems, 28
Pestun V, Vlassopoulos Y (2017) Tensor network language model. arXiv:1710.10248
Reyes JA, Stoudenmire EM (2021) Multi-scale tensor network architecture for machine learning. Mach Learn Sci Technol 2(3):035036
Schuld M, Sweke R, Meyer JJ (2021) Effect of data encoding on the expressive power of variational quantum-machine-learning models. Phys Rev A 103(3):032430
Streit Ananda, Santos Gustavo, Leão R, de Souza e Silva E, Menasché Daniel, Towsley Don (2020) Network anomaly detection based on tensor decomposition. In: 2020 Mediterranean communication and computer networking conference (medcomnet), IEEE, pp 1–8
Stokes J, Terilla J (2019) Probabilistic modeling with matrix product states. Entropy 21 (12):1236
Sun Z-Z, Peng C, Liu D, Ran S-J, Su G (2020) Generative tensor network classification model for supervised machine learning. Phys Rev B 101(7):075135
Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, Rabinovich Andrew (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Wang Jinhui, Roberts Chase, Vidal G, Leichenauer S (2020) Anomaly detection with tensor networks. arXiv:2006.02516
Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N, Kaiser Ł, Polosukhin Illia (2017) Attention is all you need. Advances in neural information processing systems 30
Wolfram S (1983) Statistical mechanics of cellular automata. Rev Mod Phys 55(3):601
Wolfram S et al (2002) A new kind of science, volume 5 Wolfram media Champaign