Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network
Tóm tắt
Từ khóa
Tài liệu tham khảo
Lin, 2017, Criticality in formal languages and statistical physics, Entropy, 19, 299, 10.3390/e19070299
Sanger, 1989, Optimal unsupervised learning in a single-layer linear feedforward neural network, Neural Netw., 2, 459, 10.1016/0893-6080(89)90044-0
Sherstinsky, 1994
Liao, 2016, Bridging the gaps between residual learning, recurrent neural networks and visual cortex, CoRR, abs/1604.03640
Haber, 2017, Stable architectures for deep neural networks, Inverse Problems, 34, 10.1088/1361-6420/aa9a90
Lu, 2018, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, vol. 80, 3282
Ruthotto, 2018, Deep neural networks motivated by partial differential equations, CoRR, abs/1804.04272,
Sherstinsky, 2018
Sherstinsky, 2018, Deriving the recurrent neural network definition and rnn unrolling using signal processing, vol. 31
Ciccone, 2018, Nais-net: Stable deep networks from non-autonomous differential equations, 3029
Chang, 2018, Reversible architectures for arbitrarily deep residual neural networks, 2811
Bo Chang, Minmin Chen, Eldad Haber, Ed H. Chi, AntisymmetricRNN: A dynamical system view on recurrent neural networks. in: International Conference on Learning Representations, 2019.
Chen, 2018, Neural ordinary differential equations, 6572
Rubanova, 2019
Greff, 2015, LSTM: A search space Odyssey, CoRR, abs/1503.04069
Graves, 2005, Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Netw., 18, 602, 10.1016/j.neunet.2005.06.042
A. Graves, J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM networks. in: Proc. Int. Joint Conf. on Neural Networks IJCNN 2005, 2005.
Graves, 2008
Sutskever, 2011, Generating text with recurrent neural networks, 1017
Martin Sundermeyer, Ralf Schlüter, Hermann Ney, LSTM neural networks for language modeling. in: Interspeech, 2012, 194–197.
Graves, 2013, Generating sequences with recurrent neural networks, CoRR, abs/1308.0850
Sutskever, 2014, Sequence to sequence learning with neural networks, 3104
Sak, 2014, Long short-term memory recurrent neural network architectures for large scale acoustic modeling, 338
Lipton, 2015, A critical review of recurrent neural networks for sequence learning, CoRR, abs/1506.00019
Karpathy, 2015
Olah, 2015
Palangi, 2015, Deep sentence embedding using the long short term memory network: Analysis and application to information retrieval, CoRR, abs/1502.06922
Kannan, 2016, Smart reply: Automated response suggestion for email, CoRR, abs/1606.04870
Zhou, 2016, Deep recurrent models with fast-forward connections for neural machine translation, CoRR, abs/1606.04199
Renvoisé, 2017
Chen, 2017
Mallya, 2017
Mallya, 2017
Mallya, 2017
Jayasiri, 2017
Salehinejad, 2018, Recent advances in recurrent neural networks, CoRR, abs/1801.01078
Strogatz, 1994
Wang, 2017, A new concept using LSTM neural networks for dynamic system identification, 5324
Hopfield, 1984, Neurons with graded response have collective computational properties like those of two-state neurons, Proc. Natl. Acad. Sci., 81, 3088, 10.1073/pnas.81.10.3088
Grossberg, 1988, Nonlinear neural networks: Principles, mechanisms, and architectures, Neural Netw., 1, 17, 10.1016/0893-6080(88)90021-4
Sherstinsky, 1996, M-lattice: from morphogenesis to image processing, IEEE Trans. Image Process., 5, 1137, 10.1109/83.502393
Yuliya Kyrychko, Stephen Hogan, On the use of delay equations in engineering applications, 16 (2010) 943–960.
Metropolis, 1953, Equation of state calculations by fast computing machines, J. Chem. Phys., 21, 1087, 10.1063/1.1699114
Sherstinsky, 1998, On stability and equilibria of the M-Lattice, IEEE Trans. Circuit Syst. I, 45, 408, 10.1109/81.669063
Ostroverkhyi, 2010
Jordan, 1986
Pineda, 1987, Generalization of backpropagation to recurrent neural networks, Phys. Rev. Lett., 59, 2229, 10.1103/PhysRevLett.59.2229
Fernando L. Pineda, 1987, Generalization of backpropagation to recurrent and higher order neural networks, 602
Pearlmutter, 1989, Learning state space trajectories in recurrent neural networks, Neural Comput., 1, 263, 10.1162/neco.1989.1.2.263
Barak A. Pearlmutter, 1990
de Vries, 1991, A theory for neural networks with time delays, 162
Chua, 1988, Cellular neural networks: Applications, IEEE Trans. Circuits Syst., 35, 1273, 10.1109/31.7601
Lloyd N. Trefethen, Finite Difference and Spectral Methods for Ordinary and Partial Differential Equations. unpublished text, Cambridge, MA, 1996.
Oppenheim, 1989
Vinyals, 2013
Sherstinsky, 1996, On the efficiency of the orthogonal least squares training method for radial basis function networks, IEEE Trans. Neural Netw., 7, 195, 10.1109/72.478404
Bose, 1956
Hochreiter, 2001, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio, On the difficulty of training recurrent neural networks. in: International Conference on Machine Learning, 2013, pp. 1310–1318.
Werbos, 1988, Generalization of backpropagation with application to a recurrent gas market model, Neural Netw., 1, 10.1016/0893-6080(88)90007-X
Werbos, 1990, Backpropagation through time: what does it do and how to do it, vol. 78, 1550
Sutskever, 2012
Pascanu, 2014
Rumelhart, 1985
1986
Minsky, 1990
Williams, 1989, A learning algorithm for continually running fully recurrent neural networks, Neural Comput., 1, 270, 10.1162/neco.1989.1.2.270
Schuster, 1997, Bidirectional recurrent neural networks, IEEE Trans. Signal Process., 45, 2673, 10.1109/78.650093
Levy, 2018, Long short-term memory as a dynamically computed element-wise weighted sum, CoRR, abs/1805.03716
Gers, 2001
Rabiner, 1971, Techniques for designing finite-duration impulse-response digital filters, IEEE Trans. Commun. Technol., 19, 188, 10.1109/TCOM.1971.1090625
McClellan, 1973, A computer program for designing optimum FIR linear phase digital filters, IEEE Trans. Audio Electroacoust., 21, 506, 10.1109/TAU.1973.1162525
Yamamoto, 2003, Optimizing FIR approximation for discrete-time IIR filters, IEEE Signal Process. Lett., 10, 273, 10.1109/LSP.2003.815615
Zaremba, 2014