On the vanishing and exploding gradient problem in Gated Recurrent Units

IFAC-PapersOnLine - Tập 53 Số 2 - Trang 1243-1248 - 2020
Alexander Rehmer1, Andreas Kroll1
1Department of Measurement and Control, Institute for System Analytics and Control, Faculty of Mechanical Engineering, University of Kassel, Germany

Tóm tắt

Từ khóa


Tài liệu tham khảo

Cho, K. et al. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 8, 1724–1734.

Goodfellow, 2016

Gringard, M. and Kroll, A. (2016). On the systematic analysis of the impact of the parametrization of standard test signals. In IEEE Symposium Series of Computational Intelligence 2016. IEEE, Athens, Greece.

Hochreiter, 1997, Long short-term memory, Neural Computation, 9, 1735, 10.1162/neco.1997.9.8.1735

Jordan, I.D., Sokol, P.A., and Park, I.M. (2019). Gated recurrent units viewed through the lens of continuous time dynamical systems. arXiv preprint arXiv:1906.01005.

Kingma, D. and Ba, J. (2015). Adam: A method for stochastic optimization. In 3rd International Conference for Learning Representations (ICLR 2015).

Nelles, 2001

Pascanu, R., Mikolov, T., and Bengio, Y. (2012). Understanding the exploding gradient problem. CoRR, abs/1211.5063.

Rehmer, A. and Kroll, A. (2019). On using gated recurrent units for nonlinear system identification. In Preprints of the 18th European Control Conference (ECC), 2504–2509. IFAC, Naples, Italy.