LSTM: A Search Space Odyssey

IEEE Transactions on Neural Networks and Learning Systems - Tập 28 Số 10 - Trang 2222-2232 - 2017
Klaus Greff1, Rupesh K. Srivastava1, Jan Koutník1, Bas R. Steunebrink1, Jürgen Schmidhuber1
1[Istituto Dalle Molle di studi sull’Intelligenza Artificiale, Scuola universitaria professionale della Svizzera italiana, Manno, Switzerland]

Tóm tắt

Từ khóa


Tài liệu tham khảo

10.1145/1143844.1143891

10.1109/ICDAR.2005.132

chung, 2014, Empirical evaluation of gated recurrent neural networks on sequence modeling

10.3115/v1/D14-1179

otte, 2014, Dynamic cortex memory: Enhancing recurrent neural networks for gradient-based sequence learning, Artificial Neural Networks and Machine Learning, 8681, 1

jozefowicz, 2015, An empirical exploration of recurrent network architectures, Proc Int Conf Mach Learn (ICML), 2342

halberstadt, 1998, Heterogeneous acoustic measurements and multiple classifiers for speech recognition

graves, 2008, Supervised sequence labelling with recurrent neural networks

mermelstein, 1976, Distance measures for speech recognition: Psychological and instrumental, Pattern Recognition and Artificial Intelligence, 374

crystal, 2011, Dictionary of Linguistics and Phonetics, 30

fan, 2014, TTS synthesis with bidirectional LSTM based recurrent neural networks, Proc INTERSPEECH, 1964

allan, 2005, Harmonising chorales by probabilistic inference, Advances in neural information processing systems, 17, 25

sønderby, 2014, Protein secondary structure prediction with long short term memory networks

10.1109/ICASSP.2014.6853982

10.21236/ADA623249

hochreiter, 1995, Long short-term memory

10.1162/neco.1997.9.8.1735

10.2307/2281072

10.1287/moor.6.1.19

bergstra, 2012, Random search for hyper-parameter optimization, J Mach Learn Res, 13, 281

hutter, 2014, An efficient approach for assessing hyperparameter importance, Proc 31st Int Conf Mach Learn, 754

10.1162/neco.2007.19.3.757

pham, 2013, Dropout improves recurrent neural networks for handwriting recognition

gers, 2002, DEKF-LSTM, Proc European Symp Artificial Neural Networks (ESANN), 369

10.1109/TPAMI.2008.137

graves, 2013, Generating Sequences with Recurrent Neural Networks

bayer, 2009, Evolving memory cell structures for sequence learning, Artificial Neural Networks—ICANN, 755

10.1109/ICFHR.2014.54

luong, 2014, Addressing the rare word problem in neural machine translation

zaremba, 2014, Recurrent Neural Network Regularization

hochreiter, 2001, Gradient flow in recurrent nets: The difficulty of learning long-term dependencies, A Field Guide to Dynamical Recurrent Networks

sak, 2014, Long short-term memory recurrent neural network architectures for large scale acoustic modeling, Proc Annu Conf Int Speech Commun Assoc (Interspeech), 338

hochreiter, 1991, Untersuchungen zu dynamischen neuronalen Netzen

10.1198/106186007X237892

10.1016/j.neunet.2005.06.042

graves, 2008, Unconstrained on-line handwriting recognition with recurrent neural networks, Proc Adv Neural Inf Process Syst, 577

10.1109/IJCNN.2000.861302

gers, 1999, Learning to forget: Continual prediction with LSTM, Proc 9th Int Conf Artif Neur Netw (ICANN'99), 2, 850, 10.1049/cp:19991218

sutskever, 2013, On the importance of initialization and momentum in deep learning, J Mach Learn Res, 23, 1139

williams, 1989, Complexity of exact gradient computation algorithms for recurrent neural networks

boulanger-lewandowski, 2012, Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription, Proc 29th Int Conf Mach Learn, 1159

robinson, 1987, The utility driven dynamic error propagation network

hutter, 2011, Sequential model-based optimization for general algorithm configuration, Proc LION, 507

garofolo, 1993, DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1

snoek, 2012, Practical Bayesian optimization of machine learning algorithms, Advances in Neural Information Processing Systems 25, 2951

10.1016/0893-6080(88)90007-X