Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

IEEE Signal Processing Magazine - Tập 29 Số 6 - Trang 82-97 - 2012

Geoffrey E. Hinton¹, Li Deng², Dong Yu³, George E. Dahl¹, Abdelrahman Mohamed⁴, Navdeep Jaitly¹, Andrew Senior⁵, Vincent Vanhoucke⁶, Patrick Nguyen⁶, Tara N. Sainath⁷, Brian Kingsbury⁸

¹[Computer Science, Univ. Toronto, Toronto, Canada]

²Department of Electrical and Computer Engineering, University of Waterloo, ONT, Canada

³Microsoft Research, Redmond, Washington, USA

⁴Department of Computer Science, University of Toronto, Toronto, M5S 3G4 Canada

⁵Google Inc., USA

⁶Research, Google, Mountainview, California USA

⁷IBM Thomas J Watson Research Center, USA.

⁸Electrical engineering, Michigan State University, East Lansing, USA

Tóm tắt

Từ khóa

Tài liệu tham khảo

lee, 2009, Unsupervised feature learning for audio classification using convolutional deep belief networks, Advances in Neural Information Processing Systems 22, 1096

10.1109/ICASSP.2010.5495222

dahl, 2010, Phone recognition with the mean-covariance restricted Boltzmann machine, Advances in Neural Information Processing Systems 23, 469

10.1109/TASL.2011.2155060

mohamed, 0, Investigation of full-sequence training of deep belief networks for speech recognition, Proc INTERSPEECH, 2846

halberstadt, 0, Heterogeneous measurements and multiple classifiers for speech recognition, Proc ICSLP

10.1109/ICASSP.2009.4960445

10.1109/IJCNN.1991.155435

he, 2008, Discriminative learning in sequential pattern recognition—A unifying review for optimization-oriented speech recognition, IEEE Signal Processing Mag, 25, 14, 10.1109/MSP.2008.926652

10.1109/ICASSP.2012.6288864

10.1109/TASL.2011.2116010

10.1109/ICASSP.2012.6288833

10.1109/TASL.2011.2129510

10.1109/MSP.2005.1511826

10.1109/ICASSP.1998.674454

10.1109/ICASSP.2011.5947378

10.1109/72.279192

10.1109/ICASSP.2012.6288837

10.1121/1.409839

deng, 0, Use of differential cepstra as acoustic features in hidden trajectory modelling for phonetic recognition, Proc ICASSP, 445

10.1121/1.1420380

10.1006/csla.2001.0182

furui, 2000, Digital Speech Processing, Synthesis, and Recognition

10.1109/ICASSP.2007.367023

10.1109/MSP.2009.932166

10.1162/089976602760128018

10.1162/neco.2006.18.7.1527

hinton, 2010, A practical guide to training restricted Boltzmann machines, Tech Rep UTML TR 2010-003

10.1109/ICASSP.2011.5947494

10.1109/ASRU.2009.5373263

10.1109/TASL.2008.2010286

10.1109/ICASSP.2012.6288863

sainath, 2011, Improvements in using deep belief networks for large vocabulary continuous speech recognition, Speech and Language Algorithm Group IBM Yorktown Heights NY Tech Rep UTML TR 2010-003

deng, 0, Deep convex network: A scalable architecture for speech pattern classification, Proc INTERSPEECH, 2285

martens, 0, Deep learning via Hessian-free optimization, Proc 27th Int Conf Machine Learning, 735

le, 0, On optimization methods for deep learning, Proc 28th Int Conf Machine Learning, 265

10.1109/ICASSP.2012.6288994

plahl, 0, Improved pretraining of deep belief networks using sparse encoding symmetric machines, Proc ICASSP, 4165

rifai, 0, Contractive autoencoders: Explicit invariance during feature extraction, Proc 28th Int Conf Machine Learning, 833

vincent, 2010, Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion, J Mach Learn Res, 11, 3371

yu, 2011, Discriminative pretraining of deep neural networks, U S Patent Filing

10.1109/ICASSP.2012.6288333

10.1007/978-3-642-60087-6_20

10.1109/TASL.2006.878265

deng, 2003, Switching dynamic system models for speech articulation and acoustics, Mathematical Foundations of Speech and Language Processing, 115

mohamed, 0, Deep belief networks for phone recognition, Proc NIPS Workshop Deep Learning for Speech Recognition and Related Applications

10.1109/TASL.2011.2109382

10.1038/323533a0

glorot, 0, Understanding the difficulty of training deep feedforward neural networks, Proc AISTATS, 249

10.1162/NECO_a_00052

10.1126/science.1127647

10.1145/1273496.1273556

pearl, 1988, Probabilistic Inference in Intelligent Systems Networks of Plausible Inference

10.1121/1.399423

10.1109/TIT.1986.1057145

10.1109/79.536824

10.1109/TASSP.1981.1163530

10.1109/ICASSP.2000.862024

vanhoucke, 2011, Improving the speed of neural networks on CPUs, Proc Deep Learning and Unsupervised Feature Learning NIPS Workshop

10.1109/ICASSP.1986.1169179

bourlard, 1993, Connectionist Speech Recognition A Hybrid Approach

povey, 0, Boosted MMI for model and feature-space discriminative training, Proc ICASSP, 4057

10.1109/ASRU.2011.6163899

zweig, 0, Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop, Proc ICASSP, 5044

jaitly, 0, An application of pretrained deep neural networks to large vocabulary speech recognition

10.1109/TASL.2011.2134090

10.1109/TASL.2011.2165280

yu, 0, Roles of pretraining and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition, Proc NIPS Workshop Deep Learning and Unsupervised Feature Learning

seide, 0, Conversational speech transcription using context-dependent deep neural networks, Proc INTERSPEECH, 437

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA