Identification of related languages from spoken data: Moving from off-line to on-line scenario
Tài liệu tham khảo
web. NIST Language Recognition Evaluations2020. http://nist.gov/itl/iad/mig/lre.cfm, Online (accessed: 2020-05-20).
LRE. 2015. The 2015 NIST language recognition evaluation plan (LRE15).
LRE, 2017. NIST 2017 language recognition evaluation plan.
Abdullah, B. M., Avgustinova, T., Möbius, B., Klakow, D., 2020. Cross-domain adaptation of spoken language identification for related languages: the curious case of slavic languages. 2008.00545.
Cai, 2019, Utterance-level end-to-end language identification using attention-based CNN-BLSTM, 5991
Cai, 2018, Insights in-to-end learning scheme for language identification, 5209
Cai, 2018, A novel learnable dictionary encoding layer for end-to-end language identification, 5189
Cai, 2018, Exploring the encoding layer and loss function in end-to-end speaker and language recognition system, 74
Caseiro, 1998, Spoken language identification using the speechdat corpus, 1
Dahl, 2012, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, IEEE Trans. Audio Speech Lang. Process., 20, 30, 10.1109/TASL.2011.2134090
Dehak, 2011, Language recognition via i-vectors and dimensionality reduction, 857
D’Haro, 2014, Extended phone log-likelihood ratio features and acoustic-based i-vectors for language recognition, 5342
Fer, 2015, Multilingual bottleneck features for language recognition, 389
Fer, 2017, Multilingually trained bottleneck features in spoken language recognition, Comput. Speech Lang., 46, 252, 10.1016/j.csl.2017.06.008
Fernando, 2017, Bidirectional modelling for short duration language identification, 2809
Ferrer, 2016, Study of senone-based deep neural network approaches for spoken language recognition, IEEE/ACM Trans. Audio Speech Lang. Process., 24, 105, 10.1109/TASLP.2015.2496226
Garcia-Romero, 2016, Stacked long-term TDNN for spoken language recognition, 3226
Gauvain, 2004, Language recognition using phone latices, 1283
Gelly, 2017, Spoken language identification using LSTM-based angular proximity, 2566
Gelly, 2016, A divide-and-conquer approach for language identification based on recurrent neural networks, 3231
Geng, 2016, End-to-end language identification using attention-based recurrent neural networks, 2944
Geng, 2016, Gating recurrent enhanced memory neural networks on language identification, 3280
Gonzalez, 2011, Language recognition in ivectors space, 861
Gonzalez-Dominguez, 2014, Automatic language identification using long short-term memory recurrent neural networks, 2155
Griol, 2020, A data-driven approach to spoken dialog segmentation, Neurocomputing, 391, 292, 10.1016/j.neucom.2019.02.072
Jin, 2018, Lid-senones and their statistics for language identification, IEEE/ACM Trans. Audio Speech Lang. Process., 26, 171, 10.1109/TASLP.2017.2766023
Li, 2007, A vector space modeling approach to spoken language identification, IEEE Trans. Audio Speech Lang. Process., 15, 271, 10.1109/TASL.2006.876860
Li, 2013, Spoken language recognition: from fundamentals to practice, Proc. IEEE, 101, 1136, 10.1109/JPROC.2012.2237151
Lim, 2010, Real-time spoken language identification and recognition for speech-to-speech translation, 307
Liu, 2017, A survey of deep neural network architectures and their applications, Neurocomputing, 234, 11, 10.1016/j.neucom.2016.12.038
Lopez, 2018, End-to-end versus embedding neural networks for language recognition in mismatched conditions, 112
Lopez-Moreno, 2014, Automatic language identification using deep neural networks, 5337
Lozano-Diez, 2018, DNN based embeddings for language recognition, 5184
Lozano-Diez, 2015, An end-to-end approach to language identification in short utterances using convolutional neural networks, 403
Malek, 2019, On practical aspects of multi-condition training based on augmentation for reverberation-/noise-robust speech recognition, 251
Malek, 2018, Robust recognition of conversational telephone speech via multi-condition training and data augmentation, 324
Masumura, 2017, Parallel phonetically aware DNNS and LSTM-RNNS for frame-by-frame discriminative modeling of spoken language identification, 5260
Mateju, 2019, An approach to online speaker change point detection using DNNs and WFSTs, 649
Mateju, 2017, Speech activity detection in online broadcast transcription using deep neural networks and weighted finite state transducers, 5460
Mateju, 2018, Using deep neural networks for identification of slavic languages from acoustic signal, 1803
McLaren, 2016, Exploring the role of phonetic bottleneck features for speaker and language recognition, 5575
Miao, 2019, A new time-frequency attention mechanism for TDNN and cnn-lstm-tdnn, with application to language identification, 4080
Mingote, 2019, Language recognition using triplet neural networks, 4025
Nouza, 2016, ASR for south slavic languages developed in almost automated way, 3868
Okamoto, 2017, Reducing latency for language identification based on large-vocabulary continuous speech recognition, Acoust. Sci. Technol., 38, 38, 10.1250/ast.38.38
Padi, 2019, Attention based hybrid i-vector BLSTM model for language recognition, 1263
Padi, 2019, End-to-end language recognition using attention based hierarchical gated recurrent unit models, 5966
Pesan, 2016, Sequence summarizing neural networks for spoken language recognition, 3285
Povey, 2011, The Kaldi speech recognition toolkit, 1
Rasanen, 2009, An improved speech segmentation quality measure: the r-value, 1851
Richardson, 2015, Deep neural network approaches to speaker and language recognition, IEEE Signal Process. Lett., 22, 1671, 10.1109/LSP.2015.2420092
Richardson, 2015, A unified deep neural network for speaker and language recognition, 1146
Singer, 2012, The MITLL NIST LRE 2011 language recognition system, 209
Siniscalchi, 2014, An artificial neural network approach to automatic speech processing, Neurocomputing, 140, 326, 10.1016/j.neucom.2014.03.005
Snyder, 2018, Spoken language recognition using x-vectors, 105
Snyder, 2018, X-vectors: robust DNN embeddings for speaker recognition, 5329
Song, 2015, Deep bottleneck network based i-vector representation for language identification, 398
V., 2016, An investigation of deep neural network architectures for language recognition in indian languages, 2930
Wan, 2019, Tuplemax loss for language identification, 5976
Zazo, 2016, Evaluation of an LSTM-RNN system in different NIST language recognition frameworks, 231
Zhang, 2015, Feedforward sequential memory networks: A new structure to learn long-term dependency, CoRR
Zissman, 1996, Comparison of four approaches to automatic language identification of telephone speech, IEEE Trans. Audio Speech Process., 4, 31, 10.1109/TSA.1996.481450