NEC-TT System for Mixed-Bandwidth and Multi-Domain Speaker Recognition
Tài liệu tham khảo
Alam, 2018, Speaker verification in mismatched conditions with frustratingly easy domain adaptation, 176
Anguera, 2012, Speaker diarization: A review of recent research, IEEE Transactions on Audio, Speech, and Language Processing, 20, 356, 10.1109/TASL.2011.2125954
Bengio, 2000, A neural probabilistic language model, 932
Bhattacharya, 2017, Deep speaker embeddings for short-duration speaker verification, 1517
Bonastre, 2015, Forensic speaker recognition: mirages and reality, 255
Brümmer, 2014, Unsupervised domain adaptation for i-vector speaker recognition, 260
Chowdhury, 2017, Attention-based models for text-dependent speaker verification, arXiv preprint arXiv:1710.10470
Chung, 2018, VoxCeleb2: Deep speaker recognition, 1086
Cui, 2015, Data augmentation for deep neural network acoustic modeling, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23, 1469, 10.1109/TASLP.2015.2438544
Ferrer, 2011, Promoting robustness for speaker modeling in the community: the PRISM evaluation set
Garcia-Romero, 2014, Supervised domain adaptation for i-vector based speaker recognition, 4047
Hansen, 2015, Speaker recognition by machines and humans: a tutorial review, IEEE Signal Processing Magazine, 32, 74, 10.1109/MSP.2015.2462851
Hinton, 2012, Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal Processing Magazine, 29, 82, 10.1109/MSP.2012.2205597
Ioffe, 2006, Probabilistic linear discriminant analysis, 531
Jones, 2017, Call my net corpus: A multilingual corpus for evaluation of speaker recognition technology, 2621
Kenny, 2010, Bayesian speaker verification with heavy-tailed priors
Kinnunen, 2010, An overview of text-independent speaker recognition: from features to supervectors, Speech Communication, 52, 12, 10.1016/j.specom.2009.08.009
Kinoshita, 2016, A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research, EURASIP Journal on Advances in Signal Processing, 2016, 7, 10.1186/s13634-016-0306-6
Ko, 2017, A study on data augmentation of reverberant speech for robust speech recognition, 5220
Lee, 2019, I4U submission to NIST SRE 2018: Leveraging from a decade of shared experiences, 1497
Lee, 2013, Speaker verification makes its debut in smartphone, IEEE Signal Processing Society Speech and language Technical Committee Newsletter
Lee, 2019, The CORAL+ algorithm for unsupervised domain adaptation of PLDA, 5821
Lee, 2018, The NEC-TT speaker verification system for SRE18, NIST SRE 2018 Workshop
Lee, 2019, The NEC-TT 2018 speaker verification system, 4355
Li, 2012, Improving wideband speech recognition using mixed-bandwidth training data in CDDNN-HMM, 131
Li, 2015, DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech, 2575
McLaren, 2016, The speakers in the wild (sitw) speaker recognition database, 818, 10.21437/Interspeech.2016-1129
Mikolov, 2013, Distributed representations of words and phrases and their compositionality, 3111
Nagrani, 2017, Voxceleb: A large-scale speaker identification dataset, 2616
Nidadavolu, 2018, Investigation on bandwidth extension for speaker recognition, 1111
National Institute of Standards, 2018, NIST 2018 Speaker Recognition Evaluation Plan, NIST SRE
Okabe, 2018, Attentive statistics pooling for deep speaker embedding, 2252
Peddinti, 2015, A time delay neural network architecture for efficient modeling of long temporal contexts, 3214
Prince, 2007, Probabilistic linear discriminant analysis for inferences about identity, 1
Schroff, 2015, FaceNet: A unified embedding for face recognition and clustering, 815
Sell, 2014, Speaker diarization with PLDA i-vector scoring and unsupervised calibration, 413
Silnova, 2018, Fast variational bayes for heavy-tailed plda applied to i-vectors and x-vectors, 72
Snyder, 2015, MUSAN: a music, speech, and noise corpus
Snyder, 2017, Deep neural network embeddings for text-independent speaker verification, 999
Snyder, 2018, X-vectors: Robust DNN embeddings for speaker recognition, 5329
Snyder, 2016, Deep neural network-based speaker embeddings for end-to-end speaker verification, 165
SoX – Sound eXchange http://sox.sourceforge.net/.
Strang, 2019
Sun, 2016, Return of frustratingly easy domain adaptation, 2058
Tracey, 2018, Vast: A corpus of video annotation for speech technologies, 4318
Variani, 2014, Deep neural networks for small footprint text-dependent speaker verification, 4052
Vaswani, 2017, Attention is all you need, 5998
Villalba, 2019, State-of-the-art speaker recognition for telephone and video speech: the JHU-MIT submission for NIST SRE18, 1488
Villalba, 2018, The JHU-MIT system description for NIST SRE18, NIST SRE 2018 Workshop
Wang, 2018, Attention mechanism in speaker recognition: What does it learn in deep speaker embedding?, 1052
Wang, 2017, What does the speaker embedding encode?, 1497
Yamamoto, 2019, Speaker augmentation and bandwidth extension for deep speaker embedding, 406
Zeinali, 2019, How to improve your speaker embeddings extractor in generic toolkits, 6141
Zhang, 2017, End-to-end text-independent speaker verification with triplet loss on short utterances, 1487