Adjustable deterministic pseudonymization of speech
Tài liệu tham khảo
Almaadeed, 2016, Text-independent speaker identification using vowel formants, J. Signal Process. Syst., 82, 345, 10.1007/s11265-015-1005-5
Ardila, 2019
Boersma, P., Weenink, D., Praat: Doing Phonetics by Computer (Computer program). Version 6.1.06.
Christensen, 2018
De Jong, 2009, Praat script to detect syllable nuclei and measure speech rate automatically, Behav. Res. Methods, 41, 385, 10.3758/BRM.41.2.385
Dromey, 2013, Assessing correlations between lingual movements and formants, Speech Commun., 55, 315, 10.1016/j.specom.2012.09.001
Eyben, 2016, The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing, IEEE Trans. Affect. Comput., 7, 190, 10.1109/TAFFC.2015.2457417
Fang, 2019, Speaker anonymization using X-vector and neural waveform models, 155
Finck, 2020, They who must not be identified—distinguishing personal from non-personal data under the GDPR, Int. Data Privacy Law, 10, 11, 10.1093/idpl/ipz026
Fradette, 2003, Conventional and robust paired and independent-samples t tests: Type i error and power rates, J. Modern Appl. Statist. Methods, 2, 481, 10.22237/jmasm/1067646120
Harper, 2017, Quantifying labial, palatal, and pharyngeal contributions to third formant lowering in American english/r/, J. Acoust. Soc. Am., 142, 10.1121/1.5014445
Kent, 1989, Relationships between speech intelligibility and the slope of second-formant transitions in dysarthric subjects, Clinical Linguist. & Phonet., 3, 347, 10.3109/02699208908985295
Korshunov, 2017, Presentation attack detection in voice biometrics
Kucur Ergunay, 2015, On the vulnerability of speaker verification to realistic voice spoofing, 1
Kung, 2018, A compressive privacy approach to generalized information bottleneck and privacy funnel problems, J. Franklin Inst. B, 355, 1846, 10.1016/j.jfranklin.2017.07.002
Lammert, 2015, On short-time estimation of vocal tract length from formant frequencies, PLOS ONE, 10, 10.1371/journal.pone.0132193
Lee, 1988, On robust linear prediction of speech, IEEE Trans. Acoust. Speech Signal Process., 36, 642, 10.1109/29.1574
Lee, 2015, Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography, Am. J. Speech-Lang. Pathol., 24, 739, 10.1044/2015_AJSLP-14-0063
Mawalim, 2020, X-VEctor singular value modification and statistical-based decomposition with ensemble regression modeling for speaker anonymization system, 1703
McKell, 2016
Moulines, 1990, Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones, Speech Commun., 9, 453, 10.1016/0167-6393(90)90021-Z
Ning, 2019, A review of deep learning based speech synthesis, Appl. Sci., 9, 4050, 10.3390/app9194050
O’Shaughnessy, 2000, Speaker recognition, 437
Panayotov, 2015, Librispeech: an ASR corpus based on public domain audio books, 5206
Patino, 2020
Patino, 2020
Povey, 2016, Purely sequence-trained neural networks for ASR based on lattice-free mmi., 2751
R: A Language and Environment for Statistical Computing. http://www.R-project.org/.
Ribeiro, 2018
Richardson, 2017, Discrimination and identification of a third formant frequency cue to place of articulation by young children and adults, Lang. Speech, 60, 27, 10.1177/0023830915625680
Rubinstein, 2016, Anonymization and risk, Wash. Law Rev., 91, 59
Rudzicz, 2012, The TORGO database of acoustic and articulatory speech from speakers with dysarthria, Lang. Res. Eval., 46, 523, 10.1007/s10579-011-9145-0
Sapir, 2010, Formant centralization ratio: A proposal for a new acoustic measure of dysarthric speech, J. Speech, Lang., Hear. Res., 114, 10.1044/1092-4388(2009/08-0184)
Sapir, 2007, Effects of intensive voice treatment (the lee silverman voice treatment [lsvt]) on vowel articulation in dysarthric individuals with idiopathic parkinson disease: Acoustic and perceptual findings, J. Speech, Lang., Hear. Res., 899, 10.1044/1092-4388(2007/064)
Snyder, 2018, X-vectors: Robust DNN embeddings for speaker recognition, 5329
Soldo, 2012, Synthetic references for template-based ASR using posterior features
Soldo, 2011, Posterior features for template-based ASR
van Son, R.J.J.H., Pseudonymize Speech, [Online; accessed 10th May 2020]. https://robvanson.github.io/PseudonymizeSpeech/.
van Son, 2020
van Son, 2020
van Son, 2020
van Son, 2018, Vowel space as a tool to evaluate articulation problems., 357
Srivastava, 2020, Evaluating voice conversion-based privacy protection against informed attackers
Stalla-Bourdillon, 2017, Anonymous data v. Personal data – a false debate: An EU perspective on anonymization, pseudonymization and personal data, Wis. Int. Law J., 34, 39
Tomashenko, 2020, The voiceprivacy 2020 challenge
Tomashenko, N., et al., The VoicePrivacy 2020 Challenge Evaluation Plan, Online; accessed 1st April 2020], https://www.voiceprivacychallenge.org/docs/VoicePrivacy_2020_Eval_Plan_v1_2.pdf.
Tomashenko, 2020, Introducing the voiceprivacy initiative, 1693
Tomashenko, 2021
Tomashenko, 2021
Ullmann, 2015, Objective speech intelligibility assessment through comparison of phoneme class conditional probability sequences, 4924
Van Son, 2001, The IFA corpus: a phonemically segmented dutch ”open source” speech database, 2051
Wang, X., et al., The VoicePrivacy 2020 Challenge Subjective evaluation-1. https://www.voiceprivacychallenge.org/docs/6_Subjective_evaluation_1_naturalness_intelligibility_speaker_verifiability_X_Wang.pdf. Accessed on 25.05.2021.
Yamagishi, 2019
Zhang, 2017, Advanced data exploitation in speech analysis: An overview, IEEE Signal Process. Mag., 34, 107, 10.1109/MSP.2017.2699358