Analysis of speech production real-time MRI
Tài liệu tham khảo
Arens, 2001, Magnetic resonance imaging of the upper airway structure of children with obstructive sleep apnea syndrome, Am. J. Respir. Crit. Care Med., 164, 698, 10.1164/ajrccm.164.4.2101127
Asadiabadi, 2017, Vocal tract airway tissue boundary tracking for rtMRI using shape and appearance priors, 636
Atal, 1983, Efficient coding of LPC parameters by temporal decomposition, 8, 81
Badin, 1998, A three-dimensional linear articulatory model based on MRI data
Bae, 2011, Real-time magnetic resonance imaging of velopharyngeal activities with simultaneous speech recordings, Cleft Palate-Craniofac. J., 48, 695, 10.1597/09-158
Baer, 1991, Analysis of vocal tract shape and dimensions using magnetic resonance imaging: vowels, J. Acoust. Soc. Am., 90, 799, 10.1121/1.401949
Beautemps, 1995, Deriving vocal-tract area functions from midsagittal profiles and formant frequencies: a new model for vowels and fricative consonants based on experimental data, Speech Commun., 16, 27, 10.1016/0167-6393(94)00045-C
Beer, 2004, Dynamic near-real-time magnetic resonance imaging for analyzing the velopharyngeal closure in comparison with videofluoroscopy, J. Magn. Reson. Imaging, 20, 791, 10.1002/jmri.20197
Birkholz, 2013, Modeling consonant-vowel coarticulation for articulatory speech synthesis, PLoS One, 8, e60603, 10.1371/journal.pone.0060603
Birkholz, 2006, Vocal tract model adaptation using magnetic resonance imaging, 493
Bresch, 2010, Statistical multi-stream modeling of real-time MRI articulatory speech data
Bresch, 2009, Region segmentation in the frequency domain applied to upper airway real-time magnetic resonance images, IEEE Trans. Med. Imaging, 28, 323, 10.1109/TMI.2008.928920
Browman, 1995, Dynamics and articulatory phonology, 175
Burdumy, 2016
Byrd, 2003, The elastic phrase: modeling the dynamics of boundary-adjacent lengthening, J. Phon., 31, 149
Byrd, 2009, Timing effects of syllable structure and stress on nasals: a real-time MRI examination, J. Phon., 37, 97
Carey, 2017, Vocal tract images reveal neural representations of sensorimotor transformation during speech imitation, Cereb. Cortex, 27, 3064, 10.1093/cercor/bhx056
Carignan, 2013, The role of the pharynx and tongue in enhancement of vowel nasalization: a real-time MRI investigation of french nasal vowels, 3042
Carignan, 2015, A real-time MRI investigation of the role of lingual and pharyngeal articulation in the production of the nasal vowel system of French, J. Phon., 50, 34
Chi, 2011, Identification of craniofacial risk factors for obstructive sleep apnoea using three-dimensional MRI, Eur. Respir. J., 38, 348, 10.1183/09031936.00119210
Cootes, 1995, Active shape models-their training and application, Comput. Vis. Image Underst., 61, 38, 10.1006/cviu.1995.1004
Cootes, 2001, Active appearance models, IEEE Trans. Pattern Anal. Mach. Intell., 23, 681, 10.1109/34.927467
Delvaux, 2002, French nasal vowels: acoustic and articulatory properties, 53
Demolin, 1997, Coarticulation and articulatory compensations studied by dynamic MRI
Demolin, 2002, Real-time MRI and articulatory coordination in speech, C. R. Biol., 325, 547, 10.1016/S1631-0691(02)01458-0
Demolin, 2000, Real time MRI and articulatory coordinations in vowels, 86
Deng, 1998, A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition, Speech Commun., 24, 299, 10.1016/S0167-6393(98)00023-5
Deng, 1997, Production models as a structural basis for automatic speech recognition, Speech Commun., 22, 93, 10.1016/S0167-6393(97)00018-6
Ding, 2010, Convex and semi-nonnegative matrix factorizations, IEEE Trans. Pattern Anal. Mach. Intell., 32, 45, 10.1109/TPAMI.2008.277
Drissi, 2011, Feasibility of dynamic MRI for evaluating velopharyngeal insufficiency in children, Eur. Radiol., 21, 1462, 10.1007/s00330-011-2069-7
Echternach, 2016, Morphometric differences of vocal tract articulators in different loudness conditions in singing, PLoS One, 11, e0153792, 10.1371/journal.pone.0153792
Eide, 1996, A parametric approach to vocal tract length normalization, 1, 346
Engwall, 2003, A revisit to the application of MRI to the analysis of speech production – testing our assumptions, 43
Engwall, 2004, From real-time MRI to 3D tongue movements
Engwall, 1999, Collecting and analysing two and three-dimensional MRI data for Swedish, KTH STL-QPSR, 3, 011
Eryildirim, 2011, A guided approach for automatic segmentation and modeling of the vocal tract in MRI images, 61
Fitch, 1999, Morphology and development of the human vocal tract: a study using magnetic resonance imaging, J. Acoust. Soc. Am., 106, 1511, 10.1121/1.427148
Frankel, 2001, ASR-articulatory speech recognition
Freitas, 2016, Comparison of cartesian and non-cartesian real-time MRI sequences at 1.5 T to assess velar motion and velopharyngeal closure during speech, PLoS One, 11, e0153322, 10.1371/journal.pone.0153322
Fu, 2017, High-frame-rate full-vocal-tract 3D dynamic speech imaging, Magnet. Reson. Med., 77, 1619, 10.1002/mrm.26248
Fu, 2015, High-resolution dynamic speech imaging with joint low-rank and sparsity constraints, Magnet. Reson. Med., 73, 1820, 10.1002/mrm.25302
Ghosh, 2011, Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion, J. Acoust. Soc. Am., 130, EL251, 10.1121/1.3634122
Ghosh, 2011, A subject-independent acoustic-to-articulatory inversion, 4624
Greenwood, 1992, Measurements of vocal tract shapes using magnetic resonance imaging, IEE Proc. I – Commun. Speech Vis., 139, 553, 10.1049/ip-i-2.1992.0074
Hagedorn, 2014, Characterizing post-glossectomy speech using real-time MRI, 170
Hagedorn, 2011, Automatic analysis of singleton and geminate consonant articulation using real-time magnetic resonance imaging, 409
Hagedorn, 2017, Characterizing articulation in apraxic speech using real-time magnetic resonance imaging, J. Speech Lang Hear. Res., 60, 877, 10.1044/2016_JSLHR-S-15-0112
Hardcastle, 1972, The use of electropalatography in phonetic research, Phonetica, 25, 197, 10.1159/000259382
Harshman, 1977, Factor analysis of tongue shapes, J. Acoust. Soc. Am., 62, 693, 10.1121/1.381581
Hart, 2010, A neural basis for motor primitives in the spinal cord, J. Neurosci., 30, 1322, 10.1523/JNEUROSCI.5894-08.2010
Heinz, 1964, On the derivation of area functions and acoustic spectra from cinéradiographic films of speech, J. Acoust. Soc. Am., 36, 1037, 10.1121/1.2143313
Hewer, 2014, A hybrid approach to 3D tongue modeling from vocal tract MRI using unsupervised image segmentation and mesh deformation, 418
Iltis, 2015, High-speed real-time magnetic resonance imaging of fast tongue movements in elite horn players, Quant. Imaging Med. Surg., 5, 374
Israel, 2012, Emphatic segments and emphasis spread in Lebanese Arabic: a real-time magnetic resonance imaging study
Jolliffe, 2002
Jung, 1996, Deriving gestural scores from articulator-movement records using weighted temporal decomposition, IEEE Trans. Speech Audio Process., 4, 2, 10.1109/TSA.1996.481448
Kass, 1988, Snakes: active contour models, Int. J. Comput. Vis., 1, 321, 10.1007/BF00133570
Katsamanis, 2011, Validating RT-MRI based articulatory representations via articulatory recognition
Kessler, 2015, The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions, Stat. Methods Med. Res., 24, 9, 10.1177/0962280214537333
Kim, 2014, Enhanced airway-tissue boundary segmentation for real-time magnetic resonance imaging data, 222
Kim, 2012, Improved imaging of lingual articulation using real-time multislice MRI, J. Magn. Reson. Imaging, 35, 943, 10.1002/jmri.23510
Kröger, 2009, Articulatory synthesis of speech and singing: State of the art and suggestions for future research, Vol. 5398, 306
Kröger, 2007, A gesture-based concept for speech movement control in articulatory speech synthesis, 174
Labrunie, 2016, Tracking contours of orofacial articulators from real-time MRI of speech, 470
Ladefoged, 1971, Direct Measurement of the Vocal Tract, J. Acoust. Soc. Am., 49, 104, 10.1121/1.1975547
Lammert, 2013, Statistical methods for estimation of direct and differential kinematics of the vocal tract, Speech Commun., 55, 147, 10.1016/j.specom.2012.08.001
Lammert, 2011, Automatic identification of stable modes and fluctuations in a repetitive task using real-time MRI
Lammert, 2013, Interspeaker variability in hard palate morphology and vowel production, J. Speech Lang. Hear. Res., 56, S1924, 10.1044/1092-4388(2013/12-0211)
Lammert, 2010, Data-driven analysis of realtime vocal tract MRI using correlated image regions, 1572
Lammert, 2015, On short-time estimation of vocal tract length from formant frequencies, PLoS One, 10, e0132193, 10.1371/journal.pone.0132193
Lee, 2003, Variational inference and learning for segmental switching state space models of hidden speech dynamics, 1, I
Lee, 1998, A frequency warping approach to speaker normalization, IEEE Trans. Speech Audio Process., 6, 49, 10.1109/89.650310
Lee, 2015, Systematic variation in the articulation of the Korean liquid across prosodic positions
Li, 2010, Distance regularized level set evolution and its application to image segmentation, IEEE Trans. Image Process., 19, 3243, 10.1109/TIP.2010.2069690
Li, 2016, Speaker verification based on the fusion of speech acoustics and inverted articulatory signals, Comput. Speech Lang., 36, 196, 10.1016/j.csl.2015.05.003
Li, 2006, The relationships among various nonnegative matrix factorization methods for clustering, 362
Ling, 2013, Articulatory control of HMM-based parametric speech synthesis using feature-space-switched multiple regression, IEEE Trans. Audio Speech Lang. Proc., 21, 207, 10.1109/TASL.2012.2215600
Ling, 2009, Integrating articulatory features into HMM-based parametric speech synthesis, IEEE Trans. Audio Speech Lang. Proc., 17, 1171, 10.1109/TASL.2009.2014796
Lingala, 2016, Recommendations for real-time speech MRI, J. Magn. Reson. Imaging, 43, 28, 10.1002/jmri.24997
Harandi, 2015, 3D segmentation of the tongue in MRI: a minimally interactive model-based approach, Comput. Methods Biomech. Biomed. Eng. Imaging Vis., 3, 178, 10.1080/21681163.2013.864958
Ma, 2004, Target-directed mixture dynamic models for spontaneous speech recognition, IEEE Trans. Speech Audio Process., 12, 47, 10.1109/TSA.2003.818074
Mády, 2003, Consonant articulation in glossectomee speech evaluated by dynamic MRI, 3233
Maeda, 1979, An articulatory model of the tongue based on a statistical analysis, J. Acoust. Soc. Am., 65, S22, 10.1121/1.2017158
Maeda, 1990, Compensatory articulation during speech: evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model, Speech Prod. Speech Model. Part of the NATO ASI Series book series (ASID, volume 55), 131
Mcdermott, 2006, Production-oriented models for speech recognition, IEICE Trans. Inf. Syst., 89, 1006, 10.1093/ietisy/e89-d.3.1006
McGowan, R., 1994. Knowledge from speech production used in speech technology: Articulatory synthesis. Haskins Laboratories Status Report on Speech Research SR-117/118, 25–29.
Mermelstein, 1973, Articulatory model for the study of speech production, J. Acoust. Soc. Am., 53, 1070, 10.1121/1.1913427
Metze, 2002, A flexible stream architecture for ASR using articulatory features
Mussa-Ivaldi, 1999, Motor primitives, force-fields and the equilibrium point theory, 392
Narayanan, 2004, An approach to real-time magnetic resonance imaging for speech production, J. Acoust. Soc. Am., 115, 1771, 10.1121/1.1652588
Narayanan, 2014, Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC), J. Acoust. Soc. Am., 136, 1307, 10.1121/1.4890284
Niebergall, 2013, Real-time MRI of speaking at a resolution of 33 ms: undersampled radial flash with nonlinear inverse reconstruction, Magnet. Reson. Med., 69, 477, 10.1002/mrm.24276
Öhman, 1967, Numerical model of coarticulation, J. Acoust. Soc. Am., 41, 310, 10.1121/1.1910340
Olthoff, 2014, On the physiology of normal swallowing as revealed by magnetic resonance imaging in real time, Gastroenterol. Res. Pract., 2014, 1, 10.1155/2014/493174
Ostry, 1996, Coarticulation of jaw movements in speech production: is context sensitivity in speech kinematics centrally planned?, J. Neurosci., 16, 1570, 10.1523/JNEUROSCI.16-04-01570.1996
Perkell, 1992, Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements, J. Acoust. Soc. Am., 92, 3078, 10.1121/1.404204
Prasad, 2016, Information theoretic optimal vocal tract region selection from real time magnetic resonance images for broad phonetic class recognition, Comput. Speech Lang., 39, 108, 10.1016/j.csl.2016.03.003
Proctor, 2013, Paralinguistic mechanisms of production in human beatboxing: a real-time magnetic resonance imaging study, J. Acoust. Soc. Am., 133, 1043, 10.1121/1.4773865
Proctor, 2009, Articulatory comparison of Tamil liquids and stops using real-time magnetic resonance imaging, J. Acoust. Soc. Am., 125, 2568, 10.1121/1.4783732
Proctor, 2013, Velic coordination in French Nasals: a realtime magnetic resonance imaging study, 577
Proctor, 2010, Temporal analysis of articulatory speech errors using direct image analysis of real-time magnetic resonance imaging, J. Acoust. Soc. Am., 128, 2289, 10.1121/1.3508036
Proctor, 2015, Articulation of English vowels in running speech: a real-time MRI study
Proctor, 2012, Articulatory bases of English liquids, 285
Proctor, 2016, Lingual consonant production in Khoekhoe: a real-time MRI study, 337
Proctor, 2010, Rapid semi-automatic segmentation of real-time Magnetic Resonance Images for parametric vocal tract analysis, 1576
Proctor, 2012, Articulation of Mandarin Sibilants: a multi-plane realtime MRI study
Raeesy, 2013, Automatic segmentation of vocal tract MR images, 1328
Rahim, 1993, On the use of neural networks in articulatory speech synthesis, J. Acoust. Soc. Am., 93, 1109, 10.1121/1.405559
Ramanarayanan, 2009, Analysis of pausing behavior in spontaneous speech using real-time magnetic resonance imaging of articulation, J. Acoust. Soc. Am., 126, EL160, 10.1121/1.3213452
Ramanarayanan, 2010, Investigating articulatory setting-pauses, ready position, and rest-using real-time MRI
Ramanarayanan, 2012, Exploiting speech production information for automatic speech and speaker modeling and recognition-possibilities and new opportunities, 1
Ramanarayanan, 2013, An investigation of articulatory setting using real-time magnetic resonance imaging, J. Acoust. Soc. Am., 134, 510, 10.1121/1.4807639
Ramanarayanan, 2013, Spatio-temporal articulatory movement primitives during speech production: extraction, interpretation, and validation, J. Acoust. Soc. Am., 134, 1378, 10.1121/1.4812765
Ramanarayanan, 2011, Automatic data-driven learning of articulatory primitives from real-time MRI data using convolutive NMF with sparseness constraints
Ramanarayanan, 2014, Are articulatory settings mechanically advantageous for speech motor control?, PLoS One, 9, 1, 10.1371/journal.pone.0104168
Ramanarayanan, 2016, Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories, Comput. Speech Lang., 36, 330, 10.1016/j.csl.2015.03.004
Rose, 1996, The potential role of speech production models in automatic speech recognition, J. Acoust. Soc. Am., 99, 1699, 10.1121/1.414679
Sagar, 2014, Feasibility study to assess clinical applications of 3-T cine MRI coupled with synchronous audio recording during speech in evaluation of velopharyngeal insufficiency in children, Pediatric Radiol., 45, 217, 10.1007/s00247-014-3141-7
Sampaio, 2017, Vocal tract morphology using real-time magnetic resonance imaging, 359
Scott, 2012, Towards clinical assessment of velopharyngeal closure using MRI: evaluation of real-time MRI sequences at 1.5 and 3T, Br. J. Radiol., 85, 1083, 10.1259/bjr/32938996
Scott, 2013, Adaptive averaging applied to dynamic imaging of the soft palate, Magnet. Reson. Med., 70, 865, 10.1002/mrm.24503
Shosted, 2012, Using magnetic resonance to image the pharynx during Arabic speech: Static and dynamic aspects, 2182
Silva, 2015, Unsupervised segmentation of the vocal tract from real-time MRI sequences, Comput. Speech Lang., 33, 25, 10.1016/j.csl.2014.12.003
Silva, 2016, Quantitative systematic analysis of vocal tract data, Comput. Speech Lang., 36, 307, 10.1016/j.csl.2015.05.004
Silva, 2013, Segmentation and analysis of vocal tract from midsagittal real-time MRI
Singh, 2008, A unified view of matrix factorization models, 358
Smith, 2014, Complex tongue shaping in lateral liquid production without constriction-based goals, 413
Sosnik, 2004, When practice leads to co-articulation: the evolution of geometrically defined movement primitives, Exp. Brain Res., 156, 422, 10.1007/s00221-003-1799-4
Stone, 1995, A head and transducer support system for making ultrasound images of tongue/jaw movement, J. Acoust. Soc. Am., 98, 3107, 10.1121/1.413799
Stone, 2001, Modeling tongue surface contours from cine-MRI images, J. Speech Lang. Hear. Res., 44, 1026, 10.1044/1092-4388(2001/081)
Strang, 2006
Subtelny, 1972, Cineradiographic study of sibilants, Folia Phoniatr., 24, 30, 10.1159/000263541
Sutton, 2010, Faster dynamic imaging of speech with field inhomogeneity corrected spiral fast low angle shot (FLASH) at 3T, J. Magn. Reson. Imaging, 32, 1228, 10.1002/jmri.22369
Teixeira, 2012, Real-time MRI for portuguese, 306
Tiede, 2000, Contrasts in speech articulation observed in sitting and supine conditions, 25
Tilsen, 2016, Anticipatory posturing of the vocal tract reveals dissociation of speech movement plans from linguistic units, PLoS One, 11, e0146813, 10.1371/journal.pone.0146813
Toda, 2004, Mapping from articulatory movements to vocal tract spectrum with gaussian mixture model for articulatory speech synthesis
Töger, 2016, Sensitivity of quantitative RT-MRI metrics of vocal tract dynamics to image reconstruction settings, 165
Vaz, 2016, Convex hull convolutive non-negative matrix factorization for uncovering temporal patterns in multivariate time-series data, 963
Vijay Kumar, 2012, Assessment of swallowing and its disorders: a dynamic MRI study, Eur. J. Radiol., 82, 215, 10.1016/j.ejrad.2012.09.010
Vorperian, 2005, Development of vocal tract length during early childhood – a magnetic resonance imaging study, J. Acoust. Soc. Am., 117, 338, 10.1121/1.1835958
Welch, 2002, A novel volumetric magnetic resonance imaging paradigm to study upper airway anatomy, Sleep, 25, 532, 10.1093/sleep/25.5.530
Welling, 2002, Speaker adaptive modeling by vocal tract normalization, IEEE Trans. Speech Audio Process., 10, 415, 10.1109/TSA.2002.803435
Westbury, 1990, X-ray microbeam speech production database, J. Acoust. Soc. Am., 88, S56, 10.1121/1.2029064
Whalen, 2005, The Haskins optically corrected ultrasound system (Hocus), J. Speech Lang. Hear. Res., 48, 543, 10.1044/1092-4388(2005/037)
Wrench, 2000, A multi-channel/multi-speaker articulatory database for continuous speech recognition research
Yehia, 1997, A parametric three-dimensional model of the vocal-tract based on MRI data, 3, 1619
Zhang, 2016, Extraction of tongue contour in real-time magnetic resonance imaging sequences, 937
Zhang, 2012, Real-time magnetic resonance imaging of normal swallowing, J. Magn. Reson. Imaging, 35, 1372, 10.1002/jmri.23591
Zu, 2013, Evaluation of swallow function after tongue cancer treatment using real-time magnetic resonance imaging, JAMA Otolaryngol. Head Neck Surg., 139, 1312, 10.1001/jamaoto.2013.5444