A deep neural network based correction scheme for improved air-tissue boundary prediction in real-time magnetic resonance imaging video

Computer Speech & Language - Tập 66 - Trang 101160 - 2021
Renuka Mannem1, Prasanta Kumar Ghosh1
1Electrical Engineering, Indian Institute of Science (IISc), Bangalore 560012, Karnataka, India

Tài liệu tham khảo

Bassil, 2012, Post-editing error correction algorithm for speech recognition using bing spelling suggestion, International Journal of Advanced Computer Science and Applications, 2012., 10.14569/IJACSA.2012.030217 Berndt, 1994, Using dynamic time warping to find patterns in time series, in KDD workshop, 10, 359 Bowman, 1997, 18 Bresch, 2008, Seeing speech: capturing vocal tract shaping using real-time magnetic resonance imaging, IEEE Signal Process Mag, 25, 123, 10.1109/MSP.2008.918034 Bresch, 2009, Region segmentation in the frequency domain applied to upper airway real-time magnetic resonance images, IEEE Trans Med Imaging, 28, 323, 10.1109/TMI.2008.928920 CA, 2018, Air-tissue boundary segmentation in real-time magnetic resonance imaging video using semantic segmentation with fully convolutional networks, 3132 Dunning, 2019, Computing extremely accurate quantiles using t-digests, arXiv:1902.04023 Fusayasu, 2015, Word-error correction of continuous speech recognition based on normalized relevance distance Gidaris, 2017, Detect, replace, refine: Deep structured prediction for pixel wise labeling, 5248 Hsieh, 2013, Pharyngeal constriction in English diphthong production, Proceedings of Meetings on Acoustics, 19, 060271, 10.1121/1.4799762 Huang, 2018, Error correction for dense semantic image labeling, 998 Iglovikov, 2018 Kim, 2014, Enhanced airway-tissue boundary segmentation for real-time magnetic resonance imaging data, 222 Koparkar, 2018, A supervised air-tissue boundary segmentation technique in real-time magnetic resonance imaging video using a novel measure of contrast and dynamic programming, 5004 Lammert, 2010, Data-driven analysis of realtime vocal tract MRI using correlated image regions, 1572 Lammert, 2013, Interspeaker variability in hard palate morphology and vowel production, in Journal of Speech, Language, and Hearing Research, 56, 1924, 10.1044/1092-4388(2013/12-0211) Li, 2016, Speaker verification based on the fusion of speech acoustics and inverted articulatory signals, in Computer Speech and Language, 36, 196, 10.1016/j.csl.2015.05.003 Liu, 2002, Self-correction algorithms and applications to digital signal processing, Measurement, 31, 107, 10.1016/S0263-2241(01)00034-3 Mannem, 2019, Acoustic and Articulatory Feature Based Speech Rate Estimation Using a Convolutional Dense Neural Network, 929 Maurer, 1993, Re-examination of the relation between the vocal tract and the vowel sound with electromagnetic articulography in vocalizations, in Clinical Linguistics & Phonetics, 7, 129, 10.3109/02699209308985550 Narayanan, 2014, Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (tc), J. Acoust. Soc. Am., 136, 1307, 10.1121/1.4890284 Parrell, 2014, Interaction between general prosodic factors and languagespecific articulatory patterns underlies divergent outcomes of coronal stop reduction, 308 Pattem, 2018, Optimal sensor placement in electromagnetic articulography recording for speech production study, Computer speech & language, 47, 157, 10.1016/j.csl.2017.07.008 Prasad, 2015, Estimation of the invariant and variant characteristics in speech articulation and its application to speaker identification, 4265 Proctor, 2010, Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis, 1576 Ramanarayanan, 2013, An investigation of articulatory setting using real-time magnetic resonance imaging, in The Journal of the Acoustical Society of America, 134, 510, 10.1121/1.4807639 Russakovsky, 2015, Imagenet large scale visual recognition challenge, Int J Comput Vis, 115, 211, 10.1007/s11263-015-0816-y S., 2018, 3127 Simonyan, 2014, Very deep convolutional networks for large-scale image recognition, arXiv 1409.1556 Somandepalli, 2017, Semantic edge detection for tracking vocal tract air-tissue boundaries in real-time magnetic resonance images, 631 Story, 1996, Vocal tract area functions from magnetic resonance imaging, in The Journal of the Acoustical Society of America, 100, 537, 10.1121/1.415960 Tolga Birdal, 2020 Toutios, 2015, Factor analysis of vocal-tract outlines derived from real-time magnetic resonance imaging data Toutios, 2016, Articulatory synthesis based on real-time magnetic resonance imaging data, 1492, 10.21437/Interspeech.2016-596 Valliappan, 2019, An improved air tissue boundary segmentation technique for real time magnetic resonance imaging video using segnet, 5921 Watkin, 1989, Pseudo-three-dimensional reconstruction of ultrasonic images of the tongue, in The Journal of the Acoustical Society of America, 85, 496, 10.1121/1.397702 Wold, 1985, Generation of vocal tract shapes from formant frequencies, in The Journal of the Acoustical Society of America, 78, S54, 10.1121/1.2022874 Wrench, 2000, A multichannel articulatory database and its application for automatic speech recognition, 305 Wu, 2019, Towards accurate high resolution satellite image semantic segmentation, IEEE Access, 7, 55609, 10.1109/ACCESS.2019.2913442 Zhang, 2016, Extraction of tongue contour in real-time magnetic resonance imaging sequences, 937 Zung, 2017, An error detection and correction framework for connectomics, 6818