Improving the automatic segmentation of subtitles through conditional random field

Speech Communication - Tập 88 - Trang 83-95 - 2017

Aitor Álvarez¹, Carlos-D. Martínez-Hinarejos², Haritz Arzelus¹, Marina Balenciaga¹, Arantza del Pozo¹

¹Human Speech and Language Technology Group, Vicomtech-IK4, San Sebastian, Spain

²Pattern Recognition and Human Language Technologies Research Center, Universitat Politècnica de València, Spain

Tài liệu tham khảo

Agerri, 2014, Multilingual, Efficient and Easy NLP Processing with IXA Pipeline, 5 Álvarez, 2014, Towards customized automatic segmentation of subtitles, 8854, 229 Álvarez, 2016, Impact of automatic segmentation on the quality, productivity and self-reported post-editing effort of intralingual subtitles, 3049 Álvarez, 2015, Automating live and batch subtitling of multimedia contents for several european languages, Multimedia Tools Appl., 1 Álvarez, 2014, Improving a long audio aligner through phone-relatedness matrices for english, spanish and basque, 8655, 473 Batista, 2010, Extending the punctuation module for european portuguese, 1509 Beeferman, 1998, Cyberpunc: a lightweight punctuation annotation system for speech, 689 Coltheart, 1987 D’Ydewalle, 1989, 13 developmental studies of text-picture interactions in the perception of animated cartoons with text, Adv. Psychol., 58, 233, 10.1016/S0166-4115(08)62157-3 Ezeiza, 1998, Combining stochastic and rule-based methods for disambiguation in agglutinative languages, 380 Flores d’Arcais, 1987 Gallwitz, 2002, Integrated recognition of words and prosodic phrase boundaries., Speech Commun., 36, 81, 10.1016/S0167-6393(01)00027-9 Gotoh, 2000, Sentence boundary detection in broadcast speech transcripts, 228 Graves, 2012, Supervised Sequence Labelling with Recurrent Neural Networks, 385 Gunawardana, 2005, Hidden conditional random fields for phone classification, 1117 Güz, 2009, Generative and discriminative methods using morphological information for sentence segmentation of turkish, IEEE Trans. Audio, Speech Lang. Process., 17, 895, 10.1109/TASL.2009.2016393 Kawahara, 2007, Automatic detection of sentence and clause units using local syntactic dependency, 125 Kudo, T., 2005. Crf++: Yet another crf toolkit. Software available at http://crfpp.sourceforge.net. Liu, 2006, Protein fold recognition using segmentation conditional random fields (scrfs), J. Comput. Biol., 13, 394, 10.1089/cmb.2006.13.394 Liu, 2004, The icsi-sri-uw metadata extraction system, 577 Liu, 2005, Using conditional random fields for sentence boundary detection in speech, 451 Martínez-Hinarejos, 2015, Unsegmented dialogue act annotation and decoding with n-gram transducers, IEEE/ACM Trans. Audio, Speech, Lang. Process., 23, 198 Matusov, 2006, Automatic sentence segmentation and punctuation prediction for spoken language translation, 158 McCallum, 2003, Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons, 188 Mrozinski, 2006, Automatic sentence segmentation of speech for automatic summarization, 981 NIST, 2003. Nist website: Rt-03 fall rich transcription. http://www.itl.nist.gov/iad/mig/tests/rt/2003-fall/index.html. Nowozin, 2011, Structured learning and prediction in computer vision, Found. Trends. Comput. Graph. Vis., 6, 185, 10.1561/0600000033 Oba, 2006, Sentence boundary detection using sequential dependency analysis combined with crf-based chunking, 1153 Peng, 2004, Chinese segmentation and new word detection using conditional random fields, 562 Perego, 2008, 78, 211 Perego, 2010, The Cognitive Effectiveness of Subtitle Processing, Media Psychol., 13, 243, 10.1080/15213269.2010.502873 Rajendran, 2013, Effects of Text Chunking on Subtitling: A Quantitative and Qualitative Examination, Perspectives, 21, 5, 10.1080/0907676X.2012.722651 Read, 2007, Stochastic and syntactic techniques for predicting phrase breaks., Comput. Speech Lang., 21, 519, 10.1016/j.csl.2006.09.004 Roark, 2006, Reranking for sentence boundary detection in conversational speech., 545 Roth, 2005, Integer linear programming inference for conditional random fields, 736 Sha, 2003, Shallow parsing with conditional random fields, 134 Shriberg, 2000, Prosody-based automatic segmentation of speech into sentences and topics., Speech Commun., 32, 127, 10.1016/S0167-6393(00)00028-5 Sutton, 2012, An introduction to conditional random fields, Found. Trends Mach. Learn., 4, 267, 10.1561/2200000013 Warnke, 1997, Integrated dialog act segmentation and classification using prosodic features and language models, 207

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA