TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate

Matthew Snover1, Nitin Madnani2, Bonnie J. Dorr2, Richard Schwartz3
1Laboratory for Computational Linguistics and Information Processing, Institute for Advanced Computer Studies, University of Maryland, College Park, USA#TAB#
2Laboratory for Computational Linguistics and Information Processing, Institute for Advanced Computer Studies, University of Maryland, College Park, USA
3BBN Technologies, Cambridge, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL 2005 workshop on intrinsic and extrinsic evaulation measures for MT and/or summarization, pp 228–231

Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL 2005). Ann Arbor, Michigan, pp 597–604

Fellbaum C (1998) WordNet: an electronic lexical database. MIT Press. http://www.cogsci.princeton.edu/wn Accessed 7 Sep 2000

Kauchak D, Barzilay R (2006) Paraphrasing for automatic evaluation. In: Proceedings of the human language technology conference of the North American chapter of the ACL, pp 455–462

Lavie A, Sagae K, Jayaraman S (2004) The significance of recall in automatic metrics for MT evaluation. In: Proceedings of the 6th conference of the association for machine translation in the Americas, pp 134–143

Leusch G, Ueffing N, Ney H (2006) CDER: efficient MT evaluation using block movements. In: Proceedings of the 11th conference of the European chapter of the association for computational linguistics, pp 241–248

Lita LV, Rogati M, Lavie A (2005) BLANC: learning evaluation metrics for MT. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing (HLT/EMNLP). Vancouver, BC, pp 740–747

Lopresti D, Tomkins A (1997) Block edit models for approximate string matching. Theor Comput Sci 181(1): 159–179

Madnani N, Resnik P, Dorr BJ, Schwartz R (2008) Are multiple reference translations necessary? Investigating the value of paraphrased reference translations in parameter optimization. In: Proceedings of the eighth conference of the association for machine translation in the Americas, pp 143–152

Niessen S, Och F, Leusch G, Ney H (2000) An evaluation tool for machine translation: fast evaluation for MT research. In: Proceedings of the 2nd international conference on language resources and evaluation, pp 39–45

Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318

Porter MF (1980) An algorithm for suffic stripping. Program 14(3): 130–137

Przybocki M, Peterson K, Bronsart S (2008) Official results of the NIST 2008 “Metrics for MAchine TRanslation” Challenge (MetricsMATR08). http://nist.gov/speech/tests/metricsmatr/2008/results/

Rosti A-V, Matsoukas S, Schwartz R (2007) Improved word-level system combination for machine translation. In: Proceedings of the 45th annual meeting of the association of computational linguistics. Prague, Czech Republic, pp 312–319

Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of association for machine translation in the Americas, pp 223–231

Snover M, Madnani N, Dorr B, Schwartz R (2009) Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric. In: Proceedings of the fourth workshop on statistical machine translation. Association for Computational Linguistics, Athens, Greece, pp 259–268

Zhou L, Lin C-Y, Hovy E (2006) Re-evaluating machine translation results with paraphrase support. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006), pp 77–84