Cross-lingual sentiment transfer with limited resources

Machine Translation - Tập 32 Số 1-2 - Trang 143-165 - 2018
Mohammad Sadegh Rasooli1, Noura Farra1, Axinia Radeva1, Tao Yu2, Kathleen McKeown1
1Department of Computer Science, Columbia University, New York, NY 10027, USA
2Department of Computer Science, Yale University, New Haven, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abdul-Mageed M, Diab MT (2011) Subjectivity and sentiment annotation of modern standard Arabic newswire. In: Proceedings of the 5th linguistic annotation workshop, Association for Computational Linguistics, pp 110–118

Ammar W, Mulcaire G, Tsvetkov Y, Lample G, Dyer C, Smith NA (2016) Massively multilingual word embeddings. arXiv:1602.01925

Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. LREC 10:2200–2204

Balahur A, Turchi M (2014) Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Comput Speech Lang 28(1):56–75

Berg-Kirkpatrick T, Burkett D, Klein D (2012) An empirical investigation of statistical significance in NLP. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, Association for Computational Linguistics, pp 995–1005. http://aclweb.org/anthology/D12-1091

Brooke J, Tofiloski M, Taboada M (2009) Cross-linguistic sentiment analysis: from english to spanish. In: RANLP, pp 50–54

Chang PC, Galley M, Manning CD (2008) Optimizing Chinese word segmentation for machine translation performance. In: Proceedings of the third workshop on statistical machine translation (StatMT ’08), Association for Computational Linguistics, Stroudsburg, PA, pp 224–232. http://dl.acm.org/citation.cfm?id=1626394.1626430

Chen X, Sun Y, Athiwaratkun B, Cardie C, Weinberger K (2016) Adversarial deep averaging networks for cross-lingual sentiment classification. arXiv:1606.01614

Christodouloupoulos C, Steedman M (2014) A massively parallel corpus: the bible in 100 languages. Language resources and evaluation, pp 1–21

Duh K, Fujino A, Nagata M (2011) Is machine translation ripe for cross-lingual sentiment classification? In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies: short papers—volume 2, Association for Computational Linguistics (HLT ’11), Stroudsburg, PA, pp 429–433. http://dl.acm.org/citation.cfm?id=2002736.2002823

Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874

Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th conference of the european chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Gothenburg, pp 462–471. http://www.aclweb.org/anthology/E14-1049

Gouws S, Søgaard A (2015) Simple task-specific bilingual word embeddings. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, Association for Computational Linguistics, Denver, Colorado, pp 1386–1390. http://www.aclweb.org/anthology/N15-1157

Hermann KM, Blunsom P (2013) Multilingual distributed representations without word alignment. arXiv:1312.6173v4

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

Hosseini P, Ahmadian Ramaki A, Maleki H, Anvari M, Mirroshandel SA (2015) SentiPers: a sentiment analysis corpus for Persian

Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 168–177

Joshi A, Balamurali A, Bhattacharyya P (2010) A fall-back strategy for sentiment analysis in Hindi: a case study. In: Proceedings of the 8th ICON

Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980. arXiv:1412.6980v9

Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. MT Summit 5:79–86

Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

Meng X, Wei F, Liu X, Zhou M, Xu G, Wang H (2012) Cross-lingual mixture model for sentiment classification. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics (volume 1: long papers), Association for Computational Linguistics, Jeju Island, pp 572–581. http://www.aclweb.org/anthology/P12-1060

Mihalcea R, Banea C, Wiebe J (2007) Learning multilingual subjective language via cross-lingual projections. In: Proceedings of the 45th annual meeting of the association of computational linguistics, Association for Computational Linguistics, Prague, Czech Republic, pp 976–983. http://www.aclweb.org/anthology/P07-1123

Mozetič I, Grčar M, Smailović J (2016) Twitter sentiment for 15 European languages. http://hdl.handle.net/11356/1054 , Slovenian language resource repository CLARIN.SI

Mukund S, Srihari RK (2010) A vector space model for subjectivity classification in Urdu aided by co-training. In: Proceedings of the 23rd international conference on computational linguistics: posters, Association for Computational Linguistics, pp 860–868

Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

Neubig G, Dyer C, Goldberg Y, Matthews A, Ammar W, Anastasopoulos A, Ballesteros M, Chiang D, Clothiaux D, Cohn T, et al (2017) Dynet: the dynamic neural network toolkit. arXiv:1701.03980

Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51

Pasha A, Al-Badrashiny M, Diab MT, El Kholy A, Eskander R, Habash N, Pooleery M, Rambow O, Roth R (2014) Madamira: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. LREC 14:1094–1101

Rasooli MS, Collins M (2017) Cross-lingual syntactic transfer with limited resources. Trans Assoc Comput Linguist 5:279–293. https://transacl.org/ojs/index.php/tacl/article/view/922

Rosenthal S, Farra N, Nakov P (2017) Semeval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), Association for Computational Linguistics, Vancouver, pp 502–518. http://www.aclweb.org/anthology/S17-2088

Salameh M, Mohammad SM, Kiritchenko S (2015) Sentiment after translation: a case-study on Arabic social media posts. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, pp 767–777

Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C, et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Citeseer, vol 1631, p 1642

Stratos K, Kim Dk, Collins M, Hsu D (2014) A spectral algorithm for learning class-based n-gram models of natural language. In: Proceedings of the association for uncertainty in artificial intelligence

Täckström O, McDonald R, Uszkoreit J (2012) Cross-lingual word clusters for direct transfer of linguistic structure. In: Proceedings of the 2012 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, Association for Computational Linguistics, pp 477–487

Vulić I, Moens MF (2016) Bilingual distributed word representations from document-aligned comparable data. J Artif Intell Res 55:953–994

Wan X (2008) Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the 2008 conference on empirical methods in natural language processing, Association for Computational Linguistics, Honolulu, Hawaii, pp 553–561. http://www.aclweb.org/anthology/D08-1058

Wang S, Manning CD (2012) Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers-volume 2, Association for Computational Linguistics, pp 90–94

Wick M, Kanani P, Pocock A (2015) Minimally-constrained multilingual embeddings via artificial code-switching. In: Workshop on transfer and multi-task learning: trends and new perspectives, Montreal, Canada

Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, Association for Computational Linguistics, pp 347–354

Yu T, Hidey C, Rambow O, McKeown K (2017) Leveraging sparse and dense feature combinations for sentiment classification. arXiv:1708.03940

Zhang R, Lee H, Radev D (2016) Dependency sensitive convolutional neural networks for modeling sentences and documents. arXiv:1611.02361

Zhou G, He T, Zhao J (2014) Bridging the language gap: learning distributed semantics for cross-lingual sentiment classification. In: Zong C, Nie JY, Zhao D, Feng Y (eds) Nat Lang Process Chin Comput. Springer, Berlin, pp 138–149

Zhou H, Chen L, Shi F, Huang D (2015) Learning bilingual sentiment word embeddings for cross-language sentiment classification. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), Association for Computational Linguistics, Beijing, China, pp 430–440. http://www.aclweb.org/anthology/P15-1042

Zhou X, Wan X, Xiao J (2016a) Attention-based lstm network for cross-lingual sentiment classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Association for Computational Linguistics, Austin, Texas, pp 247–256. https://aclweb.org/anthology/D16-1024

Zhou X, Wan X, Xiao J (2016b) Cross-lingual sentiment classification with bilingual document representation learning. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers), Association for Computational Linguistics, Berlin, pp 1403–1412, http://www.aclweb.org/anthology/P16-1133