Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization

Houda Oufaida1, Omar Nouali2, Philippe Blache3
1Ecole Nationale Supérieure d’Informatique (ESI), Algiers, Algeria
2Research Center on Scientific and Technical Information (Cerist), Algiers, Algeria
3Aix Marseille Université, CNRS, LPL UMR 7309, 13604 Aix en Provence, France

Tài liệu tham khảo

Al-Sanie, W., Touir, A., Mathkour, H. (2005). Towards a suitable rhetorical representation for Arabic text summarization. In Kotsis, G., Taniar, D., Bressan, S., Ibrahim, I.K., Mokhtar, S. (Eds.), iiWAS, vol. 196, Austrian Computer Society, pp. 535–542. Alguliev, 2011, MCMR: maximum coverage and minimum redundant text summarization model, Expert Syst. Appl., 38, 14514, 10.1016/j.eswa.2011.05.033 Alguliev, 2013, MR&MR-sum: maximum relevance and minimum redundancy document summarization model, Int J Inf Technol Decis Making, 12, 361, 10.1142/S0219622013500156 Attia, 2007, Arabic tokenization system, 65 Azmi, 2012, A text summarizer for Arabic, Comput. Speech Lang., 26, 260, 10.1016/j.csl.2012.01.002 Barzilay, 2005, Sentence fusion for multidocument news summarization, Comput. Linguist., 31, 297, 10.1162/089120105774321091 Boudin, 2011, A graph-based approach to cross-language multi-document summarization, Polibits, 43, 113, 10.17562/PB-43-16 Ding, C., Peng, H. (2003). Minimum redundancy feature selection from microarray gene expression data. Int. J. Bioinform. Comput. Biol., pp. 523–529. Douzidia, F.S., Lapalme, G., 2004. Lakhas, an Arabic summarization system. Proceedings of the Document Understanding Conference (DUC2004). Edmundson, 1969, New methods in automatic extracting, J. ACM, 16, 264, 10.1145/321510.321519 El-Haj, M., Kruschwitz, U., Fox, C. (2010). Using mechanical Turk to create a corpus of Arabic summaries. Proceedings of the Seventh Conference on International Language Resources and Evaluation. El-Haj, M., Kruschwitz, U., Fox, C. (2011a). Experimenting with automatic text summarisation for Arabic. In: Z. Vetulani (Ed.), Human Language Technology. Challenges for Computer Science and Linguistics, Springer, Berlin Heidelberg, pp. 490–499. El-Haj, M., Kruschwitz, U., Fox, C. (2011b). University of Essex at the TAC 2011 multilingual summarisation pilot. Proceedings of the Text Analysis Conference (TAC2011). Evans, D.K., Mckeown, K., Klavans, J.L. (2005). Similarity-based Multilingual Multi-Document Summarization. IEEE Transactions on Information Theory, p. 49. Farghaly, A., & Shaalan, K. (2009). Arabic natural language processing: challenges and solutions. ACM Transactions on Asian Language Information Processing, 8(4), 14:1–14:22. Giannakopoulos, 2013 Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V. (2011). TAC 2011 MultiLing pilot overview. Proceedings of the Text Analysis Conference (TAC). Giannakopoulos, G., Karkaletsis, V. (2010). Summarization system evaluation variations based on n-gram graphs. Proceedings of the Text Analysis Conference (TAC2010). Habash, 2005, Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop, 573 Khoja, 1999 Lin, C. (2004). Rouge: A Package for Automatic Evaluation of Summaries. In Moens, M., Szpakowicz, S. (Eds.), Presented at the Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Association for Computational Linguistics, pp. 74–81. Liu, H., Liu, P., Wei, H., Li, L. (2011). The CIST summarization system at TAC 2011. Proceedings of the Text Analysis Conference (TAC2011). Luhn, 1958, The automatic creation of literature abstracts, IBM J. Res. Dev., 2, 159, 10.1147/rd.22.0159 Mâaloul, M. H., Keskes, I., Hadrich Belguith, L., Blache, P. (2010). Automatic summarization of Arabic texts based on RST Technique. In Proceedings of 12th International Conference on Enterprise Information Systems (ICEIS’2010)12th International Conference on Enterprise Information Systems (ICEIS’2010) vol. 2, Portugal. pp. 434–437). Mann, 1988, Rhetorical structure theory: toward a functional theory of text organization, Text – Interdiscip. J. Study Discourse, 8, 10.1515/text.1.1988.8.3.243 Marcu, 2000 Marcu, D.C. (1998). The rhetorical parsing, summarization, and generation of natural language texts. University of Toronto, Toronto, Ontorio, Canada, Canada. Mathkour, 2008, Parsing Arabic texts using rhetorical structure theory, J. Comput. Sci., 4, 713, 10.3844/jcssp.2008.713.720 Monroe W., Green S., Manning C.D. (2014). Word Segmentation of Informal Arabic with Domain Adaptation. ACL, Short Papers. Peng, 2005, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27, 1226, 10.1109/TPAMI.2005.159 Radev, 2004, Centroid-based summarization of multiple documents, Inf. Process. Manage., 40, 919, 10.1016/j.ipm.2003.10.006 Radev, 1998, Generating natural language summaries from multiple on-line sources, Comput. Linguist., 24, 470 Schlesinger, 2008, Arabic/English multi-document summarization with CLASSY—the past and the future, 568 Sobh, I., Darwish, N., Fayek, M. (2006). An optimized dual classification system for Arabic extractive generic text summarization. Proceedings of the 7th Conf. on Language English, ESLEC, 149–154. Wan, 2010, Cross-language document summarization based on machine translation quality prediction, 917