Recent automatic text summarization techniques: a survey

Artificial Intelligence Review - Tập 47 - Trang 1-66 - 2016
Mahak Gambhir1, Vishal Gupta1
1University Institute of Engineering and Technology, Panjab University, Chandigarh, India

Tóm tắt

As information is available in abundance for every topic on internet, condensing the important information in the form of summary would benefit a number of users. Hence, there is growing interest among the research community for developing new approaches to automatically summarize the text. Automatic text summarization system generates a summary, i.e. short length text that includes all the important information of the document. Since the advent of text summarization in 1950s, researchers have been trying to improve techniques for generating summaries so that machine generated summary matches with the human made summary. Summary can be generated through extractive as well as abstractive methods. Abstractive methods are highly complex as they need extensive natural language processing. Therefore, research community is focusing more on extractive summaries, trying to achieve more coherent and meaningful summaries. During a decade, several extractive approaches have been developed for automatic summary generation that implements a number of machine learning and optimization techniques. This paper presents a comprehensive survey of recent text summarization extractive approaches developed in the last decade. Their needs are identified and their advantages and disadvantages are listed in a comparative manner. A few abstractive and multilingual text summarization approaches are also covered. Summary evaluation is another challenging issue in this research field. Therefore, intrinsic as well as extrinsic both the methods of summary evaluation are described in detail along with text summarization evaluation conferences and workshops. Furthermore, evaluation results of extractive summarization approaches are presented on some shared DUC datasets. Finally this paper concludes with the discussion of useful future directions that can help researchers to identify areas where further research is needed.

Tài liệu tham khảo

Abuobieda A, Salim N, Albaham AT, Osman AH, Kumar YJ (2012) Text summarization features selection method using pseudo genetic-based model. In: International conference on information retrieval knowledge management, pp 193–197 Aliguliyev RM (2009) A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst Appl 36(4):7764–7772 Alguliev RM, Aliguliyev RM, Isazade NR (2013) Multiple documents summarization based on evolutionary optimization algorithm. Expert Syst Appl 40:1675–1689. doi:10.1016/j.eswa.2012.09.014 Alguliev RM, Aliguliyev RM, Hajirahimova MS, Mehdiyev CA (2011) MCMR: maximum coverage and minimum redundant text summarization model. Expert Syst Appl 38:14514–14522. doi:10.1016/j.eswa.2011.05.033 Almeida M, Martins AF (2013) Fast and robust compressive summarization with dual decomposition and multi-task learning. In: ACL (1), pp 196–206 Amigó E, Gonzalo J, Penas A, Verdejo F (2005) QARLA: a framework for the evaluation of text summarization systems. In: ACL ’05: proceedings of the 43rd annual meeting on association for computational linguistics, pp 280–289 Amati G (2003) Probability models for information retrieval based on divergence from randomness. University of Glasgow Amini MR, Usunier N (2009) Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization. In: Proceedings of the 32nd annual ACM SIGIR conference on research and development in information retrieval (SIGIR’09), pp 704–705 Antiqueira L, Oliveira ON, Costa F, Volpe G (2009) A complex network approach to text summarization. Inf Sci 179:584–599. doi:10.1016/j.ins.2008.10.032 Azmi AM, Al-Thanyyan S (2012) A text summarizer for Arabic. Comput Speech Lang 26:260–273. doi:10.1016/j.csl.2012.01.002 Bairi RB, Iyer R, Ramakrishnan G, Bilmes J (2015) Summarization of multi-document topic hierarchies using submodular. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 553–563 Banerjee S Mitra P, Sugiyama K (2015) Multi-document abstractive summarization using ILP based multi-sentence compression. In: Proceedings of the 24th international joint conference on artificial intelligence (IJCAI 2015), pp 1208–1214 Baralis E, Cagliero L, Jabeen S, Fiori A (2012) Multi-document summarization exploiting frequent itemsets. In: Symposium on applied computing (SAC’12), pp 782–786 Baralis E, Cagliero L, Mahoto N, Fiori A (2013) GRAPHSUM : discovering correlations among multiple terms for graph-based summarization. Inf Sci 249:96–109. doi:10.1016/j.ins.2013.06.046 Barrera A, Verma R (2012) Combining syntax and semantics for automatic extractive single-document summarization. In: 13th international conference on computational linguistics and intelligent text processing. Springer, pp 366–377 Barzilay R, Lapata M (2005) Modeling local coherance: an entity-based approach. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL ’05), pp 141–148 Bing L, Li P, Liao Y, Lam W, Guo W, Passonneau RJ (2015) Abstractive multi-document summarization via phrase selection and. arXiv preprint arXiv:1506.01597 Boudin F, Morin E (2013) Keyphrase extraction for N-best reranking in multi-sentence compression. In: North American Chapter of the Association for Computational Linguistics (NAACL) Brin S, Page L (1998) The anatomy of a large scale hypertextual web search engine. In: Proceedings of the 7th international conference on world wide web 7, pp 107–117 Cao Z, Wei F, Dong L, Li S, Zhou M (2015a) February. Ranking with recursive neural networks and its application to multi-document summarization. In: Twenty-ninth AAAI conference on artificial intelligence Cao Z, Wei F, Dong L, Li S, Zhou M (2015b) Ranking with recursive neural networks and its application to multi-document summarization. In Twenty-ninth AAAI conference on artificial intelligence Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015c) Learning summary prior representation for extractive summarization. In: Proceedings of ACL: short papers, pp 829–833 Carbonell JG, Goldstein J (1998) The use of MMR, diversity-based re-ranking for re-ordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 335–336 Carenini G, Ng RT, Zhou X (2007) Summarizing email conversations with clue words. In: Proceedings of the 16th international conference on World Wide Web. ACM. pp 91–100 Carenini G, Ng RT, Zhou X (2008) Summarizing emails with conversational cohesion and subjectivity. ACL 8:353–361 Carlson L, Marcu D, Okurowski ME (2003) Building a discourse-tagged corpus in the framework of rhetorical structure theory. Springer, Netherlands, pp 85–112 Chali Y, Hasan SA (2012) Query focused multi-document summarization: automatic data annotations and supervised learning approaches. Nat Lang Eng 18:109–145 Chan SWK (2006) Beyond keyword and cue-phrase matching: a sentence-based abstraction technique for information extraction. Decis Support Syst 42:759–777. doi:10.1016/j.dss.2004.11.017 Cilibrasi RL, Vitanyi PMB (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19:370–383 Deerwester S, Dumais ST, Furnas GW et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci Technol 41:391–407 Dunlavy DM, O’Leary DP, Conroy JM, Schlesinger JD (2007) A system for querying, clustering and summarizing documents. Inf Process Manag 43:1588–1605 Erkan G, Radev D (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479 Fang H, Lu W, Wu F et al (2015) Topic aspect-oriented summarization via group selection. Neurocomputing 149:1613–1619. doi:10.1016/j.neucom.2014.08.031 Fattah MA (2014) A hybrid machine learning model for multi-document summarization. 592–600. doi:10.1007/s10489-013-0490-0 Fattah MA, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23:126–144. doi:10.1016/j.csl.2008.04.002 Ferreira R, De Souza L, Dueire R et al (2013) Assessing sentence scoring techniques for extractive text summarization. Expert Syst Appl 40:5755–5764. doi:10.1016/j.eswa.2013.04.023 Ferreira R, de Souza Cabral L, Freitas F et al (2014) A multi-document summarization system based on statistics and linguistic treatment. Expert Syst Appl 41:5780–5787. doi:10.1016/j.eswa.2014.03.023 Filippova K (2010) August. Multi-sentence compression: finding shortest paths in word graphs. In: Proceedings of the 23rd international conference on computational linguistics. Association for computational linguistics, pp 322–330 Frank JR, Kleiman-Weiner M, Roberts DA, Niu F, Zhang C, Ré C, Soboroff I (2012) Building an entity-centric stream filtering test collection for TREC 2012. MASSACHUSETTS INST OF TECH CAMBRIDGE Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976 Fung P, Ngai G (2006) One story, one flow: hidden Markov Story Models for multilingual multidocument summarization. ACM Trans Speech Lang 3:1–16. doi:10.1145/1149290.1151099 Ganesan K, Zhai C, Han J (2010) Opinosis : a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd international conference on computational linguistics, pp 340–348 Genest PE, Lapalme G (2011) Framework for abstractive summarization using text-to-text generation. In: Proceedings of the workshop on monolingual text-to-text generation, Association for Computational Linguistics, pp 64–73 Giannakopoulos G, Karkaletsis V, Vouros G, Stamatopoulos P (2008) Summarization system evaluation revisited: N-gram graphs. ACM Trans Speech Lang Process 5:1–39 Gillick D, Favre B, Hakkani-Tur D, Bohnet B, Liu Y, Xie S (2009) The icsi/utd summarization system at tac 2009. In Proceedings of the text analysis conference workshop, Gaithersburg, MD (USA) Glavaš G, Šnajder J (2014) Event graphs for information retrieval and multi-document summarization. Expert Syst Appl 41:6904–6916. doi:10.1016/j.eswa.2014.04.004 Goldstein J, Mittal V, Carbonelll J, Kantrowitz M (2000) Multi-document summarization by sentence extraction. In: NAACL-ANLP 2000 workshop on automatic summarization. pp 40–48 Gong Y, Liu X (2001) Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24st annual international ACM SIGIR conference on research and development in information retrieval. pp 19–25 Graff D, Kong J, Chen K, Maeda K (2003) English gigaword. Linguistic Data Consortium, Philadelphia Graham Y (2015) Re-evaluating automatic summarization with BLEU and 192 shades of ROUGE. In: Proceedings of the 2015 conference on empirical methods in natural language processing. pp 128–137 Grosz BJ, Weinstein S, Joshi AK (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21:203–225 Gupta V (2013) Hybrid algorithm for multilingual summarization of Hindi and Punjabi documents. In: Mining intelligence and knowledge exploration. Springer International Publishing, pp 717–727 Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268. doi:10.4304/jetwi.2.3.258-268 Gupta P, Pendluri VS, Vats I (2011) Summarizing text by ranking texts units according to shallow linguistic features. In: 13th international conference on advanced communication technology. pp 1620–1625 Haberlandt K, Bingham G (1978) Verbs contribute to the coherence of brief narratives: reading related and unrelated sentence triples. J Verbal Learn Verbal Behav 17:419–425 Hadi Y, Essannouni F, Thami ROH (2006) Unsupervised clustering by k-medoids for video summarization. In: ISCCSP’06 (the second international symposium on communications, control and signal processing) Halliday MAK, Hasan R (1991) Language, context and text: aspects of language in a social-semiotic perspective. Oxford University Press, Oxford Harabagiu S, Lacatusu F (2005) Topic themes for multi-document summarization. In: SIGIR’ 05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. pp 202–209 Harabagiu S, Lacatusu F (2010) Using topic themes for multi-document summarization. ACM Trans Inf Syst 28:13:1–13:47 He T, Shao W, Li F, Yang Z, Ma L (2008) The automated estimation of content-terms for query-focused multi-document summarization. In: Fuzzy systems and knowledge discovery, 2008. FSKD’08. Fifth international conference on IEEE, vol 5, pp 580–584 He Z, Chen C, Bu J, Wang C, Zhang L, Cai D, He X (2012) Document summarization based on data reconstruction. In: AAAI Hearst M (1997) TextTiling: segmenting text into multi-paragraph subtopic passages. Comput Linguist 23:33–64 Heu JU, Qasim I, Lee DH (2015) FoDoSu: multi-document summarization exploiting semantic analysis based on social Folksonomy. Inf Process Manag 51(1):212–225 Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M (2013) Single-document summarization as a tree knapsack problem. EMNLP 13:1515–1520 Hong K, Nenkova A (2014) Improving the estimation of word importance for news multi-document summarization. In: Proceedings of EACL Hong K, Marcus M, Nenkova A (2015) System combination for multi-document summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. pp 107–117 Hovy E, Lin CY, Zhou L, Fukumoto J (2006) Automated summarization evaluation with basic elements. In: Proceedings of the 5th international conference on language resources and evaluation (LREC), pp 81–94 Huang L, He Y, Wei F, Li W (2010) Modeling document summarization as multi-objective optimization. In: Proceedings of the third international symposium on intelligent information technology and security informatics, pp 382–386 Jones KS (2007) Automatic summarising: the state of the art. Inf Process Manag 43:1449–1481. doi:10.1016/j.ipm.2007.03.009 Kabadjov M, Atkinson M, Steinberger J et al. (2010) NewsGist: a multilingual statistical news summarizer. Lecture notes in computer science (including including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 6323 LNAI, pp 591–594. doi:10.1007/978-3-642-15939-8_40 Kaljahi R, Foster J, Roturier J (2014) Semantic role labelling with minimal resources: experiments with french. In: Lexical and computational semantics (*SEM 2014), p 87 Kallimani JS, Srinivasa KG, Eswara Reddy B (2011) Information extraction by an abstractive text summarization for an Indian regional language. In: Natural language processing and knowledge engineering (NLP-KE), 2011 7th international conference on IEEE, pp 319–322 Kedzie C, McKeown K, Diaz F (2015) Predicting salient updates for disaster summarization. In: Proceedings of the 53rd annual meeting of the ACL and the 7th international conference on natural language processing. pp 1608–1617 Khan A, Salim N, Jaya Kumar Y (2015) A framework for multi-document abstractive summarization based on semantic role labelling. Appl Soft Comput 30:737–747. doi:10.1016/j.asoc.2015.01.070 Kikuchi Y, Hirao T, Takamura H, Okumura M, Nagata M (2014) Single document summarization based on nested tree structure. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, vol 2, pp 315–320 Kim SM, Hovy E (2005) Automatic detection of opinion bearing words and sentences. In: Companion volume to the proceedings of the international joint conference on natural language processing (IJCNLP), pp 61–66 Kintsch W, Van Dijk TA (1978) Toward a model of text comprehension and production. Psychol Rev 85(5):363 Knuth DE (1977) A generalization of Dijkstra’s algorithm. Inf Process Lett 6:1–5 Ko Y, Seo J (2004) Learning with unlabeled data for text categorization using a bootstrapping and a feature projection technique. In: Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL 2004). pp 255–262 Ko Y, Seo J (2008) An effective sentence-extraction technique using contextual information and statistical approaches for text summarization. Pattern Recognit Lett 29:1366–1371. doi:10.1016/j.patrec.2008.02.008 Ko Y, Kim K, Seo J (2003) Topic keyword identification for text summarization using lexical clustering. IEICE Trans Inf Syst E86-D:1695–1701 Kruengkrai C, Jaruskulchai C (2003) Generic text summarization using local and global properties of sentences. In: Proceedings of the ieee/wic international conference on web intelligence (ieee/wic’03) Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083 Kulkarni UV, Prasad RS (2010) Implementation and evaluation of evolutionary connectionist approaches to automated text summarization. J Comput Sci 6:1366–1376 Landauer TK, Foltz PW, Laham D (1998) An intoduction to latent semantic analysis. Discourse Process 25:259–284 Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791 Lee J-H, Park S, Ahn C-M, Kim D (2009) Automatic generic document summarization based on non-negative matrix factorization. Inf Process Manag 45:20–34 Leite DS, Rino LHM (2006) Selecting a feature set to summarize texts in Brazilian Portuguese. Advances in artificial intelligence-IBERAMIA-SBIA 2006:462–471 Li JW, Ng KW, Liu Y, Ong KL (2007) Enhancing the effectiveness of clustering with spectra analysis. IEEE Trans Knowl Data Eng 19:887–902 Li C, Liu F, Weng F, Liu Y (2013) Document summarization via guided sentence compression. In: EMNLP, pp 490–500 Li C, Liu Y, Zhao L (2015a) Using external resources and joint learning for bigram weighting in ilp-based multi-document summarization. In: Proceedings of NAACL-HLT, pp 778–787 Li P, Bing L, Lam W, Li H, Liao Y (2015b) Reader-aware multi-document summarization via sparse coding. arXiv preprint arXiv:1504.07324 Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of ACL text summarization workshop, pp 74–81 Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human language technologies: the 2010 annual conference of the North American chapter of the association for computational linguistics, Association for Computational Linguistics, pp 912–920 Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th conference on computational linguistics, pp 495–501 Liu Y, Wang X, Zhang J, Xu H (2008) Personalized PageRank based multi-document summarization. In: Semantic computing and systems, 2008. WSCS’08. IEEE international workshop on IEEE, pp 169–173 Liu X, Webster JJ, Kit C (2009) An extractive text summarizer based on significant words. In: Proceedings of the 22nd international conference on computer processing of oriental languages, language technology for the knowledge-based economy, Springer, pp 168–178 Liu H, Yu H, Deng ZH (2015) Multi-document summarization based on two-level sparse representation model. In: Twenty-ninth AAAI conference on artificial intelligence Lloret E, Palomar M (2009) A gradual combination of features for building automatic summarisation systems. Text, speech and dialogue. Springer, Berlin, pp 16–23 Lloret E, Palomar M (2011a) Analyzing the use of word graphs for abstractive text summarization. In: IMMM 2011, first international conference, pp 61–66 Lloret E, Palomar M (2011b) Text summarisation in progress: a literature review. Artif Intell Rev 37:1–41. doi:10.1007/s10462-011-9216-z Lloret E, Palomar M (2013) Tackling redundancy in text summarization through different levels of language analysis. Comput Stand Interfaces 35:507–518. doi:10.1016/j.csi.2012.08.001 Lloret E, Romá-Ferri MT, Palomar M (2013) COMPENDIUM: a text summarization system for generating abstracts of research papers. Data Knowl Eng 88:164–175. doi:10.1016/j.datak.2013.08.005 Luhn H (1958) The automatic creation of literature abstracts. IBM J Res Dev 2:159–165 Mani I, Maybury M (1999) Advances in automatic text summarization. MIT Press, Cambridge Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge Mann W, Thompson S (1988) Rhetorical structure theory: toward a functional theory of text organization. Text 8:243–281 Mendoza M, Bonilla S, Noguera C et al (2014) Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 41:4158–4169. doi:10.1016/j.eswa.2013.12.042 Mihalcea R, Tarau P (2004) TextRank: bringing order into texts. In: Conference on empirical methods in natural language processing. pp 404–411 Moawad IF, Aref M (2012) Semantic graph reduction approach for abstractive Text Summarization. In: Proceedings of ICCES 2012, 2012 International Conference on Computer Engineering and Systems, pp 132–138. doi:10.1109/ICCES.2012.6408498 Murdock VG (2006) Aspects of sentence retrieval. University of Massachusetts, Amherst Neto JL, Freitas AA, Kaestner CAA (2002) Automatic text summarization using a machine learning approach. In: Proceedings of the 16th brazilian symposium on artificial intelligence (sbia), 2507 of lnai. pp 205–215 Neto JL, Santos AD, Kaestner CAA, Freitas AA (2000) Document clustering and text summarization. In: Proceedings of the fourth international conference practical applications of knowledge discovery and data mining (padd-2000), pp 41–55 Nobata C, Satoshi S, Murata M, Uchimoto K, Utimaya M, Isahara H (2001) Sentence extraction system asssembling multiple evidence. In: Proceedings 2nd NTCIR workshop, pp 319–324 Orasan C (2009) Comparative evaluation of term-weighing methods for automatic summarization. J Quant Linguist 16:67–95 Otterbacher J, Erkan G, Radev DR (2009) Biased LexRank: passage retrieval using random walks with question-based priors. Inf Process Manag 45(1):42–54 Oufaida H, Philippe B, Omar Nouali (2015) Using distributed word representations and mRMR discriminant analysis for multilingual text summarization. In: Natural language processing and information systems. Springer International Publishing, pp 51–63 Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47:227–237 Ouyang Y, Li W, Zhang R et al (2013) A progressive sentence selection strategy for document summarization. Inf Process Manag 49:213–221. doi:10.1016/j.ipm.2012.05.002 Owczarzak K (2009) DEPEVAL(summ): dependency-based evaluation for automatic summaries. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 190–198 Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135 Pardo TAS, Rino LHM, Nunes MGV (2003a) Neuralsumm: a connexionist approach to automatic text summarization. In: Proceedings of the fourth Brazilian meeting artificial intelligence (ENIA). pp 1–10 Pardo TAS, Rino LHM, Nunes MGV (2003b) Gistsumm: a summarization tool based on a new extractive method. In: Proceedings of the sixth workshop on computational processing of written and spoken portuguese (propor), 2721 of LNAI, pp 210–218 Parveen D, Strube M (2015) Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: Proceedings of the 24th international conference on artificial intelligence. AAAI Press. pp 1298–1304 Patel A, Siddiqui T, Tiwary US (2007) A language independent approach to multilingual text summarization. In: Large scale semantic access to content (text, image, video, and sound), pp 123–132 Pitler E, Nenkova A (2008) Revisiting readability. In: Proceedings of the 2008 conference on empirical methods in natural language processing. pp 186–195 Prasad RS, Uplavikar NM, Wakhare SS, Jain VY, Avinash T (2012) Feature based text summarization. In: International journal of advances in computing and information researches Quirk R, Greenbaum S, Leech G (1985) A comprehensive grammar of the English language. Longman, London and New York Radev D, Tam D (2003) Summarization evaluation using relative utility. In: CIKM ’03: proceedings of the 12th international conference on information and knowledge management, pp 508–511 Radev DR, Fan W, Zhang Z, Arbor A (2001) WebInEssence: a personalized web-based multi-document summarization and recommendation system. In: NAACL 2001 workshop on automatic summarization, pp 79–88 Radev D, Allison T, Goldensohn B et al. (2004a) MEAD: a platform for multidocument multilingual text summarization. Proc Lr, 1–4 Radev DR, Jing HY, Stys M, Tam D (2004b) Centroid-based summarization of multiple documents. Inf Process Manag 40:919–938 Riedhammer K, Favre B, Hakkani-Tur D (2010) Long story short- global unsupervised models for keyphrase based meeting summarization. Speech Commun 52:801–815 Rino LHM, Modolo M (2004) Supor: an environment for as of texts in brazilianportuguese. In: Espana for natural language processsing (EsTAL). pp 419–430 Rotem N (2011) Open text summarizer (ots). Retrieved from http://libots.sourceforge.net/ Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 Russell SJ, Norvig P (1995) Artificial intelligence: a modern approach. Prentice-Hall International Incorporated, Englewood Cliffs Sanderson M, Croft WB (1999) Deriving concept hierarchies from text. Proceedings of SIGIR 1999:206–213 Sarkar K (2010) Syntactic trimming of extracted sentences for improving extractive multi-document summarization. J Comput 2:177–184 Shen C, Li T, Ding CH (2011) Integrating clustering and multi-document summarization by bi-mixture probabilistic latent semantic analysis (PLSA) with sentence bases. In: AAAI Shen D, Sun J-T, Li H et al. (2007) Document summarization using conditional random fields. In: Proceedings of 20th international joint conference on artificial intelligence. pp 2862–2867 Simon I, Snavely N, Seitz SM (2007) Scene summarization for online image collections. In: Computer vision, 2007. ICCV 2007. IEEE 11th international conference on. IEEE. pp 1–8 Sipos R, Shivaswamy P, Joachims T (2012) Large-margin learning of submodular summarization models. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics, Association for Computational Linguistics, pp 224–233 Song W, Choi LC, Park SC, Ding XF (2011) Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization. Expert Syst Appl 38:9112–9121 Storn R, Price K (1997) Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359 Svore K, Vanderwende L, Burges C (2007) Enhancing single-document summarization by combining RankNet and third priority sources. In: Proceedings of the empirical methods on natural language processing and computational natural language learning (EMNLP-CoNLL), pp 448–457 Takamura H, Okumura M (2009) Text summarization model based on maximum coverage problem and its variant. In: Proceedings of the 12th conference of the European chapter of the association for computational linguistics, Association for Computational Linguistics, pp 781–789 Tan PN, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02). pp 32–41 Tang J, Yao L, Chen D (2009) Multi-topic based query-oriented summarization. SDM 9:1147–1158 Tao Y, Zhou S, Lam W, Guan J (2008) Towards more text summarization based on textual association networks. In: Proceedings of the 2008 fourth international conference on semantics, knowledge and grid, pp 235–240 Teufel S, Halteren H (2004) Evaluating information content by factoid analysis: human annotation and stability. In: Proceedings of the 2004 conference on empirical methods in natural language processing, pp 419–426 Texlexan (2011) Texlexan: an open-source text summarizer. http://texlexan.sourceforge.net/ Tonelli S, Pianta E (2011) Matching documents and summaries using key concepts. In: Proceedings of the French text mining evaluation workshop Tzouridis E, Nasir JA, Lahore LUMS, Brefeld U (2014) Learning to summarise related sentences. In: The 25th international conference on computational linguistics (COLING’14), Dublin, Ireland, ACL Vadlapudi R, Katragadda R (2010) An automated evaluation of readability of summaries: capturing grammaticality, focus, structure and coherence. In: Proceedings of the NAACL HLT 2010 student research workshop. pp 7–12 van der Plas L, Henderson J, Merlo P (2010) D6. 2: semantic role annotation of a French-English Corpus, Computational Learning in Adaptive Systems for Spoken Conversation (CLASSiC) Van der Plas L, Merlo P, Henderson J (2011) Scaling up automatic cross-lingual semantic role annotation. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2. Association for computational linguistics, pp 299–304 Wan X (2008) Using only cross-document relationships for both generic and topic-focused multi-document summarizations. Inf Retr 11(1):25–49 Wan X (2010) Towards a unified approach to simultaneous single-document and multi-document summarizations. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), pp 1137–1145 Wan X, Yang J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM. pp 299–306 Wan X, Xiao J (2009) Graph-based multi-modality learning for topic-focused multi-document summarization. In: IJCAI. pp. 1586–1591 Wang D, Li T (2012) Weighted consensus multi-document summarization. Inf Process Manag 48:513–523 Wang C, Long L, Li L (2008a) HowNet based evaluation for Chinese text summarization. In: Proceedings of the international conference on natural language processing and software engineering. pp 82–87 Wang D, Li T, Zhu S, Ding C (2008b) Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 307–314 Wang D, Li T, Zhu S, Ding C (2009) Multi-document summarization using sentence-based topic models. In: Proceedings of the ACL-IJCNLP 2009 conference short papers, pp 297–300 Wang D, Li T, Ding C (2010) Weighted feature subset non-negative matrix factorization and its applications to document understanding. In: Proceedings of the 2010 IEEE international conference on data mining, pp 541–550 Wang D, Zhu S, Li T et al. (2011) Integrating document clustering and multi-document summarization. ACM Trans Knowl Discov Data 5:14:1–14:26 Wasson M (1998) Using leading text for news summaries: evaluation results and implications for commercial summarization applications. In: Proceedings of the 17th international conference on computational linguistics, vol 2. Association for computational linguistics, pp 1364–1368 Wei F, Li W, Lu Q, He Y (2008) Query sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In: Proceedings of the 31st annual international acmsigir conference on research and development in information retrieval (SIGIR’08). pp 283–290 Wei F, Li W, Lu Q, He Y (2010) A document-sensitive graph model for multi-document summarization. Knowl Inf Syst 22(2):245–259 Wenjie L, Furu W, Qin L, Yanxiang H (2008) Pnr2: ranking sentences with positive and negative reinforcement for query-oriented update summarization. In: Proceedings of the 22nd international conference on computational linguistics (coling’08). pp 489–496 Wilson T, Hoffmann P, Somasundaran S, Kessler J, Wiebe J, Choi Y, Cardie C, Riloff E, Patwardhan S (2005) OpinionFinder: a system for subjectivity analysis. In: Proceedings of hlt/emnlp on interactive demonstrations. Association for computational linguistics. pp 34–35 Yang CC, Wang FL (2008) Hierarchical summaization of large documents. J Am Soc Inf Sci Technol 59:887–902 Yang C, Shen J, Peng J, Fan J (2013) Image collection summarization via dictionary learning for sparse representation. Pattern Recognit 46(3):948–961 Yang L, Cai X, Zhang Y, Shi P (2014) Enhancing sentence-level clustering with ranking-based clustering framework for theme-based summarization. Inf Sci 260:37–50. doi:10.1016/j.ins.2013.11.026 Yao JG, Wan X, Xiao J (2015a) Compressive document summarization via sparse optimization. In: Proceedings of the 24th international conference on artificial intelligence. AAAI Press. pp 1376–1382 Yao JG, Wan X, Xiao J (2015b) Phrase-based compressive cross-language summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 118–127 Ye S, Chua TS, Kan MY, Qiu L (2007) Document concept lattice for text understanding and summarization. Inf Process Manag 43:1643–1662. doi:10.1016/j.ipm.2007.03.010 Yeh J-Y, Ke H-R, Yang W-P, Meng I-H (2005) Text summarization using a trainable summarizer and latent semantic analysis. Inf Process Manag 41:75–95. doi:10.1016/j.ipm.2004.04.003 Yen JY (1971) Finding the k shortest loopless paths in a network. Manag Sci 17(11):712–716 Zajic DM, Dorr BJ, Lin J (2008) Single-document and multi-document summarization techniques for e-mail threads using sentence compression. Inf Process Manag 44:1600–1610 Zha H (2002) Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: Proceedings of the 25th annual international acmsigir conference on research and development in information retrieval (SIGIR’02), pp 113–120 Zhang J, Xu H, Cheng X (2008a) Gspsummary: a graph-based sub-topic partition algorithm for summarization. In: Proceedings of the 2008 Asia information retrieval symposium, pp 321–334 Zhang J, Cheng X, Wu G, Xu H (2008b) Ada sum: an adaptive model for summarization. In: Proceedings of the acm 17th conference on information and knowledge management (CIKM’08), pp 901–909 Zhao L, Wu L, Huang X (2009) Using query expansion in graph-based approach for query-focused multi-document summarization. Inf Process Manag 45(1):35–41 Zhou L, Lin CY, Munteanu DS, Hovy E (2006) ParaEval: using paraphrases to evaluate summaries to evaluate summaries automatically. In: Proceedings of the human language technology/North American association of computational linguistics conference, pp 447–454