Genre: generative multi-turn question answering with contrastive learning for entity–relation extraction

Complex & Intelligent Systems - Trang 1-15 - 2024
Lulu Wang1,2, Kai Yu3, Aishan Wumaier1,2, Peng Zhang4, Tuergen Yibulayin1,2, Xi Wu3, Jibing Gong3, Maihemuti Maimaiti1,2
1School of Computer Science and Technology, Xinjiang University, Urumqi, China
2Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, Urumqi, China
3School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
4Beijing Zhipu Huazhang Technology Co., Ltd., Beijing, China

Tóm tắt

Extractive approaches have been the mainstream paradigm for identifying overlapping entity–relation extraction. However, limited by their inherently methodological flaws, which hardly deal with three issues: hierarchical dependent entity–relations, implicit entity–relations, and entity normalization. Recent advances have proposed an effective solution based on generative language models, which cast entity–relation extraction as a sequence-to-sequence text generation task. Inspired by the observation that humans learn by getting to the bottom of things, we propose a novel framework, namely GenRE, Generative multi-turn question answering with contrastive learning for entity–relation extraction. Specifically, a template-based question prompt generation first is designed to answer in different turns. We then formulate entity–relation extraction as a generative question answering task based on the general language model instead of span-based machine reading comprehension. Meanwhile, the contrastive learning strategy in fine-tuning is introduced to add negative samples to mitigate the exposure bias inherent in generative models. Our extensive experiments demonstrate that GenRE performs competitively on two public datasets and a custom dataset, highlighting its superiority in entity normalization and implicit entity–relation extraction. (The code is available at https://github.com/lovelyllwang/GenRE ).

Tài liệu tham khảo

Fader A, Zettlemoyer L, Etzioni O (2014) Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data, pp 1156–1165. https://doi.org/10.1145/2623330.2623677 Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268. https://doi.org/10.4304/jetwi.2.3.258-268 Riedel S, Yao L, McCallum A, Marlin BM (2013) Relation extraction with matrix factorization and universal schemas. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 74–84. https://aclanthology.org/N13-1008 Chan YS, Roth D (2011) Exploiting syntactico-semantic structures for relation extraction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp 551–560. https://aclanthology.org/P11-1056 Lin Y, Shen S, Liu Z, Luan H, Sun M (2016) Neural relation extraction with selective attention over instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol 1, pp 2124–2133. https://doi.org/10.18653/v1/p16-1200 Li Q, Ji H (2014) Incremental joint extraction of entity mentions and relations. In: Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics, pp 402–412. https://doi.org/10.3115/v1/p14-1038 Ren X, Wu Z, He W, Qu M, Voss CR, Ji H, Abdelzaher TF, Han J (2017) CoType: joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th International Conference on World Wide Web, pp 1015–1024. https://doi.org/10.1145/3038912.3052708 Zheng S, Wang F, Bao H, Hao Y, Zhou P, Xu B (2017) Joint extraction of entities and relations based on a novel tagging scheme. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1227–1236. https://doi.org/10.18653/v1/P17-1113 Zeng X, Zeng D, He S, Liu K, Zhao J (2018) Extracting relational facts by an end-to-end neural model with copy mechanism. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 506–514. https://doi.org/10.18653/v1/p18-1047 Wei Z, Su J, Wang Y, Tian Y, Chang Y (2020) A novel cascade binary tagging framework for relational triple extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 1476–1488. https://doi.org/10.18653/v1/2020.acl-main.136 Zheng H, Wen R, Chen X, Yang Y, Zhang Y, Zhang Z, Zhang N, Qin B, Xu M, Zheng Y (2021) PRGC: potential relation and global correspondence based joint relational triple extraction. In: ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp 6225-6235. https://doi.org/10.18653/v1/2021.acl-long.486 Ren F, Zhang L, Zhao X, Yin S, Liu S, Li B (2022) A simple but effective bidirectional framework for relational triple extraction. In: WSDM 2022 - Proceedings of the 15th ACM International Conference on Web Search and Data Mining, pp 824–832. https://doi.org/10.1145/3488560.3498409 Dixit K., Al-Onaizan Y (2020) Span-level model for relation extraction. In: ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pp 5308–5314. https://doi.org/10.18653/v1/p19-1525 Zhong Z, Chen D (2021) A frustratingly easy approach for entity and relation extraction. In: NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, pp 50–61. https://doi.org/10.18653/v1/2021.naacl-main.5 Wang J, Lu W (2020) Two are better than one: Joint entity and relation extraction with table-sequence encoders. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1706–1721. https://doi.org/10.18653/v1/2020.emnlp-main.133 Ren F, Zhang L, Yin S, Zhao X, Liu S, Li B, Liu Y (2021) A novel global feature-oriented relational triple extraction model based on table filling. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 2646–2656. https://doi.org/10.18653/v1/2021.emnlp-main.208 Li X, Yin F, Sun Z, Li X, Yuan A, Chai D, Zhou M, Li J (2019) Entity-relation extraction as multi-turn question answering. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 1340–1350. https://doi.org/10.18653/v1/p19-1129 Du Z, Qian Y, Liu X, Ding M, Qiu J, Yang Z, Tang J (2022) GLM : general language model pretraining with autoregressive blank infilling. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, pp 320–335. https://doi.org/10.18653/v1/2022.acl-long.26 Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) pp 740–750. https://doi.org/10.3115/v1/d14-1082 Zeng D, Liu K, Chen Y, Zhao J (2015) Distant supervision for relation extraction via piecewise convolutional neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1753–1762. https://doi.org/10.18653/v1/d15-1203 Zhang M, Zhang Y, Fu G (2017) End-to-end neural relation extraction with global optimization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1730–1740. https://doi.org/10.18653/v1/d17-1182 Sun C, Gong Y, Wu Y, Gong M, Jiang D, Lan M, Sun S, Duan N (2019) Joint type inference on entities and relations via graph convolutional networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 1361–1370. https://doi.org/10.18653/v1/p19-1131 Yuan Y, Zhou X, Pan S, Zhu Q, Song Z, Guo L (2020) A relation-specific attention network for joint entity and relation extraction. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), pp 4054–4060. https://doi.org/10.24963/ijcai.2020/561 Wang Y, Yu B, Zhang Y, Liu T, Zhu H, Sun L (2020) Tplinker: single-stage joint extraction of entities and relations through token pair Linking. In: Proceedings of the 28th International Conference on Computational Linguistics, pp 1572–1582. https://doi.org/10.18653/v1/2020.coling-main.138 Wang Y, Sun C, Wu Y, Zhou H, Li L, Yan J (2021) UNIRE: a unified label space for entity relation extraction. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 220–231. https://doi.org/10.18653/v1/2021.acl-long.19 Shang YM, Huang H, Mao XL (2022) OneRel: joint entity and relation extraction with one module in one step. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022, pp 11285–11293. https://doi.org/10.1609/aaai.v36i10.21379 Eberts M, Ulges A (2020) Span-based joint entity and relation extraction with transformer pre-training, vol 325. https://doi.org/10.3233/FAIA200321 Zeng D, Zhang H, Liu Q (2020) Copymtl: copy mechanism for joint extraction of entities and relations with multi-task learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 9507–9514. https://doi.org/10.1609/aaai.v34i05.6495 Nayak T, Ng HT (2020) Effective modeling of encoder–decoder architecture for joint entity and relation extraction. In: AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, pp 8528–8535. https://doi.org/10.1609/aaai.v34i05.6374 Fu TJ, Li PH, Ma WY (2019) Graphrel: modeling text as relational graphs for joint entity and relation extraction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 1409–1418. https://doi.org/10.18653/v1/p19-1136 Zhao K, Xu H, Cheng Y, Li X, Gao K (2021) Representation iterative fusion based on heterogeneous graph neural network for joint entity and relation extraction. Knowl-Based Syst 219:106888. https://doi.org/10.1016/j.knosys.2021.106888 Takanobu R, Zhang T, Liu J, Huang M (2019) A hierarchical framework for relation extraction with reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 7072–7079. https://doi.org/10.1609/aaai.v33i01.33017072 Zeng X, He S, Zeng D, Liu K, Zhao J (2019) Learning the extraction order of multiple relational facts in a sentence with reinforcement learning, pp 367–377. https://doi.org/10.18653/v1/d19-1035 Levy O, Seo M, Choi E, Zettlemoyer L (2017) Zero-shot relation extraction via reading comprehension. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp 333–342. https://doi.org/10.18653/v1/k17-1034 Li X, Feng J, Meng Y, Han Q, Wu F, Li J (2020) A unified mrc framework for named entity recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 5849–5859. https://doi.org/10.18653/v1/2020.acl-main.519 Du X, Cardie C (2020) Event extraction by answering (almost) natural questions. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 671–683. https://doi.org/10.18653/v1/2020.emnlp-main.49 Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703 Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67. https://doi.org/10.48550/arxiv.1910.10683 Paolini G, Athiwaratkun B, Krone J, Ma J, Achille A, Anubhai R, Nogueira C, Xiang B, Soatto S (2021) Structured prediction as translation between augmented natural languages. In: International Conference on Learning Representations (ICLR- 2021). https://doi.org/10.48550/arXiv.2101.05779 Zhang N, Ye H, Deng S, Tan C, Chen M, Huang S, Huang F, Chen H (2021) Contrastive information extraction with generative transformer. IEEE/ACM Trans Audio Speech Lang Process 29:3077–3088. https://doi.org/10.1109/TASLP.2021.3110126 Cabot PLH, Navigli R (2021) REBEL: relation extraction by end-to-end language generation. In: Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021, pp 2370–2381. https://doi.org/10.18653/v1/2021.findings-emnlp.204 Lu Y, Liu Q, Dai D, Xiao X, Lin H, Han X, Sun L, Wu H (2022) Unified structure generation for universal information extraction. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, pp 5755–5772. https://doi.org/10.18653/v1/2022.acl-long.395. https://arxiv.org/abs/2203.12277 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Łukasz Kaiser Polosukhin, I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6000–6010. https://doi.org/10.48550/arXiv.1706.03762 Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4171–4186. https://doi.org/10.18653/V1/N19-1423 Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp 5753–5763. https://doi.org/10.48550/arXiv.1906.08237 Radford A, Narasimhan K, Salimans T, Sutskever I, et al (2018) Improving language understanding by generative pre-training. OpenAI. https://api.semanticscholar.org/CorpusID:49313245 Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler DM, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901. https://doi.org/10.48550/arXiv.2005.14165 Misra I, van der Maaten L (2020) Self-supervised learning of pretext-invariant representations. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 6707–6717. https://doi.org/10.1109/CVPR42600.2020.00674 Fang H, Wang S, Zhou M, Ding J, Xie P (2020) Cert: Contrastive self-supervised learning for language understanding. arXiv preprint arXiv:2005.12766. https://doi.org/10.48550/arXiv.2005.12766 Gao T, Yao X, Chen D (2021) SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 6894–6910. https://doi.org/10.18653/v1/2021.emnlp-main.552 Yang Z, Cheng Y, Liu Y, Sun M (2019) Reducing word omission errors in neural machine translation: a contrastive learning approach. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistic, pp 6191–6196. https://doi.org/10.18653/v1/p19-1623 Lee S, Lee DB, Hwang SJ (2021) Contrastive learning with adversarial perturbations for conditional text generation. In: International Conference on Learning Representations (ICLR 2021), https://arxiv.org/abs/2012.07280 Riedel S, Yao L, McCallum A (2010) Modeling relations and their mentions without labeled text. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp 148–163. https://doi.org/10.1007/978-3-642-15939-8_10 Gardent C, Shimorina A, Narayan S, Perez-Beltrachini L (2017) Creating training corpora for nlg micro-planning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 179–188. https://doi.org/10.18653/v1/P17-1017 Huang H, Shang YM, Sun X, Wei W, Mao X (2022) Three birds, one stone: a novel translation based framework for joint entity and relation extraction. Knowl-Based Syst 236:107677. https://doi.org/10.1016/j.knosys.2021.107677