Boosting court judgment prediction and explanation using legal entities

Irene Benedetto1, Alkis Koudounas1, Lorenzo Vaiani1, Eliana Pastor1, Luca Cagliero1, Francesco Tarasconi2, Elena Baralis1
1Department of Computer and Control Engineering, Politecnico di Torino, Corso Castelfidardo, 39, 10129, Turin, Italy
2MAIZE SRL, Via San Quintino, 31, 10121, Turin, Italy

Tóm tắt

Từ khóa


Tài liệu tham khảo

Alali M, Syed S, Alsayed M, et al (2021) Justice: a benchmark dataset for supreme court’s judgment prediction. arXiv:2112.03414

Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D et al (2016) Predicting judicial decisions of the European court of human rights: a natural language processing perspective. PeerJ Comput Sci 2:e93. https://doi.org/10.7717/peerj-cs.93

Angelidis I, Chalkidis I, Koubarakis M (2018) Named entity recognition, linking and generation for greek legislation. In: JURIX, URL https://ebooks.iospress.nl/volumearticle/50829

Arrieta AB, Díaz-Rodríguez N, Del Ser J et al (2020) Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion 58:82–115

Attanasio G, Pastor E, Di Bonaventura C, et al (2023) ferret: a framework for benchmarking explainers on transformers. In: Croce D, Soldaini L (eds) Proceedings of the 17th conference of the European chapter of the association for computational linguistics: system demonstrations. Association for Computational Linguistics, Dubrovnik, Croatia, pp 256–266, https://doi.org/10.18653/v1/2023.eacl-demo.29, URL https://aclanthology.org/2023.eacl-demo.29

Au TWT, Cox IJ, Lampos V (2022) E-NER—an annotated named entity recognition corpus of legal text. CoRR arXiv:abs/2212.09306. https://doi.org/10.48550/arXiv.2212.09306,

Benedetto I, Cagliero L, Tarasconi F (2022) Automatic inference of taxonomy relationships among legal documents. In: Chiusano S, Cerquitelli T, Wrembel R, et al (eds) New Trends in Database and Information Systems. Springer International Publishing, Cham, pp 24–33, https://doi.org/10.1007/978-3-031-15743-1_3

Benedetto I, Cagliero L, Tarasconi F, et al (2023a) Benchmarking abstractive models for italian legal news summarization. In: Sileno G, Spanakis J, van Dijck G (eds) Legal knowledge and information systems—JURIX 2023: the thirty-sixth annual conference, Maastricht, The Netherlands, 18-20 December 2023, Frontiers in Artificial Intelligence and Applications, vol 379. IOS Press, pp 311–316, https://doi.org/10.3233/FAIA230980,

Benedetto I, Koudounas A, Vaiani L, et al (2023b) PoliToHFI at SemEval-2023 task 6: leveraging entity-aware and hierarchical transformers for legal entity recognition and court judgment prediction. In: Proceedings of the The 17th international workshop on semantic evaluation (SemEval-2023). Association for computational linguistics, Toronto, Canada, pp 1401–1411, URL https://aclanthology.org/2023.semeval-1.194

Benedetto I, Sportelli G, Bertoldo S et al (2023) On the use of pretrained language models for legal Italian document classification. Proc Comput Sci 225:2244–2253. https://doi.org/10.1016/j.procs.2023.10.215

Bhambhoria R, Dahan S, Zhu X (2021) Investigating the state-of-the-art performance and explainability of legal judgment prediction. In: Canadian Conference on AI

Bhambhoria R, Liu H, Dahan S, et al (2022) Interpretable low-resource legal decision making. In: Proceedings of the AAAI conference on artificial intelligence, pp 11819–11827

Bibal A, Lognoul M, De Streel A et al (2021) Legal requirements on explainability in machine learning. Artif Intell Law 29:149–169. https://doi.org/10.1007/s10506-020-09270-4

Chalkidis I, Søgaard A (2022) Improved multi-label classification under temporal concept drift: rethinking group-robust algorithms in a label-wise setting. In: Findings of the association for computational linguistics: ACL 2022. Association for computational linguistics, Dublin, Ireland, pp 2441–2454, https://doi.org/10.18653/v1/2022.findings-acl.192, URL https://aclanthology.org/2022.findings-acl.192

Chalkidis I, Androutsopoulos I, Aletras N (2019) Neural legal judgment prediction in English. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for computational linguistics, Florence, Italy, pp 4317–4323, https://doi.org/10.18653/v1/P19-1424, URL https://aclanthology.org/P19-1424

Chalkidis I, Fergadiotis M, Malakasiotis P, et al (2020) LEGAL-BERT: the muppets straight out of law school. In: Findings of the association for computational linguistics: EMNLP 2020. Association for computational linguistics, Online, pp 2898–2904, https://doi.org/10.18653/v1/2020.findings-emnlp.261

Choi E, Levy O, Choi Y, et al (2018) Ultra-fine entity typing. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, pp 87–96, https://doi.org/10.18653/v1/P18-1009, URL https://aclanthology.org/P18-1009

Cui J, Shen X, Nie F, et al (2022) A survey on legal judgment prediction: Datasets, metrics, models and challenges. arXiv preprint arXiv:2204.04859

Dai Y, Feng D, Huang J, et al (2023) Laiw: A chinese legal large language models benchmark (A technical report). CoRR arXiv:abs/2310.05620. https://doi.org/10.48550/ARXIV.2310.05620,

Dettmers T, Lewis M, Shleifer S, et al (2021) 8-bit optimizers via block-wise quantization. CoRR arXiv:abs/2110.02861

Devlin J, Chang M, Lee K, et al (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186, https://doi.org/10.18653/v1/n19-1423,

DeYoung J, Jain S, Rajani NF, et al (2020) ERASER: a benchmark to evaluate rationalized NLP models. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, Online, pp 4443–4458, https://doi.org/10.18653/v1/2020.acl-main.408, URL https://aclanthology.org/2020.acl-main.408

Dozier C, Kondadadi R, Light M et al (2010) Named entity recognition and resolution in legal text. Springer, Berlin. https://doi.org/10.1007/978-3-642-12837-0_2

Fei Z, Shen X, Zhu D, et al (2023) Lawbench: Benchmarking legal knowledge of large language models. CoRR arXiv:abs/2309.16289. https://doi.org/10.48550/ARXIV.2309.16289

Goel K, Rajani NF, Vig J, et al (2021) Robustness gym: unifying the NLP evaluation landscape. In: Proceedings of the 2021 Conference of the North American chapter of the association for computational linguistics: human language technologies: demonstrations. Association for computational linguistics, Online, pp 42–55, https://doi.org/10.18653/v1/2021.naacl-demos.6, URL https://aclanthology.org/2021.naacl-demos.6

Górski L, Ramakrishna S (2021) Explainable artificial intelligence, lawyer’s perspective. In: Proceedings of the eighteenth international conference on artificial intelligence and law. Association for computing machinery, New York, NY, USA, ICAIL ’21, p 60-68, https://doi.org/10.1145/3462757.3466145,

Górski Ł, Ramakrishna S, Nowosielski JM (2021) Towards grad-cam based explainability in a legal text processing pipeline. extended version. In: Rodríguez-Doncel V, Palmirani M, Araszkiewicz M, et al (eds) AI approaches to the complexity of legal systems XI-XII. Springer International Publishing, Cham, pp 154–168, URL https://link.springer.com/chapter/10.1007/978-3-030-89811-3_11

Guha N, Nyarko J, Ho DE, et al (2023) Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models. arXiv:2308.11462

Hassan F, Domingo-Ferrer J, Soria-Comas J (2018) Anonymization of unstructured data via named-entity recognition. In: Torra V, Narukawa Y, Aguiló I et al (eds) Modeling decisions for artificial intelligence. Springer International Publishing, Cham, pp 296–305

Hendrycks D, Burns C, Chen A, et al (2021) CUAD: an expert-annotated NLP dataset for legal contract review. CoRR arXiv:abs/2103.06268

Hu EJ, Shen Y, Wallis P, et al (2021) Lora: Low-rank adaptation of large language models. CoRR arXiv:abs/2106.09685

Jain D, Borah MD, Biswas A (2021) Summarization of legal documents: where are we now and the way forward. Comput Sci Rev 40:100388. https://doi.org/10.1016/j.cosrev.2021.100388

Jiang AQ, Sablayrolles A, Mensch A, et al (2023) Mistral 7b. arXiv:2310.06825

Kalamkar P, Agarwal A, Tiwari A, et al (2022a) Named entity recognition in Indian court judgments. In: Proceedings of the natural legal language processing workshop 2022. Association for computational linguistics, Abu Dhabi, United Arab Emirates (Hybrid), pp 184–193, URL https://aclanthology.org/2022.nllp-1.15

Kalamkar P, Tiwari A, Agarwal A, et al (2022b) Corpus for automatic structuring of legal documents. In: Proceedings of the thirteenth language resources and evaluation conference. European language resources association, Marseille, France, pp 4420–4429, URL https://aclanthology.org/2022.lrec-1.470

Kaur A, Bozic B (2019) Convolutional neural network-based automatic prediction of judgments of the european court of human rights. In: Irish conference on artificial intelligence and cognitive science, URL https://ceur-ws.org/Vol-2563/aics_42.pdf

Koudounas A, Giobergia F, Baralis E (2023a) Bad exoplanet! explaining degraded performance when reconstructing exoplanets atmospheric parameters. In: NeurIPS 2023 AI for science workshop, URL https://openreview.net/forum?id=9Z4XZOhwiz

Koudounas A, Pastor E, Attanasio G, et al (2023b) Exploring subgroup performance in end-to-end speech models. In: ICASSP 2023 - 2023 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1–5, https://doi.org/10.1109/ICASSP49357.2023.10095284

Koudounas A, Pastor E, Attanasio G, et al (2024a) Prioritizing data acquisition for end-to-end speech model improvement. In: ICASSP 2024 - 2024 IEEE international conference on acoustics, speech and signal processing (ICASSP)

Koudounas A, Pastor E, Attanasio G et al (2024) Towards comprehensive subgroup performance analysis in speech models. IEEE/ACM Trans Audio Speech Lang Process. https://doi.org/10.1109/TASLP.2024.3363447

Kowsrihawat K, Vateekul P, Boonkwan P (2018) Predicting judicial decisions of criminal cases from thai supreme court using bi-directional gru with attention mechanism. In: 2018 5th Asian conference on defense technology (ACDT) pp 50–55. URL https://ieeexplore.ieee.org/document/8592948

Lavie A, Agarwal A (2007) METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the second workshop on statistical machine translation. Association for computational linguistics, Prague, Czech Republic, pp 228–231, URL https://aclanthology.org/W07-0734

Leitner E, Rehm G, Moreno-Schneider J (2020) A dataset of German legal documents for named entity recognition. In: Proceedings of the twelfth language resources and evaluation conference. European language resources association, Marseille, France, pp 4478–4485, URL https://aclanthology.org/2020.lrec-1.551

Li J, Sun A, Han J et al (2020) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70

Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Text summarization branches out. Association for computational linguistics, Barcelona, Spain, pp 74–81, URL https://aclanthology.org/W04-1013

Liu H, Tam D, Muqeeth M, et al (2022) Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. arXiv:2205.05638

Liu Y, Ott M, Goyal N, et al (2019) Roberta: a robustly optimized bert pretraining approach. https://doi.org/10.48550/ARXIV.1907.11692, URL https://arxiv.org/abs/1907.11692

Lu J, Henchion M, Bacher I, et al (2021) A sentence-level hierarchical BERT model for document classification with limited labelled data, pp 231–241. https://doi.org/10.1007/978-3-030-88942-5_18

Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, et al (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc., URL https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

Luo CF, Bhambhoria R, Dahan S, et al (2022) Evaluating explanation correctness in legal decision making. In: Proceedings of the Canadian conference on artificial intelligence https://doi.org/10.21428/594757db.8718dc8b

Malik V, Sanjay R, Nigam SK, et al (2021) ILDC for CJPE: Indian legal documents corpus for court judgment prediction and explanation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). Association for computational linguistics, Online, pp 4046–4062, https://doi.org/10.18653/v1/2021.acl-long.313

McCallum A, Li W (2003) Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003, pp 188–191, URL https://aclanthology.org/W03-0430

Medvedeva M, Üstün A, Xu X, et al (2021) Automatic judgement forecasting for pending applications of the european court of human rights. In: ASAIL/LegalAIIA@ ICAIL, pp 12–23, URL https://ceur-ws.org/Vol-2888/paper2.pdf

Mosbach M, Pimentel T, Ravfogel S, et al (2023) Few-shot fine-tuning vs. in-context learning: A fair comparison and evaluation. In: Findings of the association for computational linguistics: ACL 2023. Association for computational linguistics, Toronto, Canada, pp 12284–12314, https://doi.org/10.18653/v1/2023.findings-acl.779, URL https://aclanthology.org/2023.findings-acl.779

Napolitano D, Cagliero L (2023) GX-HUI: global explanations of AI models based on high-utility itemsets. In: Shahriar H, Teranishi Y, Cuzzocrea A, et al (eds) 47th IEEE annual computers, software, and applications conference, COMPSAC 2023, Torino, Italy, June 26-30, 2023. IEEE, pp 292–297, https://doi.org/10.1109/COMPSAC57700.2023.00045,

Papineni K, Roukos S, Ward T, et al (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for computational linguistics, USA, ACL ’02, pp 311–318, https://doi.org/10.3115/1073083.1073135,

Pastor E, Baralis E (2019) Explaining black box models by means of local rules. In: Proceedings of the 34th ACM/SIGAPP symposium on applied computing. Association for computing machinery, New York, NY, USA, SAC ’19, pp 510–517, https://doi.org/10.1145/3297280.3297328

Pastor E, de Alfaro L, Baralis E (2021a) Looking for trouble: analyzing classifier behavior via pattern divergence. In: Proceedings of the 2021 international conference on management of data. Association for computing machinery, New York, NY, USA, SIGMOD ’21, p 1400-1412, https://doi.org/10.1145/3448016.3457284,

Pastor E, Gavgavian A, Baralis E et al (2021) How divergent is your data? Proc VLDB Endow 14(12):2835–2838. https://doi.org/10.14778/3476311.3476357

Pastor E, Baralis E, de Alfaro L (2023) A hierarchical approach to anomalous subgroup discovery. In: 2023 IEEE 39th international conference on data engineering (ICDE), pp 2647–2659, https://doi.org/10.1109/ICDE55515.2023.00203

Pastor E, Koudounas A, Attanasio G, et al (2024) Explaining speech classification models via word-level audio segments and paralinguistic features. In: Proceedings of the 18th conference of the European chapter of the association for computational linguistics. Association for computational linguistics

Paul S, Goyal P, Ghosh S (2022) Lesicin: a heterogeneous graph-based approach for automatic legal statute identification from Indian legal documents. In: Proceedings of the AAAI conference on artificial intelligence, pp 11139–11146, URL https://aaai-2022.virtualchair.net/poster_aaai10463

Quemy A, Wrembel R (2020) On integrating and classifying legal text documents. In: Hartmann S, Küng J, Kotsis G, et al (eds) Database and expert systems applications. Springer International Publishing, Cham, pp 385–399, URL https://dl.acm.org/doi/abs/10.1007/978-3-030-59003-1_25

Ribeiro MT, Singh S, Guestrin C (2016) “why should i trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. Association for computing machinery, New York, NY, USA, KDD ’16, pp 1135–1144, https://doi.org/10.1145/2939672.2939778,

Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215

Saeed W, Omlin CW (2023) Explainable AI (XAI): a systematic meta-survey of current challenges and future opportunities. Knowl Based Syst 263:110273. https://doi.org/10.1016/j.knosys.2023.110273

Sansone C, Sperlí G (2022) Legal information retrieval systems: state-of-the-art and open issues. Inf Syst 106:101967. https://doi.org/10.1016/j.is.2021.101967

Selvaraju RR, Cogswell M, Das A, et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626

Setzu M, Guidotti R, Monreale A et al (2021) Glocalx—from local to global explanations of black box AI models. Artif Intell 294:103457. https://doi.org/10.1016/j.artint.2021.103457

Shaikh RA, Sahu TP, Anand V (2020) Predicting outcomes of legal cases based on legal factors using classifiers. Proc Comput Sci 167:2393–2402. https://doi.org/10.1016/j.procs.2020.03.292

Shukla A, Bhattacharya P, Poddar S, et al (2022) Legal case document summarization: extractive and abstractive methods and their evaluation. In: Proceedings of the 2nd conference of the asia-pacific chapter of the association for computational linguistics and the 12th international joint conference on natural language processing (Volume 1: Long Papers). Association for Computational Linguistics, Online only, pp 1048–1064, URL https://aclanthology.org/2022.aacl-main.77

Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034

Strickson B, De La Iglesia B (2020) Legal judgement prediction for UK courts. In: Proceedings of the 3rd international conference on information science and systems. Association for computing machinery, New York, NY, USA, ICISS ’20, p 204-209, https://doi.org/10.1145/3388176.3388183,

Sundararajan M, Taly A, Yan Q (2017a) Axiomatic attribution for deep networks. In: International conference on machine learning, PMLR, pp 3319–3328

Sundararajan M, Taly A, Yan Q (2017b) Axiomatic attribution for deep networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, proceedings of machine learning research, vol 70. PMLR, pp 3319–3328, URL https://proceedings.mlr.press/v70/sundararajan17a.html

Tiersma P (2000) Legal language. Bibliovault OAI Repository, the University of Chicago Press 27. https://doi.org/10.1016/S1352-0237(00)00210-0

Tjong Kim Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003, pp 142–147, URL https://aclanthology.org/W03-0419

Touvron H, Martin L, Stone K, et al (2023) Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288

Tunstall L, Beeching E, Lambert N, et al (2023) Zephyr: direct distillation of lm alignment. arXiv:2310.16944

Ventura F, Greco S, Apiletti D et al (2022) Trusting deep learning natural-language models via local and global explanations. Knowl Inf Syst 64(7):1863–1907

Visentin A, Nardotto A, O’Sullivan B (2019) Predicting judicial decisions: a statistically rigorous approach and a new ensemble classifier. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI) pp 1820–1824. URL https://ieeexplore.ieee.org/document/8995348

Williams C (2005) Tradition and Change in Legal English. Peter Lang Verlag, Lausanne, Switzerland, https://doi.org/10.3726/978-3-0351-0317-5, URL https://www.peterlang.com/document/1043657

Yamada I, Asai A, Shindo H, et al (2020) LUKE: deep contextualized entity representations with entity-aware self-attention. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). Association for computational linguistics, Online, pp 6442–6454, https://doi.org/10.18653/v1/2020.emnlp-main.523, URL https://aclanthology.org/2020.emnlp-main.523

Zhang Y, Zhong V, Chen D, et al (2017) Position-aware attention and supervised data improve slot filling. In: Conference on empirical methods in natural language processing

Zhao H, Chen H, Yang F, et al (2023) Explainability for large language models: a survey. arXiv:2309.01029

Zhong L, Zhong Z, Zhao Z, et al (2019) Automatic summarization of legal decisions using iterative masking of predictive sentences. In: Proceedings of the seventeenth international conference on artificial intelligence and law. Association for computing machinery, New York, NY, USA, ICAIL ’19, pp 163–172, https://doi.org/10.1145/3322640.3326728