TempCourt: evaluation of temporal taggers on a new corpus of court decisions

María Navas Loro, Erwin Filtz, Víctor Rodríguez-Doncel, Axel Polleres, Sabrina Kirrane

Publication: Scientific journalJournal articlepeer-review

Abstract

The extraction and processing of temporal expressions (TEs) in textual documents have been extensively studied in several domains; however, for the legal domain it remains an open challenge. This is possibly due to the scarcity of corpora in the domain and the particularities found in legal documents that are highlighted in this paper. Considering the pivotal role played by temporal information when it comes to analyzing legal cases, this paper presents TempCourt, a corpus of 30 legal documents from the European Court of Human Rights, the European Court of Justice, and the United States Supreme Court with manually annotated TEs. The corpus contains two different temporal annotation sets that adhere to the TimeML standard, the first one capturing all TEs and the second dedicated to TEs that are relevant for the case under judgment (thus excluding dates of previous court decisions). The proposed gold standards are subsequently used to compare ten state-of-the-art cross-domain temporal taggers, and to identify not only the limitations of cross-domain temporal taggers but also limitations of the TimeML standard when applied to legal documents. Finally, the paper identifies the need for dedicated resources and the adaptation of existing tools, and specific annotation guidelines that can be adapted to different types of legal documents.
Original languageEnglish
Pages (from-to)E34
JournalThe Knowledge Engineering Review
Volume34
DOIs
Publication statusPublished - 2019

Cite this