Luis Adrián Cabrera-Diego

CL
3papers
1citation
Novelty8%
AI Score23

3 Papers

CLDec 31, 2025
Classifying long legal documents using short random chunks

Luis Adrián Cabrera-Diego

Classifying legal documents is a challenge, besides their specialized vocabulary, sometimes they can be very long. This means that feeding full documents to a Transformers-based models for classification might be impossible, expensive or slow. Thus, we present a legal document classifier based on DeBERTa V3 and a LSTM, that uses as input a collection of 48 randomly-selected short chunks (max 128 tokens). Besides, we present its deployment pipeline using Temporal, a durable execution solution, which allow us to have a reliable and robust processing workflow. The best model had a weighted F-score of 0.898, while the pipeline running on CPU had a processing median time of 498 seconds per 100 files.

IRFeb 21, 2017
Algorithmes de classification et d'optimisation: participation du LIA/ADOC á DEFT'14

Luis Adrián Cabrera-Diego, Stéphane Huet, Bassam Jabaian et al.

This year, the DEFT campaign (Défi Fouilles de Textes) incorporates a task which aims at identifying the session in which articles of previous TALN conferences were presented. We describe the three statistical systems developed at LIA/ADOC for this task. A fusion of these systems enables us to obtain interesting results (micro-precision score of 0.76 measured on the test corpus)

CLFeb 21, 2017
Systèmes du LIA à DEFT'13

Xavier Bost, Ilaria Brunetti, Luis Adrián Cabrera-Diego et al.

The 2013 Défi de Fouille de Textes (DEFT) campaign is interested in two types of language analysis tasks, the document classification and the information extraction in the specialized domain of cuisine recipes. We present the systems that the LIA has used in DEFT 2013. Our systems show interesting results, even though the complexity of the proposed tasks.