CLJan 12, 2021

Neural Contract Element Extraction Revisited: Letters from Sesame Street

Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Ion Androutsopoulos

arXiv:2101.04355v22.018 citations

Originality Synthesis-oriented

AI Analysis

This work addresses contract element extraction for legal document analysis, but it is incremental as it revisits and refines existing methods with task-specific insights.

The study tackled contract element extraction by comparing various neural architectures and embeddings, finding that LSTM-based encoders outperform dilated CNNs, Transformers, and BERT, and domain-specific Word2Vec embeddings beat generic GloVe, with Transformers struggling due to lack of recurrency in this context-sensitive task.

We investigate contract element extraction. We show that LSTM-based encoders perform better than dilated CNNs, Transformers, and BERT in this task. We also find that domain-specific WORD2VEC embeddings outperform generic pre-trained GLOVE embeddings. Morpho-syntactic features in the form of POS tag and token shape embeddings, as well as context-aware ELMO embeddings do not improve performance. Several of these observations contradict choices or findings of previous work on contract element extraction and generic sequence labeling tasks, indicating that contract element extraction requires careful task-specific choices. Analyzing the results of (i) plain TRANSFORMER-based and (ii) BERT-based models, we find that in the examined task, where the entities are highly context-sensitive, the lack of recurrency in TRANSFORMERs greatly affects their performance.

View on arXiv PDF

Similar