CLJun 24, 2023

Can GPT-4 Support Analysis of Textual Data in Tasks Requiring Highly Specialized Domain Expertise?

Jaromir Savelka, Kevin D. Ashley, Morgan A Gray, Hannes Westermann, Huihui Xu

CMU

arXiv:2306.13906v111.9127 citationsh-index: 25

Originality Incremental advance

AI Analysis

This addresses the problem of automating complex text analysis for researchers and practitioners in specialized domains like law, though it is incremental in applying existing AI to new tasks.

The study evaluated GPT-4's ability to analyze textual data requiring highly specialized domain expertise, specifically in interpreting legal concepts from court opinions, and found it performs on par with well-trained law student annotators, with batch predictions offering significant cost reductions at a minor performance decrease.

We evaluated the capability of generative pre-trained transformers~(GPT-4) in analysis of textual data in tasks that require highly specialized domain expertise. Specifically, we focused on the task of analyzing court opinions to interpret legal concepts. We found that GPT-4, prompted with annotation guidelines, performs on par with well-trained law student annotators. We observed that, with a relatively minor decrease in performance, GPT-4 can perform batch predictions leading to significant cost reductions. However, employing chain-of-thought prompting did not lead to noticeably improved performance on this task. Further, we demonstrated how to analyze GPT-4's predictions to identify and mitigate deficiencies in annotation guidelines, and subsequently improve the performance of the model. Finally, we observed that the model is quite brittle, as small formatting related changes in the prompt had a high impact on the predictions. These findings can be leveraged by researchers and practitioners who engage in semantic/pragmatic annotations of texts in the context of the tasks requiring highly specialized domain expertise.

View on arXiv PDF

Similar