CLSep 11, 2025

Reading Between the Lines: Classifying Resume Seniority with Large Language Models

Matan Cohen, Shira Shani, Eden Menahem, Yehudit Aperstein, Alexander Apartsin

arXiv:2509.09229v1

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of automating seniority assessment in hiring to reduce bias from self-promotional language, though it appears incremental as it applies existing LLM methods to a new dataset.

The study tackled the problem of accurately classifying candidate seniority from resumes, which is complicated by overstated experience and ambiguous language, by evaluating large language models (LLMs) on a hybrid dataset of real and synthetic resumes, achieving promising results for AI-driven evaluation systems.

Accurately assessing candidate seniority from resumes is a critical yet challenging task, complicated by the prevalence of overstated experience and ambiguous self-presentation. In this study, we investigate the effectiveness of large language models (LLMs), including fine-tuned BERT architectures, for automating seniority classification in resumes. To rigorously evaluate model performance, we introduce a hybrid dataset comprising both real-world resumes and synthetically generated hard examples designed to simulate exaggerated qualifications and understated seniority. Using the dataset, we evaluate the performance of Large Language Models in detecting subtle linguistic cues associated with seniority inflation and implicit expertise. Our findings highlight promising directions for enhancing AI-driven candidate evaluation systems and mitigating bias introduced by self-promotional language. The dataset is available for the research community at https://bit.ly/4mcTovt

View on arXiv PDF

Similar