CLLGGNNov 28, 2025

Standard Occupation Classifier -- A Natural Language Processing Approach

arXiv:2511.23057v1
Originality Synthesis-oriented
AI Analysis

This provides a tool for analyzing labor market demand from job ads, but it is incremental as it applies existing NLP methods to a specific domain.

The paper tackled the problem of automatically classifying job advertisements into standard occupation codes using natural language processing, achieving up to 61% accuracy for the fourth tier and 72% for the third tier of SOC with an ensemble model combining BERT and a neural network.

Standard Occupational Classifiers (SOC) are systems used to categorize and classify different types of jobs and occupations based on their similarities in terms of job duties, skills, and qualifications. Integrating these facets with Big Data from job advertisement offers the prospect to investigate labour demand that is specific to various occupations. This project investigates the use of recent developments in natural language processing to construct a classifier capable of assigning an occupation code to a given job advertisement. We develop various classifiers for both UK ONS SOC and US O*NET SOC, using different Language Models. We find that an ensemble model, which combines Google BERT and a Neural Network classifier while considering job title, description, and skills, achieved the highest prediction accuracy. Specifically, the ensemble model exhibited a classification accuracy of up to 61% for the lower (or fourth) tier of SOC, and 72% for the third tier of SOC. This model could provide up to date, accurate information on the evolution of the labour market using job advertisements.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes