CL IRSep 14, 2021

conSultantBERT: Fine-tuned Siamese Sentence-BERT for Matching Jobs and Job Seekers

Dor Lavi, Volodymyr Medentsiy, David Graus

arXiv:2109.06501v12.840 citations

Originality Synthesis-oriented

AI Analysis

This addresses job matching for staffing consultants, but it is incremental as it applies an existing method to a specific domain with new data.

The paper tackled the problem of matching jobs and job seekers by constructing embeddings from noisy, heterogeneous, and cross-lingual textual data in vacancies and resumes, resulting in a fine-tuned Siamese Sentence-BERT model that significantly outperformed baselines like TF-IDF and BERT embeddings.

In this paper we focus on constructing useful embeddings of textual information in vacancies and resumes, which we aim to incorporate as features into job to job seeker matching models alongside other features. We explain our task where noisy data from parsed resumes, heterogeneous nature of the different sources of data, and crosslinguality and multilinguality present domain-specific challenges. We address these challenges by fine-tuning a Siamese Sentence-BERT (SBERT) model, which we call conSultantBERT, using a large-scale, real-world, and high quality dataset of over 270,000 resume-vacancy pairs labeled by our staffing consultants. We show how our fine-tuned model significantly outperforms unsupervised and supervised baselines that rely on TF-IDF-weighted feature vectors and BERT embeddings. In addition, we find our model successfully matches cross-lingual and multilingual textual content.

View on arXiv PDF

Similar