CLJul 29, 2025

Multilingual JobBERT for Cross-Lingual Job Title Matching

Jens-Joris Decorte, Matthias De Lange, Jeroen Van Hautte

arXiv:2507.21609v13 citationsh-index: 4Has CodeCLEF

Originality Incremental advance

AI Analysis

This addresses job matching across languages for multilingual labor market applications, but it is incremental as it builds on an existing model.

The paper tackles cross-lingual job title matching by introducing JobBERT-V3, which extends a monolingual model to support English, German, Spanish, and Chinese using synthetic translations and a dataset of over 21 million job titles, achieving consistent performance on the TalentCLEF 2025 benchmark.

We introduce JobBERT-V3, a contrastive learning-based model for cross-lingual job title matching. Building on the state-of-the-art monolingual JobBERT-V2, our approach extends support to English, German, Spanish, and Chinese by leveraging synthetic translations and a balanced multilingual dataset of over 21 million job titles. The model retains the efficiency-focused architecture of its predecessor while enabling robust alignment across languages without requiring task-specific supervision. Extensive evaluations on the TalentCLEF 2025 benchmark demonstrate that JobBERT-V3 outperforms strong multilingual baselines and achieves consistent performance across both monolingual and cross-lingual settings. While not the primary focus, we also show that the model can be effectively used to rank relevant skills for a given job title, demonstrating its broader applicability in multilingual labor market intelligence. The model is publicly available: https://huggingface.co/TechWolf/JobBERT-v3.

View on arXiv PDF

Similar