David Graus

h-index8

6papers

69citations

Novelty23%

AI Score19

Ranked #187,398 of 194,257 authors (top 96%)#30,181 in CL (top 98%)

6 Papers

2.0LGSep 11, 2023

Career Path Recommendations for Long-term Income Maximization: A Reinforcement Learning Approach

Spyros Avlonitis, Dor Lavi, Masoud Mansoury et al.

This study explores the potential of reinforcement learning algorithms to enhance career planning processes. Leveraging data from Randstad The Netherlands, the study simulates the Dutch job market and develops strategies to optimize employees' long-term income. By formulating career planning as a Markov Decision Process (MDP) and utilizing machine learning algorithms such as Sarsa, Q-Learning, and A2C, we learn optimal policies that recommend career paths with high-income occupations and industries. The results demonstrate significant improvements in employees' income trajectories, with RL models, particularly Q-Learning and Sarsa, achieving an average increase of 5% compared to observed career paths. The study acknowledges limitations, including narrow job filtering, simplifications in the environment formulation, and assumptions regarding employment continuity and zero application costs. Future research can explore additional objectives beyond income optimization and address these limitations to further enhance career planning processes.

0.9CLAug 31, 2023

Enhancing PLM Performance on Labour Market Tasks via Instruction-based Finetuning and Prompt-tuning with Rules

Jarno Vrolijk, David Graus

The increased digitization of the labour market has given researchers, educators, and companies the means to analyze and better understand the labour market. However, labour market resources, although available in high volumes, tend to be unstructured, and as such, research towards methodologies for the identification, linking, and extraction of entities becomes more and more important. Against the backdrop of this quest for better labour market representations, resource constraints and the unavailability of large-scale annotated data cause a reliance on human domain experts. We demonstrate the effectiveness of prompt-based tuning of pre-trained language models (PLM) in labour market specific applications. Our results indicate that cost-efficient methods such as PTR and instruction tuning without exemplars can significantly increase the performance of PLMs on downstream labour market applications without introducing additional model layers, manual annotations, and data augmentation.

2.8CLSep 14, 2021

conSultantBERT: Fine-tuned Siamese Sentence-BERT for Matching Jobs and Job Seekers

Dor Lavi, Volodymyr Medentsiy, David Graus

In this paper we focus on constructing useful embeddings of textual information in vacancies and resumes, which we aim to incorporate as features into job to job seeker matching models alongside other features. We explain our task where noisy data from parsed resumes, heterogeneous nature of the different sources of data, and crosslinguality and multilinguality present domain-specific challenges. We address these challenges by fine-tuning a Siamese Sentence-BERT (SBERT) model, which we call conSultantBERT, using a large-scale, real-world, and high quality dataset of over 270,000 resume-vacancy pairs labeled by our staffing consultants. We show how our fine-tuned model significantly outperforms unsupervised and supervised baselines that rely on TF-IDF-weighted feature vectors and BERT embeddings. In addition, we find our model successfully matches cross-lingual and multilingual textual content.

8.5IRSep 6, 2021

Job Posting-Enriched Knowledge Graph for Skills-based Matching

Maurits de Groot, Jelle Schutte, David Graus

The labor market is constantly evolving. Occupations are changing, being added, or disappearing to fit the needs of today's market. In recent years the pace of this change has accelerated, due to factors such as globalization, digitization, and the shift to working from home. Different factors are relevant when selecting employment, e.g., cultural fit, compensation, provided degree of freedom. To successfully fulfill an occupation the gap between required (by the job) and possessed (by the job seeker) skills needs to be as small as possible. Decreasing this skill-gap improves the fit between a job candidate and occupation. In this paper we propose a custom-built Skills & Occupation Knowledge Graph (KG) that fits the above described dynamic nature of the labor market, by leveraging existing skills and occupation taxonomies enriched with external job posting data. We leverage this KG and explore several applications for skills-based matching of jobs to job seekers. First, we study link prediction as a means to quantify relevance of skills to occupations, which can help in prioritizing learning and development of employees. Next, we study node similarity methods and shortest path algorithms for career pathfinding. Finally, we leverage a term weighting method for identifying which skills are most "distinctive" for different (types of) occupations.

2.0IRFeb 22, 2021

Entities of Interest

David Graus

In the era of big data, we continuously - and at times unknowingly - leave behind digital traces, by browsing, sharing, posting, liking, searching, watching, and listening to online content. When aggregated, these digital traces can provide powerful insights into the behavior, preferences, activities, and traits of people. While many have raised privacy concerns around the use of aggregated digital traces, it has undisputedly brought us many advances, from the search engines that learn from their users and enable our access to unforeseen amounts of data, knowledge, and information, to, e.g., the discovery of previously unknown adverse drug reactions from search engine logs. Whether in online services, journalism, digital forensics, law, or research, we increasingly set out to exploring large amounts of digital traces to discover new information. Consider for instance, the Enron scandal, Hillary Clinton's email controversy, or the Panama papers: cases that revolve around analyzing, searching, investigating, exploring, and turning upside down large amounts of digital traces to gain new insights, knowledge, and information. This discovery task is at its core about "finding evidence of activity in the real world." This dissertation revolves around discovery in digital traces, and sits at the intersection of Information Retrieval, Natural Language Processing, and applied Machine Learning. We propose computational methods that aim to support the exploration and sense-making process of large collections of digital traces. We focus on textual traces, e.g., emails and social media streams, and address two aspects that are central to discovery in digital traces.

3.3LGFeb 12, 2020

Improving automated segmentation of radio shows with audio embeddings

Oberon Berlage, Klaus-Michael Lux, David Graus

Audio features have been proven useful for increasing the performance of automated topic segmentation systems. This study explores the novel task of using audio embeddings for automated, topically coherent segmentation of radio shows. We created three different audio embedding generators using multi-class classification tasks on three datasets from different domains. We evaluate topic segmentation performance of the audio embeddings and compare it against a text-only baseline. We find that a set-up including audio embeddings generated through a non-speech sound event classification task significantly outperforms our text-only baseline by 32.3% in F1-measure. In addition, we find that different classification tasks yield audio embeddings that vary in segmentation performance.