CLFeb 28, 2023
H-AES: Towards Automated Essay Scoring for HindiShubhankar Singh, Anirudh Pupneja, Shivaansh Mital et al.
The use of Natural Language Processing (NLP) for Automated Essay Scoring (AES) has been well explored in the English language, with benchmark models exhibiting performance comparable to human scorers. However, AES in Hindi and other low-resource languages remains unexplored. In this study, we reproduce and compare state-of-the-art methods for AES in the Hindi domain. We employ classical feature-based Machine Learning (ML) and advanced end-to-end models, including LSTM Networks and Fine-Tuned Transformer Architecture, in our approach and derive results comparable to those in the English language domain. Hindi being a low-resource language, lacks a dedicated essay-scoring corpus. We train and evaluate our models using translated English essays and empirically measure their performance on our own small-scale, real-world Hindi corpus. We follow this up with an in-depth analysis discussing prompt-specific behavior of different language models implemented.
LGNov 26, 2019
Text2FaceGAN: Face Generation from Fine Grained Textual DescriptionsOsaid Rehman Nasir, Shailesh Kumar Jha, Manraj Singh Grover et al.
Powerful generative adversarial networks (GAN) have been developed to automatically synthesize realistic images from text. However, most existing tasks are limited to generating simple images such as flowers from captions. In this work, we extend this problem to the less addressed domain of face generation from fine-grained textual descriptions of face, e.g., "A person has curly hair, oval face, and mustache". We are motivated by the potential of automated face generation to impact and assist critical tasks such as criminal face reconstruction. Since current datasets for the task are either very small or do not contain captions, we generate captions for images in the CelebA dataset by creating an algorithm to automatically convert a list of attributes to a set of captions. We then model the highly multi-modal problem of text to face generation as learning the conditional distribution of faces (conditioned on text) in same latent space. We utilize the current state-of-the-art GAN (DC-GAN with GAN-CLS loss) for learning conditional multi-modality. The presence of more fine-grained details and variable length of the captions makes the problem easier for a user but more difficult to handle compared to the other text-to-image tasks. We flipped the labels for real and fake images and added noise in discriminator. Generated images for diverse textual descriptions show promising results. In the end, we show how the widely used inceptions score is not a good metric to evaluate the performance of generative models used for synthesizing faces from text.
IRSep 30, 2019
End-to-End Resume Parsing and Finding Candidates for a Job Description using BERTVedant Bhatia, Prateek Rawat, Ajit Kumar et al.
The ever-increasing number of applications to job positions presents a challenge for employers to find suitable candidates manually. We present an end-to-end solution for ranking candidates based on their suitability to a job description. We accomplish this in two stages. First, we build a resume parser which extracts complete information from candidate resumes. This parser is made available to the public in the form of a web application. Second, we use BERT sentence pair classification to perform ranking based on their suitability to the job description. To approximate the job description, we use the description of past job experiences by a candidate as mentioned in his resume. Our dataset comprises resumes in LinkedIn format and general non-LinkedIn formats. We parse the LinkedIn resumes with 100\% accuracy and establish a strong baseline of 73\% accuracy for candidate suitability.