NJUST-KMG at TRAC-2024 Tasks 1 and 2: Offline Harm Potential Identification
This work addresses the challenge of detecting harmful content in multilingual social media for safety applications, but it is incremental as it builds on existing methods.
The paper tackled the problem of identifying offline harm potential from social media comments in Indian languages, achieving second place in two tracks with F1 scores of 0.73 and 0.96.
This report provide a detailed description of the method that we proposed in the TRAC-2024 Offline Harm Potential dentification which encloses two sub-tasks. The investigation utilized a rich dataset comprised of social media comments in several Indian languages, annotated with precision by expert judges to capture the nuanced implications for offline context harm. The objective assigned to the participants was to design algorithms capable of accurately assessing the likelihood of harm in given situations and identifying the most likely target(s) of offline harm. Our approach ranked second in two separate tracks, with F1 values of 0.73 and 0.96 respectively. Our method principally involved selecting pretrained models for finetuning, incorporating contrastive learning techniques, and culminating in an ensemble approach for the test set.