Weak Links in LinkedIn: Enhancing Fake Profile Detection in the Age of LLMs
This addresses a critical security issue for social media platforms and users by enhancing detection against AI-generated fake profiles, though it is incremental as it builds on existing methods.
The study tackled the problem of fake profile detection on LinkedIn being vulnerable to LLM-generated profiles, finding that existing detectors failed with a 42-52% false accept rate, but proposed GPT-assisted adversarial training reduced this to 1-7% without harming false reject rates.
Large Language Models (LLMs) have made it easier to create realistic fake profiles on platforms like LinkedIn. This poses a significant risk for text-based fake profile detectors. In this study, we evaluate the robustness of existing detectors against LLM-generated profiles. While highly effective in detecting manually created fake profiles (False Accept Rate: 6-7%), the existing detectors fail to identify GPT-generated profiles (False Accept Rate: 42-52%). We propose GPT-assisted adversarial training as a countermeasure, restoring the False Accept Rate to between 1-7% without impacting the False Reject Rates (0.5-2%). Ablation studies revealed that detectors trained on combined numerical and textual embeddings exhibit the highest robustness, followed by those using numerical-only embeddings, and lastly those using textual-only embeddings. Complementary analysis on the ability of prompt-based GPT-4Turbo and human evaluators affirms the need for robust automated detectors such as the one proposed in this study.