Beyond Imperfect Alternatives with Rulemapping: A Neuro-Symbolic Case Study on Online Hate Speech
For legal decision-making in high-volume settings like content moderation, this hybrid method offers a verifiable and auditable alternative to purely neural systems.
This paper shows that a neuro-symbolic approach (Rulemapping) improves precision from 0.34-0.49 to 0.80-0.86 while maintaining high recall (0.82-0.89) for hate speech classification under German law, preventing LLMs from conflating offensiveness with illegality.
Automating legal reasoning forces a choice between imperfect alternatives: symbolic systems offer transparency but struggle with ambiguity, whereas neural systems handle natural language flexibly but lack verifiability. This paper investigates whether a hybrid, neuro-symbolic approach can reconcile this trade-off. We evaluate this architecture in the domain of online content moderation, which serves as a proxy for high-volume legal decision-making such as mass administrative proceedings. In these settings, operators must assess thousands of cases daily under strict legal standards. Specifically, we examine whether constraining large language models (LLMs) within deterministic symbolic scaffolds improves statute-grounded illegality assessment while preventing "scope drift" (where LLMs conflate moral offensiveness with legal illegality). We evaluate the neuro-symbolic variant of Rulemapping - a visual logic-tree method that operationalises the classic legal syllogism - on online hate-speech classification under §130(1) of the German Criminal Code. Across diverse LLMs, Rulemapping maintains high recall (0.82-0.89) while achieving precision of 0.80-0.86, compared to 0.34-0.49 for unconstrained prompting. Expert-authored symbolic scaffolds thus enable robust legal automation aligned with regulatory requirements for auditability and verifiable decision-making.