CLLGNov 17, 2022

Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material

arXiv:2211.09710v3297 citationsh-index: 46
AI Analysis

This work addresses a specific challenge for scholars of rabbinic literature by providing a computational tool to aid in textual analysis and recovery of lost material, representing an incremental advance in applying NLP to historical texts.

The paper tackles the problem of determining the origin of passages in complex rabbinic texts by proposing a style classification system using NLP for Hebrew, and demonstrates its application to detect lost material from the Tanhuma-Yelammedenu midrash genre preserved in later anthologies.

Midrash collections are complex rabbinic works that consist of text in multiple languages, which evolved through long processes of unstable oral and written transmission. Determining the origin of a given passage in such a compilation is not always straightforward and is often a matter of dispute among scholars, yet it is essential for scholars' understanding of the passage and its relationship to other texts in the rabbinic corpus. To help solve this problem, we propose a system for classification of rabbinic literature based on its style, leveraging recent advances in natural language processing for Hebrew texts. Additionally, we demonstrate how this method can be applied to uncover lost material from a specific midrash genre, Tan\d{h}uma-Yelammedenu, that has been preserved in later anthologies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes