Automatic Detection of Reuses and Citations in Literary Texts
This work addresses the need for efficient computational tools in literary studies to analyze intertextual references, though it appears incremental as it applies existing computer science techniques to a specific domain.
The paper tackles the problem of automatically detecting and exploring networks of textual reuses, such as paraphrases and citations, in classical literature, resulting in the development of a software tool that has achieved significant results in this domain.
For more than forty years now, modern theories of literature (Compagnon, 1979) insist on the role of paraphrases, rewritings, citations, reciprocal borrowings and mutual contributions of any kinds. The notions of intertextuality, transtextuality, hypertextuality/hypotextuality, were introduced in the seventies and eighties to approach these phenomena. The careful analysis of these references is of particular interest in evaluating the distance that the creator voluntarily introduces with his/her masters. Phoebus is collaborative project that makes computer scientists from the University Pierre and Marie Curie (LIP6-UPMC) collaborate with the literary teams of Paris-Sorbonne University with the aim to develop efficient tools for literary studies that take advantage of modern computer science techniques. In this context, we have developed a piece of software that automatically detects and explores networks of textual reuses in classical literature. This paper describes the principles on which is based this program, the significant results that have already been obtained and the perspectives for the near future.