Biclustering Readings and Manuscripts via Non-negative Matrix Factorization, with Application to the Text of Jude
This addresses contamination and co-dependence issues in manuscript analysis for textual scholars, but is incremental as it applies an existing method to a new domain.
The paper tackled the problem of grouping witnesses into families in text-critical practice by introducing non-negative matrix factorization (NMF) as an unsupervised method to cluster manuscripts and readings simultaneously, and applied it to the New Testament epistle of Jude, showing that the clusters correspond to human-identified textual families.
The text-critical practice of grouping witnesses into families or texttypes often faces two obstacles: Contamination in the manuscript tradition, and co-dependence in identifying characteristic readings and manuscripts. We introduce non-negative matrix factorization (NMF) as a simple, unsupervised, and efficient way to cluster large numbers of manuscripts and readings simultaneously while summarizing contamination using an easy-to-interpret mixture model. We apply this method to an extensive collation of the New Testament epistle of Jude and show that the resulting clusters correspond to human-identified textual families from existing research.