Computational Analysis of Semantic Connections Between Herman Melville Reading and Writing
This provides a computational framework to support source and influence studies in literary scholarship, but it is incremental as it applies existing methods to a new domain.
This study tackled the problem of identifying potential literary influences on Herman Melville's writings by computationally analyzing semantic similarity between his works and texts he read, using BERTScore to find alignments without fixed thresholds, and the results showed the method captured expert-identified similarities and highlighted additional passages for further study.
This study investigates the potential influence of Herman Melville reading on his own writings through computational semantic similarity analysis. Using documented records of books known to have been owned or read by Melville, we compare selected passages from his works with texts from his library. The methodology involves segmenting texts at both sentence level and non-overlapping 5-gram level, followed by similarity computation using BERTScore. Rather than applying fixed thresholds to determine reuse, we interpret precision, recall, and F1 scores as indicators of possible semantic alignment that may suggest literary influence. Experimental results demonstrate that the approach successfully captures expert-identified instances of similarity and highlights additional passages warranting further qualitative examination. The findings suggest that semantic similarity methods provide a useful computational framework for supporting source and influence studies in literary scholarship.