Two Kinds of Recall
This work addresses a foundational issue in machine learning evaluation by proposing a new framework for recall, which could impact how models are assessed and developed across various domains.
The paper challenges the assumption that pattern-based models are better at precision and learning-based models at recall by identifying two types of recall: d-recall for diversity and e-recall for exhaustiveness, showing through experiments that neural methods excel at d-recall while pattern-based methods can outperform in e-recall.
It is an established assumption that pattern-based models are good at precision, while learning based models are better at recall. But is that really the case? I argue that there are two kinds of recall: d-recall, reflecting diversity, and e-recall, reflecting exhaustiveness. I demonstrate through experiments that while neural methods are indeed significantly better at d-recall, it is sometimes the case that pattern-based methods are still substantially better at e-recall. Ideal methods should aim for both kinds, and this ideal should in turn be reflected in our evaluations.