RaTE: a Reproducible automatic Taxonomy Evaluation by Filling the Gap
This addresses the problem of manual evaluation inefficiency for researchers in knowledge representation, though it is incremental as it builds on existing taxonomy construction methods.
The authors tackled the lack of automatic evaluation for taxonomy construction by proposing RaTE, a label-free scoring method using a pre-trained language model, which showed good correlation with human judgments and sensitivity to degradation in tests on Yelp domain taxonomies.
Taxonomies are an essential knowledge representation, yet most studies on automatic taxonomy construction (ATC) resort to manual evaluation to score proposed algorithms. We argue that automatic taxonomy evaluation (ATE) is just as important as taxonomy construction. We propose RaTE, an automatic label-free taxonomy scoring procedure, which relies on a large pre-trained language model. We apply our evaluation procedure to three state-of-the-art ATC algorithms with which we built seven taxonomies from the Yelp domain, and show that 1) RaTE correlates well with human judgments and 2) artificially degrading a taxonomy leads to decreasing RaTE score.