SEMA: an Extended Semantic Evaluation Metric for AMR
This work addresses a specific evaluation bottleneck for researchers and developers in natural language processing working with AMR parsing, offering an incremental improvement over the existing smatch metric.
The authors tackled the problem of evaluating Abstract Meaning Representation (AMR) parsers by introducing SEMA, an extended metric that addresses drawbacks in the existing smatch metric, such as self-relations on the root and lack of element dependence consideration, and demonstrated that SEMA is more refined, robust, fairer, and faster than smatch when tested on four well-known parsers.
Abstract Meaning Representation (AMR) is a recently designed semantic representation language intended to capture the meaning of a sentence, which may be represented as a single-rooted directed acyclic graph with labeled nodes and edges. The automatic evaluation of this structure plays an important role in the development of better systems, as well as for semantic annotation. Despite there is one available metric, smatch, it has some drawbacks. For instance, smatch creates a self-relation on the root of the graph, has weights for different error types, and does not take into account the dependence of the elements in the AMR structure. With these drawbacks, smatch masks several problems of the AMR parsers and distorts the evaluation of the AMRs. In view of this, in this paper, we introduce an extended metric to evaluate AMR parsers, which deals with the drawbacks of the smatch metric. Finally, we compare both metrics, using four well-known AMR parsers, and we argue that our metric is more refined, robust, fairer, and faster than smatch.