Specificity measures and reference
This work addresses the challenge of evaluating referring expression generation algorithms in domains with non-crisp properties, though it appears incremental as it focuses on empirical validation of existing measures.
The paper tackled the problem of predicting human accuracy in resolving referring expressions with gradual properties, finding that certain fuzzy measures of success can effectively predict this accuracy.
In this paper we study empirically the validity of measures of referential success for referring expressions involving gradual properties. More specifically, we study the ability of several measures of referential success to predict the success of a user in choosing the right object, given a referring expression. Experimental results indicate that certain fuzzy measures of success are able to predict human accuracy in reference resolution. Such measures are therefore suitable for the estimation of the success or otherwise of a referring expression produced by a generation algorithm, especially in case the properties in a domain cannot be assumed to have crisp denotations.