Automatic Design of Semantic Similarity Ensembles Using Grammatical Evolution
This work addresses the need for more accurate semantic similarity models in NLP tasks like document analysis, but it is incremental as it builds on existing ensemble approaches.
The paper tackled the problem of varying performance of individual semantic similarity measures across datasets by proposing an automated strategy using grammatical evolution to construct ensembles, resulting in outperforming existing ensemble techniques in accuracy on standard benchmarks.
Semantic similarity measures are a key component in natural language processing tasks such as document analysis, requirement matching, and user input interpretation. However, the performance of individual measures varies considerably across datasets. To address this, ensemble approaches that combine multiple measures are often employed. This paper presents an automated strategy based on grammatical evolution for constructing semantic similarity ensembles. The method evolves aggregation functions that maximize correlation with human-labeled similarity scores. Experiments on standard benchmark datasets demonstrate that the proposed approach outperforms existing ensemble techniques in terms of accuracy. The results confirm the effectiveness of grammatical evolution in designing adaptive and accurate similarity models. The source code that illustrates our approach can be downloaded from https://github.com/jorge-martinez-gil/sesige.