Optimal Embedding Calibration for Symbolic Music Similarity
This work addresses the challenge of music similarity evaluation for researchers and practitioners by providing an incremental improvement in calibration methods.
The paper tackled the problem of evaluating music similarity without expensive human labels by using composer information to construct automatic labels, and discovered an optimal embedding calibration combination that achieved superior metrics over baseline methods.
In natural language processing (NLP), the semantic similarity task requires large-scale, high-quality human-annotated labels for fine-tuning or evaluation. By contrast, in cases of music similarity, such labels are expensive to collect and largely dependent on the annotator's artistic preferences. Recent research has demonstrated that embedding calibration technique can greatly increase semantic similarity performance of the pre-trained language model without fine-tuning. However, it is yet unknown which calibration method is the best and how much performance improvement can be achieved. To address these issues, we propose using composer information to construct labels for automatically evaluating music similarity. Under this paradigm, we discover the optimal combination of embedding calibration which achieves superior metrics than the baseline methods.