SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
This work addresses the need for more accurate evaluation methods in visual brain decoding research, which is incremental as it builds on existing metrics and models.
The authors tackled the problem of evaluating semantic decoding in visual brain decoding models by introducing SEED, a new metric that integrates three complementary semantic similarity measures and aligns better with human judgments than existing metrics, revealing that even state-of-the-art models lose crucial information despite high scores on current metrics.
We present SEED (\textbf{Se}mantic \textbf{E}valuation for Visual Brain \textbf{D}ecoding), a novel metric for evaluating the semantic decoding performance of visual brain decoding models. It integrates three complementary metrics, each capturing a different aspect of semantic similarity between images. Using carefully crowd-sourced human judgment data, we demonstrate that SEED achieves the highest alignment with human evaluations, outperforming other widely used metrics. Through the evaluation of existing visual brain decoding models, we further reveal that crucial information is often lost in translation, even in state-of-the-art models that achieve near-perfect scores on existing metrics. To facilitate further research, we open-source the human judgment data, encouraging the development of more advanced evaluation methods for brain decoding models. Additionally, we propose a novel loss function designed to enhance semantic decoding performance by leveraging the order of pairwise cosine similarity in CLIP image embeddings. This loss function is compatible with various existing methods and has been shown to consistently improve their semantic decoding performances when used for training, with respect to both existing metrics and SEED.