CL AI LGApr 4, 2019

Robust Evaluation of Language-Brain Encoding Experiments

Lisa Beinborn, Samira Abnar, Rochelle Choenni

arXiv:1904.02547v12.718 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of inconsistent evaluation for researchers in computational neuroscience and AI, offering incremental improvements through standardized methods.

The paper tackled the lack of standardization in evaluating language-brain encoding experiments by performing consistent tests across multiple fMRI datasets, analyzing sensitivity to randomized data and voxel selection effects, and providing a public framework to enhance transparency and reproducibility.

Language-brain encoding experiments evaluate the ability of language models to predict brain responses elicited by language stimuli. The evaluation scenarios for this task have not yet been standardized which makes it difficult to compare and interpret results. We perform a series of evaluation experiments with a consistent encoding setup and compute the results for multiple fMRI datasets. In addition, we test the sensitivity of the evaluation measures to randomized data and analyze the effect of voxel selection methods. Our experimental framework is publicly available to make modelling decisions more transparent and support reproducibility for future comparisons.

View on arXiv PDF Code

Similar