LGDSMLMar 1, 2024

Scalable Learning of Item Response Theory Models

arXiv:2403.00680v27 citationsh-index: 10AISTATS
AI Analysis

This work addresses scalability issues in psychometric and machine learning applications where both examinees and items are numerous, though it is incremental as it adapts existing coreset methods to IRT models.

The paper tackles the challenge of scaling Item Response Theory (IRT) models for large datasets, such as those in global assessments or machine learning contexts, by developing coresets based on logistic regression approximations to enable efficient training algorithms.

Item Response Theory (IRT) models aim to assess latent abilities of $n$ examinees along with latent difficulty characteristics of $m$ test items from categorical data that indicates the quality of their corresponding answers. Classical psychometric assessments are based on a relatively small number of examinees and items, say a class of $200$ students solving an exam comprising $10$ problems. More recent global large scale assessments such as PISA, or internet studies, may lead to significantly increased numbers of participants. Additionally, in the context of Machine Learning where algorithms take the role of examinees and data analysis problems take the role of items, both $n$ and $m$ may become very large, challenging the efficiency and scalability of computations. To learn the latent variables in IRT models from large data, we leverage the similarity of these models to logistic regression, which can be approximated accurately using small weighted subsets called coresets. We develop coresets for their use in alternating IRT training algorithms, facilitating scalable learning from large data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes