LGCLCRFeb 13, 2023

Machine Learning Model Attribution Challenge

arXiv:2302.06716v36 citationsh-index: 15
Originality Synthesis-oriented
AI Analysis

This addresses the issue of model provenance and accountability in machine learning, particularly for LLMs, but is incremental as it focuses on a specific challenge setting.

The paper tackled the problem of attributing fine-tuned large language models to their publicly-available base models using only textual outputs, with results showing that manual approaches based on output similarities and public documentation were most successful in the challenge.

We present the findings of the Machine Learning Model Attribution Challenge. Fine-tuned machine learning models may derive from other trained models without obvious attribution characteristics. In this challenge, participants identify the publicly-available base models that underlie a set of anonymous, fine-tuned large language models (LLMs) using only textual output of the models. Contestants aim to correctly attribute the most fine-tuned models, with ties broken in the favor of contestants whose solutions use fewer calls to the fine-tuned models' API. The most successful approaches were manual, as participants observed similarities between model outputs and developed attribution heuristics based on public documentation of the base models, though several teams also submitted automated, statistical solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes