LG CL CRFeb 13, 2023

Machine Learning Model Attribution Challenge

Elizabeth Merkhofer, Deepesh Chaudhari, Hyrum S. Anderson, Keith Manville, Lily Wong, João Gante

arXiv:2302.06716v35.36 citationsh-index: 15

Originality Synthesis-oriented

AI Analysis

This addresses the issue of model provenance and accountability in machine learning, particularly for LLMs, but is incremental as it focuses on a specific challenge setting.

The paper tackled the problem of attributing fine-tuned large language models to their publicly-available base models using only textual outputs, with results showing that manual approaches based on output similarities and public documentation were most successful in the challenge.

We present the findings of the Machine Learning Model Attribution Challenge. Fine-tuned machine learning models may derive from other trained models without obvious attribution characteristics. In this challenge, participants identify the publicly-available base models that underlie a set of anonymous, fine-tuned large language models (LLMs) using only textual output of the models. Contestants aim to correctly attribute the most fine-tuned models, with ties broken in the favor of contestants whose solutions use fewer calls to the fine-tuned models' API. The most successful approaches were manual, as participants observed similarities between model outputs and developed attribution heuristics based on public documentation of the base models, though several teams also submitted automated, statistical solutions.

View on arXiv PDF

Similar