LG AI CYFeb 2, 2024

On Catastrophic Inheritance of Large Foundation Models

Hao Chen, Bhiksha Raj, Xing Xie, Jindong Wang

arXiv:2402.01909v215.715 citationsh-index: 14J. Data-centric Mach. Learn. Res.

Originality Synthesis-oriented

AI Analysis

This addresses a critical issue for AI developers and users by highlighting inherited risks in foundation models, though it is incremental as it builds on existing concerns about data bias.

The paper identifies Catastrophic Inheritance as a problem where large foundation models inherit weaknesses from biased pre-training data, leading to issues like bias and poor generalization in downstream tasks, and proposes the UIM framework to understand, interpret, and mitigate these effects.

Large foundation models (LFMs) are claiming incredible performances. Yet great concerns have been raised about their mythic and uninterpreted potentials not only in machine learning, but also in various other disciplines. In this position paper, we propose to identify a neglected issue deeply rooted in LFMs: Catastrophic Inheritance, describing the weaknesses and limitations inherited from biased large-scale pre-training data to behaviors of LFMs on the downstream tasks, including samples that are corrupted, long-tailed, noisy, out-of-distributed, to name a few. Such inheritance can potentially cause catastrophes to downstream applications, such as bias, lack of generalization, deteriorated performance, security vulnerability, privacy leakage, and value misalignment. We discuss the challenges behind this issue and propose UIM, a framework to Understand the catastrophic inheritance of LFMs from both pre-training and downstream adaptation, Interpret the implications of catastrophic inheritance on downstream tasks, and how to Mitigate it. UIM aims to unite both the machine learning and social sciences communities for more responsible and promising AI development and deployment.

View on arXiv PDF

Similar