AIMar 6, 2024

Understanding Biology in the Age of Artificial Intelligence

CambridgeHarvardMIT
arXiv:2403.04106v16 citationsh-index: 15
Originality Synthesis-oriented
AI Analysis

This addresses the gap in understanding the interplay between ML and scientific inquiry in biology, which is important for researchers in life sciences and AI, though it is incremental as it builds on existing philosophical frameworks.

The paper tackles the problem of how machine learning (ML) models in biology affect scientific understanding, using epistemological principles to analyze applications like protein structure prediction and single-cell RNA-sequencing, proposing that features like information compression and dependency modeling can guide ML design to advance knowledge.

Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems, primarily centered around the use of machine learning (ML) models. Although ML is undeniably useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry. As such, the interplay between these models and scientific understanding in biology is a topic with important implications for the future of scientific research, yet it is a subject that has received little attention. Here, we draw from an epistemological toolkit to contextualize recent applications of ML in biological sciences under modern philosophical theories of understanding, identifying general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge. We propose that conceptions of scientific understanding as information compression, qualitative intelligibility, and dependency relation modelling provide a useful framework for interpreting ML-mediated understanding of biological systems. Through a detailed analysis of two key application areas of ML in modern biological research - protein structure prediction and single cell RNA-sequencing - we explore how these features have thus far enabled ML systems to advance scientific understanding of their target phenomena, how they may guide the development of future ML models, and the key obstacles that remain in preventing ML from achieving its potential as a tool for biological discovery. Consideration of the epistemological features of ML applications in biology will improve the prospects of these methods to solve important problems and advance scientific understanding of living systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes