GenoTEX: An LLM Agent Benchmark for Automated Gene Expression Data Analysis
This work addresses the scalability issue in computational genomics for researchers and bioinformaticians, but it is incremental as it builds on existing LLM capabilities to create a new benchmark.
The authors tackled the problem of automating gene expression data analysis, which requires extensive expertise and manual effort, by introducing GenoTEX, a benchmark dataset for evaluating LLM-based agents, and GenoAgent, a baseline method that demonstrates potential but also highlights challenges.
Recent advancements in machine learning have significantly improved the identification of disease-associated genes from gene expression datasets. However, these processes often require extensive expertise and manual effort, limiting their scalability. Large Language Model (LLM)-based agents have shown promise in automating these tasks due to their increasing problem-solving abilities. To support the evaluation and development of such methods, we introduce GenoTEX, a benchmark dataset for the automated analysis of gene expression data. GenoTEX provides analysis code and results for solving a wide range of gene-trait association problems, encompassing dataset selection, preprocessing, and statistical analysis, in a pipeline that follows computational genomics standards. The benchmark includes expert-curated annotations from bioinformaticians to ensure accuracy and reliability. To provide baselines for these tasks, we present GenoAgent, a team of LLM-based agents that adopt a multi-step programming workflow with flexible self-correction, to collaboratively analyze gene expression datasets. Our experiments demonstrate the potential of LLM-based methods in analyzing genomic data, while error analysis highlights the challenges and areas for future improvement. We propose GenoTEX as a promising resource for benchmarking and enhancing automated methods for gene expression data analysis. The benchmark is available at https://github.com/Liu-Hy/GenoTEX.