CL AISep 9, 2024

Elsevier Arena: Human Evaluation of Chemistry/Biology/Health Foundational Large Language Models

Camilo Thorne, Christian Druckenbrodt, Kinga Szarkowska, Deepika Goyal, Pranita Marajan, Vijay Somanath, Corey Harper, Mao Yan, Tony Scerri

arXiv:2409.05486v21.0

Originality Synthesis-oriented

AI Analysis

This work would have addressed the need for human evaluation of AI models in scientific domains, but it is incomplete and incremental due to the removal.

The paper aimed to evaluate foundational large language models in chemistry, biology, and health, but it was removed from arXiv due to licensing issues, so no results or numbers are available.

arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

View on arXiv PDF

Similar