DCAICLOct 3, 2023

HPC-GPT: Integrating Large Language Model for High-Performance Computing

arXiv:2311.12833v155 citationsh-index: 33Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the gap in LLM performance for HPC domain tasks, which is incremental as it adapts existing methods to a new domain.

The paper tackles the problem of poor performance of large language models (LLMs) in high-performance computing (HPC) tasks by proposing HPC-GPT, a fine-tuned LLaMA-based model, achieving comparable performance to existing methods on tasks like managing AI models and data race detection.

Large Language Models (LLMs), including the LLaMA model, have exhibited their efficacy across various general-domain natural language processing (NLP) tasks. However, their performance in high-performance computing (HPC) domain tasks has been less than optimal due to the specialized expertise required to interpret the model responses. In response to this challenge, we propose HPC-GPT, a novel LLaMA-based model that has been supervised fine-tuning using generated QA (Question-Answer) instances for the HPC domain. To evaluate its effectiveness, we concentrate on two HPC tasks: managing AI models and datasets for HPC, and data race detection. By employing HPC-GPT, we demonstrate comparable performance with existing methods on both tasks, exemplifying its excellence in HPC-related scenarios. Our experiments on open-source benchmarks yield extensive results, underscoring HPC-GPT's potential to bridge the performance gap between LLMs and HPC-specific tasks. With HPC-GPT, we aim to pave the way for LLMs to excel in HPC domains, simplifying the utilization of language models in complex computing applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes