CLAIJan 9, 2024

TechGPT-2.0: A large language model project to solve the task of knowledge graph construction

arXiv:2401.04507v110 citationsh-index: 3Has Code
Originality Synthesis-oriented
AI Analysis

This project provides an incremental improvement for researchers in the Chinese open-source model community, focusing on domain-specific applications like medicine and law.

The authors tackled the problem of knowledge graph construction by introducing TechGPT-2.0, a large language model project that enhances capabilities in named entity recognition and relationship triple extraction, offering two 7B model weights and a QLoRA weight for long texts, with training on Huawei's Ascend server.

Large language models have exhibited robust performance across diverse natural language processing tasks. This report introduces TechGPT-2.0, a project designed to enhance the capabilities of large language models specifically in knowledge graph construction tasks, including named entity recognition (NER) and relationship triple extraction (RTE) tasks in NLP applications. Additionally, it serves as a LLM accessible for research within the Chinese open-source model community. We offer two 7B large language model weights and a QLoRA weight specialized for processing lengthy texts.Notably, TechGPT-2.0 is trained on Huawei's Ascend server. Inheriting all functionalities from TechGPT-1.0, it exhibits robust text processing capabilities, particularly in the domains of medicine and law. Furthermore, we introduce new capabilities to the model, enabling it to process texts in various domains such as geographical areas, transportation, organizations, literary works, biology, natural sciences, astronomical objects, and architecture. These enhancements also fortified the model's adeptness in handling hallucinations, unanswerable queries, and lengthy texts. This report provides a comprehensive and detailed introduction to the full fine-tuning process on Huawei's Ascend servers, encompassing experiences in Ascend server debugging, instruction fine-tuning data processing, and model training. Our code is available at https://github.com/neukg/TechGPT-2.0

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes