LLM-Powered Ensemble Learning for Paper Source Tracing: A GPU-Free Approach
This addresses the problem of resource-efficient reference source identification for academic researchers, though it is incremental as it adapts existing LLM technology to a specific competition task.
The paper tackled the KDD CUP 2024 paper source tracing competition by using closed-source LLMs in a GPU-free approach, achieving 3rd place without fine-tuning pre-trained models.
We participated in the KDD CUP 2024 paper source tracing competition and achieved the 3rd place. This competition tasked participants with identifying the reference sources (i.e., ref-sources, as referred to by the organizers of the competition) of given academic papers. Unlike most teams that addressed this challenge by fine-tuning pre-trained neural language models such as BERT or ChatGLM, our primary approach utilized closed-source large language models (LLMs). With recent advancements in LLM technology, closed-source LLMs have demonstrated the capability to tackle complex reasoning tasks in zero-shot or few-shot scenarios. Consequently, in the absence of GPUs, we employed closed-source LLMs to directly generate predicted reference sources from the provided papers. We further refined these predictions through ensemble learning. Notably, our method was the only one among the award-winning approaches that did not require the use of GPUs for model training. Code available at https://github.com/Cklwanfifa/KDDCUP2024-PST.