SE AIFeb 16, 2025

Performance Review on LLM for solving leetcode problems

Lun Wang, Chuanqi Shi, Shaoshui Du, Yiyi Tao, Yixian Shen, Hang Zheng, Yanxin Shen, Xinyu Qiu

arXiv:2502.15770v211.312 citationsh-index: 72024 4th International Symposium on Artificial Intelligence and Intelligent Manufacturing (AIIM)

Originality Synthesis-oriented

AI Analysis

This work assesses LLMs for automated programming assistance, which is incremental as it applies existing methods to a new dataset.

This paper evaluated the performance of Large Language Models (LLMs) like GPT-4 and GPT-3.5-turbo in solving Leetcode programming problems, finding strengths and limitations in code generation and problem-solving tasks, with results including pass@k success rates and runtime analysis.

This paper presents a comprehensive performance evaluation of Large Language Models (LLMs) in solving programming challenges from Leetcode, a widely used platform for algorithm practice and technical interviews. We began by crawling the Leetcode website to collect a diverse set of problems encompassing various difficulty levels and topics. Using this dataset, we generated solutions with multiple LLMs, including GPT-4 and GPT-3.5-turbo (ChatGPT-turbo). The generated solutions were systematically evaluated for correctness and efficiency. We employed the pass@k metric to assess the success rates within a given number of attempts and analyzed the runtime performance of the solutions. Our results highlight the strengths and limitations of current LLMs [10] in code generation and problem-solving tasks, providing insights into their potential applications and areas for improvement in automated programming assistance.

View on arXiv PDF

Similar