CLFeb 19, 2023

Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Dacheng Tao

arXiv:2302.10198v223.1310 citationsh-index: 36Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the gap in quantitative analysis of ChatGPT's understanding ability for NLP researchers, providing comparative insights with BERT models.

The study quantitatively evaluated ChatGPT's understanding ability on the GLUE benchmark, finding it falls short in paraphrase and similarity tasks, outperforms fine-tuned BERT models in inference tasks by a large margin, and achieves comparable performance in sentiment analysis and question-answering tasks.

Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries. Several prior studies have shown that ChatGPT attains remarkable generation ability compared with existing models. However, the quantitative analysis of ChatGPT's understanding ability has been given little attention. In this report, we explore the understanding ability of ChatGPT by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models. We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question-answering tasks. Additionally, by combining some advanced prompting strategies, we show that the understanding ability of ChatGPT can be further improved.

View on arXiv PDF Code

Similar