ChatGPT’s answers to software engineering questions were 52% incorrect

OpenAI’s ChatGPT answered about 52% software engineering questions incorrectly, according to a study, raising questions about the popular language models accuracy.
Despite ChatGPT’s popularity, there hasn’t been a thorough investigation into the quality and usability of its responses to software engineering queries, said researchers from the Purdue University in the US.
To address this gap, the team undertook a comprehensive analysis of ChatGPT’s replies to 517 questions from Stack Overflow (SO).
“Our examination revealed that 52 per cent of ChatGPT’s answers contain inaccuracies and 77 per cent are verbose,” the researchers wrote in the paper, not peer-reviewed and published on a pre-print site.
Importantly, the team found that 54 per cent of the time the errors were made due to ChatGPT not understanding the concept of the questions.
Even when it could understand the question, it failed to show an understanding of how to solve the problem, contributing to a high number of conceptual errors, they said. Further, the researchers observed ChatGPT’s limitation to reasoning. “In many cases, we saw ChatGPT give a solution, code, or formula without foresight or thinking about the outcome,” they said.

Be the first to comment

Leave a Reply

Your email address will not be published.