Technology

Latest Version of ChatGPT Hallucinates Much Less

Published March 3, 2025

In recent times, various companies like OpenAI, Meta, xAI, and DeepSeek have been rapidly improving their AI chatbots. This has led to a significant focus not only on which company holds the most advanced or conversational AI, but also on which chatbot provides the most accurate answers to users' inquiries.

Recently, OpenAI introduced GPT-4.5, a new and improved version of the AI model that powers ChatGPT. OpenAI describes this latest model as their "largest and best model for chat yet," showcasing enthusiasm about its advancements.

According to a blog post by OpenAI, GPT-4.5 is designed to better understand and react to subtle cues in users' written prompts. This model is optimized for various tasks, including chatting, writing, and coding.

One of the significant improvements with GPT-4.5 is its reduced tendency to experience "hallucinations," a term used to describe instances when AI models generate incorrect or fabricated responses to user questions.

OpenAI reported that the updated version has a hallucination rate of 37.1%, which is a marked decrease from 61.8% in the previous version.

To achieve these improvements, OpenAI utilized a method called post-training. This process involved incorporating human feedback to refine the AI's responses, as noted by reports from Bloomberg. Additionally, the company has developed new training methods using data from the training of its existing GPT-4.0 model, as explained by Mia Glaese, a vice president of research at OpenAI.

In their blog, OpenAI mentioned, "Early testing shows that interacting with GPT-4.5 feels more natural." They emphasized that the model's broader knowledge, enhanced ability to follow user intent, and increased emotional intelligence (EQ) make it a useful tool for improving writing, programming, and tackling real-world problems.

Despite the progress in minimizing hallucinations, OpenAI's CEO, Sam Altman, has urged caution regarding expectations for GPT-4.5. He remarked, "It is a giant, expensive model. A heads-up: this isn't a reasoning model and won't crush benchmarks."

Altman's comments come at a time when competitors are also making strides in AI technology. For instance, Elon Musk's xAI recently launched its updated Grok-3 model chatbot technology, which boasts "more than 10 times" the computing power of its predecessor. Musk claims that Grok-3 outperforms systems like OpenAI's GPT-4o, Google Gemini, DeepSeek's V3 model, and Anthropic's Claude across multiple math, science, and coding benchmarks.

xAI asserts that Grok-3 surpasses GPT-4o in challenges like the AIME, which consists of math questions, and GPQA, using complex PhD-level problems in physics, biology, and chemistry. The Grok 3 family includes two models, Grok 3 Reasoning and Grok 3 mini Reasoning, which reportedly have capabilities that allow them to "think through" similar to reasoning models like OpenAI’s o3-mini and DeepSeek’s R1.

Currently, OpenAI's GPT-4.5 is being made available as a "research preview" to a limited group of software developers and ChatGPT Pro subscribers. After gathering their feedback and making necessary adjustments, the model will be rolled out for wider access.

AI, Chatbot, Technology