Study Finds ChatGPT Search Provides Inaccurate Answers
ChatGPT has already been viewed as a potential competitor to Google Search, but the introduction of ChatGPT Search was anticipated to further solidify its position against alternatives like Perplexity AI. However, a recent study conducted by Columbia University’s Tow Center for Digital Journalism indicates that ChatGPT Search often fails to deliver precise answers to user inquiries.
The researchers examined content from 20 publications across three distinct categories: those that collaborate with OpenAI to utilize their content in ChatGPT Search results, those involved in litigation against OpenAI, and independent publishers who have either consented or denied permission for ChatGPT to scrape their articles.
To assess accuracy, the study selected 10 articles from each publisher and focused on specific quotes that, when searched on conventional search engines like Google or Bing, reliably returned the original source in the top three results. The study then evaluated how effectively ChatGPT's new search tool recognized these original sources.
Interestingly, 40 quotes were sourced from publications that collaborate with OpenAI but have not permitted their content to be scrapped. Despite this restriction, ChatGPT Search confidently provided incorrect or fabricated answers.
According to the study, "ChatGPT produced responses that were partially or wholly wrong on 153 occasions, yet it only acknowledged its inability to accurately respond seven times." In those limited instances, the chatbot used cautious language like "appears," "it’s possible," or stated, "I couldn’t locate the exact article," which suggests a lack of commitment to correctness.
The casual approach of ChatGPT Search towards providing accurate information raises concerns about its impact not just on its credibility but also on the reputations of the sources it mentions. For example, during the study, it misattributed a report originally published by Time as one authored by the Orlando Sentinel, and it linked to a third-party website that had duplicated a New York Times article instead of the article itself.
In response to the findings, OpenAI contended that the issues noted in the study were a result of flaws in the methodology employed by the Columbia researchers. They stated, "Misattribution is challenging to resolve without access to both the data and the methodology that the Tow Center did not provide," arguing that the findings don’t reflect a standard evaluation of their product. Furthermore, OpenAI promised to continue improving its search functionalities.
ChatGPT, Search, Accuracy