Technology

Major Websites Impede Google From Utilizing Content in AI Development

Published March 15, 2024

A cohort of leading online entities has recently initiated measures to obstruct Alphabet Inc.'s Google from appropriating their content for the enhancement of artificial intelligence (AI) models. The implicated websites have adopted this approach as a defensive measure against the perceived jeopardy that AI systems may present to the status quo of web traffic distribution.

Introduction of Google-Extended

A recent innovation, dubbed Google-Extended, affords webmasters the capability to bar Google from leveraging their site content in AI training endeavors. This tool has garnered implementation among approximately 10% of the foremost 1,000 websites, according to insights from Originality.ai noted by Business Insider.

High-Profile Entities Take A Stand

Leading news outlet The New York Times, in the throes of an AI copyright tussle with ChatGPT-progenitor OpenAI, has instated the Google-Extended blockade. Correspondingly, it has constricted OpenAI's access to its articles. Other prominent platforms deploying Google-Extended encompass CNN, BBC, Yelp, and Business Insider.

Nonetheless, compared to alternative AI data restriction tools such as OpenAI’s GPTBot, which is currently active on nearly one-third of the top-ranked 1,000 websites, Google-Extended's adoption is less widespread. Jonathan Gillham, Chief Executive Officer at Originality.ai, postulated that the reluctance to endorse Google-Extended may stem from the potential fallout of websites being omitted from AI-crafted search results, should they choose to prohibit Google's data access.

Risks to Web Traffic

Publishers have voiced alarm over the implications of Google's AI-driven search enhancements, which could autonomously furnish answers to user inquiries, thereby diminishing the necessity for users to engage with external sites. For instance, The Atlantic has aired its concern, noting the pivotal reliance on Google for a substantial fraction of its online visitor traffic.

The ongoing copyright dispute between The New York Times and OpenAI is symptomatic of a broader conflict within the AI sector. Tensions have been escalating, highlighted by Elon Musk's accusations that OpenAI purloined significant amounts of data for its AI program, Sora.

Alphabet, Google, AI