ChatGPT Receives Personal Assistant Upgrade for Premium Users
On Thursday, OpenAI announced a new feature known as Operator, designed to transform ChatGPT into a virtual assistant capable of performing tasks such as ordering food and booking flights. Currently, this service is exclusively available to Pro subscribers, which comes at a price of $200 per month in the U.S.
This innovative tool allows ChatGPT to independently browse the internet, marking a significant step forward for the company in autonomous web browsing. However, it also highlights a growing financial divide in access to advanced AI features, as users who are willing to pay more are granted higher capabilities, while those on lower tiers do not have the same advantages.
The service can be accessed through the website operator.chatgpt.com, where users can request ChatGPT to manage various online activities on their behalf.
In the past, there were attempts to create similar functionality, including OpenAI's plugin store and models like Large Action Models. Yet, those systems often relied on APIs, making them difficult and inconvenient to set up.
The uniqueness of the Operator feature lies in its operation. Instead of depending on APIs, it utilizes a cloud-based browser that mimics human behavior by clicking buttons and filling out forms. As Operator makes its moves, it captures screenshots to show users what actions are being taken.
For instance, if you want to buy a ticket for an event, ChatGPT can open a browser, visit the relevant website, search for the event, and display options for ticket purchase before asking for confirmation on the payment.
Along with this, the AI provides a transparent view into its decision-making process through visual documentation. If anything goes wrong, users have the option to take manual control at any time.
To build this distinct capability, OpenAI developed a dedicated AI model that can visually interpret web browser information and execute tasks using keyboard and mouse commands. This new model, powered by GPT-4o, is referred to as the Computer User Agent (CUA).
Unlike just following scripts, this AI has the ability to read and comprehend website layouts, adjust to various designs, and manage unexpected pop-ups or error notifications effectively.
The system demonstrates several impressive functionalities. For example, if users provide a photo of a messy handwritten shopping list, it can utilize GPT-Vision to read the list and even place an order with a designated grocery store.
OpenAI has formed partnerships with various companies to enhance the integration and usability of this tool across different platforms.
When tasks involve booking for rides or ordering food, the AI can interact smoothly with services such as Uber and DoorDash, as it is already configured to work within their interfaces.
Nonetheless, for websites that are unsupported, Operator will still try to accomplish tasks using its browser capabilities, showcasing its superiority over previous tools.
The performance metrics released by OpenAI indicate that Operator significantly outperforms other leading models: it scored 38.1% on OSWorld (proficiency with standard operating systems) compared to the best competitor's 22%, and 58.1% on WebArena (handling e-commerce sites) against 36.2% from other models.
However, it's important to note that Operator is still in a research preview stage, which means users may encounter errors and bugs while using it.
One concern for users who prioritize security is that they must trust Operator with their login details. The cloud-based browser requires access to personal accounts to function effectively, and since it cannot be used with local browsers, some users may feel hesitant about OpenAI's commitment to safeguard sensitive information.
The feature is poised for a wider launch soon, with plans to extend availability to Plus subscribers subsequently. Developers will also benefit, as OpenAI is expected to release Operator via its API in the upcoming weeks, potentially leading to a new wave of AI-driven automation applications.
OpenAI has indicated that there are further enhancements on the way. During demonstrations, the team mentioned that they aim to expand the range of AI agents, moving beyond the current general-purpose assistant capabilities.
OpenAI, AI, Technology