Anthropic Introduces Computer Control Feature in Claude 3.5 Sonnet AI
Anthropic has recently unveiled an exciting new feature in its Claude 3.5 Sonnet AI model that allows it to control a computer autonomously. This innovative capability, known as "computer use," enables Claude to interact with a computer in a way similar to how a human would. It can observe a screen, move the cursor, click buttons, and type text, making it a significant advancement in AI technology.
The computer use feature is currently in public beta and available on the API, allowing developers to implement it for various tasks. A demonstration of this functionality has been made available, showcasing Claude's ability to operate on a Mac.
Other tech giants like Microsoft and OpenAI have introduced similar features, such as Microsoft’s Copilot Vision and OpenAI’s desktop app for ChatGPT, which can interpret the computer screen to perform tasks. Google's Gemini app also possesses comparable capabilities for Android users. However, none of these platforms have yet rolled out tools that fully mimic human-like interaction by clicking around and completing tasks autonomously.
While this development is promising, Anthropic notes that the computer use feature is still experimental. They acknowledge that it may be "cumbersome and error-prone" at this stage. As pointed out by the company, this early release aims to gather feedback from developers, and they anticipate that the feature will see rapid improvements in the future.
The developers have indicated that there are specific actions that Claude cannot perform just yet. For instance, tasks like dragging and zooming are beyond its current capabilities. Additionally, Claude's perception of the screen is described as a "flipbook" approach, meaning it takes screenshots and stitches them together, rather than processing a real-time video stream. As a result, it may overlook fleeting actions or notifications on the screen.
In an effort to promote responsible usage, this version of Claude is designed to avoid engaging with social media and has built-in mechanisms to monitor requests related to election activities. It is also guided to refrain from actions such as generating social media content, domain registration, or interaction with government websites.
Alongside this groundbreaking feature, Anthropic has reported notable enhancements in other areas of the Claude 3.5 Sonnet model. The latest version shows substantial improvements across various benchmarks, maintaining the same pricing and efficiency as its predecessor. Notably, it has achieved impressive gains in tasks related to agentic coding and tool usage.
The updated Claude model improves its performance in coding challenges, evidenced by a rise in scores from 33.4% to 49.0% on the SWE-bench Verified benchmark, surpassing all publicly available models. It also demonstrates enhanced abilities in TAU-bench, another agentic tool use evaluation, improving from 62.6% to 69.2% in retail applications and from 36.0% to 46.0% in the more complex airline sector. This progress indicates a promising future for Claude and its potential applications.
AI, technology, computer, development, software