Anthropic's New AI Model Operates Computers Like Humans, Errors Included

Have you ever dreamed of an AI that can seamlessly interact with your computer, just like a human would? Well, that dream is now a reality, thanks to Anthropic's latest innovation. On Tuesday, they unveiled the new generation of their Claude AI model, named Claude 3.5 Sonnet, which can operate a computer with surprising finesse. Currently in beta mode, this AI is available for developers to experiment with through an API.
Anthropic proudly labels Claude 3.5 Sonnet as the "first frontier AI model to offer computer use in public beta." This means developers can program it to perform a variety of tasks on a computer, such as viewing the screen, maneuvering the cursor, clicking buttons, and even typing on a virtual keyboard. The goal? To replicate the way we interact with our computers every day.
Now, while this new AI is still in the experimental phase, it's not without its hiccups. It can be a bit clumsy and error-prone at times. But that's exactly why Anthropic released it in beta—to gather valuable feedback from developers and refine the model over time.
Why Should We Care About AI Using Computers?
Anthropic has a clear answer to that question: "A vast amount of modern work happens via computers." By enabling AIs to interact with software the same way humans do, they unlock a plethora of new applications that current AI assistants can't handle.
How Can Developers and Users Benefit?
Instead of creating specific tools for each task, Anthropic is teaching Claude general computer skills. This allows the AI to utilize a wide range of standard software programs designed for humans. Developers can harness this capability to automate repetitive tasks, build and test software, and even conduct research.
Several companies are already leveraging Claude 3.5 Sonnet's computer skills, including Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company. For instance, Replit is using these capabilities to enhance its Replit Agent product.
How Did They Train Claude to Use Computers?
Training Claude to navigate a computer involved a lot of trial and error, according to Anthropic. The process requires the AI to understand and interpret images of the computer screen, then decide which actions to take based on what it sees. Claude 3.5 Sonnet accomplishes this by analyzing screenshots, counting pixels to precisely move the cursor, and issuing mouse commands.
How Well Is Claude Performing?
In the OSWorld benchmarking tests, which assess AI models' ability to use computers, Claude 3.5 Sonnet achieved a score of 14.9%. While this is significantly lower than the 70%-75% human-level performance, it's nearly double the 7.7% scored by the next best AI model in the same category.
Despite these promising results, Claude's computer use is still in its infancy. It can't yet perform more complex tasks like dragging windows or zooming into the screen. Additionally, because it relies on screenshots, it might miss certain actions and notifications.
Anthropic remains optimistic, stating, "We expect that computer use will rapidly improve to become faster, more reliable, and more useful for the tasks our users want to complete." They also emphasize that as the technology evolves, it will become more accessible to those with less software development experience, all while maintaining strict safety measures.
Claude 3.5 Sonnet is now accessible to everyone. Developers can start building applications with the computer-use beta on the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.
Related article
DeepSeek Code poised for launch
As AI technology accelerates, DeepSeek is at a thrilling juncture. The AI company recently revealed it has secured over 70 billion yuan in funding. Leadership has emphasized a commitment to groundbreaking AI research over immediate commercial gains.
Musk’s Grok: 1.5 Trillion Parameters and Cursor Code Absorption—Game Changer or Bluff?
Elon Musk is finally making a move.In the AI programming race, OpenAI and Anthropic are accelerating, while xAI appears to be lagging. Musk has often stated his aim to rival Claude, yet despite multiple updates to the Grok4.X series, the results look
OpenAI Secretly Changes Charter to Make Removing Altman Harder
Following the 2023 coup-like incident, OpenAI has further solidified protections for CEO Sam Altman by updating its corporate bylaws. Recently released court documents reveal that Altman's position is now rock-solid, with substantially higher barrier
Related Special Topic Recommendations
Comments (8)
0/500
Когда ИИ начинает делать те же ошибки, что и я в работе с компьютером, это по-своему успокаивает 😂 Меня беспокоит, насколько мы готовы доверить программному обеспечению такое прямое взаимодействие с интерфейсом. Это ведь прямая дорога как к невероятной производительности, так и к полному хаосу, если что-то пойдет не так. Кажется, пора задуматься о новых 'правилах дорожного движения' для роботов-помощников.
Этот AI, который делает ошибки, как человек, звучит одновременно и забавно, и немного тревожно 😅 Получается, мы создали идеального цифрового стажёра, который тоже путает Ctrl+C и Ctrl+V? Интересно, как это повлияет на безопасность — вдруг он случайно удалит что-то важное, пытаясь 'помочь'?
Wow, Claude 3.5 Sonnet sounds like a game-changer! An AI that mimics human computer use, errors and all? That’s wild. I wonder how it handles my messy desktop and random browser tabs 😅. Super curious to see it in action!
This AI acting like a human on computers is wild! 😮 Makes me wonder if it’ll start rage-quitting when apps crash like I Elyse.

Have you ever dreamed of an AI that can seamlessly interact with your computer, just like a human would? Well, that dream is now a reality, thanks to Anthropic's latest innovation. On Tuesday, they unveiled the new generation of their Claude AI model, named Claude 3.5 Sonnet, which can operate a computer with surprising finesse. Currently in beta mode, this AI is available for developers to experiment with through an API.
Anthropic proudly labels Claude 3.5 Sonnet as the "first frontier AI model to offer computer use in public beta." This means developers can program it to perform a variety of tasks on a computer, such as viewing the screen, maneuvering the cursor, clicking buttons, and even typing on a virtual keyboard. The goal? To replicate the way we interact with our computers every day.
Now, while this new AI is still in the experimental phase, it's not without its hiccups. It can be a bit clumsy and error-prone at times. But that's exactly why Anthropic released it in beta—to gather valuable feedback from developers and refine the model over time.
Why Should We Care About AI Using Computers?
Anthropic has a clear answer to that question: "A vast amount of modern work happens via computers." By enabling AIs to interact with software the same way humans do, they unlock a plethora of new applications that current AI assistants can't handle.
How Can Developers and Users Benefit?
Instead of creating specific tools for each task, Anthropic is teaching Claude general computer skills. This allows the AI to utilize a wide range of standard software programs designed for humans. Developers can harness this capability to automate repetitive tasks, build and test software, and even conduct research.
Several companies are already leveraging Claude 3.5 Sonnet's computer skills, including Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company. For instance, Replit is using these capabilities to enhance its Replit Agent product.
How Did They Train Claude to Use Computers?
Training Claude to navigate a computer involved a lot of trial and error, according to Anthropic. The process requires the AI to understand and interpret images of the computer screen, then decide which actions to take based on what it sees. Claude 3.5 Sonnet accomplishes this by analyzing screenshots, counting pixels to precisely move the cursor, and issuing mouse commands.
How Well Is Claude Performing?
In the OSWorld benchmarking tests, which assess AI models' ability to use computers, Claude 3.5 Sonnet achieved a score of 14.9%. While this is significantly lower than the 70%-75% human-level performance, it's nearly double the 7.7% scored by the next best AI model in the same category.
Despite these promising results, Claude's computer use is still in its infancy. It can't yet perform more complex tasks like dragging windows or zooming into the screen. Additionally, because it relies on screenshots, it might miss certain actions and notifications.
Anthropic remains optimistic, stating, "We expect that computer use will rapidly improve to become faster, more reliable, and more useful for the tasks our users want to complete." They also emphasize that as the technology evolves, it will become more accessible to those with less software development experience, all while maintaining strict safety measures.
Claude 3.5 Sonnet is now accessible to everyone. Developers can start building applications with the computer-use beta on the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.
DeepSeek Code poised for launch
As AI technology accelerates, DeepSeek is at a thrilling juncture. The AI company recently revealed it has secured over 70 billion yuan in funding. Leadership has emphasized a commitment to groundbreaking AI research over immediate commercial gains.
Musk’s Grok: 1.5 Trillion Parameters and Cursor Code Absorption—Game Changer or Bluff?
Elon Musk is finally making a move.In the AI programming race, OpenAI and Anthropic are accelerating, while xAI appears to be lagging. Musk has often stated his aim to rival Claude, yet despite multiple updates to the Grok4.X series, the results look
OpenAI Secretly Changes Charter to Make Removing Altman Harder
Following the 2023 coup-like incident, OpenAI has further solidified protections for CEO Sam Altman by updating its corporate bylaws. Recently released court documents reveal that Altman's position is now rock-solid, with substantially higher barrier
Когда ИИ начинает делать те же ошибки, что и я в работе с компьютером, это по-своему успокаивает 😂 Меня беспокоит, насколько мы готовы доверить программному обеспечению такое прямое взаимодействие с интерфейсом. Это ведь прямая дорога как к невероятной производительности, так и к полному хаосу, если что-то пойдет не так. Кажется, пора задуматься о новых 'правилах дорожного движения' для роботов-помощников.
Этот AI, который делает ошибки, как человек, звучит одновременно и забавно, и немного тревожно 😅 Получается, мы создали идеального цифрового стажёра, который тоже путает Ctrl+C и Ctrl+V? Интересно, как это повлияет на безопасность — вдруг он случайно удалит что-то важное, пытаясь 'помочь'?
Wow, Claude 3.5 Sonnet sounds like a game-changer! An AI that mimics human computer use, errors and all? That’s wild. I wonder how it handles my messy desktop and random browser tabs 😅. Super curious to see it in action!
This AI acting like a human on computers is wild! 😮 Makes me wonder if it’ll start rage-quitting when apps crash like I Elyse.





Home






