Anthropic launches Claude Sonnet 4.5 with coding improvements and major advance in computer use

29/09/2025

Anthropic has introduced Claude Sonnet 4.5, its new artificial intelligence model that leads programming and computer use evaluations. The launch includes updates to Claude Code, the API and applications, plus the new Claude Agent SDK for developers.

Anthropic launches Claude Sonnet 4.5 with coding improvements and major advance in computer use

Claude Sonnet 4.5 leads evaluations of programming and computer use capabilities in real-world conditions. In SWE-bench Verified, a test measuring coding skills in real situations, the model reaches 82.0%. In OSWorld, which evaluates the ability to perform real computing tasks, it reaches 61.4%, compared to 42.2% achieved by Claude Sonnet 4 four months ago. According to Anthropic, the model can maintain focus for over 30 hours on complex, multi-step tasks.

The launch includes significant updates to the company's products. Claude Code, the command-line tool for developers, incorporates checkpoints that allow saving progress and returning to previous states instantly. The terminal interface has been completely redesigned and a native VS Code extension has been launched. The Claude API adds context editing and memory functions that enable agents to execute longer and more complex tasks.

Claude applications now integrate code execution and file creation directly into conversations. Users can generate spreadsheets, presentations and documents without leaving the chat. The Claude for Chrome extension, available to Max subscribers who joined the waitlist last month, allows the model to navigate websites, fill spreadsheets and complete tasks directly in the browser.

Alongside the model, Anthropic launches the Claude Agent SDK, the infrastructure it uses internally to develop Claude Code. The kit provides developers with tools to build AI agents, including memory management systems for long-running tasks, permissions that balance autonomy with user control, and coordination of subagents working toward common goals. While Claude Code focuses on programming, the SDK can be applied to a wide variety of tasks.

Regarding alignment and safety, Anthropic describes this as its most aligned model to date. Internal evaluations show significant reductions in problematic behaviors such as excessive flattery, deception, power-seeking and tendency to encourage delusional thinking. For agent and computer use capabilities, defenses against prompt injection attacks have been implemented.

The model is released under AI Safety Level 3 protections, which include classifiers to detect potentially dangerous content related to chemical, biological, radiological and nuclear weapons. These classifiers may occasionally incorrectly identify normal content, so Anthropic has made it easy for users to continue interrupted conversations with Sonnet 4, a model that presents lower risk in this area.

Experts in finance, law, medicine and STEM disciplines have evaluated the model and found notable improvements in domain-specific knowledge and reasoning compared to previous models, including Opus 4.1. Anthropic has published detailed safety and alignment evaluations that, for the first time, include tests using mechanistic interpretability techniques.

The model is available today through the Claude API with the identifier claude-sonnet-4-5, maintaining the same pricing structure as its predecessor. Anthropic recommends upgrading to Claude Sonnet 4.5 for all uses, as it functions as a drop-in replacement with improved performance.

Key points

  • Claude Sonnet 4.5 reaches 82.0% in SWE-bench Verified and 61.4% in OSWorld, leading programming and computer use evaluations
  • The model can maintain focus for over 30 hours on complex, multi-step tasks
  • Anthropic launches the Claude Agent SDK, the infrastructure it uses internally to develop its products, now available to developers
  • Claude Code incorporates checkpoints to save progress, redesigned interface and native VS Code extension
  • Claude applications allow code execution and file creation (spreadsheets, presentations, documents) directly in conversations
  • The API adds memory and context editing functions so agents can execute longer and more complex tasks
  • The model shows significant reductions in misaligned behaviors such as flattery, deception and power-seeking according to internal evaluations
  • Defenses against prompt injection attacks and classifiers to detect CBRN weapons-related content are implemented under AI Safety Level 3 protections

Videos

Related AI

Anthropic

AI systems you can rely on

Anthropic develops reliable and interpretable artificial intelligence systems through a scientific approach to safety. The company integrates advanced research and multidisciplinary collaboration to ...

Claude

Create with Claude

Claude is a conversational AI system from Anthropic designed to process natural language and images, providing analysis, logical reasoning, code generation, and multilingual communication under ...

Claude Code

Terminal-based coding assistant

Claude Code is an agentic coding tool for terminal that integrates AI into the development workflow. It enables file editing, problem solving, test execution and git management through natural ...

Lastest news

Trustpilot
This website uses technical, personalization and analysis cookies, both our own and from third parties, to facilitate anonymous browsing and analyze website usage statistics. We consider that if you continue browsing, you accept their use.