Claude 3.7 Sonnet: First Hybrid Reasoning Model and Launch of Intelligent Coding Tool Claude Code

AI News6mos agorelease AI Sharing Circle

1.5K 00

Just last night, news of Anthropic's upcoming release of a new model spread quickly through the AI community, but not in the way that was previously expected. Claude 4.0, but rather Claude 3.7 Sonnet version.

Claude 3.7 Sonnet：首创混合推理模式并推出智能编码工具 Claude Code

Early this morning, Anthropic released its latest flagship model right on time, theThe official launch of Claude 3.7 Sonnet, claimed to be the smartest to date and the first hybrid inference model on the market!The

Claude 3.7 Sonnet delivers both fast responses in near real-time and deeper, more detailed step-by-step thinking based on user needs. As Anthropic The description "One model, two ways to think..." refers to the fact that it has both standard and extended modes of thinking. In addition, API users have more fine-grained control over the length of time a model can think.

In addition to the release of Claude 3.7 Sonnet.Anthropic has also launched a parallel command line tool called Claude Code that focuses on smart codingClaude is available as a limited research preview. The tool is currently available as a limited research preview and is designed to allow developers to leave a large number of engineering tasks to Claude directly in the terminal environment.

In terms of coding capabilities, Anthropic has further optimized the coding experience on the Claude.ai platform. Its GitHub integration is now available across all Claude programs, allowing developers to connect their code repositories directly to Claude, and by providing a deeper understanding of personal, work, and open source projects, Claude will become an even more powerful assistant for developers when it comes to bug fixing, feature development, and documentation building in GitHub projects.

Because of this, and benefiting from significant improvements in coding and front-end web development capabilities.Claude 3.7 Sonnet became Anthropic's best encoding model to date.The

Currently, users can experience the latest Claude 3.7 Sonnet model through all Claude plans (including Free, Pro, Team, and Enterprise), as well as platforms such as Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI. In addition to Free users, all paid users can experience its Extended Thinking model.

In the standard and extended thinking modes, thePricing for Claude 3.7 Sonnet remains consistent with the previous generation of Claude 3.5 Sonnet at $3 per million input tokens and $15 per million output tokens (including think tokens).The

As one user commented, "Every new release from Anthropic is surprising and exciting!"

Maximum Claude 3.7 Sonnet

Putting cutting-edge reasoning at your fingertips

Anthropic emphasizes that Claude 3.7 Sonnet was developed with a different philosophy than other reasoning models on the market, arguing that just as the human brain is able to react quickly and think deeply at the same time, AI reasoning should integrate the capabilities of cutting-edge models, rather than separating them from each other. This unified design approach aims to provide a smoother user experience.

In line with this philosophy, the Claude 3.7 Sonnet offers a number of unique advantages.

First.Claude 3.7 Sonnet is unique in that it can be used as a general-purpose LLM but also has powerful reasoning capabilities. Depending on your needs, you can choose to have the model give you a quick answer, or to think more deeply before answering.The Claude 3.7 Sonnet can be seen as an upgrade from the previous Claude 3.5 Sonnet. In standard mode, Claude 3.7 Sonnet can be seen as an upgraded version of its predecessor, Claude 3.5 Sonnet. In Extended Thinking Mode, it reflects on itself before giving an answer, which significantly improves its performance on a wide range of tasks, including math, physics, instruction following, coding, etc. Anthropic officials note that in both modes, the model understands and processes the cue words in a similar way.

Secondly.When calling Claude 3.7 Sonnet using the API, users can also customize the model's "thinking budget". Specifically, the user can set Claude to think in terms of the maximum number of token Number (N). Regardless of the N value, the model caps the number of output tokens at 128K. This allows the user to find the optimal balance between speed (and cost) of response and quality of answer.

Third, in developing its inference model, theInstead of focusing excessively on optimizing model performance on math and computer science competition questions, as others have done, Anthropic focuses on real-world tasks that are more relevant to practical application scenarios in the enterpriseThe

From the Claude 3.7 Sonnet benchmark results, in the SWE-bench Verified benchmark (which was designed to evaluate LLM's ability to solve real software problems on GitHub), theClaude 3.7 Sonnet achieved SOTA-level performance, significantly ahead of models such as Claude 3.5 Sonnet, OpenAI's o3-mini (high) and o1, and DeepSeek R1.The

The Claude 3.7 Sonnet also performed well in the TAU-bench benchmark, a benchmarking platform used to evaluate LLM's ability to interact with the tool in complex, realistic scenarios, achieving SOTA-level performance, outperforming both the Claude 3.5 Sonnet and OpenAI's o1 model.

Claude 3.7 Sonnet demonstrates excellent performance in a number of areas, including instruction adherence, generalized reasoning, multimodal capabilities, and intelligent coding, with significant enhancements in math and science, especially in Extended Thinking Mode. However, in some specific areas, it still falls slightly short of OpenAI's o3-mini (high), Grok-3 Beta, and other models.

It's easy to see that Anthropic has focused on coding capabilities with Claude 3.7 Sonnet, with relatively less prominent improvements in other areas. It is clear that Anthropic intends to position the Sonnet series as an AI model focused on coding (and is actually moving in that direction).

It's worth noting that in addition to excelling in traditional benchmarks, the Claude 3.7 Sonnet even outperformed all previous models in the Pokémon playtest.

Anthropic has already conducted extensive early testing with its partners, and the results have amply demonstrated the leadership of the Claude family of models in terms of encoding capability.

For example, the Cursor team noted that Claude was once again the preferred solution for real-world coding tasks, showing significant improvements in handling complex code bases and using advanced tools, and the Cognition team found that Claude outperformed the other models in code change planning and full-stack update processing. Vercel emphasized Claude's accuracy in complex agent workflows, and Replit successfully used Claude to build complex web applications and dashboards from scratch where other models struggled, while Canva's evaluation showed that Claude consistently produced well-designed, production-ready code with significantly fewer bugs. Significantly reduced error rates.

Claude Code

Intelligent Coding for Easier Development

Since June 2024, the Sonnet family of models has been the go-to choice for developers around the world. Today, theAnthropic has officially released Claude Code, its first intelligent coding tool (currently in a limited research preview), designed to further enhance developer productivity and capabilityThe

Functionally, Claude Code is positioned as a proactive collaboration partner, capable of performing tasks such as code searching and reading, file editing, test writing and running, code committing and pushing to GitHub, and invoking various command line tools.

Let's go through a few examples Claude Code application scenarios, such as explaining the project structure:

Writing tests:

Build the application:

Although still in early preview, Claude Code has become an indispensable tool for the Anthropic team, especially for test-driven development, debugging complex problems, and large-scale code refactoring.

In early testing, Claude Code has been able to accomplish tasks in a single pass that would normally take more than 45 minutes to complete manually, significantly reducing development time and costs.The

In the coming weeks, Anthropic plans to continue optimizing Claude Code based on feedback from its own usage, including improving the reliability of tool calls, enhancing support for long-running commands, improving in-app rendering, and expanding the depth of Claude's understanding of its own functionality.

The launch of Claude Code is designed to provide a deeper understanding of how developers work with Claude for coding, thus providing a valuable reference for future iterations and upgrades of Anthropic's models. Those who participate in the Claude Code preview experience will have early access to the powerful tools Anthropic uses internally to build and optimize Claude models.

Responsible construction and future perspectives

Anthropic thoroughly tested and evaluated Claude 3.7 Sonnet and worked with external security experts to ensure that the model fully meets the security and reliability standards it sets for itself.

At the same time, Claude 3.7 Sonnet demonstrates finer judgment in distinguishing between harmful and benign requests. Compared to the previous generation model, it has reduced the number of unnecessary rejections by 45%.

CoT fidelity assessment results.

In the Model Card for Claude 3.7 Sonnet, Anthropic details its framework for evaluating responsible AI scaling policies and draws on the hands-on experience of other AI labs and researchers in related work. Additionally, the model card outlines the new types of risks posed by the application of AI technologies, specifically rapid injection attacks, and explains how Anthropic assesses and responds to these potential security vulnerabilities, as well as how it trains the Claude model to defend against and mitigate these risks. In addition to this, the Model Card delves into the potential security benefits that inference models can bring, and examines questions such as "how to understand the model's decision-making process" and "whether the model's inference results are truly trustworthy and reliable".

Anthropic believes that the release of Claude 3.7 Sonnet and Claude Code marks a critical step towards truly empowering humans with AI systems. With superior deep reasoning, autonomous work, and efficient collaboration, Anthropic is bringing us closer to a vision of a future in which AI technology fully enriches and expands human potential.

Anthropic also has an exciting vision for the future: by 2025, they expect Claude to have evolved into an expert intelligence that can work autonomously for hours on end, and by 2027, Anthropic expects Claude to be able to tackle complex problems that would take years for a human team to solve.

AI News

The article is copyrighted and should not be reproduced without permission.

Sam Altman: OpenAI Confirms Release of AI Agents to Revolutionize Enterprise Efficiency

AI News

7mos ago

01.5K

A Deep Dive into the Next Generation of AI Programming Tools and the Innovative Practices of AutoDev Sketch

AI News

5mos ago

01K

Project-level code generation results are in! o3/Claude 3.7 leads the way, R1 is in the top tier!

AI News

5mos ago

0936

Devin uses autonomous AI programming assistant to fix problematic code for open source community

AI News

7mos ago

01.3K

No comments

You must be logged in to leave a comment!

No comments...

Claude 3.7 Sonnet: First Hybrid Reasoning Model and Launch of Intelligent Coding Tool Claude Code

Maximum Claude 3.7 Sonnet

Putting cutting-edge reasoning at your fingertips

Claude Code

Intelligent Coding for Easier Development

Responsible construction and future perspectives

Monica (Monica) opens a domestic domain name and compares it to the overseas paid model, the domestic version is free to use!

Claude 3.7 Sonnet and Claude Code: cutting-edge reasoning meets Agentic coding

Related posts

Sam Altman: OpenAI Confirms Release of AI Agents to Revolutionize Enterprise Efficiency

A Deep Dive into the Next Generation of AI Programming Tools and the Innovative Practices of AutoDev Sketch

Project-level code generation results are in! o3/Claude 3.7 leads the way, R1 is in the top tier!

Devin uses autonomous AI programming assistant to fix problematic code for open source community

No comments

Latest Collections

Latest Articles

Claude 3.7 Sonnet: First Hybrid Reasoning Model and Launch of Intelligent Coding Tool Claude Code

Maximum Claude 3.7 Sonnet

Putting cutting-edge reasoning at your fingertips

Claude Code

Intelligent Coding for Easier Development

Responsible construction and future perspectives

Monica (Monica) opens a domestic domain name and compares it to the overseas paid model, the domestic version is free to use!

Claude 3.7 Sonnet and Claude Code: cutting-edge reasoning meets Agentic coding

Related posts

Sam Altman: OpenAI Confirms Release of AI Agents to Revolutionize Enterprise Efficiency

A Deep Dive into the Next Generation of AI Programming Tools and the Innovative Practices of AutoDev Sketch

Project-level code generation results are in! o3/Claude 3.7 leads the way, R1 is in the top tier!

Devin uses autonomous AI programming assistant to fix problematic code for open source community

No comments

Selected AI Tools

Latest Collections

Latest Articles