2025 02 24

GCP – Announcing Claude 3.7 Sonnet, Anthropic’s first hybrid reasoning model, is available on Vertex AI

Today, we’re announcing Claude 3.7 Sonnet, Anthropic’s most intelligent model to date and the first hybrid reasoning model on the market, is available in preview on Vertex AI Model Garden. Claude 3.7 Sonnet can produce quick responses or extended, step-by-step thinking that is made visible to the user. Claude 3.7 Sonnet includes improvements in coding, and is optimized for real-world, practical use cases to reflect customers’ needs.

“Claude 3.7 Sonnet represents an exciting breakthrough as the first hybrid reasoning model, combining rapid responses and reasoning in a single model,” said Kate Jensen, Head of Revenue at Anthropic. “By making Claude 3.7 Sonnet available through Vertex AI, Google Cloud customers can now apply this transformative technology across their organizations. Whether developing complex software solutions, delivering customer experiences, or conducting strategic analysis, Claude on Vertex AI helps teams to tackle their most challenging business problems with enterprise-grade reliability.”

We’re also announcing Vertex AI support for Anthropic’s new agentic coding tool, Claude Code. Claude Code lets developers delegate coding tasks to Claude directly from their terminal and is available through Anthropic’s limited research preview. For more information on Claude 3.7 Sonnet and Claude Code, including how to access Claude Code, check out Anthropic’s blog here.

Build on a unified AI platform with Vertex AI

To explore the full potential of foundational models like Claude, you’ll need advanced development tools and enterprise-grade reliability to use them in your applications. That’s what you get with Vertex AI, which is built on Google’s AI-optimized infrastructure, stringent security, and learnings from serving 300+ real-world use cases.

Vertex AI empowers you to take your Claude-powered applications from concept to production on a unified platform. With Vertex AI’s Model-as-a-Service (MaaS) offering, you benefit from simplified procurement, fully managed infrastructure, enterprise-grade security, and advanced developer tools.

Confidently deploy agents in production: Power production-grade AI agents with Claude 3.7 Sonnet, using Vertex AI’s full suite of agentic tools and services, including RAG Engine and Agent Engine (coming soon).
Optimize performance with fully managed infrastructure: Simplify how you deploy and scale Claude 3.7 Sonnet with Vertex AI’s fully managed infrastructure that’s tailored for AI workloads.
Accelerate development with powerful MLOps tools: Explore and evaluate Claude 3.7 Sonnet with fully integrated platform tools like Vertex AI Evaluation for model testing and evaluation and the LangChain integration for custom application building.
Build with enterprise-grade security, compliance, and data governance: Leverage Google Cloud’s robust built-in security, privacy, and compliance measures to securely scale your applications. Enterprise controls, such as Vertex AI Model Garden’s organization policy, provide the right access controls to make sure only approved models can be accessed.

Additional features to make the most of Claude on Vertex AI

To enhance your interaction and deployment of Claude models on Vertex AI, including Claude 3.7 Sonnet, we also offer advanced features designed to reduce latency and costs, increase throughput, and optimize Claude model utilization:

Count tokens (generally available): Make more informed decisions about your prompts and usage by determining the number of tokens in a message before sending it to Claude. Learn more on how to use count tokens with Claude models and which models are supported here.
Citations (generally available): Verify sources with detailed references to the exact sentences and passages it uses to generate responses, leading to more verifiable, trustworthy outputs. Claude 3.7 Sonnet, upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku support Citations.
Batch predictions (preview): Process large volumes of requests asynchronously for cost savings. Popular applications include analyzing large datasets—such as customer databases—for risk assessment or fraud detection, and applications that require periodic updates—such as generating daily reports. Each batch job is processed in less than 24 hours and costs 50% less than standard Anthropic API calls. Learn more on how to use batch predictions with Claude models and which models are supported here.
Prompt caching (preview): Provide Claude with more background knowledge and example outputs to improve response accuracy—all while reducing costs. You can cache all or specific parts of your frequently used inputs, so that subsequent queries can use the cached results. Learn more on how to use prompt caching with Claude models and which models are supported here.

We’re also excited to share that Claude 3.5 Haiku, which is already available on Vertex AI Model Garden, now supports multi-modal image input. Claude 3.5 Haiku is Anthropic’s fastest and most cost-effective model.

Customers are driving business results with Anthropic on Google Cloud

AES, a global energy company, uses Claude on Vertex AI to significantly increase the accuracy and speed of the company’s health and safety audits:

“Our auditors previously spent 14 days completing each audit process. Now, with our Claude-powered agents on Vertex AI, the same work is completed in just one hour. I love the accuracy of Anthropic’s Claude models and the security and advanced AI tools that Google Cloud provides to utilize these models for our auditing process.” — Sean Otto, Senior Director of Data Science & Analytics at AES

Palo Alto Networks, a global cybersecurity company, is accelerating software development and security by deploying Anthropic’s Claude models on Vertex AI:

“With Claude running on Vertex AI, we saw a 20% to 30% increase in feature development and code implementation. Running Claude on Google Cloud’s Vertex AI not only accelerates development projects, it enables us to hardwire security into code before it ships.” — Gunjan Patel, Director of Engineering, Office of the CPO at Palo Alto Networks

Quora, the global knowledge-sharing platform, is harnessing Claude’s capabilities on Vertex AI to facilitate millions of daily interactions through Quora’s own AI-powered chat platform, Poe:

“We consistently hear from our users about how much they enjoy the intelligence, adaptability, and natural conversational abilities of Anthropic’s Claude models. They’re relying on these qualities for a wide variety of tasks, from the complex to the creative. By leveraging Claude with Vertex AI’s secure and scalable platform, we’re able to facilitate millions of daily interactions, ensuring both speed and reliability.” — Spencer Chan, Product Lead at Poe by Quora

Replit, a platform for software development and deployment, leverages Claude on Vertex AI to power Replit Agent, which empowers people across the world to use natural language prompts to turn their ideas into applications, regardless of coding experience.

“Our AI agent is made more powerful through Anthropic’s Claude models running on Vertex AI. This integration allows us to easily connect with other Google Cloud services, like Cloud Run, to work together behind the scenes to help customers turn their ideas into apps.” — Amjad Masad, Founder and CEO of Replit

Get started

Select the Claude 3.7 Sonnet model card in Vertex AI Model Garden. You can also find and easily procure Claude 3.7 Sonnet on Google Cloud Marketplace and take advantage of the ability to draw down on your Google Cloud spend commitments.
Select “Enable” and follow the proceeding instructions.
Explore our sample notebook and documentation to start building.

GCP – Announcing Claude 3.7 Sonnet, Anthropic’s first hybrid reasoning model, is available on Vertex AI

Build on a unified AI platform with Vertex AI

Additional features to make the most of Claude on Vertex AI

Customers are driving business results with Anthropic on Google Cloud

Get started

Related Posts

AWS – Amazon VPC Route Server now available in new regions

GCP – Palo Alto Networks automates customer intelligence document creation with agentic design

GCP – Vibe querying: Write SQL queries faster with Comments to SQL in BigQuery