GCP – What’s new with Google Cloud networking
The AI era is here, fundamentally reshaping industries and demanding unprecedented network capabilities for training, inference and serving AI models. To power this transformation, organizations need global networking solutions that can handle massive capacity, seamless connectivity, and provide robust security.
At Next 25, we’re addressing these critical needs and enabling customers to build and deliver distributed AI applications with ease through a suite of innovations in our cloud networking products and Cross-Cloud Network solutions.
These innovations include AI-optimized networking, simplified and secure service networking, and zero-trust security against zero-day threats. We are also expanding our Cross-Cloud Network solutions to include programmability and performance for the Global Front End for web, media, and generative AI services, plus our newest solution, Cloud WAN, which provides a fully managed global network for secure, simplified connectivity across enterprise locations, powered by our extensive global infrastructure.
AI-optimized networking: performant, secure, scalable
To make your AI models work their best, you need a network that can handle massive amounts of data and intense processing. Whether you’re training huge models or serving them to users (“inference”), speed, reliability, and security are essential. You’re dealing with complex infrastructure, and moving tons of data to provide lightning-fast responses. Our innovations are focused on enabling the infrastructure you need for these demanding AI workloads:
-
Massive data ingestion with 400G Cloud Interconnect and Cross-Cloud Interconnect: Onboard your AI datasets faster and train cross-cloud with 4x the bandwidth of our 100G Cloud Interconnect and Cross-Cloud Interconnect, providing connectivity from on-premises or other cloud environments to Google Cloud. Available later this year.
-
Unprecedented cluster scale: Build massive AI services with networking support for up to 30,000 GPUs per cluster in a non-blocking configuration, available in preview now.
-
Zero-Trust RDMA security: Help secure your high-performance GPU and TPU traffic with our RDMA firewall, featuring dynamic enforcement policies for zero-trust networking. Available later this year.
-
Accelerated GPU-to-GPU communication: Unleash up to 3.2Tbps of non-blocking GPU-to-GPU bandwidth with our high-throughput, low-latency RDMA networking, now generally available.
“Google Cloud plays a key role in our AI infrastructure, by supporting us to deliver high performance and secure AI experiences for our users at scale, while optimizing utilization of our resources.” – Xu Ning, Dir of Engineering, AI Platform, Snap, Inc.
The increasing complexity of AI inference, particularly when enterprises deploy multiple task-optimized models, presents significant networking challenges. The growing demand for AI capacity strains network infrastructure because efficiently routing data to GPU or TPU resources that are often distributed across regions requires high bandwidth and low latency. Furthermore, the introduction of gen AI applications and agents expands the attack surface, creating vulnerabilities for sensitive data leaks during inference, necessitating robust AI safety and security measures. To address these challenges, we are introducing GKE Inference Gateway, now in preview, which offers:
-
Differentiated performance for gen AI applications without exorbitant serving costs. New capabilities in GKE Inference Gateway reduce serving costs by up to 30%, tail latency by up to 60% and increase throughput by up to 40% compared to other managed and open-source Kubernetes offerings based on internal benchmarks. Features in GKE Inference Gateway include intelligent load balancing based on model server metrics from Google Jetstream, NVIDIA, and vLLM, dynamic request routing, and efficient, dynamic LoRA fine-tuned models.
-
AI security and safety with powerful new integrations. Now, you can leverage the GKE Inference Gateway and Cloud Load Balancing alongside Model Armor, NVIDIA NeMo Guardrails, and Palo Alto Networks AI Runtime Security. This combined approach uses Service Extensions for comprehensive protection for your AI models, simplifying governance for platform engineering and security teams.
-
Google Cloud Load Balancing optimizations for LLM inference, let you leverage NVIDIA GPU capacity across multiple cloud providers or on-prem infrastructure.
“Enterprises across industries are seeking full-stack, integrated infrastructure to deploy agentic AI securely and cost-effectively. By integrating NVIDIA inference software for real-time observability and NeMo Guardrails for robust security enforcement with GKE Inference Gateway, NVIDIA and Google Cloud are delivering advanced capabilities that enhance performance and reliability in AI deployments.” – Kari Briski, vice president of generative AI software for Enterprise, NVIDIA
- aside_block
- <ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3edb1c2aa610>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/products?#networking’), (‘image’, None)])]>
Programmable global front end for web, media, and AI
The Cross-Cloud Network Global Front End solution accelerates and secures the most demanding web, media, and now gen AI applications, regardless of where your backends are hosted, without having to expose your infrastructure to the internet. Today, we’re introducing new innovations for modern and gen AI apps:
-
Edge programmability with Service Extensions: Unlock the power of the edge with open programmability through Service Extensions plugins, powered by WebAssembly (Wasm). Automate, extend, and customize your applications with over 60 plugin examples in Rust, C++, and Go. Cloud Load Balancing support is now generally available, and Cloud CDN support will follow later this year.
-
Accelerated web performance: Deliver static and dynamic content at global scale with improved performance with Cloud CDN’s fast cache invalidation, and boost application performance for resumed connections with TLS 1.3 0-RTT. Both capabilities are now available in preview.
-
End-to-end mTLS security: Strengthen your security posture with end-to-end mTLS, protecting your data from client to backend infrastructure via Cloud Load Balancing. Client-to-frontend mTLS was launched last year, and mTLS to backends is now in preview.
“Service Extensions plugins give us the ability to customize our web services by easily running custom code directly in the request/response path. Having an edge programmability solution based on open standards like WebAssembly with a large set of out-of-the-box examples allows our developers to quickly address custom requirements from the business.” – Justin Reid, Principal Engineer, Shopify
Service-centric networking simplifies development
Whether you’re building cutting-edge gen AI applications or modernizing an existing system, service-centric architectures are essential for rapid iteration. As a pioneer of service-centric architectures, we are on a journey to help NetOps, DevOps, SecOps, and developer teams simplify service deployment and management. By abstracting away the complexities of the underlying networking and security layers, we empower developers to quickly deploy, update, and secure services across multiple applications. Today, we’re unveiling new innovations in automation, security, and scale with enhanced service-centric networking:
-
Streamlined service discovery and management. App Hub integration simplifies producer-consumer interactions by automating service discovery and cataloging. Service health (coming later this year) enables resilient global services with network-driven cross-regional failover.
-
Simplified multi-network, multi-service, multi-compute deployment. Later in 2025, you’ll be able to use Private Service Connect to publish multiple services within a single GKE cluster, making them natively accessible from non-peered GKE clusters, Cloud Run, or Service Mesh.
“Our collaboration with Google has enabled us to streamline service discovery and helped to empower our developers to iterate faster and more efficiently.” – Jonathan Perry, Partner, Engineering, Goldman Sachs
Protect modern and gen AI apps from evolving attacks
We’re witnessing a surge in sophisticated attacks: terabit-scale DDoS, DNS tunneling for data exfiltration, and the growing prevalence of AI-driven threats that evade traditional defenses. These cyber risks demand a fundamental shift in your approach to network security and underscore the need for advanced network security capabilities that extend beyond traditional perimeter defenses. Today, we’re announcing powerful network security enhancements that provide comprehensive protection for your distributed multi-cloud applications and internet-facing services.
Our strategy has three core pillars:
Secure the workload: Planet-scale DDoS protection, with up to 24x better threat efficacy
Protecting your distributed applications and internet-facing services against critical network attack vectors is paramount. Today, we’re introducing several key enhancements:
-
DNS Armor: DNS traffic often lacks adequate monitoring, making it a prime target for data exfiltration. Attackers exploit this blind spot, using DNS tunneling, domain generation algorithms (DGA) and other sophisticated techniques to bypass traditional security controls. Powered by Infoblox Threat Defense with visibility into 70 billion DNS events daily, DNS Armor detects these DNS-based data exfiltration attacks. Available in preview later this year.
-
Enhanced security posture enforcement: Strengthen your security posture with consistent org-wide protection using new hierarchical policies for Cloud Armor. Enforce granular protection independent of your network architecture with new network types and new firewall tags for Cloud NGFW Hierarchical firewall policies, coming this quarter in preview.
-
In 2024, we launched Cloud NGFW Enterprise with up to 24x higher efficacy than other major public clouds. We are continuing to improve Cloud NGFW with new layer 7 domain filtering, which will allow firewall administrators to monitor and control outbound web traffic to only allowed destinations, coming later in 2025.
“We use Cloud NGFW and Cloud Armor to protect our critical applications and websites in Google Cloud. The new Network Security innovations announced at Next will help us improve protection for our users and simplify how we manage network security.” – Jason Jones, Sr. Director, Security Engineering, UKG
Secure the data: Introducing inline network DLP
In today’s data-driven world, your enterprise’s intellectual property is its most valuable asset. But ensuring its security and compliance can be complex. We understand the need for robust, yet streamlined, data loss prevention (DLP) across both data at-rest and data in-transit. Our upcoming inline network DLP for Secure Web Proxy and Application Load Balancer provides real-time protection for sensitive data-in-transit via integration with third-party (Symantec DLP) solutions using Service Extensions. In preview this quarter, inline network DLP helps you safeguard critical data and maintain compliance, without sacrificing performance or agility.
Open security ecosystem: Third-party security insertion
We give you the flexibility to choose your preferred security solutions, tailoring protection to your specific needs. We are excited to expand our security partner ecosystem with deeper integrations. Recently, we announced that you can insert partner network services or virtual appliances with Google Cloud workloads via Network Security Integration. Now generally available, this helps you maintain consistent policies across hybrid and multi-cloud environments without changing your routing policies or network architecture.
Plus, to broaden our web and API protection ecosystem, we’ve worked with Imperva to integrate Imperva Application Security with Cloud Load Balancing, also via Service Extensions, and it’s now available in the Google Cloud Marketplace.
Cloud WAN: the enterprise backbone for the AI era
Connecting a modern business is incredibly complex. Customers have to deal with many different networks and security architectures, and must make difficult choices to balance reliability, application speeds, and cost. This can lead to complex, custom solutions that are hard to manage, weaken security postures, and often don’t deliver the best results. Cloud WAN, our newest Cross-Cloud Network solution, is a fully managed, reliable and secure enterprise backbone to transform enterprise WAN architectures and address these challenges.
Cloud WAN delivers significant advantages :
-
Cloud WAN provides up to a 40% savings in total cost of ownership (TCO) over a customer-managed WAN solution leveraging colocation facilities1
-
Global reach and performance via Google’s expansive backbone network with 99.99% reliability
-
Cross-Cloud Network up to 40% improved performance compared to the public internet2
-
An open, flexible and tightly integrated ecosystem with major SD-WAN and security vendors
For more details please read the full announcement here.
A network to deliver the AI era
Our cloud networking products and solutions give you the ability to connect, simplify, modernize and secure your organizations across the planet. With these new innovations — plus the new Cloud WAN — we continue to provide you with the flexibility to adapt to new technologies, services, applications and locations, all with the agility required for the AI era.
For more on our Google Cloud Next 2025 announcements, you can watch our Cross-Cloud Network innovations session, and check out the many great networking breakout sessions.
1. Architecture includes SD-WAN and 3rd party firewalls, and compares a customer-managed WAN using multi-site colocation facilities to a WAN managed and hosted by Google Cloud.
2. During testing, network latency was more than 40% lower when traffic to a target traveled over the Cross-Cloud Network compared to when traffic to the same target traveled across the public internet.
Read More for the details.