The complexity of the infrastructure behind AI training and high performance computing (HPC) workloads can really slow teams down. At Google Cloud, where we work with some of the world’s largest AI research teams, we see it everywhere we go: researchers hampered by complex configuration files, platform teams struggling to manage GPUs with home-grown scripts, and operational leads battling the constant, unpredictable hardware failures that derail multi-week training runs. Access to raw compute isn’t enough. To operate at the cutting edge, you need reliability that survives hardware failures, orchestration that respects topology, and a lifecycle management strategy that adapts to evolving needs.
Today, we are delivering on those requirements with the General Availability (GA) of Cluster Director and the Preview of Cluster Director support for Slurm on Google Kubernetes Engine (GKE).
Cluster Director (GA) is a managed infrastructure service designed to meet the rigorous demands of modern supercomputing. It replaces fragile DIY tooling with a robust topology-aware control plane that handles the entire lifecycle of Slurm clusters, from the first deployment to the thousandth training run.
We are expanding Cluster Director to support Slurm on GKE (Preview), designed to give you the best of both worlds: the familiar precision of high-performance scheduling and the automated scale of Kubernetes. It achieves this by treating GKE node pools as a direct compute resource for your Slurm cluster, allowing you to scale your workloads with Kubernetes’ power without changing your existing Slurm workflows.
Cluster Director, now GA
Cluster Director offers advanced capabilities at each phase of the cluster lifecycle, spanning preparation (Day 0), where infrastructure design and capacity are determined; deployment (Day 1), where the cluster is automatically deployed and configured; and monitoring (Day 2), where performance, health, and optimization are continuously tracked.
This holistic approach ensures that you get the benefits of fully configurable infrastructure while automating lower-level operations so your compute resources are always optimized, reliable, and available.
So, what does all this cost? That’s the best part. There’s no extra charge to use Cluster Director. You only pay for the underlying Google Cloud resources — your compute, storage, and networking.
How Cluster Director supports each phase of deployment
Day 0: Preparation
Standing up a cluster typically involves weeks of planning, wrangling Terraform, and debugging the network. Cluster Director changes the ‘Day 0’ experience entirely, with tools for designing infrastructure topology that’s optimized for your workload requirements.
To streamline your Day 0 setup, Cluster Director provides:
Reference architectures: We’ve codified Google’s internal best practices into reusable cluster templates, enabling you to spin up standardized, validated clusters in minutes. This helps ensure that every team in your organization is using the same security standards for their deployments and deploying on infrastructure that is configured correctly by default — right down to the network topology and storage mounting.
Guided configuration: We know that having too many options can lead to configuration paralysis. The Cluster Director control plane guides you through astreamlined setup flow. You select your resources, and our system handles the complex backend mapping, ensuring that storage tiers, network fabrics, and compute shapes are compatible and optimized before you deploy.
Broad hardware support: Cluster Director offers full support for large-scale AI systems, including Google Cloud’s A4X and A4X Max VMs powered by NVIDIA GB200 and GB300 GPUs, and versatile CPUs such as N2 VMs for cost-effective login nodes and debugging partitions.
Flexible consumption options: Cluster Director integrates with your preferred procurement strategy, with support for Reservations for guaranteed capacity during critical training runs, Dynamic Workload Scheduler Flex-start for dynamic scaling, or Spot VMs for opportunistic low-cost runs.
“Google Cloud’s Cluster Director is optimized for managing large-scale AI and HPC environments. It complements the power and performance of NVIDIA’s accelerated computing platform. Together, we’re providing customers with a simplified, powerful, and scalable solution to tackle the next generation of computing challenges.“ – Dave Salvator, Director of Accelerated Computing Products, NVIDIA
Day 1: Deployment
Deploying hardware is one thing, but maximizing performance is another thing entirely. Day 1 is the execution phase, where your configuration transforms into a fully operational cluster. The good news is that Cluster Director doesn’t just provision VMs, it validates that your software and hardware components are healthy, properly networked, and ready to accept the first workload.
To ensure a high-performance deployment, Cluster Director automates:
Getting a clean “bill of health”: Before your job ever touches a GPU, Cluster Director runs a rigorous suite of health checks, including DCGMI diagnostics and NCCL performance validation, to verify the integrity of the network, storage, and accelerators.
Keeping accelerators fed with data: Storage throughput is often the silent killer of training efficiency. That’s why Cluster Director fully supports Google Cloud Managed Lustre with selectable performance tiers, allowing you to attach high-throughput parallel storage directly to your compute nodes, so your GPUs are never starved for data.
Maximizing Interconnect Performance: To achieve peak scaling, Cluster Director implements topology-aware scheduling and compact placement policies. By utilizing dense reservations on Google’s non-blocking fabric, the system ensures that your distributed workloads are placed on the shortest physical path possible, minimizing tail latency and maximizing collective communication (NCCL) speeds from the get-go.
Day 2: Monitoring
The reality of AI and HPC infrastructure is that hardware fails and requirements change. A rigid cluster is an inefficient cluster. As you move into the ongoing “Day 2” operational phase, you need to maintain cluster health, maximize utilization and performance. Cluster Director provides a control plane equipped for the complexities of long-term operations. Today we are introducing new active cluster management capabilities to handle the messy reality of Day 2 operations.
New active cluster management capabilities include:
Topology-level visibility: You can’t orchestrate what you can’t see. Cluster Director’s observability graphs and topology grids let you visualize your entire fleet, spot thermal throttles or interconnect issues, and optimize job placement based on physical proximity.
One-click remediation: When a node degrades, you shouldn’t have to SSH in to debug it. Cluster Director allows you to replace faulty nodes with a single click directly from the Google Cloud console. The system handles the draining, teardown, and replacement, returning your cluster to full capacity in minutes.
Adaptive infrastructure: When your research needs change, so should your cluster. You can now modify active clusters, with activities such as adding or removing storage filesystems, on the fly, without tearing down the cluster or interrupting ongoing work.
Cluster Director support for Slurm on GKE, now in preview
Innovation thrives in the open. Google, the creator of Kubernetes, and SchedMD, the developers behind Slurm, have long championed the open-source technologies that power the world’s most advanced computing. For years, NVIDIA and SchedMD have worked in lockstep to optimize GPU scheduling, introducing foundational features like the Generic Resource (GRES) framework and Multi-Instance GPU (MIG) support that are essential for modern AI. By acquiring SchedMD, NVIDIA is doubling down on its commitment to Slurm as a vendor-neutral standard, ensuring that the software powering the world’s fastest supercomputers remains open, performant, and perfectly tuned for the future of accelerated computing.
Building on this foundation of accelerated computing, Google is deepening its collaboration with SchedMD to answer a fundamental industry challenge: how to bridge the gap between cloud-native orchestration and high-performance scheduling. We are excited to announce the Preview of Cluster Director support for Slurm on GKE, utilizing SchedMD’s Slinky offering.
This initiative brings together the two standards of the infrastructure world. By running a native Slurm cluster directly on top of GKE, we are amplifying the strengths of both communities:
Researchers get the uncompromised Slurm interface and batch capabilities, such as sbatch and squeue, that have defined HPC for decades.
Platform teams gain the operational velocity that GKE, with its auto-scaling, self-healing, and bin-packing, brings to the table.
Slurm on GKE is strengthened by our long-standing partnership with SchedMD, which helps create a unified, open, and powerful foundation for the next generation of AI and HPC workloads. Request preview access now.
Try Cluster Director today
Ready to start using Cluster Director for your AI and HPC cluster automation?
Learn more about the end-to-end capabilities in documentation.
For most organizations, the question is no longer if they will use AI, but how to scale it from a promising prototype into a production-grade service that drives business outcomes. In this age of inference, competitive advantage is defined by your ability to serve useful information to users around the world at the lowest possible cost. As you move from demos to production deployments at scale, you need to simplify infrastructure operations with integrated systems that provide the latest AI software and accelerator hardware platforms, while keeping costs and architectural complexity low.
Yesterday, Forrester released The Forrester Wave™: AI Infrastructure Solutions, Q4 2025 report, evaluating 13 vendors, and we believe their findings validate our commitment to solving these core challenges. Google received the highest score of all vendors in the Current Offering category and received the highest possible score in 16 out of 19 evaluation criteria, including, but not limited to: Vision, Architecture, Training, Inferencing, Efficiency, and Security.
Accelerating time-to-value with an integrated system
Enterprises don’t run AI in a vacuum. They need to integrate it with a diverse range of applications and databases while adhering to stringent security protocols. Forrester recognized Google Cloud’s strategy of co-design by giving us the highest possible score in the Efficiency and Scalability criteria:
“Google pursues a strategy of silicon-infrastructure co-design. It develops TPUs to improve inference efficiency and NVIDIA GPUs for access to broader ecosystem compatibility. Google designs TPUs to integrate tightly with its networking fabric, giving customers high bandwidth and low latency for inference at scale.”
For over two decades, we have operated some of the world’s largest services, from Google Search and YouTube to Maps, where their unprecedented scale required us to solve problems that no one else had. We couldn’t simply buy the platform and infrastructure we needed; we had to invent it. This led to a decade-long journey of deep, system-level co-design, building everything from our custom network fabric and specialized accelerators to frontier models, all under one roof.
The result was an integrated supercomputing system, AI Hypercomputer, which has paid significant dividends for our customers. It supports a wide range of AI-optimized hardware, allowing you to optimize for granular, workload-level objectives — whether that’s higher throughput, lower latency, faster time-to-results, or lower TCO. That means you can use our custom Tensor Processing Units (TPUs), the latest NVIDIA GPUs, or both, backed by a system that tightly integrates accelerators with networking and storage for exceptional performance and efficiency. It’s also why today, leading generative AI companies such as Anthropic, Lightricks, and LG AI Research trust Google Cloud to power their most demanding AI workloads.
This system-level integration lays the foundation for speed, but operational complexity could still slow you down. To accelerate your time-to-market, we provide multiple ways to deploy and manage AI infrastructure, abstracting away the heavy lifting regardless of your preferred workflow. Google Kubernetes Engine (GKE) Autopilot automates management for containerized applications, helping customers like LiveX.AI reduce operational costs by 66%. Similarly, Cluster Director simplifies deployment for Slurm-based environments, enabling customers like LG AI Research to slash setup time from 10 days to under one day.
Managing AI cost and complexity
Forrester gave Google Cloud the highest scores possible in the Pricing Flexibility and Transparency criterion. The price of compute is only one part of the AI infrastructure cost equation. A complete view should also account for development costs, downtime and inefficient resource utilization. We offer optionality at every layer of the stack to provide the flexibility businesses demand.
Flexible consumption: Dynamic Workload Scheduler allows you to secure compute at up to 50% savings, by ensuring you only pay for the capacity you need, when you need it.
Load balancing: GKE Inference Gateway improves throughput by using AI-aware routing to balance requests across models, preventing bottlenecks and ensuring servers aren’t sitting idle.
Eliminating data bottlenecks: Anywhere Cache co-locates data with compute, reducing read latency by up to 96% and eliminating the “integration tax” of moving data. By using Anywhere Cache together with our unified data platform BigQuery, you can avoid latency and egress fees while keeping your accelerators fed with data.
Mitigating strategic risk through flexibility and choice
We are also committed to enabling customer choice across accelerators, frameworks and multicloud environments. This isn’t new for us. Our deep experience with Kubernetes, which we developed then open-sourced, taught us that open ecosystems are the fastest path to innovation and provide our customers with the most flexibility. We are bringing that same ethos to the AI era by actively contributing to the tools you already use.
Open source frameworks and hardware portability: We continue to support open frameworks such as PyTorch, JAX, and Keras. We’ve also directly addressed concerns about workload portability on custom silicon by investing in TPU support for vLLM, allowing developers to easily switch between TPUs and GPUs (or use both) with only minimal configuration changes.
Hybrid and multicloud flexibility: Our commitment to choice extends to where you run your applications. Google Distributed Cloud brings our services to on-premises, edge and cloud locations, while Cross-Cloud Network securely connects applications and users with high-speed connectivity between your environments and other clouds. This powerful combination means you’re no longer locked into a specific environment; you can easily migrate workloads and apply uniform management practices, streamlining operations, and mitigating the risk of lock-in.
Systems you can rely on
When your entire business model depends on the availability of AI services, infrastructure uptime is critical. Google Cloud’s global infrastructure is engineered for enterprise-grade reliability, an approach rooted in our history as the birthplace of Site Reliability Engineering (SRE).
We operate one of the world’s largest private software-defined networks, handling approximately 25% of global internet egress traffic. Unlike providers that rely on the public internet, we keep your traffic on Google’s own fiber to improve speed, reliability, and latency. This global backbone is powered by our Jupiter data center fabric, which scales to 13 Petabits/sec of bandwidth, delivering 50x greater reliability than previous generations — to say nothing of other providers. Finally, to improve cluster-level fault tolerance, we employ capabilities like elastic training and multi-tier checkpointing, which allow jobs to continue uninterrupted, by dynamically resizing the cluster around failed nodes while minimizing the time to recovery.
Building on a secure foundation
Our approach is to secure AI from the ground up. In fact, Google Cloud maintains a leading track record for cloud security. Independent analysis from cloudvulndb.org (2024-2025) shows that our platform has up to 70% fewer critical and high vulnerabilities compared to the other two leading cloud providers. We were also the first in the industry to publish an AI/ML Privacy Commitment, which guarantees that we do not use your data to train our models. With those safeguards in place, security is integrated into the foundation of Google Cloud, based on the zero-trust principles that protect Google’s own services:
A hardware root of trust: Our custom Titan chips, as part of our Titanium architecture, create a verifiable hardware root of trust. We recently extended this with Titanium Intelligence Enclaves for Private AI Compute, allowing you to process sensitive data in a hardened, isolated, and encrypted environment.
Built-in AI security:Security Command Center (SCC) natively integrates with our infrastructure, providing AI Protection by automatically discovering assets, preventing security issues, detecting active threats with frontline Google Threat Intelligence, and discovering known and unknown risks before attackers can exploit them.
Sovereign solutions: We enable you to meet stringent data residency, operational control, and software sovereignty requirements through solutions like Data Boundary. This is complemented by flexible options like partner-operated sovereign controls and Google Distributed Cloud for air-gapped needs.
Platform controls for AI and agent governance: Vertex AI provides the essential governance layer for the enterprise builder to deploy models and agents at scale. This trust is anchored in Google Cloud’s secure-by-default infrastructure, utilizing platform controls like VPC Service Controls (VPC-SC) and Customer-Managed Encryption Keys (CMEK) to sandbox environments and protect sensitive data, and Agent Identity for granular IAM permissions. At the platform level, Vertex AI and Agent Builder integrate Model Armor to provide runtime protection against emergent agentic threats, such as prompt injection and data exfiltration.
Delivering continuous AI innovation
We are honored to be recognized as a Leader in The Forrester Wave™ report, which we believe validates decades of R&D and our approach to building ultra-scale AI infrastructure. Look to us to continue on this path of system-level innovation as we help you convert the promise of AI into a reality.
Today, we’re expanding the Gemini 3 model family with Gemini 3 Flash, which offers frontier intelligence built for speed at a fraction of the cost.
Gemini 3 Flash builds on the model series that developers and enterprises already love, optimized for high-frequency workflows that demand speed, without sacrificing quality. It allows enterprises to process near real-time information, automate complex workflows, and build responsive agentic applications.
Gemini 3 Flash is built to be highly efficient, pushing the boundaries of quality at better price performance and faster speed. With a near real-time response from the model, businesses can now provide more engaging experiences for their end users at production scale, without sacrificing on quality.
Optimized for speed and scale
Gemini 3 Flash strikes an ideal balance between reasoning and speed, for agentic coding, production-ready systems, and responsive interactive applications. It is available now in Gemini Enterprise, Vertex AI, and Gemini CLI, so businesses and developers can access:
Advanced multimodal processing: Gemini 3 Flash enables enterprises to build applications capable of complex video analysis, data extraction, and visual Q&As in near real-time. Whether streamlining back-office operations by extracting structured data from thousands of documents, or analyzing video archives to identify trends, Gemini 3 Flash delivers these insights with the speed required for modern data pipelines.
Cost-efficient and high-performance execution for code and agents: Gemini 3 Flash delivers exceptional performance on coding and agentic tasks combined with a lower price point, allowing teams to deploy sophisticated reasoning across high-volume processes without hitting barriers.
Low latency for near-real-time experiences: Gemini 3 Flash eliminates the lag typically associated with large models when it comes to intelligence. Its low latency powers responsive applications, from live customer support agents to in-game assistants. These applications can now offer more natural interactions for both quick answers and deep reasoning.
Gemini 3 Flash clearly demonstrates that speed and scale do not have to come at the cost of intelligence.
Real-world value across industries
With the launch of Gemini 3 Pro last month, we introduced frontier performance across complex reasoning, multimodal and vision understanding, as well as agentic and vibe-coding tasks. Gemini 3 Flash retains this foundation, combining Gemini 3’s Pro-grade reasoning with Flash-level latency, efficiency, and cost.
We are already seeing a tremendous response from companies using Gemini 3 Flash. With inference speed and reasoning capabilities that are typically associated with larger models, Gemini 3 Flash is unlocking new and more efficient use cases for companies like Salesforce, Workday and Figma.
Reasoning and multimodality
“Gemini 3 Flash shows a relative improvement of 15% in overall accuracy compared to Gemini 2.5 Flash, delivering breakthrough precision on our hardest extraction tasks like handwriting, long-form contracts, and complex financial data. This is a significant jump in performance, and we’re excited to continue collaborating to bring this specialist-level reasoning to Box AI users.” –Yashodha Bhavnani, Head of AI, Box
“At Bridgewater, we require models capable of reasoning over vast, unstructured multimodal datasets without sacrificing conceptual understanding. Gemini 3 Flash is the first to deliver Pro-class depth at the speed and scale our workflows demand. Its long-context performance on complex problems is exceptional.” – Jasjeet Sekhon, Chief Scientist and Head of AI, AIA Labs, Bridgewater Associates
“ClickUp leverages Gemini 3 Flash’s advanced reasoning to help power our next generation of autonomous agents. Gemini is decomposing high-level user goals into granular tasks, and we are seeing massive quality improvements on critical path identification and long-horizon task sequencing.” –Justin Midyet, Director, Software Engineering, ClickUp
“Gemini 3 Flash has achieved a meaningful step up in reasoning, improving over 7% on Harvey’s BigLaw Bench from its predecessor, Gemini 2.5 Flash. These quality improvements, combined with Flash’s low latency, are impactful for high-volume legal tasks such as extracting defined terms and cross-references from contracts.” – Niko Grupen, Head of Applied Research, Harvey
Agentic coding
“Our engineers have found Gemini 3 Flash to work well together with Debug Mode in Cursor. Flash is fast and accurate at investigating issues and finding the root cause of bugs.” –Lee Robinson, VP of Developer Experience, Cursor
“Gemini 3 Flash is a major step above other models in its speed class when it comes to instruction following and intelligence. It’s immediately become our go-to for latency-sensitive experiences in Devin, and we’re excited to roll it out to more use cases.” –Walden Yan, Co-Founder, Cognition
“The improvements in the latest Gemini 3 Flash model are impressive. Even without specific optimization, we saw an immediate 10% baseline improvement on agentic coding tasks, including complex user-driven queries.” – Daniel Lewis, Distinguished Data Scientist, Geotab
“In our JetBrains AI Chat and Junie agentic-coding evaluation, Gemini 3 Flash delivered quality close to Gemini 3 Pro, while offering significantly lower inference latency and cost. In a quota-constrained production setup, it consistently stays within per-customer credit budgets, allowing complex multi-step agents to remain fast, predictable, and scalable.” – Denis Shiryaev, Head of AI DevTools Ecosystem, JetBrains
“For the first time, Gemini 3 Flash combines speed and affordability with enough capability to power the core loop of a coding agent. We were impressed by its tool usage performance, as well as its strong design and coding skills.” –Michele Catasta, President & Head of AI, Replit
“Gemini 3 Flash remains the best fit for Warp’s Suggested Code Diffs, where low latency and cost efficiency are hard constraints. With this release, it resolves a broader set of common command-line errors while staying fast and economical. In our internal evaluations, we’ve seen an 8% lift in fix accuracy.” – Zach Lloyd, Founder & CEO, Warp
Agentic applications
“Gemini 3 Flash is a great option for teams who want to quickly test and iterate on product ideas in Figma Make. The model can rapidly and reliably create prototypes while maintaining attention to detail and responding to specific design direction.” – Loredana Crisan, Chief Design Officer, Figma
“Presentations.ai is using Gemini 3 Flash to enhance our intelligent slide-generation agents, and we’re consistently impressed by the Pro-level quality at lightning-fast speeds. With previous Flash-sized models there were many things we simply couldn’t attempt because of the speed vs. quality tradeoff. With Gemini 3 Flash, we’re finally able to explore those workflows.” – Saravanan Govindaraj, Co-Founder & Head of Product Development, Presentations.ai
“Integrating Gemini 3 Flash into Agentforce is another step forward in our commitment to bring the best AI to our customers and deploy intelligent agents faster than ever. By pairing Google’s latest model capabilities with the power of Agentforce, we’re unlocking high-quality reasoning, stronger responses, and rapid iteration all inside the tools our customers already use.”– John Kucera, SVP of Product Management, Salesforce AI
“Gemini 3 Flash gives us a powerful new frontier model to fuel Workday’s AI-first strategy. From delivering sharper inference in our customer-facing applications to unlocking greater efficiency in our own operations and development, it provides the performance boost to continue to innovate rapidly.” – Dean Arnold, VP of AI Platform, Workday
“Gemini 3 Flash model’s superb speed and quality allow our users to keep generating content without interruptions. With its improved Korean abilities and adherence to prompts, Gemini 3 Flash can be used for a variety of use cases including agentic workflow and story generation. As the largest consumer AI company in Korea, we’d love to keep using Gemini 3 models and be part of its continuous improvement cycles.” –DJ Lee, Chief Product Officer, WRTN Technologies Inc.
Get started with Gemini 3 Flash
Today, you can safely put Gemini 3 Flash to work.
Business teams can access Gemini 3 Flash in preview onGemini Enterprise, our advanced agentic platform for teams to discover, create, share, and run AI agents all in one secure platform.
Your security program is robust. Your audits are clean. But are you ready for a real-world attack? A tenacious human adversary can create a critical blind spot for security leaders: A program can be compliant, but not resilient. Bridging this gap requires more than just going through the red-teaming motions.
To help security teams forge better instincts when responding to actual cyber-crisis events, we developed ThreatSpace, a cyber proving grounds and realistic corporate network that includes all the digital noise of real employee activities.
From gaps to battle: The ThreatSpace cyber range
The ThreatSpace environment is architecturally stateless and disposable to allow the deployment of real-world malware. It emulates the tactics, techniques, and procedures (TTPs) of real-world adversaries, informed by the latest, unparalleled threat intelligence from Google Threat Intelligence Group and Mandiant. By design, it never puts your actual business assets at risk.
Recently, stakeholders from the U.S. Embassy, the FBI, and Cote d’Ivoire cybersecurity agencies used ThreatSpace to conduct advanced defense training. Funded by the Bureau of International Narcotics and Law Enforcement Affairs (INL), this workshop brought together public and private sector partners to strengthen regional digital security.
“Cybersecurity is a team sport, and our goal is to make Cote d’Ivoire a safer place for Ivorians and Americans to do business. This five-day workshop, funded by INL, brought together world-class instructors from Mandiant with local agencies and private sector partners to build the collaborative muscle we need to defend against modern threats,” said Colin McGuire, FBI law enforcement attaché, Dakar in Cabo Verde and Gulf of Guinea.
More than just helping to train individuals, we helped make the global digital ecosystem safer by uniting diverse groups of defenders facing shared threats. By practicing collaboration during a crisis, and operating as a unit, we can help empower defenders to fight and win against adversaries.
ThreatSpace provides a safe place for your team to miss an indicator of compromise, exercise processes, and stress test collaboration and build the muscle memory and confidence needed to execute flawlessly when real adversaries come knocking. This is where an Offensive Security red team assessment comes in.
Catch me if you can: The Mandiant red team reality check
The Mandiant red team doesn’t follow a script. Our work on the frontlines of incident response lets us see precisely how determined adversaries operate, including their persistent, creative approaches to exploiting the complex seams between your technology, your processes, and your people.
aside_block
<ListValue: [StructValue([(‘title’, ‘Our Office of the CISO insights, direct to you’), (‘body’, <wagtail.rich_text.RichText object at 0x7ff71daccd30>), (‘btn_text’, ‘Subscribe today’), (‘href’, ‘https://go.chronicle.security/cloudciso-newsletter-signup?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY23-Cloud-CISO-Perspectives-newsletter-blog-embed-CTA&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: Cloud CISO Perspectives new header July 2024 small>)])]>
These observations enable our offensive security experts to mimic and emulate genuine threat actor behavior to achieve specific business objectives. Here are three scenarios developed by our red team to help stress-test and enhance our customers’ defenses:
The “Impossible” Blackout. One organization believed their grid controls were isolated and secure. When our team demonstrated that a nationwide blackout was technically possible through their current architecture, the conversation shifted from compliance to survival. This finding empowered them to implement stricter controls immediately, preventing a theoretical catastrophe from becoming a reality.
The Runaway Train. In another engagement, we gained remote system control of a locomotive train. The client didn’t just get a technical report; they learned exactly how physical access vectors could bypass digital security. This exposure allowed them to harden their operational technology against vectors they had previously considered secure.
The Generous Chatbot. Innovation brings new risks. In a recent test of a financial services chatbot, our team used simple prompts to bypass safety filters, ultimately convincing the AI to approve a 200-month loan at 0% APR. This finding prompted the client to immediately implement critical guardrails and grounding sources, ensuring they could innovate safely without exposing their business to manipulation.
From reactive to resilient
Building true cyber resilience requires a continuous feedback loop. It starts with analyzing your current state and enhancing your capability roadmap to align with operational priorities. Then you validate them through incident response learnings and offensive security insights and feed those back into the loop for the next iteration.
By combining these disciplines, and grounding them with threat intelligence, you can move your organization from a reactive posture to a state of proactive resilience. You find and expose your weaknesses today, so you can build the strength required to secure your future.
To battle-test your defenses, contact Mandiant to learn how our Offensive Security and ThreatSpace cyber range services can help you strengthen your defenses and build your resilience.
When it comes to public health, having a clear picture of a community’s needs is vital. These insights help officials secure crucial funding, launch new initiatives, and ultimately improve people’s lives.
That is the idea that inspired Dr. Phillip Levy, M.D., M.P.H., Professor of Emergency Medicine and Associate Vice President for Population Health and Translational Science at Wayne State University and his colleagues to develop Project PHOENIX: the Population Health OutcomEs aNd iNnformation eXchange. PHOENIX ingests information from electronic health records including demographic data, blood pressures and clinical diagnosis, and combines this with social and environmental factors from more than 70 anonymized data sources into an integrated virtual warehouse. Researchers, advocates,community leaders, and policy makers are able to use this data to better understand how different factors correlate to health outcomes and design targeted interventions.
With such functionality, the PHOENIX team recognized the potential to transform the Community Health Needs Assessment (CHNA) process. Required by the federal government, public health departments, nonprofit hospitals, and Federally Qualified Health Centers in the United States must complete a CHNA every three years—a largely manual, time-consuming task that can take up to a year to complete.
That’s where a collaboration between Wayne State University, Google Public Sector, and Syntasa came in. They teamed up to create CHNA 2.0, an innovative solution that drastically cuts down the time it takes to create these vital reports. By combining PHOENIX data with Vertex AI Platform, CHNA 2.0 can deliver a complete CHNA in a matter of weeks, giving health leaders valuable insights more quickly than ever.
Extracting community sentiment from public data
One of the most challenging parts of drafting a CHNA report involves conducting in-depth surveys to understand conditions in the community. This is often the most time-consuming part of the process, as it takes months to create, review, run, and analyze insightful surveys. By the time a CHNA report is complete, data from the surveys might be nearly a year out of date, which can prevent organizations from making a meaningful impact on their communities.
CHNA 2.0 uses public health data from the PHOENIX warehouse along with insights from Syntasa Sentiment Analytics, which combines information from surveys with real-time data from Google Search and social media posts. Syntasa Sentiment Analytics provides insights regarding the questions people are asking and what issues they’re posting about to uncover health-related problems affecting a given community, such as growing concerns about asthma or frustrations with long waits at clinics.
The architecture for this solution was built on the Syntasa Data + AI Platform. Workloads run on Google Kubernetes Engine (GKE) for its scalability, allowing the platform to process incoming sentiment data quickly. The platform also uses Cloud SQL and Google Cloud Storage as part of its data foundation, with BigQuery doing the heavy lifting for sentiment analysis. BigQuery provides the performance, efficiency, and versatility needed to handle large datasets of search and social media information efficiently.
Creating reports with the power of humans + AI
After gathering the necessary information, CHNA 2.0 uses Vertex AI and Gemini to help analysts create the report in less time. CHNA reports are highly complex and lengthy – and require manually integrating multiple data elements. Syntasa solved this challenge by breaking down the report into smaller, more manageable tasks and bringing human oversight into the loop.
Now the person in charge of handling the CHNA defines the report’s structure. Gemini extracts insights from tailored datasets and fills in the relevant details. By combining both human and AI intelligence, CHNA 2.0 delivers reports in a fraction of the time.
Organizations can also use this method to deliver a living document that is constantly updated with fresh data. This means public health officials don’t have to wait years to understand their communities—they can access the latest insights at any time to make faster and more impactful decisions.The net result is a transformation of the CHNA process from static to dynamic, enabling real time, data driven decision making for the betterment of all.
Supporting public health with technology
The City of Dearborn, Michigan, became the first to use CHNA 2.0 to great success. The long-term vision is to bring this same capability to other cities and counties in Michigan and across the nation.
This project with Wayne State University and Syntasa showcases how the right technology and a strategic partner can create a powerful, scalable solution to a long-standing public sector challenge. By partnering with Google Public Sector to leverage the most advanced AI and data tools, Wayne State not only automated a critical process, but also empowered public health officials to better serve their communities.
From improving community health to modernizing infrastructure, discover how technology is transforming the public sector. Sign up for our newsletter to stay informed on the latest trends and solutions.
We have exciting news for Google Cloud partners: Today we’re announcing our new partner program, Google Cloud Partner Network, which will formally roll out in the first quarter of 2026.
This new program marks a fundamental shift in how we measure success and reward value. Applicable to all partner types and sizes – ISVs, RSIs, GSIs, and more – the new program reinforces our strategic move toward recognizing partner contribution across the entire customer lifecycle.
Google Cloud Partner Network is being completely streamlined to focus on real-world results. This marks a strategic shift from measuring program work to valuing genuine customer outcomes. This includes rewarding successful co-sell sales efforts, high-quality service delivery, and shared innovation with ISVs. We are also integrating AI into the program’s core to make partner participation much easier, allowing more focus on customers instead of routine program administration.
With its official kickoff in Q1, the new program will provide a six-month transition window for partners to adjust to the new framework. Today, we are sharing the first details of the Google Cloud Partner Network, which is centered on three pillars: simplicity, outcomes, and automation.
Simplicity
We’re making the program simpler by moving away from tracking the work of traditional program requirements, such as business plans and customer stories, and towards recognising partner contributions – includingpre-sales influence, co-innovation, and post-sales support.
Because the program is designed to put the customer first, we’ve narrowed requirements to focus on partner efforts that deliver real, measurable value. For example, the program will recognize investments in skills, real-world experience, and successful customer outcomes.
Outcomes
The new program will provide clear visibility into how partner impact is recognized and rewarded, focusing on customer outcomes. Critical highlights include:
New tiers: We’re evolving from a two-tier to a three-tier model:Select, Premier, and a new Diamond tier. Diamond is our highest distinction – it is intentionally selective, reserved for the few partners who consistently deliver exceptional customer outcomes. Each tier will now reflect our joint customer successes and will be determined based on exceptional customer outcomes across Google Cloud and Google Workspace.
New baseline competencies: A new competency framework marks a fundamental shift that will replace today’s specializations, in order to reward partners for their deep technical and sales capabilities. The framework focuses on a partner’s proven ability to help customers, measuring two key dimensions: capacity (knowledge and skills development, validated by technical certifications and sales credentials) and capability (real-world success, measured by pre-sales and post-sales contributions to validated closed/won opportunities). This framework operates independently from tiering to allow partners to earn a competency without any dependency on a program tier.
New advanced competency: The new global competencies introduce a second level of achievement, Advanced Competency, to signal a higher designation.
Automation
Building on the proven success and transparency delivered through tools like the Earnings Hub and Statement of Work Analyzer, today’s Partner Network Hubwill transform to deliver automation and transparency across the program.
The administrative responsibility for partners to participate in the program will be dramatically reduced through the use of AI and other tools. For example, a key change is the introduction of automated tracking across tiering and competency achievements. We will automatically apply every successful customer engagement toward a partner’s progress in all eligible tiers and competencies. This radical simplification eliminates redundant reporting and ensures seamless, comprehensive recognition for the outcomes delivered.
What’s next…
The new program and portal will officially launch in Q1 2026, enabling partners to immediately log in, explore benefits and differentiation paths, and begin achieving new tiers and competencies. To ensure a smooth transition, we will host a series of webinars and listening sessions throughout early next year to educate partners on Google Cloud Partner Network.
When extreme weather or unexpected natural disaster strikes, time is the single most critical resource. For public sector agencies tasked with emergency management, the challenge isn’t just about crafting a swift response, it’s about communicating that response to citizens effectively. At our recent Google Public Sector Summit, we demonstrated how Google Workspace with Gemini is helping government agencies turn complex, legally-precise official documents and text into actionable, personalized public safety tools almost instantly, thereby transforming the speed and efficacy of disaster response communication.
Let’s dive deeper into how Google Workspace with Gemini can help transform government operations and boost the speed and effectiveness of critical public outreach during a natural disaster.
The challenge: Turning authority into action
Imagine you are a Communications Director at the Office of Emergency Management. In the aftermath of a severe weather event, the state government has just issued a critical Executive Order (EO), which serves as a foundational text, legally precise, and essential for internal agency coordination. However, its technical, authoritative language is not optimized for the public’s urgent questions such as: “Am I safe? Is my family safe? What should I do now?”
Manually translating and contextualizing this information for the public, and finding official answers to critical questions – often hidden in the details – can create a dangerous information gap during a fast-moving natural disaster.
Built on a foundation of trust
Innovation requires security. Google Workspace with Gemini empowers agencies to adopt AI without compromising on safety or sovereignty, supported by:
FedRAMP High authorization to meet the rigorous compliance standards of the public sector.
Data residency & access controls including data regions, access transparency, and access approvals.
Advanced defense mechanisms like context-aware access (CAA), data loss prevention (DLP), and client-side encryption (CSE).
Operational resilience with Business Continuity editions to help keep your agency connected and operational during critical events.
Google Workspace with Gemini: Your natural disaster response partner
This is one area where Google Workspace with Gemini can help serve as your essential natural disaster partner, by empowering government leaders to move beyond manual translation and rapidly create dynamic, user-facing tools.
For example, by using the Gemini app, the Communications Director at the Office of Emergency Management can simply upload the Executive Order PDF and prompt Gemini to ‘create an interactive safety check tool based on these rules.’ Gemini instantly parses the complex legal definitions—identifying specific counties, curfew times, and exemptions—and writes the necessary code to render a functional, interactive interface directly within the conversation window.
What was once a static document becomes a clickable prototype in seconds, ready to be tested and deployed.
Image: Gemini turns natural disaster declaration into an interactive map
Three core capabilities driving transformation
This process is driven by three core Google Workspace with Gemini capabilities.
Unprecedented speed of transformation. The journey from a complex, static document to a working, interactive application is measured in minutes, not days or weeks. This acceleration completely changes the speed of development for mission-critical tools. In a disaster, the ability to deploy a targeted public safety resource instantly can be life-saving.
Deep contextual understanding.Gemini’s advanced AI goes beyond simple summarization. When provided with a full document and specific instructions, it can synthesize the data to perform complex tasks. For example, Gemini can analyze an executive order to identify embedded technical terms and locations, interpreting them as specific geographic areas that require attention. It extracts this pertinent information—while citing sources for grounding—and can transform raw text into a practical, location-aware tool for the public.
A repeatable blueprint for any natural disaster. The entire process—from secure document upload to the creation of a working, live application—is repeatable. This means the model can be saved and leveraged for any future public safety resource, whether it’s a severe weather warning, a health advisory, or a general preparedness guide. This repeatable blueprint future-proofs an agency’s ability to communicate quickly and effectively during any emergency.
Serving the public with speed and clarity
By leveraging Google Workspace with Gemini, public sector agencies can ensure that official emergency declarations immediately translate into clear, actionable details for the public. This shift from dense legal text to personalized guidance is paramount for strengthening public trust, improving citizen preparedness, and ultimately keeping communities safe.
Are you ready to drive transformation within your own agency? Check out the highlights from our recent Google Public Sector Summit where leaders gathered to share how they are applying the latest Google AI and security technologies to solve complex challenges and advance their missions. Learn more about our Google Workspace Test Drive, and sign up for a no-cost 30-day pilot which provides your agency with full, hands-on access to the entire Google Workspace with Gemini, commitment-free, on your own terms.
The AI state of the art is shifting rapidly from simple chat interfaces to autonomous agents capable of planning, executing, and refining complex workflows. In this new landscape, the ability to ground these intelligent agents in your enterprise data is key to unlocking true business value. Google Cloud is at the forefront of this shift, empowering you to build robust, data-driven applications quickly and accurately.
Last month, Google announced Antigravity, an AI-first integrated development environment (IDE). And now, you can now give the AI agents you build in Antigravity direct, secure access to the trusted data infrastructure that powers your organization, turning abstract reasoning into concrete, data-aware action. With Model Context Protocol (MCP) servers powered by MCP Toolbox for Databases now available within Antigravity, you can securely connect your AI agents to services like AlloyDB for PostgreSQL, BigQuery, Spanner, Cloud SQL, Looker and others within Google’s Data Cloud, all within your development workflow.
Why use MCP in Antigravity?
We designed Antigravity to keep you in the flow, but the power of an AI agent is limited by what it “knows.” To build truly useful applications, your agent needs to understand your data. MCP acts as the universal translator. You can think of it like a USB-C port for AI. It allows the LLMs in your IDE to plug into your data sources in a standardized way. By integrating pre-built MCP servers directly into Antigravity, you don’t need to perform any manual configuration. Your agents can now converse directly with your databases, helping you build and iterate faster without ever leaving the IDE.
Getting started with MCP servers
In Antigravity, connecting an agent to your data is a UI-driven experience, eliminating the challenges we’ve all faced when wrestling with complex configuration files just to get a database connection running. Here’s how to get up and running.
1. Discover and launch
You can find MCP servers for Google Cloud in the Antigravity MCP Store. Search for the service you need, such as “AlloyDB for PostgreSQL” or “BigQuery,” and click on Install to start the setup process.
Launching the Antigravity MCP store
2. Configure your connection
Antigravity presents a form where you can add your service details such as Project ID and region. You can also enter your password or have Antigravity use your Identity and Access Management (IAM) credentials for additional security. These are stored securely, so your agent can access the tools it needs without exposing raw secrets in your chat window.
Installing the AlloyDB for PostgreSQL MCP Server
See your agents in action
Once connected to Antigravity, your agent gains a suite of “tools” (executable functions) that it can use to assist you, and help transform your development and observability experience across different services. Let’s take a look at a couple of common scenarios.
Streamlining database tasks with AlloyDB for PostgreSQL
When building against a relational database like PostgreSQL, you may spend time switching between your IDE and a SQL client to check schema names or test queries. With the AlloyDB MCP server, your agent handles that context and gains the ability to perform database administration and generate high-quality SQL code you can include in your apps — all within the Antigravity interface.
For example:
Schema exploration: The agent can use list_tables and get_table_schema to read your database structure and explain relationships to you instantly.
Query development: Ask the agent to “Write a query to find the top 10 users,” and it can use execute_sql to run it and verify the results immediately.
Optimization: Before you commit code, use the agent to run get_query_plan to ensure your logic is performant.
Antigravity agent using the MCP tools
Unlocking analytics with BigQuery
For data-heavy applications, your agent can act as a helpful data analyst. Leveraging the BigQuery MCP server, it can, for example:
Forecast: Use forecast to predict future trends based on historical data.
Search the catalog: Use search_catalog to discover and manage data assets.
Augmented analytics: Use analyze_contribution to understand the impact of different factors on data metrics.
Building on truth with Looker
Looker acts as your single source of truth for business metrics. Looker’s MCP server allows your agent to bridge the gap between code and business logic, for example:
Ensuring metric consistency: No more guessing whether a field is named total_revenue or revenue_total. Use get_explores and get_dimensions to ask your agent, “What is the correct measure for Net Retention?” and receive the precise field reference from the semantic model.
Instantly validating logic: Don’t wait to deploy a dashboard to test a theory. Use run_query to execute ad-hoc tests against the Looker model directly in your IDE, so that your application logic matches the live data.
Auditing reports: Use run_look to pull results from existing saved reports, allowing you to verify that your application’s output aligns with official business reporting.
Build with data in Antigravity
By integrating Google’s Data Cloud MCP servers into Antigravity, it’s easier than ever to use AI to discover insights and develop new applications. Now, with access to a wide variety of data sources that run your business, get ready to take the leap from simply talking to your code, to creating new experiences for your users.
To get started, check out the following resources:
Building Generative AI applications has become accessible to everyone, but moving those applications from a prototype to a production-ready system requires one critical step: Evaluation.
How do you know if your LLM is safe? How do you ensure your RAG system isn’t hallucinating? How do you test an agent that generates SQL queries on the fly?
At its core, GenAI Evaluation is about using data and metrics to measure the quality, safety, and helpfulness of your system’s responses. It moves you away from “vibes-based” testing (just looking at the output) to a rigorous, metrics-driven approach using tools like Vertex AI Evaluation and the Agent Development Kit (ADK).
To guide you through this journey, we have released four hands-on labs that take you from the basics of prompt testing to complex, data-driven agent assessment.
Evaluating Single LLM Outputs
Before you build complex systems, you must understand how to evaluate a single prompt and its response. This lab introduces you to GenAI Evaluation, a service that helps you automate the evaluation of your model’s outputs.
You will learn how to define metrics, such as safety, groundedness, and instruction following. You will also learn how to run evaluation tasks against a dataset. This is the foundational step for any production-ready AI application.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07005beb0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Evaluate RAG Systems with Vertex AI
Retrieval Augmented Generation (RAG) is a powerful pattern, but it introduces new failure points: did the search fail to find the document, or did the LLM fail to summarize it?
This lab takes you deeper into the evaluation lifecycle. You will learn how to verify “Faithfulness” (did the answer come from the context?) and “Answer Relevance” (did it actually answer the user’s question?). You will pinpoint exactly where your RAG pipeline needs improvement.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07005b790>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Evaluating Agents with ADK
Agents are dynamic; they choose tools and plan steps differently based on the input. This makes them harder to test than standard prompts. You aren’t just grading the final answer; you are grading the trajectory, which is the path the agent took to get there.
This lab focuses on using the Agent Development Kit (ADK) to trace and evaluate agent decisions. You will learn how to define specific evaluation criteria for your agent’s reasoning process and how to visualize the results to ensure your agent is using its tools correctly.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc0787f3e20>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Build and Evaluate BigQuery Agents
When an agent interacts with data, precision is paramount. A SQL-generating agent must write syntactically correct queries and retrieve accurate numbers. A hallucination here doesn’t just look bad, it might lead to bad business decisions.
In this advanced lab, you will build an agent capable of querying BigQuery and then use the GenAI Eval Service to verify the results. You will learn to measure Factual Accuracy and Completeness, ensuring your agent provides the exact data requested without omission.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc0787f35b0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Trust Your AI in Production
Ready to make your AI applications production-grade? Start evaluating your model’s outputs or the trajectory taken by your agents with these codelabs:
These labs are part of the AI Evaluation module in our official Production-Ready AI with Google Cloud program. Explore the full curriculum for more content that will help you bridge the gap from a promising prototype to a production-grade AI application.
To build a production-ready agentic system, where intelligent agents can freely collaborate and act, we need standards and shared protocols for how agents talk to tools and how they talk to each other.
In the Agent Production Patterns module in the Production-Ready AI with Google Cloud Learning Path, we focus on interoperability, exploring the standard patterns for connecting agents to data, tools and each other. Here are three hands-on labs to help you build these skills.
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07046d100>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Connecting to Data with MCP
Once you understand the basics, the next step is giving your agent access to knowledge. Whether you are analyzing massive datasets or searching operational records, the MCP Toolbox provides a standard way to connect your agent to your databases.
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07046db80>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Expose a CloudSQL database to an MCP Client
If you need your agent to search for specific records—like flight schedules or hotel inventory—this lab demonstrates how to connect to a CloudSQL relational database.
aside_block
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07046d040>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
From Prototype to Production
By moving away from custom integrations and adopting standards like MCP and A2A, you can build agents that are easier to maintain and scale. These labs provide the practical patterns you need to connect your agents to your data, your tools, and each other.
These labs are part of the AgentProduction Patterns module in our official Production-Ready AI with Google Cloud Learning Path. Explore the full curriculum for more content that will help you bridge the gap from a promising prototype to a production-grade AI application.
Share your progress using the hashtag #ProductionReadyAI. Happy learning!
Welcome to the first Cloud CISO Perspectives for December 2025. Today, Francis deSouza, COO and president, Security Products, Google Cloud, shares our Cybersecurity Forecast report for the coming year, with additional insights from our Office of the CISO colleagues.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x7fa5b03dd1c0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Forecasting 2026: The year AI rewrites the security playbook
By Francis deSouza, COO, Google Cloud
Francis deSouza, COO and president, Security Products, Google Cloud
We are at a unique point in time where we’re facing a generational refactoring of the entire technology stack, including the threat landscape. 2025 was a watershed year in cybersecurity, where AI moved to the forefront of every company’s agenda, changing the game for both security offense and defense.
While threats continue to intensify — with attackers using AI for sophisticated phishing and deepfakes — defenders also have been gaining ground. This year’s evolutions will continue to drive change in the coming year, and our annual Cybersecurity Forecast report for 2026 explores how today’s lessons will impact tomorrow’s cybersecurity across four key areas: artificial intelligence, cybercrime, nation-state threats, and regulatory obligations.
Organizations haven’t spent enough time preparing their workforces to use AI securely. It is essential that companies build a learning culture around security that includes true AI fluency.
1. The rise of agentic security automation
AI and agents will redefine how organizations secure their environment, turning the security operations center from a monitoring hub into an engine for automated action. This is critical because the window of opportunity has decreased; bad actors operate in hours, not weeks.
As data volumes explode, AI agents can give defenders a speed advantage we haven’t had in years. By stepping in to detect anomalies, automate data analysis, and initiate response workflows, your security teams can focus on the complex decisions that require human judgment. This shift won’t just improve speed — it will drive similar gains in proactively strengthening your entire security posture.
2. Building AI fluency as a defense
We will likely see a wave of AI-driven attacks targeting employees, largely because the weak link in security remains the user. Organizations haven’t spent enough time preparing their workforces to use AI securely. It is essential that companies build a learning culture around security that includes true AI fluency.
Every organization should deploy something like our Model Armor to protect their AI models. Implementing a validation layer at the gateway level ensures that guardrails are active controls rather than just theoretical guidelines.
However, technology is only half the equation. We also need a security-conscious workforce. If we don’t help our employees build these skills, teams simply won’t be equipped to identify the new wave of threats or understand how best to defend against them.
This means looking past standard training, and investing in efforts, like agentic security operations center (SOC) workshops and internal cyber war games efforts, to help educate their employees on what the threat landscape looks like in an AI world.
Read on for the key points from the Cybersecurity Forecast report, bolstered with new insights from our Office of the CISO.
AI advantages
Widespread adoption of AI agents will create new security challenges, requiring organizations to develop new methodologies and tools to effectively map their new AI ecosystems. A key part of this will be the evolution of identity and access management (IAM) to treat AI agents as distinct digital actors with their own managed identities.
AI adoption will transform security analysts’ roles, shifting them from drowning in alerts to directing AI agents in an agentic SOC. This will allow analysts to focus on strategic validation and high-level analysis, as AI handles data correlation, incident summaries, and threat intelligence drafting.
The heightened capability of agentic AI to take actions and execute tasks autonomously elevates the importance of cybersecurity basics. Organizations will need to create discrete boundary definitions for the authorization, authentication, and monitoring of each agent.
Taylor Lehmann, director, health care and life sciences
A year from now, we’re going to have an awesome security opportunity to secure a new persona in our organizations: Knowledge workers who produce truly useful, mission-critical applications and software using ideas and words — but not necessarily well-written, vetted, and tested code.
We’re going to need better and more fine-grained paths to help these new “idea-native developers” who use powerful AI tools and agents to build, test, submit, manage and blast secure code into secure production as safely and as fast as they can. In 2026 and 2027, we’re going to see how big this opportunity is. We should prepare to align our organizations, operations, and technology (OOT) to take advantage of it.
A corollary to this comes from our DORA reports: Just as AI has amplified productivity and begun optimizing work, it amplifies organizational dysfunctions — especially those that lead to inefficiently and ineffectively secured data.
Marina Kaganovich, executive trust lead
The heightened capability of agentic AI to take actions and execute tasks autonomously elevates the importance of cybersecurity basics. Organizations will need to create discrete boundary definitions for the authorization, authentication, and monitoring of each agent.
Beyond technical controls, organizational defense will depend on fostering an AI-literate workforce through training and awareness, as staff shift from performing tasks to architecting and overseeing agents. To be successful, organizations will require a fundamental shift in risk-informed culture.
Bill Reid, security advisor
Aggressive adoption of agentic AI will drive a renewed interest in threat modeling practices. Security teams will be asked to deeply understand what teams are trying to build, and will need to think about the data flows, the trust boundaries, and the guardrails needed.
Agentic AI will also demand that the supply chain be considered within that threat model, beyond the software bill of materials (SBOM), to look at how those services will control autonomous actions. It will also force a renewed look at identity and entitlements, as agents are asked to act on behalf of or as an extension of employees in the enterprise.
What may have been acceptable wide scopes covered by detective controls may no longer be sufficient, given the speed of action that comes with automation and the chaining of models together in goal seeking behavior.
Vesselin Tzvetkov, senior cybersecurity advisor
As Francis noted, agentic security operations are set to become the standard for modern SOCs, dramatically enhancing the speed and capabilities of security organizations. The agentic SOC in 2026 will feature multiple small, dedicated agents for tasks like summarization, alert grouping, similarity detection, and predictive remediation.
This shift will transform modern SOC roles and processes, moving away from tiered models in favor of CI/CD-like automation. AI capabilities and relevant know-how are essential for security personnel.
As AI drives new AI threat hunting capabilities to gain insight from data lakes in previously underexplored areas, such as OT protocols for manufacturing and industry-specific protocols like SS7 for telecommunications, the overall SOC coverage and overall industry security will improve.
Vinod D’Souza, director, manufacturing and industry
In 2026, agentic AI will help the manufacturing and industrial sector cross the critical threshold from static automation to true autonomy. Machines will self-correct and self-optimize with a speed and precision that exceeds human capacity.
The engine powering this transformation is the strategic integration of cloud-native SCADA and AI-native architectures. Security leaders should redefine their mandate from protecting a perimeter to enabling a trusted ecosystem anchored in cyber-physical identity.
Every sensor, service, autonomous agent, and digital twin should be treated as a verified entity. By rooting security strategies in data-centered Zero Trust, organizations stop treating security as a gatekeeper and transform it into the architectural foundation. More than just securing infrastructure, the goal is to secure the decision-making integrity of autonomous systems.
AI threats
We anticipate threat actors will move decisively from using AI as an exception to using it as the norm. They will use AI to enhance the speed, scope, and effectiveness of their operations, streamlining and scaling attacks.
A critical and growing threat is prompt injection, an attack that manipulates AI to bypass its security protocols and follow an attacker’s hidden command. Expect a significant rise in targeted attacks on enterprise AI systems.
Threat actors will accelerate the use of highly manipulative AI-enabled social engineering. This includes vishing (voice phishing) with AI-driven voice cloning to create hyperrealistic impersonations of executives or IT staff, making attacks harder to detect and defend against.
The increasing complexity of hybrid and multicloud architectures, coupled with the rapid, ungoverned introduction of AI agents, will accelerate the crisis in IAM failures, cementing them as the primary initial access vector for significant enterprise compromise.
Anton Chuvakin, security advisor
We’ve been hearing about the sizzle of AI for some time, but now we need the steak to be served. While there’s still a place for exciting, hypothetical use cases, we need tangible AI benefits backed by solid security data of value and benefits obtained and proven.
Whether your company adopts agents or not, your employees will use them for work. Shadow agents raise new and interesting risks, especially when your employees connect their personal agents to corporate systems. Organizations will have to invest to mitigate the risks of shadow agents — merely blocking them simply won’t work (they will sneak back in immediately).
David Stone, director, financial services
As highlighted in the Google Threat Intelligence Group report on adversarial use of AI, attackers will use gen AI to exploit bad hygiene, employ deepfake capabilities to erode trust in processes, and discover zero-day vulnerabilities. Cyber defenders will likewise have to adopt gen AI capabilities to find and fix cyber hygiene, patch code at scale, and scrutinize critical business processes to get signals to find and stop exploitation of humans in the process.
Security will continue to grow in importance in the boardroom as the key focus on resilience, business enablement, and business continuity — especially as AI-driven attacks evolve.
Jorge Blanco, director, Iberia and Latin America
The increasing complexity of hybrid and multicloud architectures, coupled with the rapid, ungoverned introduction of AI agents, will accelerate the crisis in IAM failures, cementing them as the primary initial access vector for significant enterprise compromise.
The proliferation of sophisticated, autonomous agents — often deployed by employees without corporate approval (the shadow agent risk) — will create invisible, uncontrolled pipelines for sensitive data, leading to data leaks and compliance violations. The defense against this requires the evolution of IAM to agentic identity management, treating AI agents as distinct digital actors with their own managed identities.
Organizations that fail to adopt this dynamic, granular control — focusing on least privilege, just-in-time access, and robust delegation — will be unable to minimize the potential for privilege creep and unauthorized actions by these new digital actors. The need for practical guidance on securing multicloud environments, including streamlined IAM configuration, will be acutely felt as security teams grapple with this evolving threat landscape.
Sri Gourisetti, senior cybersecurity advisor
The increased adversarial use of AI for the development of malware modules may likely result in “malware bloat” — a high volume of AI-generated malicious code that is non-functional or poorly optimized, creating significant noise for amateur adversaries and defenders.
Functional malware will become more modular and mature, designed to be compatible and interact with factory floor and OT environments as the manufacturing and industrial sector moves beyond initial exploration of generative AI toward the structural deployment of agentic AI in IT, OT, and manufacturing workflows.
Widya Junus, strategy operations
Over 70% of cloud breaches stem from compromised identities, according to a recent Cloud Threat Horizons report, and we expect that trend to accelerate as threat actors exploit AI. The security focus should shift from human-centered authentication to automated governance of non-human identities using Cloud Infrastructure Entitlement Management (CIEM) and Workload Identity Federation (WIF).
Accordingly, as AI-assisted attacks lower the barrier for entry and cloud-native ransomware specifically targets APIs to encrypt workloads, organizations will increasingly rely on tamper-proof backups (such as Backup Vault) and AI-driven automated recovery workflows to ensure business continuity — rather than relying solely on perimeter defenses to stop every attack.
Cybercrime
The combination of ransomware, data theft, and multifaceted extortion will remain the most financially disruptive category of cybercrime. The volume of activity is escalating, with focus on targeting third-party providers and exploiting zero-day vulnerabilities for high-volume data exfiltration.
As the financial sector increasingly adopts cryptocurrencies, threat actors are expected to migrate core components of their operations onto public blockchains for unprecedented resilience against traditional takedown efforts.
As security controls mature in guest operating systems, adversaries are pivoting to the underlying virtualization infrastructure, which is becoming a critical blind spot. A single compromise here can grant control over the entire digital estate and render hundreds of systems inoperable in a matter of hours.
Next year, we’ll see the first sustained, automated campaigns where threat actors use agentic AI to autonomously discover and exploit vulnerabilities faster than human defenders can patch exploited vulnerabilities.
David Homovich, advocacy lead
In 2026, we expect to see more boards pressuring CISOs to translate security exposure and investment into financial terms, focusing on metrics like potential dollar losses and the actual return on security investment. Crucially, operational resilience — the organization’s ability to quickly recover from an AI-fueled attack — is a non-negotiable board expectation.
CISOs take note: Boards are asking us about business resilience and the impact of advanced, machine-speed attacks — like adversarial AI and securing autonomous identities such as AI agents. Have your dollar figures ready, because this is the new language of defense for boards.
Crystal Lister, security advisor
Next year, we’ll see the first sustained, automated campaigns where threat actors use agentic AI to autonomously discover and exploit vulnerabilities faster than human defenders can patch exploited vulnerabilities.
2025 showed us that adversaries are no longer leveraging artificial intelligence just for productivity gains, they are deploying novel AI-enabled malware in active operations. The ShadowV2 botnet was likely a test run for autonomous C2 infrastructure.
Furthermore, the November 2025 revelations about Chinese state-sponsored actors using Anthropic’s Claude to automate espionage code-writing demonstrates that barriers to entry for sophisticated attacks have collapsed. Our security value proposition should shift from detection to AI-speed preemption.
The global stage: Threat actors
Cyber operations in Russia are expected to undergo a strategic shift, prioritizing long-term global strategic goals and the development of advanced cyber capabilities over just tactical support for the conflict in Ukraine.
The volume of China-nexus cyber operations is expected to continue surpassing that of other nations. They will prioritize stealthy operations, aggressively targeting edge devices and exploiting zero-day vulnerabilities.
Driven by regional conflicts and the goal of regime stability, Iranian cyber activity will remain resilient, multifaceted, and semi-deniable, deliberately blurring the lines between espionage, disruption, and hacktivism.
North Korea will continue to conduct financial operations to generate revenue for the regime, cyber espionage against perceived adversaries, and seek to expand IT worker operations.
Sovereign cloud will become a drumbeat across most of Europe, as EU member states seek to decrease their reliance on American tech companies.
Bob Mechler, director, Telco, Media, Entertainment and Gaming
The telecom cybersecurity landscape in 2026 will be dominated by the escalation of AI-driven attacks and persistent geopolitical instability. We may witness the first major AI-driven cybersecurity breach, as adversaries use AI to automate exploit development and craft sophisticated attacks that outpace traditional defenses.
This technological escalation coincides with a baseline of state-backed and politically-motivated cyber-threat activity, where critical infrastructure is targeted as part of broader geopolitical conflicts. Recent state-sponsored campaigns, such as Salt Typhoon, highlight how adversaries are already penetrating telecommunications networks to establish long-term access, posing a systemic threat to national security.
Toby Scales, security advisor
Sovereign cloud will become a drumbeat across most of Europe, as EU member states seek to decrease their reliance on American tech companies.
At the same time, the AI capability gap will continue to widen and both enterprises and governments will chase agreements with frontier model providers. Regulatory bodies may seek to enforce “locally hosted fine-tuned models” as a way to protect state secrets, but will face predictable opposition from frontier model developers.
Meeting regulatory obligations
Governance has taken on new importance in the AI era. Key areas of focus are expanding to include data integrity to prevent poisoning attacks, model security to defend against evasion and theft, and governance fundamentals to ensure transparency and accountability.
CISOs and governance, risk, and compliance teams should work together to build an AI resilience architecture, establish continuous AI health monitoring, integrate AI into business continuity and incident response, and embed AI resilience into security governance.
Bhavana Bhinder, security, privacy, and compliance advisor
In 2026, we will see the validated AI operating model become the industry standard for healthcare and life sciences (HCLS), with a shift from pilot projects to organizations seeking full-scale production deployments that are compliant and audit-ready by design. The logical evolution for HCLS will move towards agentic evaluation, where autonomous agents act as real-time auditors.
Instead of periodic reviews, these agents will continuously validate that generative AI outputs (such as clinical study reports) remain factually grounded and conform to regulatory standards. Organizations using governed, quality-scored data necessary to trust advanced models like Gemini across the drug lifecycle, clinical settings, and quality management will depend on AI workflows that natively support industry- and domain-specific regulations.
Odun Fadahunsi, senior security risk and compliance advisor
As regulators and sectoral bodies in finance, healthcare and critical infrastructure define AI-specific resilience obligations, CISOs must treat AI resilience as a primary pillar of security, not a separate or optional discipline. AI systems are poised to become so deeply embedded in identity, fraud detection, customer operations, cloud automation, and decisioning workflows that AI availability and reliability will directly determine an organization’s operational resilience.
Unlike traditional systems, AI can fail in silent, emergent, or probabilistic ways — drifting over time, degrading under adversarial prompt, and behaving unpredictably after upstream changes in data or model weights. These failure modes will create security blindspots, enabling attackers to exploit model weaknesses that bypass traditional controls.
CISOs and governance, risk, and compliance teams should work together to build an AI resilience architecture, establish continuous AI health monitoring, integrate AI into business continuity and incident response, and embed AI resilience into security governance.
For more leadership guidance from Google Cloud experts, please see ourCISO Insights hub.
Here are the latest updates, products, services, and resources from our security teams so far this month:
Responding to React2Shell (CVE-2025-55182): Follow these recommendations to minimize remote code execution risks in React and Next.js from the React2Shell (CVE-2025-55182) vulnerability. Read more.
How Google Does It: Securing production services, servers, and workloads: Here are the three core pillars that define how we protect production workloads at Google-scale. Read more.
How Google Does It: Using Binary Authorization to boost supply chain security: “Don’t trust, verify,” guides how we secure our entire software supply chain. Here’s how we use Binary Authorization to ensure that every component meets our security best practices and standards. Read more.
New data on ROI of AI in security: Our new ROI of AI in security report showcases how organizations are getting value from AI in cybersecurity, and finds a significant, practical shift is underway. Read more.
Using MCP with Web3: How to secure blockchain-interacting agents: In the Web3 world, who hosts AI agents, and who holds the private key to operations, are pressing questions. Here’s how to get started with the two most likely agent models. Read more.
Expanding the Google Unified Security Recommended program: We are excited to announce Palo Alto Networks as the latest addition to the Google Unified Security Recommended program, joining previously announced partners CrowdStike, Fortinet and Wiz. Read more.
Why PQC is Google’s path forward (and not QKD): After closely evaluating Quantum Key Distribution (QKD), here’s why we chose post-quantum cryptography (PQC) as the more scalable solution for our needs. Read more.
Architecting security for agentic capabilities in Chrome: Following the recent launch of Gemini in Chrome and the preview of agentic capabilities, here’s our approach and some new innovations to improve the safety of agentic browsing. Read more.
Android Quick Share support for AirDrop: As part of our efforts to continue to make cross-platform communication easier, we’ve made Quick Share interoperable with AirDrop, allowing for two-way file sharing between Android and iOS devices, starting with the Pixel 10 Family. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x7fa5b03dd430>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
Intellexa’s prolific zero-day exploits continue: Despite extensive scrutiny and public reporting, commercial surveillance vendors such as Intellexa continue to operate unimpeded. Known for its “Predator” spyware, new GTIG analysis shows that Intellexa is evading restrictions and thriving. Read more.
APT24’s pivot to multi-vector attacks: GTIG is tracking a long-running and adaptive cyber espionage campaign by APT24, a People’s Republic of China (PRC)-nexus threat actor that has been deploying BADAUDIO over the past three years. Here’s our analysis of the malware, and how defenders can detect and mitigate this persistent threat. Read more.
Get going with Time Travel Debugging using a .NET process hollowing case study: Unlike traditional live debugging, this technique captures a deterministic, shareable record of a program’s execution. Here’s how to start incorporating TTD into your analysis. Read more.
Analysis of UNC1549 targeting the aerospace and defense ecosystem: Following last year’s post on suspected Iran-nexus espionage activity targeting the aerospace, aviation, and defense industries in the Middle East, we discuss additional tactics, techniques, and procedures (TTPs) observed in incidents Mandiant has responded to. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Podcasts from Google Cloud
The truth about autonomous AI hacking: Heather Adkins, Google’s Security Engineering vice-president, separates the hype from the hazards of autonomous AI hacking, with hosts Anton Chuvakin and Tim Peacock. Listen here.
Escaping 1990s vulnerability management: Caleb Hoch, consulting manager for security transformations, Mandiant, discusses with Anton and Tim how vulnerability management has evolved beyond basic scanning and reporting, and the biggest gaps between modern practices and what organizations are actually doing. Listen here.
The art and craft of cloud bug hunting: Bug bounty professionals Sivanesh Ashok and Sreeram KL, have won the Most Valuable Hacker award from the Google Cloud VRP team. They chat about all things buggy with Anton and Tim, including how to write excellent bug bounty reports. Listen here.
Behind the Binary: The art of deconstructing problems: Host Josh Stroschein is joined by Nino Isakovic, a long-time low-level security expert, for a thought-provoking conversation that spans the foundational and the cutting-edge — including his discovery of the ScatterBrain obfuscating compiler. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.
We can all agree that the quality of AI-driven answers relies on the consistency of the underlying data. But AI models, while powerful, lack business context out of the box. As more organizations ask questions of their data using natural language, it is increasingly important to unify business measures and dimensions, ensure consistency company-wide. If you want trustworthy AI, what you need is a semantic layer that acts as the single source of truth for business metrics.But how do you make that data accessible and actionable for your end users? Building off the recent introduction of Looker’s Model Context Protocol (MCP) server, in this blog we take you through the process of creating an Agent Development Kit (ADK) agent that is connected to Looker via the MCP Toolbox for Databases and exposing it within Gemini Enterprise. Let’s get started.Step 1 – Set up Looker Integration in MCP Toolbox
MCP Toolbox for Databases is a central open-source server that hosts and manages toolsets, enabling agentic applications to leverage Looker’s capabilities without working directly with the platform. Instead of managing tool logic and authentication themselves, agents act as MCP clients and request tools from the Toolbox. The MCP Toolbox handles all the underlying complexities, including secure connections to Looker, authentication and query execution.
The MCP Toolbox for Databases natively supports Looker’s pre-built toolset. To access these tools, follow the below steps:
Connect to Cloud Shell. Check that you’re already authenticated, and that the project is set to your project ID using the following command:
Install the binary version of the MCP Toolbox for Databases via the script given below. This command is for Linux; if you run on Macintosh or Windows, ensure that you download the correct binary. Check out the releases page for your Operation System and Architecture and download the correct binary.
code_block
<ListValue: [StructValue([(‘code’, ‘export OS=”linux/amd64″ # one of linux/amd64, darwin/arm64, darwin/amd64, or windows/amd64rncurl -O https://storage.googleapis.com/genai-toolbox/v0.12.0/$OS/toolboxrnchmod +x toolbox’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5baae0a30>)])]>
Deploy Toolbox to Cloud Run
Next, you’ll need to run MCP Toolbox. The simplest way to do that is on Cloud Run, Google Cloud’s fully managed container application platform. Here’s how:
The Cloud Run will ask if you want Unauthenticated, select No.Allow Unauthenticated: N
Step 2: Deploy ADK Agent to Agent Engine
Next, you need to configure Agent Development Kit (ADK), a flexible and modular framework for developing and deploying AI agents. ADK was designed to make agent development feel more like software development, to make it easier for developers to create, deploy, and orchestrate agentic architectures that range from simple tasks to complex workflows. And while ADK is optimized for Gemini and the Google ecosystem, it’s also model-agnostic, deployment-agnostic, and is built for compatibility with other frameworks.
Vertex AI Agent Engine, a part of the Vertex AI Platform, is a set of services that enables developers to deploy, manage, and scale AI agents in production. Agent Engine handles the infrastructure to scale agents in production so you can focus on creating applications.
Open a new terminal tab in Cloud Shell and create a folder named my-agents as follows. You also need to navigate to the my-agents folder.
Now you’re ready to use adk to create a scaffolding, including folders, environment and basic files, for our Looker Agent Application via the adkcreate command with an app name looker_app:
Gemini model for choosing a model for the root agent
Vertex AI for the backend
Your default Google Project Id and region
code_block
<ListValue: [StructValue([(‘code’, ‘Choose a model for the root agent:rn1. gemini-2.5-flash-001rn2. Other models (fill later)rnChoose model (1, 2): 1rnrnrn1. Google AIrn2. Vertex AIrnChoose a backend (1, 2): 2rnrnEnter Google Cloud project ID [your_current_project_id]:rnEnter Google Cloud region [us-central1]:rnrnAgent created in /home/romin/looker-app:rn- .envrn- __init__.pyrn- agent.py’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5bd2720a0>)])]>
Observe the folder in which a default template and required files for the Agent have been created.
First up is the .env file:
code_block
<ListValue: [StructValue([(‘code’, ‘GOOGLE_GENAI_USE_VERTEXAI=1rnGOOGLE_CLOUD_PROJECT=YOUR_GOOGLE_PROJECT_IDrnGOOGLE_CLOUD_LOCATION=YOUR_GOOGLE_PROJECT_REGION’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5bd272220>)])]>
The values indicate that you will be using Gemini via Vertex AI along with the respective values for the Google Cloud Project Id and location.
Then you have the __init__.py file that marks the folder as a module and has a single statement that imports the agent from the agent.py file:
Finally, take a look at the agent.py file. The contents can be edited to similar to the example below:
Insert the Cloud Run URL highlighted here (. not the one with the project number in the url).
code_block
<ListValue: [StructValue([(‘code’, ‘import osrnfrom google.adk.agents import LlmAgentrnfrom google.adk.planners.built_in_planner import BuiltInPlannerrnfrom google.adk.tools.mcp_tool.mcp_toolset import MCPToolsetrnfrom google.adk.tools.mcp_tool.mcp_session_manager import SseConnectionParams, StreamableHTTPConnectionParamsrnfrom google.genai.types import ThinkingConfigrnfrom google.auth import compute_enginernimport google.auth.transport.requestsrnimport google.oauth2.id_tokenrnrn# Replace this URL with the correct endpoint for your MCP server.rnMCP_SERVER_URL = “YOUR_CLOUD_RUN_URL/mcp”rnif not MCP_SERVER_URL:rn raise ValueError(“The MCP_SERVER_URL is not set.”)rndef get_id_token():rn “””Get an ID token to authenticate with the MCP server.”””rn target_url = MCP_SERVER_URLrn audience = target_url.split(‘/mcp’)[0]rn auth_req = google.auth.transport.requests.Request()rn id_token = google.oauth2.id_token.fetch_id_token(auth_req, audience)rn # Get the ID token.rn return id_tokenrnrnrnroot_agent = LlmAgent(rn model=’gemini-2.5-flash’,rn name=’looker_agent’,rn description=’Agent to answer questions about Looker data.’,rn instruction=(rn ‘You are a helpful agent who can answer user questions about Looker data the user has access to. Use the tools to answer the question. If you are unsure on what model to use, try defaulting to thelook and if you are also unsure on the explore, try order_items if using thelook model’rn ),rnplanner=BuiltInPlanner(rnthinking_config=ThinkingConfig(include_thoughts=False, thinking_budget=0)rn),rntools=[rnMCPToolset(rnconnection_params=StreamableHTTPConnectionParams(rnurl=MCP_SERVER_URL,rnheaders={rn”Authorization”: f”Bearer {get_id_token()}”,rn}rn),rnerrlog=None,rn# Load all tools from the MCP server at the given URLrntool_filter=None,rn)rn],rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5bd272940>)])]>
NOTE: Ensure you grant the Cloud Run Invoker role to the default Agent Engine Service Account (i.e., service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com)
Step 3: Connect to Gemini Enterprise
Now it’s time to create a Gemini Enterprise app (instructions here).
Run the below command with the GCP Project Number, Reasoning Engine resource name output from the ‘deploy agent_engine’ command above, and your Gemini Enterprise Agent ID from the Gemini Enterprise Apps interface:
Your Looker data will now be available within your Gemini Enterprise app.If you don’t have access to this feature, contact your Google Cloud account team.
Querying business data made easier
Connecting Looker’s semantic layer to Vertex AI Agent services by way of the ADK and MCP Toolbox is a big win for data accessibility. By exposing your trusted Looker models and Explores in Gemini Enterprise, you empower end-users to query complex business data using natural language. This integration closes the gap between data insights and immediate action, ensuring that your organization’s semantic layer is not just a source of passive reports, but an active, conversational, and decision-driving asset.
Today, many organizations operate with data that’s trapped in silos, in disconnected legacy systems and is days or hours old. However, the rise of AI presents the need and opportunity to unify these environments, tap into unstructured data from audio, video, and text files, which together, makes up more than 80% of enterprise data and enable business decisions informed by real-time data. Data teams navigating AI also face a new set of challenges such as automating complex workflows and apps, grounding them in enterprise data, activating real-time insights on multimodal data, and building a foundation that inspires trust in AI.
Google’s Data Cloud is an AI-native platform designed to unify an organization’s entire data foundation and enable intelligent applications and agentic experiences. Data Cloud integrates Google infrastructure, intelligence, and data platform with pioneering AI advancements, including Gemini for working with data, automation of metadata management and governance, and flexible workflows for developers, allowing customers to focus on innovation and business outcomes rather than integration challenges.
Recently, we were honored to be recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for Data Integration Tools. In our opinion this demonstrates Data Cloud’s tight integration with data integration tools and vision for AI, including customer use cases for multimodal data processing, and scalable, efficient vectorization. In addition, we were recognized as a leader in the Forrester Wave™ :Streaming Data Platforms, Q4 2025. In this blog post, we take a look at recent updates and innovations that we believe made recognition from these two leading analyst firms possible.
Boost productivity with Gemini-powered intelligence
Data agents are revolutionizing the way different data roles operate by bringing automation, intelligence, and natural language capabilities into their daily workflows. Whether you’re a data analyst querying and visualizing data more efficiently, a developer building smarter applications, or a data scientist accelerating model development, agents can help streamline repetitive tasks and boost your productivity. Data engineers benefit from automated data preparation and pipeline management, while ML engineers can deploy and monitor models more effectively. Even business users, who traditionally rely on technical teams for insights, can now interact with data directly using natural language.
Recent innovations to Gemini with BigQuery for data engineering provide automation to build data pipelines to ingest, transform, and validate data. This includes data transformations like data cleaning, deduplication, formatting, standardizing, joins, and aggregations as well as data quality to enforce rules and standards. Building on these capabilities, the Data Engineering Agent further accelerates productivity by intelligently automating these standard integration patterns and proactively monitoring pipeline health.
Speed efficiency with multimodal automation and governance
We are removing the friction to build AI applications using autonomous vector embedding for multimodal data. Building on our BigQuery Vector Search capabilities, data teams can build, manage, and maintain complex data pipelines without needing to update vector embeddings. BigQuery now takes care of this automatically with added capabilities for agents to connect user intent to enterprise data. This is powering customer systems like the in-store product finder at Morrisons, which handles 50,000 customer searches on a busy day.
We are also helping organizations ensure their data platform acts as a real-time brain for AI, including orchestration and AI-infused services. Governance is foundational to data and AI success. In today’s world of distributed data spanning lakes, warehouses, and operational systems, intelligence is impossible without unified governance.
New automated cataloging with Dataplex Universal Catalog allows data teams to discover, ingest, and index metadata from a wide range of sources, minimizing the effort involved in cataloging data, and providing a near-real-time view of your data and AI landscape. Dataplex provides context to your data teams and your agents beyond the normal scope of a universal catalog. It leverages Gemini to continuously derive relationships and auto-generate business semantics, providing AI agents with trusted, real-time context.
Ericsson uses Dataplex to deliver a unified business vocabulary to users, including data classification, ownership, retention policies, and sensitivity labels. This allows different data personas to instantly understand a data origin, increasing trust and reducing investigation time.
Optimize workloads for broad usability
Managing data across cloud and hybrid environments can be piecemeal, leading to costly inefficiencies, redundant storage, and complex data movement.
To help, visual pipelines provide a code-free user experience for designing, deploying, managing and monitoring pipelines, with a metadata-driven approach to improving developer productivity. And enhancements to data preparation in BigQuery provide a single platform to clean, structure, enrich and build data pipelines.
For ML transformations supporting retrieval augmented generation (RAG) use cases, recent innovations enhance model inference to ML models in real-time or batch. And support for libraries and frameworks for multimodal data allows data teams to leverage multiple models in a single pipeline, improving accuracy and recall.
Integrating real-time data and context for AI
Agents need context in order to be effective and are significantly limited when they rely on static or outdated information. To make accurate decisions that genuinely help users and the business, they need real-time access to the current state of your systems and users. We launched Managed Service for Apache Kafka last year to help you integrate your operational and transactional data into your AI and data platform that in turn can then power your AI agents. This year, we added critical enterprise capabilities such as Apache Kafka Connect, VPC Service Controls, mutual TLS authentication, and Kafka access control which have helped customers like MadHive deploy to production in a matter of months. To enable new streaming architectures, we added User-Defined Functions support (UDFs) in Pub/Sub for transforming messages (like JSON) before they go to destinations like BigQuery, allowing custom logic, validation, and enrichment on the streaming data and making Pub/Sub pipelines more powerful and flexible. We also enhanced Dataflow, the advanced unified streaming and batch processing engine with critical capabilities such as parallel updates, Managed I/O, Google Cloud TPU support, speculative execution and more to bring the power of AI enabled data processing to advanced stream processing use cases such as continuous ML feature extraction and real time fraud detection.
Data integration and streaming momentum
It was a busy year for the Google Data Cloud team, and we are honored to be recognized in these recent Gartner and Forrester reports. We look forward to continuing to innovate and partner with you on your data transformation journey.
Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change. For more information, read about Forrester’s objectivityhere .
Gartner, Magic Quadrant for Data Integration Tools, Michele Launi, Nina Showell, Robert Thanaraj, Sharat Menon, 8 December 2025
Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s Research & Advisory organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.
Pioneering organizations have been using Gemini Live API to build the next generation of multimodal conversational AI that blends voice, vision, and text, to deliver fluid, human-like, and highly contextual interactions. For Google Cloud customers, this means you can deploy low-latency voice and video agents with the stability and performance required for your most demanding workflows.
A new standard with real-time multimodal AI agents
Gemini Live API represents a new standard for bringing AI to life. Imagine an agent that doesn’t just listen, but instantly understands the user’s intent, the context of their screen, captures the emotion in their voice, and responds with a human-like voice — all in real time.
The power behind this dynamic capability is the Gemini 2.5 Flash Native Audio model. Our approach is based on a simple commitment: to bring the same high-quality conversational intelligence found in advanced experiences across Google directly to your enterprise applications.
In a real-time interaction, precision and speed are non-negotiable. Gemini Live API is natively multimodal and is designed to handle the instantaneous complexity of human dialogue:
It can process interruptions mid-sentence without missing a beat, ensuring natural turn-taking.
It understands acoustic cues like pitch and pace, deciphering intent and tone.
It can see and discuss complex visual data (charts, live video, diagrams) shared by a user, providing immediate, contextual assistance.
The confidence to deploy on Vertex AI
Gemini Live API is engineered for enterprise success. Vertex AI provides the security and stability your mission-critical agents need for production.
The Gemini 2.5 Flash Native Audio model is optimized to process a high volume of concurrent interactions with consistent, low-latency performance. Deploying on Vertex AI allows you to leverage our expanding global infrastructure across multiple regions, delivering reliability for your users. Additionally, enterprise-grade data residency features that allow you to manage where your data is processed, helping you meet critical regulatory and compliance standards.
Building real-world impact with Gemini Live API
The true power of Gemini Live API is demonstrated by the companies who are using it today to redefine their customer experiences.
Shopify, the leading global commerce platform, developed Sidekick, a multimodal AI assistant powered by Gemini Live API on Vertex AI. It provides personalized, robust support away from a desk, enabling real-time problem solving that eliminates traditional ticketing workflows.
“Users often forget they’re talking to AI within a minute of using Sidekick, and in some cases have thanked the bot after a long chat. This is an exciting time to be an entrepreneur. New AI capabilities offered through Gemini empower our merchants to win.” – David Wurtz, VP of Product, Shopify
United Wholesale Mortgage (UWM) transformed its business process by using their AI Loan Officer Assistant, Mia, to dramatically increase business efficiency for their broker partners.
“By integrating the Gemini 2.5 Flash Native Audio model and harnessing the Gemini Live API capabilities on the Vertex AI platform, we’ve significantly enhanced Mia’s capabilities since launching in May 2025. This powerful combination has enabled us to generate over 14,000 loans for our broker partners, proving that AI is much more than just a buzzword at UWM.” – Jason Bressler, Chief Technology Officer, UWM
SightCall provides remote video support and AI-driven visual assistance, helping customer service and field teams solve problems faster.
“What makes this partnership so exciting is that the Gemini 2.5 Flash Native Audio model isn’t just fast — it’s seamlessly human. When combined with SightCall Xpert Knowledge™, it becomes a real-time expert that knows what your best technicians know… This is the future of visual support.” – Thomas Cottereau, CEO, SightCall
Napster uses the Gemini Live API’s vision and audio capabilities so their users can co-create and receive live guidance from specialized AI companions.
“By utilizing the Gemini 2.5 Flash Native Audio model on Vertex AI, we’ve built something we couldn’t before: AI Companions that see you, see your screen, and respond like real experts in real-time conversation. This combination of vision and audio enables genuine collaboration — no prompting, no engineering — just natural dialogue where AI understands your full context and unlocks creativity and expertise for everyone.” – Edo Segal, CTO, Napster
Lumeris is deploying their health AI assistant, Tom, in high-stakes environments where nuance and emotional sensitivity are non-negotiable.
“The transition to the Gemini Live API on Vertex AI is a strategic investment in more intuitive and efficient patient conversations. The result is a more responsive and personalized voice experience. For Lumeris, our goal is elevating the quality of every interaction between patients and Tom, our agentic primary care team member. This helps us set a new standard for patient care.” – Jean-Claude Saghbini, President and Chief Technology Officer, Lumeris
Newo deploys versatile AI Receptionists that achieve a conversational quality that is truly lifelike and emotionally intuitive, handling tasks from general inquiries to sales.
“Working with the Gemini 2.5 Flash Native Audio model through Vertex AI allows Newo.ai AI Receptionists to achieve unmatched conversational intelligence — combining ultra-low latency with advanced reasoning. They can identify the main speaker even in noisy settings, switch languages mid-conversation, and sound remarkably natural and emotionally expressive. Our Gemini Live API-powered outbound AI Sales Agents can laugh, joke, and truly connect — making every call feel human.” – David Yang, co-founder, Newo.ai
11Sight is redefining customer interactions with AI-powered conversational agents that book appointments and close sales.
“The Gemini 2.5 Flash Native Audio model on Vertex AI gave us the enterprise-grade platform required to rapidly develop our voice AI agents with very low latency. Integrating this solution with our Sentinel AI Agents pushed our call resolution rates from 40% in February to 60% in November.” – Dr. Farokh Eskafi, CTO, 11Sight
Give your AI apps and agents a natural, almost human-like interface, all through a single WebSocket connection.
Today, we announced the general availability of Gemini Live API on Vertex AI, which is powered by the latest Gemini 2.5 Flash Native Audio model. This is more than just a model upgrade; it represents a fundamental move away from rigid, multi-stage voice systems towards a single, real-time, emotionally aware, and multimodal conversational architecture.
We’re thrilled to give developers a deep dive into what this means for building the next generation of multimodal AI applications. In this post we’ll look at two templates and three reference demos that help you understand how to best use Gemini Live API.
Gemini Live API as your new voice foundation
For years, building conversational AI involved stitching together a high-latency pipeline of Speech-to-Text (STT), a Large Language Model (LLM), and Text-to-Speech (TTS). This sequential process created the awkward, turn-taking delays that prevented conversations from ever feeling natural.
Gemini Live API fundamentally changes the engineering approach with a unified, low-latency, native audio architecture.
Native audio processing: Gemini 2.5 Flash Native Audio model processes raw audio natively through a single, low-latency model. This unification is the core technical innovation that dramatically reduces latency.
Real-time multimodality: The API is designed for unified processing across audio, text, and visual modalities. Your agent can converse about topics informed by live streams of visual data (like charts or live video feeds shared by a user) simultaneously with spoken input.
Next-generation conversation features
Gemini Live API gives you a suite of production-ready features that define a new standard for AI agents:
Affective dialogue (emotional intelligence): By natively processing raw audio, the model can interpret subtle acoustic nuances like tone, emotion, and pace. This allows the agent to automatically de-escalate stressful support calls or adopt an appropriately empathetic tone.
Proactive audio (smarter barge-in): This feature moves beyond simple Voice Activity Detection (VAD). As demonstrated in our live demo, you can configure the agent to intelligently decide when to respond and when to remain a silent co-listener. This prevents unnecessary interruptions when passive listening is required, making the interaction feel truly natural.
Tool use: Developers can seamlessly integrate tools like Function Calling and Grounding with Google Search into these real-time conversations, allowing agents to pull real-time world knowledge and execute complex actions immediately based on spoken and visual input.
Continuous memory: Agents maintain long, continuous context across all modalities.
Enterprise-grade stability: With GA release, you get the high availability required for production workloads, including multi-region support to ensure your agents remain responsive and reliable for users globally.
Developer quickstart: Getting started
For developers, the quickest way to experience the power of low-latency, real-time audio is to understand the flow of data. Unlike REST APIs where you make a request and wait, Gemini Live API requires managing a bi-directional stream.
Gemini Live API flow
Before diving into code, it is critical to visualize the production architecture. While a direct connection is possible for prototyping, most enterprise applications require a secure, proxied flow: User-facing App -> Your Backend Server -> Gemini Live API (Google Backend).
In this architecture, your frontend captures media (microphone/camera) and streams it to your secure backend, which then manages the persistent WebSocket connection to Gemini Live API in Vertex AI. This ensures sensitive credentials never leave your server and allows you to inject business logic, persist conversation state, or manage access control before data flows to Google.
To help you get started, we have released two distinct Quickstart templates – one for understanding the raw protocol, and one for modern component-based development.
Core implementation: You interact with the gemini-live-2.5-flash-native-audio model via a stateful WebSocket connection.
code_block
<ListValue: [StructValue([(‘code’, “const client = new GeminiLiveAPI(proxyUrl, projectId, model);rnrn// Connect using the access token handled by the proxyrnclient.connect(accessToken); rnrn// Stream audio from the user’s microphonernclient.sendAudioMessage(base64AudioChunk);”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5b0254df0>)])]>
Running the Vanilla JS Demo:
code_block
<ListValue: [StructValue([(‘code’, ‘pip3 install -r requirements.txtrngcloud auth application-default loginrnpython3 server.pyrn# Open http://localhost:8000’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5b0254730>)])]>
Pro-tip: Debugging raw audio Working with raw PCM audio streams can be tricky. If you need to verify your audio chunks or test Base64 strings, we’ve included a PCM Audio Debugger in the repository.
Best for: Building scalable, production-ready applications with complex UIs.
If you are building a robust enterprise application, our React starter provides a modular architecture using AudioWorklets for high-performance, low-latency audio processing.
Features:
Real-time streaming: Audio and video streaming to Gemini with React state management.
AudioWorklets: Uses capture.worklet.js and playback.worklet.js for dedicated audio processing threads.
Secure proxy: Python backend handles Google Cloud authentication.
If you prefer a simpler development process for specific telephony or WebRTC environments, we have third-party partner integrations with Daily, Twilio, LiveKit, and Voximplant. These platforms have integrated the Gemini Live API over the WebRTC protocol, allowing you to drop these capabilities directly into your existing voice and video workflows without managing the networking stack yourself .
Gemini Live API: Three production-ready demos
Once you have your foundation set with either template, how do you scale this into a product? We’ve built three demos showcasing the distinct “superpowers” of Gemini Live API.
1. Real-time proactive advisor agent
The core of building truly natural conversational AI lies in creating a partner, not just a chatbot. This specialized application demonstrates how to build a business advisor that listens to a conversation and provides relevant insights based on a provided knowledge base.
It showcases two critical capabilities for professional agents: Dynamic Knowledge Injection and Dual Interaction Modes.
The Scenario: An advisor sits in on a business meeting. It has access to specific injected data (revenue stats, employee counts) that the user defines in the UI.
Dual modes:
Silent mode: The advisor listens and “pushes” visual information via a show_modal tool without speaking. This is perfect for unobtrusive assistance where you want data, not interruption.
Outspoken mode: The advisor politely interjects verbally to offer advice, combining audio response with visual data.
Barge-in control: The demo uses activity_handling configurations to prevent the user from accidentally interrupting the advisor, ensuring complete delivery of complex advice when necessary.
Tool use: Uses a custom show_modal tool to display structured information to the user.
Check out the full source code for the real-time advisor agent implementation in our GitHub repository.
2. Multimodal customer support agent
Customer support agents must be able to act on what they “see” and “hear.” This demo layers Contextual Action and Affective Dialogue onto the voice stream, creating a support agent that can resolve issues instantly.
This application simulates a futuristic customer support interaction where the agent can see what you see, understand your tone, and take real actions to resolve your issues instantly. Instead of describing an item for a return, the user simply shows it to the camera. The agent combines this visual input with emotional understanding to drive real actions:
Multimodal Understanding: The agent visually inspects items shown by the customer (e.g., verifying a product for return) while listening to their request.
Empathetic Response: Using affective dialogue, the agent detects the user’s emotional state (frustration, confusion) and adjusts its tone to respond with appropriate empathy.
Action Taking and Tool Use: It doesn’t just chat; it uses custom tools like process_refund (handling transaction IDs) or connect_to_human (transferring complex issues) to actually solve the problem.
Real-time Interaction: Low-latency voice interaction using Gemini Live API over WebSockets.
Check out the full source code for the multi-modal customer support agent implementation in our GitHub repository.
3. Real-time video game assistant
Gaming is better with a co-pilot. In this demo, we build a Real-Time Gaming Guide that moves beyond simple chat to become a true companion that watches your gameplay and adapts to your style.
This React application streams both your screen capture and microphone audio to the model simultaneously, allowing the agent to understand the game state instantly. It showcases three advanced capabilities:
Multimodal awareness: The agent acts as a second pair of eyes, analyzing your screen to spot enemies, loot, or puzzle clues that you might miss.
Persona switching: You can dynamically toggle the agent’s personality – from a “Wise Wizard” offering cryptic hints to a “SciFi Robot” or “Commander” giving tactical orders. This demonstrates how system instructions can instantly change the voice and style of assistance.
Google Search Grounding: The agent pulls real-time information to provide up-to-date walkthroughs and tips, ensuring you never get stuck on a new level.
Check out the full source code for the real-time video game assistant implementation in our GitHub repository.
Get started today
Try it out today: Experiment with Gemini Live API in Vertex AI Studio
Start building: Access Gemini Live API on Vertex AI today and move beyond chatbots to create truly intelligent, responsive, and empathetic user experiences.
Get the code: All demos and quickstarts are available in our official GitHub repository.
Public sector agencies are under constant pressure to do more with less. With shrinking budgets and expanding responsibilities, the need for efficient, secure, and modern tools has never been greater. Outdated systems can hinder collaboration, create security risks, and impact employee morale, making it difficult to deliver the services constituents require.
Modernizing mission delivery
Google Workspace is a modern solution designed to meet these challenges head-on. It provides a unified, cloud-native platform with built-in AI to streamline workflows, enhance security, and foster seamless collaboration. To prove its value to your team, we invite you to the Google Workspace test drive. This no-cost 30-daypilot provides your agency with full, hands-on access to the entire suite, commitment-free.
Our team provides comprehensive support throughout the pilot, including:
Guided setup: We handle the technical setup and configuration to get you started quickly.
User onboarding: We provide hands-on training to ensure your team feels confident and capable.
Workflow testing: We help you test Google Workspace with your agency’s actual day-to-day tasks.
Impact assessment: We deliver a final report you can present to leadership, detailing the value and user feedback.
Secure, efficient AI you can trust
In the public sector, efficiency isn’t just about saving money—it’s about maximizing mission impact. Every dollar saved and every hour reclaimed is a resource that can be reinvested into serving the community, from improving public safety to delivering critical services. AI is the key to unlocking this potential, automating routine administrative tasks and freeing up public servants to focus on the high-value, human-centric work that truly makes a difference. This is where AI becomes a powerful ally. Google Workspace meets this challenge by embedding Gemini directly into the tools your team uses every day.
Underscoring our commitment to the public sector, Gemini in Workspace is the first generative AI assistant for productivity and collaboration suite to achieve FedRAMP High authorization. This certification allows government agencies to confidently deploy advanced AI tools, knowing sensitive data is protected within a stringent federal security framework. It enables agencies to leverage Gemini as a force multiplier, helping staff reclaim time and focus on mission-critical work.
This commitment to security extends to our partners in the Defense Industrial Base (DIB) as well. Google Public Sector recently achieved Cybersecurity Maturity Model Certification (CMMC) Level 2 Certification. This certification, validated by a certified third-party assessment organization (C3PAO), affirms that Google Public Sector’s internal systems used to handle Controlled Unclassified Information (CUI) meet the Department of Defense’s rigorous cybersecurity standards for protecting CUI.
Transforming workflows with Gemini
Imagine drafting a plan in minutes using “Help me write” in Docs, instantly summarizing a long email thread in Gmail, chatting with people in another language with instant translation, or getting automatic meeting notes and action items from Meet. This is the practical power of Gemini. The 30-day test drive is your opportunity to go beyond theory and see how these AI capabilities can transform your agency’s real-world workflows. Our team will guide you in developing workflows and collecting feedback, culminating in an assessment report for your leadership. Your team will have full access to experience how Gemini can reduce administrative burdens and accelerate productivity firsthand.
The U.S. Department of Transportation (DOT) selected Google Workspace as their new agency-wide collaboration suite, becoming the first cabinet-level agency to fully transition its workforce away from legacy providers to Google Workspace with Gemini, using the General Services Administration (GSA) OneGov Strategy. More than 50,000 DOT employees, including those from all transportation modes – ranging from the Federal Aviation Administration (FAA) to the National Highway Traffic Safety Administration (NHTSA) – will be able to take advantage of Workspace’s modern suite of cloud-based productivity and collaboration tools, including apps like Gmail, Docs, Drive, and Meet to help employees securely connect, create, and collaborate. More than 12,000 users have access today, with 40,000 more coming on in 2026.
The Google Workspace Test Drive has been a game-changer for our customers. Instead of lengthy pilots that often lose momentum, this focused 30 day sprint delivers quick wins and clear outcomes. Our clients walk away with tangible insights into how Google Workspace can improve collaboration and security, and we’re able to accelerate decision-making while building stronger executive alignment.
Sean Maday, CTO, Game Plan Tech
Take the next step
See the difference AI makes in mission delivery. Get 30 days of no-cost access to our FedRAMP High-authorized productivity suite with built-in AI. Sign up for the Google Workspace Test Drive to pilot Google Workspace with Gemini, on your own terms.
Written by: Aragorn Tseng, Robert Weiner, Casey Charrier, Zander Work, Genevieve Stark, Austin Larsen
Introduction
On Dec. 3, 2025, a critical unauthenticated remote code execution (RCE) vulnerability in React Server Components, tracked as CVE-2025-55182 (aka “React2Shell”), was publicly disclosed. Shortly after disclosure, Google Threat Intelligence Group (GTIG) had begun observing widespread exploitation across many threat clusters, ranging from opportunistic cyber crime actors to suspected espionage groups.
GTIG has identified distinct campaigns leveraging this vulnerability to deploy a MINOCAT tunneler, SNOWLIGHT downloader, HISONIC backdoor, and COMPOOD backdoor, as well as XMRIG cryptocurrency miners, some of which overlaps with activity previously reported by Huntress. These observed campaigns highlight the risk posed to organizations using unpatched versions of React and Next.js. This post details the observed exploitation chains and post-compromise behaviors and provides intelligence to assist defenders in identifying and remediating this threat.
CVE-2025-55182 is an unauthenticated RCE vulnerability in React Server Components with a CVSS v3.x score of 10.0 and a CVSS v4 score of 9.3. The flaw allows unauthenticated attackers to send a single HTTP request that executes arbitrary code with the privileges of the user running the affected web server process.
GTIG considers CVE-2025-55182 to be a critical-risk vulnerability. Due to the use of React Server Components (RSC) in popular frameworks like Next.js, there are a significant number of exposed systems vulnerable to this issue. Exploitation potential is further increased by two factors: 1) there are a variety of valid payload formats and techniques, and 2) the mere presence of vulnerable packages on systems is often enough to permit exploitation.
The specific RSC packages that are vulnerable to CVE-2025-55182 are versions 19.0, 19.1.0, 19.1.1, and 19.2.0 of:
react-server-dom-webpack
react-server-dom-parcel
react-server-dom-turbopack
A large number of non-functional exploits, and consequently false information regarding viable payloads and exploitation logic, were widely distributed about this vulnerability during the initial days after disclosure. An example of a repository that started out wholly non-functional is this repository published by the GitHub user “ejpir“, which, while initially claiming to be a legitimate functional exploit, has now updated their README to appropriately label their initial research claims as AI-generated and non-functional. While this repository still contains non-functional exploit code, it also now contains legitimate exploit code with Unicode obfuscation. While instances like this initially caused confusion across the industry, the number of legitimate exploits and their capabilities have massively expanded, including in-memory Next.js web shell deployment capabilities. There are also exploit samples, some entirely fake, some non-functional, and some with legitimate functionality, containing malware targeting security researchers. Researchers should validate all exploit code before trusting its capabilities or legitimacy.
Technical write-ups about this vulnerability have been published by reputable security firms, such as the one from Wiz. Researchers should refer to such trusted publications for up-to-date and accurate information when validating vulnerability details, exploit code, or published detections.
Additionally, there was a separate CVE issued for Next.js (CVE-2025-66478); however, this CVE has since been marked as a duplicate of CVE-2025-55182.
Observed Exploitation Activity
Since exploitation of CVE-2025-55182 began on Dec. 3, 2025, GTIG has observed diverse payloads and post-exploitation behaviors across multiple regions and industries. In this blog post we focus on China-nexus espionage and financially motivated activity, but we have additionally observed Iran-nexus actors exploiting cve-2025-55182.
China-Nexus Activity
As of Dec. 12, GTIG has identified multiple China-nexus threat clusters utilizing CVE-2025-55182 to compromise victim networks globally. Amazon Web Services (AWS) reporting indicates that China-nexus threat groups Earth Lamia and Jackpot Panda are also exploiting this vulnerability. GTIG tracks Earth Lamia as UNC5454. Currently, there are no public indicators available to assess a group relationship for Jackpot Panda.
MINOCAT
GTIG observed China-nexus espionage cluster UNC6600 exploiting the vulnerability to deliver the MINOCAT tunneler. The threat actor retrieved and executed a bash script used to create a hidden directory ($HOME/.systemd-utils), kill any processes named “ntpclient“, download a MINOCAT binary, and establish persistence by creating a new cron job and a systemd service and by inserting malicious commands into the current user’s shell config to execute MINOCAT whenever a new shell is started. MINOCAT is an 64-bit ELF executable for Linux that includes a custom “NSS” wrapper and an embedded, open-source Fast Reverse Proxy (FRP) client that handles the actual tunneling.
SNOWLIGHT
In separate incidents, suspected China-nexus threat actor UNC6586 exploited the vulnerability to execute a command using cURL or wget to retrieve a script that then downloaded and executed a SNOWLIGHT downloader payload (7f05bad031d22c2bb4352bf0b6b9ee2ca064a4c0e11a317e6fedc694de37737a). SNOWLIGHT is a component of VSHELL, a publicly available multi-platform backdoor written in Go, which has been used by threat actors of varying motivations. GTIG observed SNOWLIGHT making HTTP GET requests to C2 infrastructure (e.g., reactcdn.windowserrorapis[.]com) to retrieve additional payloads masquerading as legitimate files.
Figure 1: cURL command executed to fetch SNOWLIGHT payload
COMPOOD
GTIG also observed multiple incidents in which threat actor UNC6588 exploited CVE-2025-55182, then ran a script that used wget to download a COMPOOD backdoor payload. The script then executed the COMPOOD sample, which masqueraded as Vim. GTIG did not observe any significant follow-on activity, and this threat actor’s motivations are currently unknown.
Figure 2: COMPOOD downloaded via wget and executed
COMPOOD has historically been linked to suspected China-nexus espionage activity. In 2022, GTIG observed COMPOOD in incidents involving a suspected China-nexus espionage actor, and we also observed samples uploaded to VirusTotal from Taiwan, Vietnam, and China.
HISONIC
Another China-nexus actor, UNC6603, deployed an updated version of the HISONIC backdoor. HISONIC is a Go-based implant that utilizes legitimate cloud services, such as Cloudflare Pages and GitLab, to retrieve its encrypted configuration. This technique allows the actor to blend malicious traffic with legitimate network activity. In this instance, the actor embedded an XOR-encoded configuration for the HISONIC backdoor delimited between two markers, “115e1fc47977812” to denote the start of the configuration and “725166234cf88gxx” to mark the end. Telemetry indicates this actor is targeting cloud infrastructure, specifically AWS and Alibaba Cloud instances, within the Asia Pacific (APAC) region.
Finally, we also observed a China-nexus actor, UNC6595, exploiting the vulnerability to deploy ANGRYREBEL.LINUX. The threat actor uses an installation script (b.sh) that attempts to evade detection by masquerading the malware as the legitimate OpenSSH daemon (sshd) within the /etc/ directory, rather than its standard location. The actor also employs timestomping to alter file timestamps and executes anti-forensics commands, such as clearing the shell history (history -c). Telemetry indicates this cluster is primarily targeting infrastructure hosted on international Virtual Private Servers (VPS).
Financially Motivated Activity
Threat actors that monetize access via cryptomining are often among the first to exploit newly disclosed vulnerabilities. GTIG observed multiple incidents, starting on Dec. 5, in which threat actors exploited CVE-2025-55182 and deployed XMRig for illicit cryptocurrency mining. In one observed chain, the actor downloaded a shell script named “sex.sh,” which downloads and executes the XMRIG cryptocurrency miner from GitHub. The script also attempts to establish persistence for the miner via a new systemd service called “system-update-service.”
GTIG has also observed numerous discussions regarding CVE-2025-55182 in underground forums, including threads in which threat actors have shared links to scanning tools, proof-of-concept (PoC) code, and their experiences using these tools.
Outlook and Implications
After the disclosure of high-visibility, critical vulnerabilities, it is common for affected products to undergo a period of increased scrutiny, resulting in a swift but temporary increase in the number of vulnerabilities discovered. Since the disclosure of CVE-2025-55182, three additional React vulnerabilities have been disclosed: CVE-2025-55183, CVE-2025-55184, and CVE-2025-67779. In this case, two of these follow-on vulnerabilities have relatively limited impacts (restricted information disclosure and causing a denial-of-service (DoS) condition). The third vulnerability (CVE-2025-67779) also causes a DoS condition, as it arose due to an incomplete patch for CVE-2025-55184.
Recommendations
Organizations utilizing React or Next.js should take the following actions immediately:
Patch Immediately:
To prevent remote code execution due to CVE-2025-55182, patch vulnerable React Server Components to at least 19.0.1, 19.1.2, or 19.2.1, depending on your vulnerable version. Patching to 19.2.2 or 19.2.3 will also prevent the potential for remote code execution.
To prevent the information disclosure impacts due to CVE-2025-55183, patch vulnerable React Server Components to at least 19.2.2.
To prevent DoS impacts due to CVE-2025-55184 and CVE-2025-67779, patch vulnerable React Server Components to 19.2.3. The 19.2.2 patch was found to be insufficient in preventing DoS impacts.
Deploy WAF Rules: Google has rolled out a Cloud Armor web application firewall (WAF) rule designed to detect and block exploitation attempts related to this vulnerability. We recommend deploying this rule as a temporary mitigation while your vulnerability management program patches and verifies all vulnerable instances.
Audit Dependencies: Determine if vulnerable React Server Components are included as a dependency in other applications within your environment.
Monitor Network Traffic: Review logs for outbound connections to the indicators of compromise (IOCs) listed below, particularly wget or cURL commands initiated by web server processes.
Hunt for Compromise: Look for the creation of hidden directories like $HOME/.systemd-utils, the unauthorized termination of processes such as ntpclient, and the injection of malicious execution logic into shell configuration files like $HOME/.bashrc.
Indicators of Compromise (IOCs)
To assist defenders in hunting for this activity, GTIG is providing the following indicators observed during recent investigations. A Google Threat Intelligence Collection of IOCs is available for registered users.
In today’s dynamic business environment, accurate forecasting is the bedrock of efficient operations. Yet, businesses across all industries grapple with the constant challenge of predicting future demand, resource needs, and market trends. Nor is this an abstract problem; the cost of miscalculation can be substantial, leading to wasted resources and missed opportunities.
Imagine a large retail chain struggling to predict seasonal demand for its popular clothing lines. A miscalculation means either mountains of unsold inventory and costly markdowns or constant stock-outs that lead to lost sales and frustrated customers. Or consider a manufacturer trying to optimize the procurement of raw materials. Inaccurate forecasts force them into a reactive cycle of expensive rush orders and production delays, or they see their capital tied up in slow-moving inventory.
Whether it’s outdated processes, siloed systems, or missing data semantics slowing down your forecasts — and your business — there’s now a better way. A new era of AI-powered enterprise intelligence means we can move beyond reactive measures and achieve new levels of foresight. Google Cloud and App Orchid have now developed a novel multi-agent application for business forecasting that helps transform the predictive challenges of the past into strategic advantages.
In this post, we will outline the elements of our multi-agent system, the benefits of this innovative approach to agentic frameworks, and how others can benefit from not only our offering but rethink how they create their own.
Transforming enterprise intelligence with AI agents
App Orchid is a leader in making data actionable with AI, with a mission to make AI a force for good. The goal is to empower every employee with trusted, understandable, and accessible data. While enterprise data is now a mission-critical asset for organizations, it’s often underutilized, difficult to access, and buried under layers of complexity. To help, App Orchid partnered with Google to build a multi-agent application that provides a new level of precision in operational forecasting.
This innovative solution combines two powerful, specialized AI agents: a prediction agent built by Google Cloud and App Orchid’s Data Agent offering. These agents work in concert to solve complex business problems, acting as complementary specialists — each an expert in its own domain. App Orchid’s agent possesses unparalleled understanding of an enterprise’s past and present, while Google’s agent brings world-class capabilities in predicting the future.
Adopting a multi-agent approach provides clear, tangible advantages that directly address the forecasting problems that often plague businesses, including:
Improved accuracy: Achieve a level of forecasting precision that was previously unattainable, reducing costly errors.
Increased operational efficiency: Automate and streamline the complex processes of data preparation and prediction, freeing up valuable human resources.
Faster insights: Gain real-time, actionable insights, enabling quicker and more informed decision-making.
Reduced costs and increased revenue: Directly impact the bottom line by minimizing inventory waste, reducing stock-outs to maximize sales, and optimizing resource allocation.
Greater agility and adaptability: Rapidly adapt to market shifts and unforeseen disruptions with agile forecasting capabilities.
How it works: Three powerful agents, one seamless solution
To better understand the power of this new agentic framework, it’s first essential to understand how these two AI agents work together as complementary specialists under the direction of a third, orchestrator agent.
1. Google prediction agent – The forecasting powerhouse
The prediction agent, which is primarily the custom engineering work of Google Cloud, is the system’s window to the future. It takes rich, contextualized historical data and applies Google’s state-of-the-art predictive models to generate highly accurate and actionable forecasts. The agent utilizes specialized foundation models like TimesFM, which is pre-trained on billions of data points for time-series forecasting, and the Population Dynamics Foundation Model (PDFM), which analyzes geospatial data to understand demographic similarities. By combining these powerful models, the Google prediction agent helps businesses anticipate future demand and identify market trends with a new level of precision.
2. App Orchid Data Agent – The enterprise intelligence data expert
Accurate predictions depend on high-quality, AI-ready data, which is where App Orchid’s Data Agent excels. This agent acts as the intelligent query engine for your enterprise data, connecting even disparate and siloed information to draw insights and offer informed feedback. The Data Agent utilizes the Google Cloud Cortex Framework for a unified view of business data and then applies App Orchid’s own powerful semantic-layer technology, which closes the “AI context gap” by transforming raw, siloed data into a unified, trustworthy knowledge foundation that grounds AI models for accuracy and scalability.
Together, this creates a “smart data layer” within the Data Agent that maps an organization’s entire data landscape into a unified knowledge graph. This allows the agent to understand the unique context, relationships, and terminology specific to a business — from internal acronyms to complex operational data — and deliver the comprehensive, time-series datasets that are essential for producing accurate predictions.
3. The combined business forecasting agent
At the heart of the solution is a unified business forecasting agent, which brings together the capabilities of our unique prediction and data agents in a discrete instance for the user. While our multi-agent system delivers an automated, end-to-end tool for next-generation forecasting, it’s the business forecasting agent that the user interacts with — few actually realize there are multiple agents working behind the scenes, nor need to know.
The process begins by having the forecasting agent function as an orchestrator of user queries, directing the App Orchid Data Agent to construct a unified, trustworthy knowledge foundation; this is a crucial step for eliminating data silos and prepares the high-context, time-series data required for accurate AI. This smart data layer is then passed to the Google prediction agent, which applies its specialized foundation models to project future outcomes with impressive levels of precision.
The final result is a single, comprehensive forecast delivered back to the user via the orchestrator agent. And by automating the labor-intensive processes of data wrangling and prediction execution, the combined solution empowers business leaders to make instant, highly accurate decisions that directly reduce costly markdowns, prevent stock-outs, maximize sales, and efficiently optimize resource allocation.
The technical glue: A2A Protocol, Google’s Agent Development Kit (ADK), and Gemini
The A2A Protocol enables AI agents, even those built by different teams and organizations, to discover, securely communicate, and collaborate across various systems. This allows developers to unite agents from multiple platforms, improving modularity, reducing vendor lock-in, and speeding up innovation. While Model Context Protocol (MCP) allows developers to connect data and APIs to agents, the agents themselves need a communication layer. The A2A protocol enables the bi-directional agentic communication needed to achieve multi-agent systems.
The Google ADK provides a robust framework for building sophisticated, scalable multi-agent applications. It supports both code-first development (Python and Java SDKs) and a no-code (YAML-based) development to define agent behavior and orchestrate workflows. The ADK also utilizes MCP Tools, which provide a standardized interface for agents to interact with external systems.
For instance, the MCP Toolbox for Databases provides out-of-the-box support for agents to easily access and query data from a variety of sources, including BigQuery. Agents built with the ADK can be deployed across virtually any environment — from a fully managed, enterprise-grade runtime like Vertex AI Agent Engine to Google Cloud services like Compute Engine, Cloud Run, or Google Kubernetes Engine (GKE), other cloud providers, or on-premises infrastructure.
At the heart of both agents’ intelligence are Google’s Gemini models, which are engineered for sophisticated reasoning and lead the industry on long context performance, with many models offering context windows of 1 million tokens or more. This massive context windowis critical: it enable the data agent to understand natural language queries, enterprise data, and its underlying schema and relationships simultaneously, while enabling the prediction agent to analyze complex historical data and grasp the nuances of forecasting tasks.
This holistic understanding is what makes it possible for both agents to break down complex problems, identify subtle trends, and follow multi-step instructions without losing the initial context. Additionally, native tool use allows the agents to proficiently interact with external systems — whether querying a database or invoking a predictive model — and interpret the outputs to fulfill user requests.
Solution architecture: A secure, managed platform for AI agents
The Google and App Orchid multi-agent application is accessible through Gemini Enterprise, which provides a fully managed platform for organizations to discover, manage, and interact with AI agents,featuringenterprise-grade security, data privacy, and governance. The user-facing agent is deployed on Vertex AI Agent Engine and registered within this secure environment, providing authorized users with a simple entry point to the powerful multi-agent capabilities while ensuring that all interactions are grounded in the company’s private data and adhere to its security and compliance policies.
From an architectural perspective, the business forecasting agent acts as the orchestrator agent of this multi-agent system, managing and directing the entire workflow between the two specialized agents. It uses the A2A protocol to pass instructions and data back and forth, hiding the underlying complexity from the user.
For example, when a user asks the business forecasting agent to predict revenue per marketing channel, the agent:
Calls upon the App Orchid Data Agent to retrieve historical sales data.
Passes that information to the Google prediction agent to run its models and generate the forecast.
Receives the prediction data back and then summarizes it for the user.
The Future is Collaborative
As the agentic era gets underway, it is evolving quickly. Our multi-agent approach demonstrates both how true agentic systems are most successful when multiple agents are at play, and the importance of finding strong partners with distinct capabilities to help build and assemble these agentic systems.
The partnership between App Orchid and Google is a testament to the power of collaboration in the AI space. By combining App Orchid’s deep understanding of enterprise data and Google’s global leadership in AI, this solution is greater than the sum of its parts. With the complementary strengths of each agent, enabled by the ADK and A2A Protocol, businesses can achieve levels of forecasting accuracy that were previously unattainable at the speed and ease agents can offer.
This innovative multi-agent system not only optimizes business processes but also paves the way for a future where AI agents from across the industry can collaborate to solve the world’s most complex challenges.
The Google Cloud team would like to thank App Orchid’s CTO Ravi Bommakanti for his contributions to this project.
Editor’s note: Today we hear from Dave McCarthy of IDC about a total cost of ownership crisis for AI infrastructure — and what you can do about it. Read on for his insights.
The AI landscape is undergoing a seismic shift. For the past few years, the industry has been focused on the massive, resource-intensive process of training generative AI models. But the focus is now rapidly pivoting to a new, even larger challenge: inference.
Inference — the process of using a trained model to make real-time predictions — is no longer just one part of the AI lifecycle; it is quickly becoming the dominant workload. In a recent IDC global survey of over 1,300 AI decision-makers, inference was already cited as the largest AI workload segment, accounting for 47% of all AI operations.
This dominance is driven by the sheer volume of real-world applications. While a model is trained periodically, it is used for inference non-stop, with every user query, API call, and recommendation. It is also critical to recognize that this inference surge will be distributed across hybrid environments. According to IDC survey respondents, 63% of workloads will reside in the cloud, which remains the standard for scalable applications like content creation and chatbots. In contrast, 37% will be deployed on on-premises infrastructure, usually related to use cases such as robotics and other systems that interact directly with the physical world.
Now, a new factor is set to multiply this demand: the rise of autonomous and semi-autonomous AI agents.
These “agentic workflows” represent the next logical step in AI, where models don’t just respond to a single prompt but execute complex, multi-step tasks. An AI agent might be asked to “plan a trip to Paris,” requiring it to perform dozens of interconnected operations: browsing for flights, checking hotel availability, comparing reviews, and mapping locations. Each of these steps is an inference operation, creating a cascade of requests that must be orchestrated across different systems.
This surge in demand is exposing a critical vulnerability for many organizations: the AI efficiency gap.
The TCO crisis in an age of agents
The AI efficiency gap is the difference between the theoretical performance of an AI stack and the actual, real-world performance achieved. This gap is the source of a Total Cost of Ownership (TCO) crisis, and it’s driven by system-wide inefficiencies.
Our research shows that more than half (54.3%) of organizations use multiple AI frameworks and hardware platforms. While this flexibility seems beneficial, it has a staggering downside: 92% of these organizations report a negative effect on efficiency.
This fragmented “patchwork” approach, stitched together from disparate and non-optimized services, creates a ripple effect of problems:
41.6% reported increased compute costs: Redundant processes and poor utilization drive up spending.
40.4% reported increased engineering complexity: Teams spend more time managing the fragmented stack than delivering value.
40.0% reported increased latency: Bottlenecks in one part of the system (like storage or networking) degrade the overall performance of an application.
The core problem is that organizations are paying for expensive, high-performance accelerators, but are failing to keep them busy. Our data shows that 29% of all AI budget waste is tied to inference. This waste is a direct result of idle GPU time (cited by 29.4% of respondents) and inefficient use of resources (22.3%).
When an expensive accelerator is idle, it’s often waiting for data from a slow storage system or for the application server to prepare the next request. This is a system-level failure, not a component failure.
This failure is often compounded by significant hurdles in data management, which serves as the fuel for these AI engines. Survey respondents highlighted three primary challenges contributing to this gap: 47.7% struggle with ensuring data quality and governance, 45.6% grapple with data storage management and related costs, and 44.1% cite the complexity and time required for data cleaning and preparation. When data pipelines cannot keep pace with high-speed accelerators, the entire infrastructure becomes inefficient.
Closing the gap: From fragmented stacks to integrated systems
To scale cost-effectively in the age of AI agents, we must stop thinking about individual components and start focusing on system-level design.
An agentic workflow, for example, requires tight coordination between two distinct types of compute:
General-purpose compute: This is the operational backbone. It runs the application servers, orchestrates the workflow, pre-processes data, and handles all the logic around the model.
Specialized accelerators: This is the high-performance engine that runs the AI model itself.
In a fragmented environment, these two sides are inefficiently connected, and latency skyrockets. The path forward is an optimized architecture where the software, networking, storage, and compute — both general-purpose and specialized — are designed to work as a single, cohesive system.
This holistic approach is the only sustainable way to manage the TCO of AI. It redefines the goal away from simply buying faster accelerators and toward improving the overall “price-performance” and “unit economics” of the entire end-to-end workflow. By eliminating bottlenecks and maximizing the utilization of every resource, organizations can finally close the efficiency gap. Organizations are actively shifting strategies to capture this value. Our survey indicates that 28.9% of respondents are prioritizing model optimization techniques, while 26.3% are partnering with AI service providers to navigate this complexity. Additionally, 25% are investing in training to upskill their teams, ensuring they can increase the value of their AI investments.
The age of inference is here, and the age of agents is right behind it. This next wave of innovation will be won not by the organizations with the most powerful accelerators, but by those who build the most efficient, integrated, and cost-effective systems to power them.
A message from Google Cloud
We sponsored this IDC research to help IT leaders navigate the critical shift to the “Age of Inference.” We recognize that the “efficiency gap” identified here — driven by fragmented stacks and idle resources — is the primary barrier to sustainable ROI. That is why we created AI Hypercomputer: an integrated supercomputer system designed to deliver exceptional performance and efficiency for demanding AI workloads.
Today, we’re thrilled to announce that the IDC Marketscape has positioned Google as a Leader in the 2025 IDC MarketScape for Worldwide Hyperscaler Marketplaces. We believe this recognition underscores our commitment to deliver a cloud marketplace experience that fuels the AI agent economy and accelerates innovation. This achievement reflects our dedication to creating an open and interoperable agentic AI ecosystem for our customers and partners.
The IDC MarketScape rigorously evaluated hyperscaler marketplaces based on their strategies and capabilities, incorporating buyer and seller feedback. We believe being named a Leader reflects Google’s strengths in delivering a comprehensive, integrated, and forward-thinking platform for cloud and AI solutions.
According to the IDC MarketScape, Google was recognized for:
Comprehensive solution portfolio and integration: Google Cloud Marketplace offers a broad selection of SaaS, AI agents, foundational models, data, professional services, and other listing categories, all validated for enterprise readiness and tightly integrated with Google Cloud.
AI focus: The platform emphasizes AI innovation through our dedicated AI agent category, Vertex AI tools, and foundational models, enabling customers and partners to co-innovate, monetize AI solutions, and deploy through Gemini Enterprise.
Delivering a trusted platform for validated partner solutions
Google Cloud Marketplace offers a broad array of third-party solutions from our trusted partner ecosystem that help organizations implement value-generating solutions faster. Partner solutions undergo a rigorous solution validation process to ensure they integrate with our first-party AI and cloud solutions for a streamlined customer discovery and deployment experience.
Cloud Marketplace also offers robust cost control and governance capabilities for enterprises, including fine-grained access control through Identity and Access Management, and a granular governance toolkit via Private Marketplace. This helps enterprise clients efficiently align technology procurement with internal requirements.
Activating agentic AI
Cloud Marketplace offers a growing portfolio of agentic and generative AI solutions. Customers can access open source and partner models, along with Google’s own models, that integrate with Vertex AI. Organizations can also access datasets and AI agents that have been validated for Agent2Agent Protocol (A2A) and Gemini Enterprise, further benefiting from our large and open partner ecosystem.
AI-driven discovery features in Cloud Marketplace, including natural-language search and contextual recommendations, help users quickly identify relevant solutions for their business and industry. Our platform supports flexible buying needs through both direct and multi-partner Marketplace Channel Private Offers, providing consistent private offer experiences for AI agents and tools, software, datasets, and professional services.
Google Cloud serves customers in 200+ countries and territories, with Google Cloud Marketplace facilitating global transactions between our customers and partners with localized payment and billing across 65 countries and 30 currencies.
For our partners, we continue to invest in programs and tools to streamline co-selling across the ecosystem. This includes recent enhancements such as industry-standard partner agreements with consistent terms across hyperscalers, ISVs, and channels. Customers and partners benefit from streamlined contract review cycles and faster time to market, while marketplace vendors maintain the flexibility to either use Google’s standard terms or apply their own.
We will also be providing partners with improved deal telemetry and insights to better predict channel-led buying behavior and identify Marketplace Channel Private Offer opportunity potential. In addition, our flagship cloud marketplace incentive program, Marketplace Customer Credit Program, offers an additional 3% in Google Cloud credits when customers purchase an eligible cloud marketplace solution for the first time, whether directly through an ISV or via a chosen channel partner.
Continuously enhancing the Cloud Marketplace experience
Our commitment to fostering rapid innovation and business transformation remains unwavering. We will continue to expand our offerings, particularly in agentic AI capabilities, and empower our customers and partners with the tools they need to accelerate value.
IDC MarketScape vendor analysis model is designed to provide an overview of the competitive fitness of technology and suppliers in a given market. The research methodology utilizes a rigorous scoring methodology based on both qualitative and quantitative criteria that results in a single graphical illustration of each supplier’s position within a given market. The Capabilities score measures supplier product, go-to-market and business execution in the short-term. The Strategy score measures alignment of supplier strategies with customer requirements in a 3-5-year timeframe. Supplier market share is represented by the size of the icons.