Welcome to the second Cloud CISO Perspectives for December 2025. Today, Google Cloud’s Nick Godfrey, senior director, and Anton Chuvakin, security advisor, look back at the year that was.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x7f39c06c5ca0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
2025 in review: Highlighting cloud security and evolving AI
By Nick Godfrey, senior director, and Anton Chuvakin, security advisor, Office of the CISO
Nick Godfrey, senior director, Office of the CISO
Cybersecurity is facing a unique moment, where AI-enhanced threat intelligence, products, and services have begun to give defenders an advantage over the threats they face that had proven elusive — until now.
However, threat actors have also begun to take advantage of AI in ways that have moved towards a wider use of tools.
At Google Cloud, we continue to strive towards our goals of bringing simplicity, streamlining operations, and enhancing efficiency and effectiveness for security essentials. AI is now part of that essential security approach, both building AI securely and using AI to boost defenders.
Anton Chuvakin, security advisor, Office of the CISO
Looking back at 2025, we’re sharing our top stories across five vital areas of development in cybersecurity: securing cloud, securing AI, AI-enabled defense, threat intelligence, and building the most trusted cloud.
Securing cloud
This year reinforced the importance of cloud security fundamentals. Cybersecurity risks continue to accelerate with the number and severity of breaches continuing to grow, and more organizations are turning to multi-cloud and hybrid solutions that introduce their own complex management challenges.
2025 was a crucial year as we continued our efforts to build AI securely — and to encourage others to do so, too. From AI governance to building agents securely, we wanted to give our customers the tools they need to secure their AI supply chain and tools.
We have seen some incredible strides towards empowering defenders with AI this year. As defenders guide others on how to secure their use of AI, we must ensure that we also use AI to support stronger defensive action.
As defenders have made significant advances in using AI to boost their efforts this year, government-backed threat actors and cybercriminals have been trying to do the same. At Google, we strongly believe in the power of threat intelligence to enhance defender abilities to respond to critical threats faster and more efficiently.
We continued to enhance our security capabilities and controls on our cloud platform to help organizations secure their cloud environments and address evolving policy, compliance, and business objectives.
As security professionals, we know that threat actors will continue to innovate to achieve their mission objectives. To help defenders proactively prepare for the coming year, we publish our annual forecast report with insights from across Google. We look forward to sharing more insights to help organizations strengthen their security posture in the new year.
For more leadership guidance from Google Cloud experts, please visit ourCISO Insights hub.
Here are the latest updates, products, services, and resources from our security teams so far this month:
How Google Does It: Collecting and analyzing cloud forensics: Here’s how Google’s Incident Management and Digital Forensics team gathers and analyzes digital evidence. Read more.
When securing Web3, remember your Web2 fundamentals: As Web3 matures, the stakes continue to rise. For Web3 to thrive, security should expand beyond the blockchain to protect operational infrastructure. Here’s how. Read more.
How Mandiant can help test and strengthen your cyber resilience: To help teams better prepare for actual incidents, we developed ThreatSpace, a cyber proving ground with all the digital noise of real employee activities. Read more.
Exploiting agency of autonomous AI agents with task injection: Learn what a task injection attack is, how it differs from prompt injection, and how it is particularly relevant to AI agents designed for a wide range of actions and tasks, such as computer-use agents. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x7f39c06c5d90>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
How threat actors are exploiting React2Shell: Shortly after CVE-2025-55182 was disclosed, Google Threat Intelligence Group (GTIG) began observing widespread exploitation across many threat clusters, from opportunistic cybercrime actors to suspected espionage groups. Here’s what GTIG has observed so far. Read more.
Intellexa’s prolific zero-day exploits continue: Despite extensive scrutiny and public reporting, commercial surveillance vendors such as Intellexa continue to operate unimpeded. Known for its “Predator” spyware, new GTIG analysis shows that Intellexa is evading restrictions and thriving. Read more.
APT24’s pivot to multi-vector attacks: GTIG is tracking a long-running and adaptive cyber espionage campaign by APT24, a People’s Republic of China (PRC)-nexus threat actor that has been deploying BADAUDIO over the past three years. Here’s our analysis of the campaign and malware, and how defenders can detect and mitigate this persistent threat. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Podcasts from Google Cloud
Bruce Schneier on the AI offense-defense balance: From rewiring democracy to hacking trust, Bruce Schneier discusses the impact of AI on society with hosts Anton Chuvakin and Tim Peacock. Hear his take on whether it will help support liberal democracy more, or boost the forces of corruption, illiberalism, and authoritarianism. Listen here.
The truth about autonomous AI hacking: Heather Adkins, Google’s Security Engineering vice-president, separates the hype from the hazards of autonomous AI hacking, with Anton and Tim. Listen here.
Escaping 1990s vulnerability management: Caleb Hoch, consulting manager for security transformations, Mandiant, discusses with Anton and Tim how vulnerability management has evolved beyond basic scanning and reporting, and the biggest gaps between modern practices and what organizations are actually doing. Listen here.
Adopting a dual offensive-defensive mindset: Betty DeVita, private and public board director and fintech advisor, shares her take on how boards can take on an offensive and defensive approach to cybersecurity for their organizations. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.
Data teams seem to be constantly balancing the need for governed, trusted metrics with business needs for agility and ad-hoc analysis. To help bridge the gap between managed reporting and rapid data exploration, we are introducing several new features in Looker, to expand users’ self-service capabilities. These updates allow individuals to analyze local data alongside governed models, organize complex dashboards more effectively, and align the look and feel of their analytics with corporate branding, all within the Looker platform.
Analyze ad-hoc data with Looker self-service Explores
Valuable data often exists outside of the primary database — whether in budget spreadsheets, sales lists, or ad-hoc research files. With self-service Explores, now in Preview, users can upload CSV and spreadsheet-based data using a drag-and-drop interface directly within Looker.
This feature allows users to combine local files with fully modeled Looker data to test new theories and enrich insights. Once uploaded, users can visually add new measures and dimensions to their self-service Explores, customize them, and share the results via dashboards and Looks.
Uploading a CSV file and creating a new self-service Explore in just a few clicks
To maintain governance, administrators retain oversight regarding which files are uploaded to the Looker instance and who has permission to perform uploads. Additionally, we have introduced a new content certification flow, which makes it easier to signal which content is the vetted, trusted source of truth, ensuring users can distinguish between ad-hoc experiments and certified data.
Certifying a self-service Explore
Upload data and content certification are available in Public Preview as of Looker 25.20.
Deliver clearer, cohesive data stories with tabbed dashboards
The new tabbed dashboard feature helps dashboard editors organize complex information into logical narratives, moving away from dense, single-page views. Editors can now streamline content creation with controls for adding, renaming, and reordering tabs.
For the viewer, the experience is designed to be seamless. Filters automatically pass values across the entire dashboard, while each tab displays only the filters relevant to the current view, reducing visual clutter. Users can share unique URLs for specific tabs and schedule or download the comprehensive multi-tab dashboard as a single PDF document.
Navigating between tabs on a multi-tab dashboard
This feature is currently available in preview.
Apply custom styling to dashboards
Matching internal dashboards to company branding can help create a familiar data experience and increase user engagement. We are announcing the Public Preview of internal dashboard theming, which allows creators to apply custom changes to tile styles, colors, fonts, and formatting directly to dashboards consumed inside the Looker application.
Applying custom theming for internal dashboards
With this feature, you can save, share, and apply pre-configured themes to ensure consistency. Users with permission to manage internal themes can create new templates for existing dashboards or select a default theme to apply across the entire instance.
You can enable Internal dashboard theming today on the Admin > Labs page.
Enabling the preview for internal dashboard theming
Get started
These new self-service capabilities in Looker are designed to help you and all users in your organization get more value out of your data by improving presentation flexibility and quality. Try self-service Explores and internal dashboard themes for yourself today and let us know your feedback.
In the AI era, when one year can feel like 10, you’re forgiven for forgetting what happened last month, much less what happened all the way back in January. To jog your memory, we pulled the readership data for top product and company news of 2025. And because we publish a lot of great thought leadership and customer stories, we pulled that data too. Long story short: the most popular stories largely mapped to our biggest announcements. But not always — there were more than a few sleeper hits on this year’s list. Read on to relive this huge year, and perhaps discover a few gems that you may have missed.
2025 started strong with important new virtual machine offerings, foundational AI tooling, and tools for both Kubernetes and data professionals. We also launched our “How Google Does It” series, looking at the internal systems and engineering principles behind how we run a modern threat-detection pipeline. We showed developers how to get started with JAX and made AI predictions for the year ahead. Readers were excited to learn about how L’Oréal built its MLOps platform and Deutsche Börse’s pioneering work on cloud-native financial trading.
There are AI products, and then there are products enhanced by AI. This month’s top launch, Gen AI Toolbox for Databases, falls into the latter category. This was also the month readers got serious about learning, with blogs about upskilling, resources, and certifications topping the charts. The fruits of our partnership with Anthropic made an appearance in our best-read list, and engineering leaders detailed Google’s extensive efforts to optimize AI system energy consumption. Execs ate up an opinion piece about how agents will unlock insights into unstructured data (which makes up 90% of enterprises’ information assets), and digested a sobering report on AI and cybercrime. During the Mobile World Congress event, we saw considerable interest in our work with telco leaders like Vodafone Italy and Amdocs.
Back when we announced it, our intent to purchase cybersecurity startup Wiz was Google’s largest deal ever, and the biggest tech deal of the year. We built on that security momentum with the launch of AI Protection. We also spread our wings to the Nordics with a new region, and announced the Gemma 3 open model on Vertex AI. Meanwhile, we explained the threat that North Korean IT workers pose to employers, gave readers a peek under the hood of the Colossus file system, and reminisced about what we’ve learned over 25 years of building data centers. Readers were interested in Levi’s approach to data and weaving it into future AI efforts, and in honor of the GDC Festival of Gaming, our AI partners shared some new perspectives on “living games.”
With April came Google Cloud Next, our flagship annual conference. From Firebase Studio, Ironwood TPUs, and Google Agentspace, to Vertex AI, Cloud WAN, and Gemini 2.5, it’s hard to limit ourselves to just a few stories, there were so many bangers (for the whole list, there’s always the event recap). Meanwhile, our systems team discussed innovations to keep data center infrastructure’s thermal envelope in check. And at the RSA Conference, we unveiled our vision for the agentic security operations center of the future. On the customer front, we highlighted the startups who played a starring role at Next, and took a peek behind the curtain of The Wizard of Oz at Sphere.
School was almost out, but readers got back into learning mode to get certified as generative AI leaders. You were also excited about new gen AI media models in Vertex AI, the availability of Anthropic’s Claude Opus 4 and Claude Sonnet 4. We also learned that you’re very excited to use AI to generate SQL code, and about using Cloud Run as a destination for your AI apps. We outlined the steps for building a well-defined data strategy, and showed governments how AI can actually improve their security posture. And on the customer front, we launched our “Cool Stuff Customers Built” round-ups, and ran stories from Formula E and MLB.
Up until this point, the promise of generative AI was largely around text and code. The launch of Veo 3 changed all that. Developers writing and deploying AI apps saw the availability of GPUs on Cloud Run as a big win, and we continued our steady drumbeat of Gemini innovation with 2.5 Flash and Flash-Lite. We also shared our thoughts on securing AI agents. And to learn how to actually build these agents, readers turned to stories about Box, the British real estate firm Schroders, and French luxury conglomerate LVMH (home of Louis Vuitton, Channel, Sephora and more).
Readers took a break from reading about AI to read about network infrastructure — the new Sol transatlantic cable, to be precise. Then it was back to AI: new video generation models in Vertex; a crucial component for building stateful, context-aware agents; and a new toolset for connecting BigQuery data to Agent Development Kit (ADK) and Multi-Cloud Protocol (MCP) environments. Developers cheered the integration between Cloud Run and Docker Compose, and executive audiences enjoyed a listicle on actionable, real-world uses for AI agents.
On the security front, we took a back-to-basics approach this month, exploring the persistence of some cloud security problems. And then, back to AI again, with our Big Sleep agent. Readers were also interested in how AI is alleviating record-keeping for nurses at HCA Healthcare, Ulta Beauty’s data warehousing and mobile record keeping initiatives, and how SmarterX migrated from Snowflake to BigQuery.
AI is compute- and energy-intensive; in a new technical paper, we released concrete numbers about our AI infrastructure’s power consumption. Then people went [nano] bananas for Gemini 2.5 Flash Image on Vertex AI, and developers got a jump on their AI projects with a wealth of technical blueprints to work from. The summer doldrums didn’t stop our security experts from tackling the serious challenge of cyber-enabled fraud. We also took a closer look at the specific agentic tools empowering workers at Wells Fargo, and how Keeta processes 11 million blockchain transactions per second with Spanner.
AI is cool tech, but how do you monetize it? One answer is the Agent Payment Protocol, or AP2. Developers and data scientists preparing for AI flocked to blogs about new Data Cloud offerings, the 2025 DORA Report, and new trainings. Executives took in our thoughts on building an agentic data strategy, and took notes on the best prompts with which to kickstart their AI usage. And because everybody is impacted by the AI era, including business leaders, we explained what it means to be “bilingual” in AI and security. Then, at Google’s AI Builders Forum, startups described how Google’s AI, infrastructure, and services are supporting their growth. Not to be left out, enterprises like Target and Mr. Cooper also showed off their AI chops.
Welcome to the Gemini Enterprise era, which brings enhanced security, data control, and advanced agent capabilities to large organizations. To help you prepare, we relaunched a variety of enhancements to our learning platform, and added new commerce and security programs. And while developers versed themselves on the finer points of Veo prompts, we discussed securing the AI supply chain, building AI agents for cybersecurity and defense, and a new vision on economic threat modeling. We partnered with PayPal to enable commerce in AI chats, Germany’s Planck Institute showed how AI can help share deep scientific expertise, and DZ Bank pioneered ways to make blockchain-based finance more reliable.
Whether it was Gemini 3, Nana Banana Pro, or our seventh-generation Ironwood TPUs, this was the month that we gave enterprise customers access to all our latest and greatest AI tech. We also did a deep dive on how we built the largest-ever Kubernetes cluster, clocking in at a massive 130,000 nodes, and we announced a new collaboration with AWS to improve connectivity between clouds.
Meanwhile, we updated our findings on the adversarial misuse of AI by threat actors and on the ROI of AI for security, and executives vibed out on our piece about vibe coding. Then, just in time for the holidays, we took a look at how Mattel is using AI tools to revamp its toys, and Waze showed how it uses Memorystore to keep the holiday traffic flowing.
The year is winding down, but we still have lots to say. Early returns show that you were interested in how to mitigate the React2Shell vulnerability, support for MCP across Google services, and the early access launch of AlphaEvolve. And let’s not forget Gemini 3 Flash, which is turning heads with its high-level reasoning, plus amazing speed and a flexible cost profile.
What does this all mean for you and your future? It’s important to contextualize these technology developments, especially AI. For example, the DORA team put together a guide on how high-performing platform teams can integrate AI capabilities into their workflows, we discussed what it looks like to have an AI-ready workforce, and our Office of the CISO colleagues put out their 2026 cybersecurity predictions. More to the point (guard), you could do like Golden State Warrior Stephen Curry and turn to Gemini to analyze your game, to prepare for the year ahead. We’ll be watching on Christmas Day to see how Steph is faring with Gemini’s advice.
In the latest episode ofthe Agent Factory, Mofi Rahmanand I had the pleasure of hosting, Brandon Royal, the PM working on agentic workloads on GKE. We dove deep into the critical questions around the nuances of choosing the right agent runtime, the power of GKE for agents, and the essential security measures needed for intelligent agents to run code.
This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.
We kicked off our discussion by tackling a fundamental question: why choose GKE as your agent runtime when serverless options like Cloud Run or fully managed solutions like Agent Engine exist?
Brandon explained that the decision often boils down to control versus convenience. While serverless options are perfectly adequate for basic agents, the flexibility and governance capabilities of Kubernetes and GKE become indispensable in high-scale scenarios involving hundreds or thousands of agents. GKE truly shines when you need granular control over your agent deployments.
We’ve discussed the Agent Development Kit (ADK) in previous episodes, and Mofi highlighted to us how seamlessly it integrates with GKE and even showed a demo with the agent he built. ADK provides the framework for building the agent’s logic, traces, and tools, while GKE provides the robust hosting environment. You can containerize your ADK agent, push it to Google Artifact Registry, and deploy it to GKE in minutes, transforming a local prototype into a globally accessible service.
As agents become more sophisticated and capable of writing and executing code, a critical security concern emerges: the risk of untrusted, LLM-generated code. Brandon emphasized that while code execution is vital for high-performance agents and deterministic behavior, it also introduces significant risks in multi-tenant systems. This led us to the concept of a “sandbox.”
For those less familiar with security engineering, Brandon clarified that a sandbox provides kernel and network isolation. Mofi further elaborated, explaining that agents often need to execute scripts (e.g., Python for data analysis). Without a sandbox, a hallucinating or prompt-injected model could potentially delete databases or steal secrets if allowed to run code directly on the main server. A sandbox creates a safe, isolated environment where such code can run without harming other systems.
So, how do we build this “high fence” on Kubernetes? Brandon introduced the Agent Sandbox on Kubernetes, which leverages technologies like gVisor, an application kernel sandbox. When an agent needs to execute code, GKE dynamically provisions a completely isolated pod. This pod operates with its own kernel, network, and file system, effectively trapping any malicious code within the gVisor bubble.
Mofi walked us through a compelling demo of the Agent Sandbox in action.We observed an ADK agent being given a task requiring code execution. As the agent initiated code execution, GKE dynamically provisioned a new pod, visibly labeled as “sandbox-executor,” demonstrating the real-time isolation. Brandon highlighted that this pod is configured with strict network policies, further enhancing security.
While the Agent Sandbox offers incredible security, the latency of spinning up a new pod for every task is a concern. Mofi demoed the game-changing solution: Pod Snapshots. This technology allows us to save their state of running sandboxes and then near-instantly restore them when an agent needs them. Brandon noted that this reduces startup times from minutes to seconds, revolutionizing real-time agentic workflows on GKE.
Conclusion
It’s incredible to see how GKE isn’t just hosting agents; it’s actively protecting them and making them faster.
Your turn to build
Ready to put these concepts into practice? Dive into the full episode to see the demos in action and explore how GKE can supercharge your agentic workloads.
In computing’s early days of the 1940s, mathematicians discovered a flawed assumption about the behavior of round-off errors. Instead of canceling out, fixed-point arithmetic accumulated errors, compromising the accuracy of calculations. A few years later, “random round-off” was proposed, which would round up or down based on a random probability proportional to the remainder.
In today’s age of generative AI, we face a new numerical challenge. To overcome memory bottlenecks, the industry is shifting to lower precision formats like FP8 and emerging 4-bit standards. However, training in low precision is fragile. Standard rounding destroys the tiny gradient updates driving learning, causing model training to stagnate. That same technique from the 1950s, now known as stochastic rounding, is allowing us to train massive models without losing the signal. In this article, you’ll learn how frameworks like JAX and Qwix apply this technique on modern Google Cloud hardware to make low-precision training possible.
When Gradients Vanish
The challenge in low-precision training is vanishing updates. This occurs when small gradient updates are systematically rounded to zero by “round to nearest” or RTN arithmetic. For example, if a large weight is 100.0 and the learning update is 0.001, a low-precision format may register 100.001 as identical to 100.0. The update effectively vanishes, causing learning to stall.
Let’s consider the analogy of a digital swimming pool that only records the water level in whole gallons. If you add a teaspoon of water, the system rounds the new total back down to the nearest gallon. This effectively deletes your addition. Even if you pour in a billion teaspoons one by one, the recorded water level never rises.
Precision through Probability
Stochastic rounding, or SR for short, solves this by replacing deterministic rounding rules with probability. For example, instead of always rounding 1.4 down to 1, SR rounds it to 1 with 60% probability and 2 with 40% probability.
Mathematically, for a value x in the interval [⌊x⌋,⌊x⌋+1], the definition is:
The defining property is that SR is unbiased in expectation:
Stochastic Rounding: E[SR(x)] = x
Round-to-Nearest: E[RTN(x)] ≠ x
To see the difference, look at our 1.4 example again. RTN is deterministic: it outputs 1 every single time. The variance is 0. It is stable, but consistently wrong. SR, however, produces a noisy stream like 1, 1, 2, 1, 2.... The average is correct (1.4), but the individual values fluctuate.
We can quantify the “cost” of zero bias with the variance formula:
Var(SR(x))=p(1-p)wherep=x-⌊x⌋
In contrast, RTN has zero variance, but suffers from fast error accumulation. In a sum of N operations, RTN’s systematic error can grow linearly (O(N)). If you consistently round down by a tiny amount, those errors stack up fast.
SR behaves differently. Because the errors are random and unbiased, they tend to cancel each other out. This “random walk” means the total error grows only as the square root of the number of operations O(√N).
While stochastic rounding introduces noise, the tradeoff can often be benign. In deep learning, this added variance often acts as a form of implicit regularization, similar to dropout or normalization, helping the model escape shallow local minima and generalize better.
Google’s TPU architecture includes native hardware support for stochastic rounding in the Matrix Multiply Unit (MXU). This allows you to train in lower-precision formats like INT4, INT8 and FP8 without meaningful degradation of model performance.
You can use Google’s Qwix library, a quantization toolkit for JAX that supports both training (QAT) and post-training quantization (PTQ). Here is how you might configure it to quantize a model in INT8, explicitly enabling stochastic rounding for the backward pass to prevent vanishing updates:
code_block
<ListValue: [StructValue([(‘code’, “import qwixrnrn# Define quantization rules selecting which layers to compressrnrules = [rn qwix.QtRule(rn module_path=’.*’,rn weight_qtype=’int8′,rn act_qtype=’int8′,rn bwd_qtype=’int8′, # Quantize gradientsrn bwd_stochastic_rounding=’uniform’, # Enable SR for gradientsrn )rn]rnrn# Apply Quantization Aware Training (QAT) rulesrnmodel = qwix.quantize_model(model, qwix.QtProvider(rules))”), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x7fcbc7d66f10>)])]>
Qwix abstracts the complexity of low-level hardware instructions, allowing you to inject quantization logic directly into your model’s graph with a simple configuration.
NVIDIA Blackwell & A4X VMs
The story is similar if you are using NVIDIA GPUs on Google Cloud. You can deploy A4X VMs, the industry’s first cloud instance powered by the NVIDIA GB200 NVL72 system. These VMs connect 72 Blackwell GPUs into a single supercomputing unit, the AI Hypercomputer.
Blackwell introduces native hardware support for NVFP4, a 4-bit floating-point format that utilizes a block scaling strategy. To preserve accuracy, the NVFP4BlockScaling recipe automatically applies stochastic rounding to gradients to avoid bias, along with other advanced scaling techniques.
When you wrap your layers in te.autocast with this recipe, the library engages these modes for the backward pass:
By simply entering this context manager, the A4X’s GB200 GPUs perform matrix multiplications in 4-bit precision while using stochastic rounding for the backward pass, delivering up to 4x higher training performance than previous generations without compromising convergence.
Best Practices for Production
To effectively implement SR in production, first remember that stochastic rounding is designed for training only. Because it is non-deterministic, you should stick to standard Round-to-Nearest for inference workloads where consistent outputs are required.
Second, use SR as a tool for debugging divergence. If your low-precision training is unstable, check your gradient norms. If they are vanishing, enabling SR may help, while exploding gradients suggest problems elsewhere.
Finally, manage reproducibility carefully. Since SR relies on random number generation, bit-wise reproducibility is more challenging. Always set a global random seed, for example, using jax.random.key(0), to ensure that your training runs exhibit “deterministic randomness,” producing the same results each time despite the internal probabilistic operations.
Stochastic rounding transforms the noise of low-precision arithmetic into the signal of learning. Whether you are pushing the boundaries with A4X VMs or Ironwood TPUs, this 1950’s numerical method is the key to unlocking the next generation of AI performance.
Connect on LinkedIn, X, and Bluesky to continue the discussion about the past, present, and future of AI infrastructure.
You’ve built a powerful AI agent. It works on your local machine, it’s intelligent, and it’s ready to meet the world. Now, how do you take this agent from a script on your laptop to a secure, scalable, and reliable application in production? On Google Cloud, there are multiple paths to deployment, each offering a different developer experience.
For teams seeking the simplest path to production, Vertex AI Agent Engine removes the need to manage web servers or containers entirely. It provides an opinionated environment optimized for python agents, where you define the agent’s logic, and the platform handles the execution, memory, and tool invocation.
aside_block
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fcbc59891f0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
The Serverless Experience: Cloud Run
For teams that want the flexibility of containers without the operational overhead, Cloud Run abstracts away the infrastructure, allowing you to deploy your agent as a container that automatically scales up when busy and down to zero when quiet.
This path is particularly powerful if you need to build in languages other than Python, use custom frameworks, or integrate your agent into existing declarative CI/CD pipelines.
aside_block
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fcbc5989400>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
The Orchestrated Experience: Google Kubernetes Engine (GKE)
For teams that need precise configuration over their environment, GKE is designed to manage that complexity. This path shows you how an AI agent functions not just as a script, but as a microservice within a broader orchestrated cluster.
aside_block
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fcbc5989460>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Your Path to Production
Whether you are looking for serverless speed, orchestrated control, or a fully managed runtime, these labs provide the blueprint to get you there.
These labs are part of the Deploying Agents module in our official Production-Ready AI with Google Cloud program. Explore the full curriculum for more content that will help you bridge the gap from a promising prototype to a production-grade AI application.
Share your progress and connect with others on the journey using the hashtag #ProductionReadyAI. Happy learning!
The White House’s Genesis Mission has set a bold ambition for our nation: to double our scientific productivity within the decade and harness artificial intelligence (AI) to accelerate the pace of discovery. This requires a profound transformation in our national scientific enterprise, one that seamlessly integrates high-performance computing, world-class experimental facilities, and AI. The challenge is no longer generating exabytes of exquisite data from experiments and simulations, but rather curating and exploring it using AI to accelerate the discoveries hidden within.
Through our Genesis Mission partnership with the Department of Energy (DOE), Google is committed to powering this new era of federally-funded scientific discovery with the necessary tools and platforms.
State-of-the-art reasoning for science
The National Labs can take advantage of Gemini for Government—a secure platform with an accredited interface that provides scaled access to a new class of agentic tools designed to augment the scientific process. This includes access to the full capabilities of Gemini, our most powerful and general-purpose AI model. Its native multimodal reasoning operates across the diverse data types of modern science. This means researchers can ask questions in natural language to generate insights grounded in selected sources—from technical reports, code, and images, to a library of enterprise applications, and even organizational and scientific datasets.
In addition to the Gemini for Government platform, the National Labs will have access to several Google technologies that support their mission. Today, Google DeepMind announced an accelerated access program for all 17 National Labs, beginning with AI co-scientist—a multi-agent virtual collaborator built on Gemini that can accelerate hypothesis development from years to days—with plans to expand to other frontier AI tools in 2026.
Google Cloud provides the secure foundation to bring these innovations to the public sector. By making these capabilities commercially available through our cloud infrastructure, we are ensuring that the latest frontier AI models and tools from Google DeepMind are accessible for the mission-critical work of our National Labs.
Accelerating the research cycle with autonomous workflows
Gemini for Government brings together the best of Google accredited cloud services, industry-leading Gemini models, and agentic solutions. The platform is engineered to enable autonomous workflows that orchestrate complex tasks.
A prime example is Deep Research, which can traverse decades of scientific literature and experimental databases to identify previously unseen connections across different research initiatives or flag contradictory findings that warrant new investigation. By automating complex computational tasks, like managing large-scale simulation ensembles or orchestrating analysis pipelines across hybrid cloud resources, scientists can dramatically accelerate the ‘design-build-test-learn’ cycle, freeing up valuable time for the creative thinking that drives scientific breakthroughs.
To ensure agencies can easily leverage these advanced capabilities—including the DOE and its National Laboratories—Gemini for Government is available under the same standard terms and pricing already established for all federal agencies through the General Services Administration’s OneGov Strategy. This streamlined access enables National Labs to quickly deploy an AI-powered backbone for their most complex, multi-lab research initiatives.
A secure fabric for big team science
The future of AI-enabled research requires interconnected experimental facilities, data repositories, and computing infrastructure stewarded by the National Labs.
Gemini for Government provides a secure, federated foundation required to reimagine “Big Team Science,” creating a seamless fabric connecting the entire DOE complex. AI models and tools in this integrated environment empower researchers to weave together disparate datasets from the field to the benchtop, and combine observations with models, revealing more insights across vast temporal and spatial scales.
Ultimately, this transformation can change the nature of discovery, creating a frictionless environment where AI manages complex workflows, uncovers hidden insights, and acts as a true creative research partner to those at our National Labs.
Learn more about Gemini for Government by registering for Google Public Sector Summit On-Demand. Ready to discuss how Gemini for Government can address your organization’s needs? Please reach out to our Google Public Sector team at geminiforgov@google.com.
Today’s AI capabilities provide a great opportunity to enable natural language (NL) interactions with your enterprise data through applications using text and voice. In fact, in the world of agentic applications, natural language is rapidly becoming the interaction standard. That means agents need to be able to issue natural language questions to a database and receive accurate answers in return. At Google Cloud, this drove us to build Natural-Language-to-SQL (NL2SQL) technology in the AlloyDB database that can receive a question as input and return a NL result, or the SQL query that will help you retrieve it.
Currently in preview, the AlloyDB AI natural language API enables developers to build an agentic application that answers natural language questions on their database data by agents or end users in a secure, business-relevant, explainable manner, with accuracy approaching 100% — and we’re focused on bringing this capability to a broader set of Google Cloud databases.
When we first released the API in 2024, it already provided leading NL2SQL accuracy, albeit not close to 100%. But leading accuracy isn’t enough. In many industries, it’s not sufficient to translate text into SQL with accuracy of 80% or even 90%. Low-quality answers carry a real cost, often measurable in monetary terms: disappointed customers or poor business decisions. A real estate search application that fails to understand what the end user is asking for (their “intent”) risks becoming irrelevant. In retail product search, less relevant answers lead to lower conversions into sales. In other words, the accuracy of the text-to-SQL translation must almost always be extremely high.
In this blog we help you understand the value of the AlloyDB AI natural language API and techniques for maximizing the accuracy of its answers.
Getting to ~100% accurate and relevant results
Achieving highly accurate text-to-SQL takes more than just prompting Gemini with a question. Rather, when developing your app, you need to provide AlloyDB AI with descriptive context, including descriptions of the database tables and columns; this context can be autogenerated. Then, when the AlloyDB AI natural language API receives a question, it can intelligently retrieve the relevant pieces of descriptive context, enabling Gemini to see how the question relates to the database data.
Still, many of our customers asked us for explainable, certifiable and business-relevant answers that would enable them to reach even higher accuracy, approaching 100% (such as >95% or even higher than 99%), for their use cases.
The latest preview release of the AlloyDB AI natural language API provides capabilities for improving your answers in several ways:
Business relevance:Answers should contain and properly rank information in order to improve business metrics, such as conversions or end-user engagement.
Explainability: Results should include anexplanation of intent that clarifies — in language that end users can understand — what the NL API understood the question to be. For example, when a real estate app interprets the question “Can you show me Del Mar homes for families?” as “Del Mar homes that are close to good schools”, it explains its interpretation to the end user.
Verified results:The result should always be consistent with the intent, as it was explained to the user or agent.
Accuracy: The result should correctly capture the intent of the question.
With this, the AlloyDB AI natural language API enables you to progressively improve accuracy for your use case, what’s sometimes referred to as “hill-climbing”. As you work your way towards 100% accuracy, AlloyDB AI’s intent explanations mitigate the effect of the occasional remaining inaccuracies, allowing the end user or agent to understand that the API answered a slightly different question than the one they intended to ask.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get started with a 30-day AlloyDB free trial instance’), (‘body’, <wagtail.rich_text.RichText object at 0x7f5bdd3e8fa0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Hill-climbing to approximate 100% accuracy
Iteratively improving the accuracy of AlloyDB AI happens via a simple workflow.
First, you start with the NL2SQL API that AlloyDB AI provides out of the box. It’s highly (although not perfectly) accurate thanks to its built-in agent that translates natural language questions into SQL queries, as well as automatically generated descriptive context that is used by the included agent.
Next, you can quickly iterate to hill-climb to approximately 100% accuracy and business relevance by improving context. Crucially, in the AlloyDB AI natural language API, context comes in two forms:
Descriptive context, which includes table and column descriptions, and
Prescriptive context, which includes SQL templates and (condition) facets, allowing you to control how the NL request is translated to SQL.
Finally, a “value index” disambiguates terms (such as SKUs and employee names) that are private to your database, and thus that are not immediately clear to foundation models.
The ability to hill-climb to approximate 100% accuracy flexibly and securely relies on two types of context and the value index in AlloyDB.
Let’s take a deeper look at context and the value index.
1. Descriptive and prescriptive context
As mentioned above, the AlloyDB AI natural language API relies on descriptive and prescriptive context to improve the accuracy of the SQL code it generates.
By improving descriptive context, mostly table and column descriptions, you increase the chances that the SQL queries employ the right tables and columns in the right roles. However, prescriptive context resolves a harder problem: accurately interpreting difficult questions that matter for a given use case. For example, an agentic real-estate application may need to answer a question such as “Can you show me homes near good schools in <provided city>?” Notice the challenges:
What exactly is “near”?
How do you define a “good” school?
Assuming the database provides ratings, what is the cutoff for a good school rating?
What is the optimal tradeoff (for ranking purposes and thus for business relevance of the top results) between distance from the school and ranking of the school when the solutions are presented as a list?
To help, the AlloyDB natural language API lets you supply templates, which allow you to associate a type of question with a parameterized SQL query and a parameterized explanation. This enables the AlloyDB NL API to accurately interpret natural language questions that may be very nuanced; this makes templates a good option for frequently asked, nuanced questions.
A second type of prescriptive context, facets, allows you to provide individual SQL conditions along with their natural language counterparts. Facets enable you to combine the accuracy of templates with the flexibility of searching over a gigantic number of conditions. For example, “near good schools” is just one of many conditions. Others may be price, “good for a young family”, “ocean view” or others. Some are combinations of these conditions, such as “homes near good schools with ocean views”. But you can’t have a template for each combination of conditions. In the past, to accommodate all these conditions, you could have tried to create a dashboard with a search field for every conceivable condition, but it would have become very unwieldy, very fast. Instead, when you use a natural language interface, you can use facets to cover any number of conditions, even in a single search field. This is where the strength of a natural language interface really shines!
The AlloyDB AI natural language API facilitates the creation of descriptive and prescriptive context. For example, rather than providing parameterized questions, parameterized intent explanations, and parameterized SQL, just add a template via the add_template API, in which you provide an example question (“Del Mar homes close to good schools”) and the correct corresponding SQL. AlloyDB AI automatically generalizes this question to handle any city and automatically prepares an intent explanation.
2. The value index
The second key enabler of approximate 100% accuracy is the AlloyDB AI value index, which disambiguates terms that are private to your database and, thus, not known to the underlying foundation model. Private terms in natural language questions pose many problems. For starters, users misspell words, and, indeed, misspellings increase with a voice interface. Second, natural language questions don’t always spell out a private term’s entity type. For instance, a university administrator may ask “How did John Smith perform in 2025?” without specifying whether John Smith is faculty or a student; each case requires a different SQL query to answer the question. The value index clarifies what kind of entity “John Smith” is, and can be automatically created by AlloyDB AI for your application.
Natural language search over structured, unstructured and multimodal data
When it comes to applications that provide search over structured data, the AlloyDB AI natural language API enables a clean and powerful search experience. Traditionally, applications present conditions as filters in the user interface that the end user can employ to narrow their search. In contrast, an NL-enabled application can provide a simple chat interface or even take voice commands that directly or indirectly pose any combination of search conditions, and still answer the question. Once search breaks free from the limitations of traditional apps, the possibilities for completely new user experiences really open up.
The combination of the NL2SQL technology with AI search features also makes it good for querying combinations of structured, unstructured and multimodal data.The AlloyDB AI natural language API can generate SQL queries that include vector search, text search and other AI search features such as the AI.IF condition, which enables checking semantic conditions on text and multimodal data. For example, our real estate app may be asked about “Del Mar move-in ready houses”. This would result in a SQL query with an AI.IF function that checks whether the text in the description column of the real_estate.properties table is similar to “move-in ready”.
Bringing the AlloyDB AI natural language API into your agentic application
Ready to integrate the AlloyDB AI natural language API into your agentic application? If you’re writing AI tools (functions) to retrieve data from AlloyDB, give MCP Toolbox for Databases a try. Or for no-code agentic programming, you can use Gemini Enterprise. For example, you can create a conversational agentic application that uses Gemini to answer questions from its knowledge of the web and the data it draws from your database — all without writing a single line of code! Either way, we look forward to seeing what you build.
The complexity of the infrastructure behind AI training and high performance computing (HPC) workloads can really slow teams down. At Google Cloud, where we work with some of the world’s largest AI research teams, we see it everywhere we go: researchers hampered by complex configuration files, platform teams struggling to manage GPUs with home-grown scripts, and operational leads battling the constant, unpredictable hardware failures that derail multi-week training runs. Access to raw compute isn’t enough. To operate at the cutting edge, you need reliability that survives hardware failures, orchestration that respects topology, and a lifecycle management strategy that adapts to evolving needs.
Today, we are delivering on those requirements with the General Availability (GA) of Cluster Director and the Preview of Cluster Director support for Slurm on Google Kubernetes Engine (GKE).
Cluster Director (GA) is a managed infrastructure service designed to meet the rigorous demands of modern supercomputing. It replaces fragile DIY tooling with a robust topology-aware control plane that handles the entire lifecycle of Slurm clusters, from the first deployment to the thousandth training run.
We are expanding Cluster Director to support Slurm on GKE (Preview), designed to give you the best of both worlds: the familiar precision of high-performance scheduling and the automated scale of Kubernetes. It achieves this by treating GKE node pools as a direct compute resource for your Slurm cluster, allowing you to scale your workloads with Kubernetes’ power without changing your existing Slurm workflows.
Cluster Director, now GA
Cluster Director offers advanced capabilities at each phase of the cluster lifecycle, spanning preparation (Day 0), where infrastructure design and capacity are determined; deployment (Day 1), where the cluster is automatically deployed and configured; and monitoring (Day 2), where performance, health, and optimization are continuously tracked.
This holistic approach ensures that you get the benefits of fully configurable infrastructure while automating lower-level operations so your compute resources are always optimized, reliable, and available.
So, what does all this cost? That’s the best part. There’s no extra charge to use Cluster Director. You only pay for the underlying Google Cloud resources — your compute, storage, and networking.
How Cluster Director supports each phase of deployment
Day 0: Preparation
Standing up a cluster typically involves weeks of planning, wrangling Terraform, and debugging the network. Cluster Director changes the ‘Day 0’ experience entirely, with tools for designing infrastructure topology that’s optimized for your workload requirements.
To streamline your Day 0 setup, Cluster Director provides:
Reference architectures: We’ve codified Google’s internal best practices into reusable cluster templates, enabling you to spin up standardized, validated clusters in minutes. This helps ensure that every team in your organization is using the same security standards for their deployments and deploying on infrastructure that is configured correctly by default — right down to the network topology and storage mounting.
Guided configuration: We know that having too many options can lead to configuration paralysis. The Cluster Director control plane guides you through astreamlined setup flow. You select your resources, and our system handles the complex backend mapping, ensuring that storage tiers, network fabrics, and compute shapes are compatible and optimized before you deploy.
Broad hardware support: Cluster Director offers full support for large-scale AI systems, including Google Cloud’s A4X and A4X Max VMs powered by NVIDIA GB200 and GB300 GPUs, and versatile CPUs such as N2 VMs for cost-effective login nodes and debugging partitions.
Flexible consumption options: Cluster Director integrates with your preferred procurement strategy, with support for Reservations for guaranteed capacity during critical training runs, Dynamic Workload Scheduler Flex-start for dynamic scaling, or Spot VMs for opportunistic low-cost runs.
“Google Cloud’s Cluster Director is optimized for managing large-scale AI and HPC environments. It complements the power and performance of NVIDIA’s accelerated computing platform. Together, we’re providing customers with a simplified, powerful, and scalable solution to tackle the next generation of computing challenges.“ – Dave Salvator, Director of Accelerated Computing Products, NVIDIA
Day 1: Deployment
Deploying hardware is one thing, but maximizing performance is another thing entirely. Day 1 is the execution phase, where your configuration transforms into a fully operational cluster. The good news is that Cluster Director doesn’t just provision VMs, it validates that your software and hardware components are healthy, properly networked, and ready to accept the first workload.
To ensure a high-performance deployment, Cluster Director automates:
Getting a clean “bill of health”: Before your job ever touches a GPU, Cluster Director runs a rigorous suite of health checks, including DCGMI diagnostics and NCCL performance validation, to verify the integrity of the network, storage, and accelerators.
Keeping accelerators fed with data: Storage throughput is often the silent killer of training efficiency. That’s why Cluster Director fully supports Google Cloud Managed Lustre with selectable performance tiers, allowing you to attach high-throughput parallel storage directly to your compute nodes, so your GPUs are never starved for data.
Maximizing Interconnect Performance: To achieve peak scaling, Cluster Director implements topology-aware scheduling and compact placement policies. By utilizing dense reservations on Google’s non-blocking fabric, the system ensures that your distributed workloads are placed on the shortest physical path possible, minimizing tail latency and maximizing collective communication (NCCL) speeds from the get-go.
Day 2: Monitoring
The reality of AI and HPC infrastructure is that hardware fails and requirements change. A rigid cluster is an inefficient cluster. As you move into the ongoing “Day 2” operational phase, you need to maintain cluster health, maximize utilization and performance. Cluster Director provides a control plane equipped for the complexities of long-term operations. Today we are introducing new active cluster management capabilities to handle the messy reality of Day 2 operations.
New active cluster management capabilities include:
Topology-level visibility: You can’t orchestrate what you can’t see. Cluster Director’s observability graphs and topology grids let you visualize your entire fleet, spot thermal throttles or interconnect issues, and optimize job placement based on physical proximity.
One-click remediation: When a node degrades, you shouldn’t have to SSH in to debug it. Cluster Director allows you to replace faulty nodes with a single click directly from the Google Cloud console. The system handles the draining, teardown, and replacement, returning your cluster to full capacity in minutes.
Adaptive infrastructure: When your research needs change, so should your cluster. You can now modify active clusters, with activities such as adding or removing storage filesystems, on the fly, without tearing down the cluster or interrupting ongoing work.
Cluster Director support for Slurm on GKE, now in preview
Innovation thrives in the open. Google, the creator of Kubernetes, and SchedMD, the developers behind Slurm, have long championed the open-source technologies that power the world’s most advanced computing. For years, NVIDIA and SchedMD have worked in lockstep to optimize GPU scheduling, introducing foundational features like the Generic Resource (GRES) framework and Multi-Instance GPU (MIG) support that are essential for modern AI. By acquiring SchedMD, NVIDIA is doubling down on its commitment to Slurm as a vendor-neutral standard, ensuring that the software powering the world’s fastest supercomputers remains open, performant, and perfectly tuned for the future of accelerated computing.
Building on this foundation of accelerated computing, Google is deepening its collaboration with SchedMD to answer a fundamental industry challenge: how to bridge the gap between cloud-native orchestration and high-performance scheduling. We are excited to announce the Preview of Cluster Director support for Slurm on GKE, utilizing SchedMD’s Slinky offering.
This initiative brings together the two standards of the infrastructure world. By running a native Slurm cluster directly on top of GKE, we are amplifying the strengths of both communities:
Researchers get the uncompromised Slurm interface and batch capabilities, such as sbatch and squeue, that have defined HPC for decades.
Platform teams gain the operational velocity that GKE, with its auto-scaling, self-healing, and bin-packing, brings to the table.
Slurm on GKE is strengthened by our long-standing partnership with SchedMD, which helps create a unified, open, and powerful foundation for the next generation of AI and HPC workloads. Request preview access now.
Try Cluster Director today
Ready to start using Cluster Director for your AI and HPC cluster automation?
Learn more about the end-to-end capabilities in documentation.
For most organizations, the question is no longer if they will use AI, but how to scale it from a promising prototype into a production-grade service that drives business outcomes. In this age of inference, competitive advantage is defined by your ability to serve useful information to users around the world at the lowest possible cost. As you move from demos to production deployments at scale, you need to simplify infrastructure operations with integrated systems that provide the latest AI software and accelerator hardware platforms, while keeping costs and architectural complexity low.
Yesterday, Forrester released The Forrester Wave™: AI Infrastructure Solutions, Q4 2025 report, evaluating 13 vendors, and we believe their findings validate our commitment to solving these core challenges. Google received the highest score of all vendors in the Current Offering category and received the highest possible score in 16 out of 19 evaluation criteria, including, but not limited to: Vision, Architecture, Training, Inferencing, Efficiency, and Security.
Accelerating time-to-value with an integrated system
Enterprises don’t run AI in a vacuum. They need to integrate it with a diverse range of applications and databases while adhering to stringent security protocols. Forrester recognized Google Cloud’s strategy of co-design by giving us the highest possible score in the Efficiency and Scalability criteria:
“Google pursues a strategy of silicon-infrastructure co-design. It develops TPUs to improve inference efficiency and NVIDIA GPUs for access to broader ecosystem compatibility. Google designs TPUs to integrate tightly with its networking fabric, giving customers high bandwidth and low latency for inference at scale.”
For over two decades, we have operated some of the world’s largest services, from Google Search and YouTube to Maps, where their unprecedented scale required us to solve problems that no one else had. We couldn’t simply buy the platform and infrastructure we needed; we had to invent it. This led to a decade-long journey of deep, system-level co-design, building everything from our custom network fabric and specialized accelerators to frontier models, all under one roof.
The result was an integrated supercomputing system, AI Hypercomputer, which has paid significant dividends for our customers. It supports a wide range of AI-optimized hardware, allowing you to optimize for granular, workload-level objectives — whether that’s higher throughput, lower latency, faster time-to-results, or lower TCO. That means you can use our custom Tensor Processing Units (TPUs), the latest NVIDIA GPUs, or both, backed by a system that tightly integrates accelerators with networking and storage for exceptional performance and efficiency. It’s also why today, leading generative AI companies such as Anthropic, Lightricks, and LG AI Research trust Google Cloud to power their most demanding AI workloads.
This system-level integration lays the foundation for speed, but operational complexity could still slow you down. To accelerate your time-to-market, we provide multiple ways to deploy and manage AI infrastructure, abstracting away the heavy lifting regardless of your preferred workflow. Google Kubernetes Engine (GKE) Autopilot automates management for containerized applications, helping customers like LiveX.AI reduce operational costs by 66%. Similarly, Cluster Director simplifies deployment for Slurm-based environments, enabling customers like LG AI Research to slash setup time from 10 days to under one day.
Managing AI cost and complexity
Forrester gave Google Cloud the highest scores possible in the Pricing Flexibility and Transparency criterion. The price of compute is only one part of the AI infrastructure cost equation. A complete view should also account for development costs, downtime and inefficient resource utilization. We offer optionality at every layer of the stack to provide the flexibility businesses demand.
Flexible consumption: Dynamic Workload Scheduler allows you to secure compute at up to 50% savings, by ensuring you only pay for the capacity you need, when you need it.
Load balancing: GKE Inference Gateway improves throughput by using AI-aware routing to balance requests across models, preventing bottlenecks and ensuring servers aren’t sitting idle.
Eliminating data bottlenecks: Anywhere Cache co-locates data with compute, reducing read latency by up to 96% and eliminating the “integration tax” of moving data. By using Anywhere Cache together with our unified data platform BigQuery, you can avoid latency and egress fees while keeping your accelerators fed with data.
Mitigating strategic risk through flexibility and choice
We are also committed to enabling customer choice across accelerators, frameworks and multicloud environments. This isn’t new for us. Our deep experience with Kubernetes, which we developed then open-sourced, taught us that open ecosystems are the fastest path to innovation and provide our customers with the most flexibility. We are bringing that same ethos to the AI era by actively contributing to the tools you already use.
Open source frameworks and hardware portability: We continue to support open frameworks such as PyTorch, JAX, and Keras. We’ve also directly addressed concerns about workload portability on custom silicon by investing in TPU support for vLLM, allowing developers to easily switch between TPUs and GPUs (or use both) with only minimal configuration changes.
Hybrid and multicloud flexibility: Our commitment to choice extends to where you run your applications. Google Distributed Cloud brings our services to on-premises, edge and cloud locations, while Cross-Cloud Network securely connects applications and users with high-speed connectivity between your environments and other clouds. This powerful combination means you’re no longer locked into a specific environment; you can easily migrate workloads and apply uniform management practices, streamlining operations, and mitigating the risk of lock-in.
Systems you can rely on
When your entire business model depends on the availability of AI services, infrastructure uptime is critical. Google Cloud’s global infrastructure is engineered for enterprise-grade reliability, an approach rooted in our history as the birthplace of Site Reliability Engineering (SRE).
We operate one of the world’s largest private software-defined networks, handling approximately 25% of global internet egress traffic. Unlike providers that rely on the public internet, we keep your traffic on Google’s own fiber to improve speed, reliability, and latency. This global backbone is powered by our Jupiter data center fabric, which scales to 13 Petabits/sec of bandwidth, delivering 50x greater reliability than previous generations — to say nothing of other providers. Finally, to improve cluster-level fault tolerance, we employ capabilities like elastic training and multi-tier checkpointing, which allow jobs to continue uninterrupted, by dynamically resizing the cluster around failed nodes while minimizing the time to recovery.
Building on a secure foundation
Our approach is to secure AI from the ground up. In fact, Google Cloud maintains a leading track record for cloud security. Independent analysis from cloudvulndb.org (2024-2025) shows that our platform has up to 70% fewer critical and high vulnerabilities compared to the other two leading cloud providers. We were also the first in the industry to publish an AI/ML Privacy Commitment, which guarantees that we do not use your data to train our models. With those safeguards in place, security is integrated into the foundation of Google Cloud, based on the zero-trust principles that protect Google’s own services:
A hardware root of trust: Our custom Titan chips, as part of our Titanium architecture, create a verifiable hardware root of trust. We recently extended this with Titanium Intelligence Enclaves for Private AI Compute, allowing you to process sensitive data in a hardened, isolated, and encrypted environment.
Built-in AI security:Security Command Center (SCC) natively integrates with our infrastructure, providing AI Protection by automatically discovering assets, preventing security issues, detecting active threats with frontline Google Threat Intelligence, and discovering known and unknown risks before attackers can exploit them.
Sovereign solutions: We enable you to meet stringent data residency, operational control, and software sovereignty requirements through solutions like Data Boundary. This is complemented by flexible options like partner-operated sovereign controls and Google Distributed Cloud for air-gapped needs.
Platform controls for AI and agent governance: Vertex AI provides the essential governance layer for the enterprise builder to deploy models and agents at scale. This trust is anchored in Google Cloud’s secure-by-default infrastructure, utilizing platform controls like VPC Service Controls (VPC-SC) and Customer-Managed Encryption Keys (CMEK) to sandbox environments and protect sensitive data, and Agent Identity for granular IAM permissions. At the platform level, Vertex AI and Agent Builder integrate Model Armor to provide runtime protection against emergent agentic threats, such as prompt injection and data exfiltration.
Delivering continuous AI innovation
We are honored to be recognized as a Leader in The Forrester Wave™ report, which we believe validates decades of R&D and our approach to building ultra-scale AI infrastructure. Look to us to continue on this path of system-level innovation as we help you convert the promise of AI into a reality.
Today, we’re expanding the Gemini 3 model family with Gemini 3 Flash, which offers frontier intelligence built for speed at a fraction of the cost.
Gemini 3 Flash builds on the model series that developers and enterprises already love, optimized for high-frequency workflows that demand speed, without sacrificing quality. It allows enterprises to process near real-time information, automate complex workflows, and build responsive agentic applications.
Gemini 3 Flash is built to be highly efficient, pushing the boundaries of quality at better price performance and faster speed. With a near real-time response from the model, businesses can now provide more engaging experiences for their end users at production scale, without sacrificing on quality.
Optimized for speed and scale
Gemini 3 Flash strikes an ideal balance between reasoning and speed, for agentic coding, production-ready systems, and responsive interactive applications. It is available now in Gemini Enterprise, Vertex AI, and Gemini CLI, so businesses and developers can access:
Advanced multimodal processing: Gemini 3 Flash enables enterprises to build applications capable of complex video analysis, data extraction, and visual Q&As in near real-time. Whether streamlining back-office operations by extracting structured data from thousands of documents, or analyzing video archives to identify trends, Gemini 3 Flash delivers these insights with the speed required for modern data pipelines.
Cost-efficient and high-performance execution for code and agents: Gemini 3 Flash delivers exceptional performance on coding and agentic tasks combined with a lower price point, allowing teams to deploy sophisticated reasoning across high-volume processes without hitting barriers.
Low latency for near-real-time experiences: Gemini 3 Flash eliminates the lag typically associated with large models when it comes to intelligence. Its low latency powers responsive applications, from live customer support agents to in-game assistants. These applications can now offer more natural interactions for both quick answers and deep reasoning.
Gemini 3 Flash clearly demonstrates that speed and scale do not have to come at the cost of intelligence.
Real-world value across industries
With the launch of Gemini 3 Pro last month, we introduced frontier performance across complex reasoning, multimodal and vision understanding, as well as agentic and vibe-coding tasks. Gemini 3 Flash retains this foundation, combining Gemini 3’s Pro-grade reasoning with Flash-level latency, efficiency, and cost.
We are already seeing a tremendous response from companies using Gemini 3 Flash. With inference speed and reasoning capabilities that are typically associated with larger models, Gemini 3 Flash is unlocking new and more efficient use cases for companies like Salesforce, Workday and Figma.
Reasoning and multimodality
“Gemini 3 Flash shows a relative improvement of 15% in overall accuracy compared to Gemini 2.5 Flash, delivering breakthrough precision on our hardest extraction tasks like handwriting, long-form contracts, and complex financial data. This is a significant jump in performance, and we’re excited to continue collaborating to bring this specialist-level reasoning to Box AI users.” –Yashodha Bhavnani, Head of AI, Box
“At Bridgewater, we require models capable of reasoning over vast, unstructured multimodal datasets without sacrificing conceptual understanding. Gemini 3 Flash is the first to deliver Pro-class depth at the speed and scale our workflows demand. Its long-context performance on complex problems is exceptional.” – Jasjeet Sekhon, Chief Scientist and Head of AI, AIA Labs, Bridgewater Associates
“ClickUp leverages Gemini 3 Flash’s advanced reasoning to help power our next generation of autonomous agents. Gemini is decomposing high-level user goals into granular tasks, and we are seeing massive quality improvements on critical path identification and long-horizon task sequencing.” –Justin Midyet, Director, Software Engineering, ClickUp
“Gemini 3 Flash has achieved a meaningful step up in reasoning, improving over 7% on Harvey’s BigLaw Bench from its predecessor, Gemini 2.5 Flash. These quality improvements, combined with Flash’s low latency, are impactful for high-volume legal tasks such as extracting defined terms and cross-references from contracts.” – Niko Grupen, Head of Applied Research, Harvey
Agentic coding
“Our engineers have found Gemini 3 Flash to work well together with Debug Mode in Cursor. Flash is fast and accurate at investigating issues and finding the root cause of bugs.” –Lee Robinson, VP of Developer Experience, Cursor
“Gemini 3 Flash is a major step above other models in its speed class when it comes to instruction following and intelligence. It’s immediately become our go-to for latency-sensitive experiences in Devin, and we’re excited to roll it out to more use cases.” –Walden Yan, Co-Founder, Cognition
“The improvements in the latest Gemini 3 Flash model are impressive. Even without specific optimization, we saw an immediate 10% baseline improvement on agentic coding tasks, including complex user-driven queries.” – Daniel Lewis, Distinguished Data Scientist, Geotab
“In our JetBrains AI Chat and Junie agentic-coding evaluation, Gemini 3 Flash delivered quality close to Gemini 3 Pro, while offering significantly lower inference latency and cost. In a quota-constrained production setup, it consistently stays within per-customer credit budgets, allowing complex multi-step agents to remain fast, predictable, and scalable.” – Denis Shiryaev, Head of AI DevTools Ecosystem, JetBrains
“For the first time, Gemini 3 Flash combines speed and affordability with enough capability to power the core loop of a coding agent. We were impressed by its tool usage performance, as well as its strong design and coding skills.” –Michele Catasta, President & Head of AI, Replit
“Gemini 3 Flash remains the best fit for Warp’s Suggested Code Diffs, where low latency and cost efficiency are hard constraints. With this release, it resolves a broader set of common command-line errors while staying fast and economical. In our internal evaluations, we’ve seen an 8% lift in fix accuracy.” – Zach Lloyd, Founder & CEO, Warp
Agentic applications
“Gemini 3 Flash is a great option for teams who want to quickly test and iterate on product ideas in Figma Make. The model can rapidly and reliably create prototypes while maintaining attention to detail and responding to specific design direction.” – Loredana Crisan, Chief Design Officer, Figma
“Presentations.ai is using Gemini 3 Flash to enhance our intelligent slide-generation agents, and we’re consistently impressed by the Pro-level quality at lightning-fast speeds. With previous Flash-sized models there were many things we simply couldn’t attempt because of the speed vs. quality tradeoff. With Gemini 3 Flash, we’re finally able to explore those workflows.” – Saravanan Govindaraj, Co-Founder & Head of Product Development, Presentations.ai
“Integrating Gemini 3 Flash into Agentforce is another step forward in our commitment to bring the best AI to our customers and deploy intelligent agents faster than ever. By pairing Google’s latest model capabilities with the power of Agentforce, we’re unlocking high-quality reasoning, stronger responses, and rapid iteration all inside the tools our customers already use.”– John Kucera, SVP of Product Management, Salesforce AI
“Gemini 3 Flash gives us a powerful new frontier model to fuel Workday’s AI-first strategy. From delivering sharper inference in our customer-facing applications to unlocking greater efficiency in our own operations and development, it provides the performance boost to continue to innovate rapidly.” – Dean Arnold, VP of AI Platform, Workday
“Gemini 3 Flash model’s superb speed and quality allow our users to keep generating content without interruptions. With its improved Korean abilities and adherence to prompts, Gemini 3 Flash can be used for a variety of use cases including agentic workflow and story generation. As the largest consumer AI company in Korea, we’d love to keep using Gemini 3 models and be part of its continuous improvement cycles.” –DJ Lee, Chief Product Officer, WRTN Technologies Inc.
Get started with Gemini 3 Flash
Today, you can safely put Gemini 3 Flash to work.
Business teams can access Gemini 3 Flash in preview onGemini Enterprise, our advanced agentic platform for teams to discover, create, share, and run AI agents all in one secure platform.
Your security program is robust. Your audits are clean. But are you ready for a real-world attack? A tenacious human adversary can create a critical blind spot for security leaders: A program can be compliant, but not resilient. Bridging this gap requires more than just going through the red-teaming motions.
To help security teams forge better instincts when responding to actual cyber-crisis events, we developed ThreatSpace, a cyber proving grounds and realistic corporate network that includes all the digital noise of real employee activities.
From gaps to battle: The ThreatSpace cyber range
The ThreatSpace environment is architecturally stateless and disposable to allow the deployment of real-world malware. It emulates the tactics, techniques, and procedures (TTPs) of real-world adversaries, informed by the latest, unparalleled threat intelligence from Google Threat Intelligence Group and Mandiant. By design, it never puts your actual business assets at risk.
Recently, stakeholders from the U.S. Embassy, the FBI, and Cote d’Ivoire cybersecurity agencies used ThreatSpace to conduct advanced defense training. Funded by the Bureau of International Narcotics and Law Enforcement Affairs (INL), this workshop brought together public and private sector partners to strengthen regional digital security.
“Cybersecurity is a team sport, and our goal is to make Cote d’Ivoire a safer place for Ivorians and Americans to do business. This five-day workshop, funded by INL, brought together world-class instructors from Mandiant with local agencies and private sector partners to build the collaborative muscle we need to defend against modern threats,” said Colin McGuire, FBI law enforcement attaché, Dakar in Cabo Verde and Gulf of Guinea.
More than just helping to train individuals, we helped make the global digital ecosystem safer by uniting diverse groups of defenders facing shared threats. By practicing collaboration during a crisis, and operating as a unit, we can help empower defenders to fight and win against adversaries.
ThreatSpace provides a safe place for your team to miss an indicator of compromise, exercise processes, and stress test collaboration and build the muscle memory and confidence needed to execute flawlessly when real adversaries come knocking. This is where an Offensive Security red team assessment comes in.
Catch me if you can: The Mandiant red team reality check
The Mandiant red team doesn’t follow a script. Our work on the frontlines of incident response lets us see precisely how determined adversaries operate, including their persistent, creative approaches to exploiting the complex seams between your technology, your processes, and your people.
aside_block
<ListValue: [StructValue([(‘title’, ‘Our Office of the CISO insights, direct to you’), (‘body’, <wagtail.rich_text.RichText object at 0x7ff71daccd30>), (‘btn_text’, ‘Subscribe today’), (‘href’, ‘https://go.chronicle.security/cloudciso-newsletter-signup?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY23-Cloud-CISO-Perspectives-newsletter-blog-embed-CTA&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: Cloud CISO Perspectives new header July 2024 small>)])]>
These observations enable our offensive security experts to mimic and emulate genuine threat actor behavior to achieve specific business objectives. Here are three scenarios developed by our red team to help stress-test and enhance our customers’ defenses:
The “Impossible” Blackout. One organization believed their grid controls were isolated and secure. When our team demonstrated that a nationwide blackout was technically possible through their current architecture, the conversation shifted from compliance to survival. This finding empowered them to implement stricter controls immediately, preventing a theoretical catastrophe from becoming a reality.
The Runaway Train. In another engagement, we gained remote system control of a locomotive train. The client didn’t just get a technical report; they learned exactly how physical access vectors could bypass digital security. This exposure allowed them to harden their operational technology against vectors they had previously considered secure.
The Generous Chatbot. Innovation brings new risks. In a recent test of a financial services chatbot, our team used simple prompts to bypass safety filters, ultimately convincing the AI to approve a 200-month loan at 0% APR. This finding prompted the client to immediately implement critical guardrails and grounding sources, ensuring they could innovate safely without exposing their business to manipulation.
From reactive to resilient
Building true cyber resilience requires a continuous feedback loop. It starts with analyzing your current state and enhancing your capability roadmap to align with operational priorities. Then you validate them through incident response learnings and offensive security insights and feed those back into the loop for the next iteration.
By combining these disciplines, and grounding them with threat intelligence, you can move your organization from a reactive posture to a state of proactive resilience. You find and expose your weaknesses today, so you can build the strength required to secure your future.
To battle-test your defenses, contact Mandiant to learn how our Offensive Security and ThreatSpace cyber range services can help you strengthen your defenses and build your resilience.
When it comes to public health, having a clear picture of a community’s needs is vital. These insights help officials secure crucial funding, launch new initiatives, and ultimately improve people’s lives.
That is the idea that inspired Dr. Phillip Levy, M.D., M.P.H., Professor of Emergency Medicine and Associate Vice President for Population Health and Translational Science at Wayne State University and his colleagues to develop Project PHOENIX: the Population Health OutcomEs aNd iNnformation eXchange. PHOENIX ingests information from electronic health records including demographic data, blood pressures and clinical diagnosis, and combines this with social and environmental factors from more than 70 anonymized data sources into an integrated virtual warehouse. Researchers, advocates,community leaders, and policy makers are able to use this data to better understand how different factors correlate to health outcomes and design targeted interventions.
With such functionality, the PHOENIX team recognized the potential to transform the Community Health Needs Assessment (CHNA) process. Required by the federal government, public health departments, nonprofit hospitals, and Federally Qualified Health Centers in the United States must complete a CHNA every three years—a largely manual, time-consuming task that can take up to a year to complete.
That’s where a collaboration between Wayne State University, Google Public Sector, and Syntasa came in. They teamed up to create CHNA 2.0, an innovative solution that drastically cuts down the time it takes to create these vital reports. By combining PHOENIX data with Vertex AI Platform, CHNA 2.0 can deliver a complete CHNA in a matter of weeks, giving health leaders valuable insights more quickly than ever.
Extracting community sentiment from public data
One of the most challenging parts of drafting a CHNA report involves conducting in-depth surveys to understand conditions in the community. This is often the most time-consuming part of the process, as it takes months to create, review, run, and analyze insightful surveys. By the time a CHNA report is complete, data from the surveys might be nearly a year out of date, which can prevent organizations from making a meaningful impact on their communities.
CHNA 2.0 uses public health data from the PHOENIX warehouse along with insights from Syntasa Sentiment Analytics, which combines information from surveys with real-time data from Google Search and social media posts. Syntasa Sentiment Analytics provides insights regarding the questions people are asking and what issues they’re posting about to uncover health-related problems affecting a given community, such as growing concerns about asthma or frustrations with long waits at clinics.
The architecture for this solution was built on the Syntasa Data + AI Platform. Workloads run on Google Kubernetes Engine (GKE) for its scalability, allowing the platform to process incoming sentiment data quickly. The platform also uses Cloud SQL and Google Cloud Storage as part of its data foundation, with BigQuery doing the heavy lifting for sentiment analysis. BigQuery provides the performance, efficiency, and versatility needed to handle large datasets of search and social media information efficiently.
Creating reports with the power of humans + AI
After gathering the necessary information, CHNA 2.0 uses Vertex AI and Gemini to help analysts create the report in less time. CHNA reports are highly complex and lengthy – and require manually integrating multiple data elements. Syntasa solved this challenge by breaking down the report into smaller, more manageable tasks and bringing human oversight into the loop.
Now the person in charge of handling the CHNA defines the report’s structure. Gemini extracts insights from tailored datasets and fills in the relevant details. By combining both human and AI intelligence, CHNA 2.0 delivers reports in a fraction of the time.
Organizations can also use this method to deliver a living document that is constantly updated with fresh data. This means public health officials don’t have to wait years to understand their communities—they can access the latest insights at any time to make faster and more impactful decisions.The net result is a transformation of the CHNA process from static to dynamic, enabling real time, data driven decision making for the betterment of all.
Supporting public health with technology
The City of Dearborn, Michigan, became the first to use CHNA 2.0 to great success. The long-term vision is to bring this same capability to other cities and counties in Michigan and across the nation.
This project with Wayne State University and Syntasa showcases how the right technology and a strategic partner can create a powerful, scalable solution to a long-standing public sector challenge. By partnering with Google Public Sector to leverage the most advanced AI and data tools, Wayne State not only automated a critical process, but also empowered public health officials to better serve their communities.
From improving community health to modernizing infrastructure, discover how technology is transforming the public sector. Sign up for our newsletter to stay informed on the latest trends and solutions.
We have exciting news for Google Cloud partners: Today we’re announcing our new partner program, Google Cloud Partner Network, which will formally roll out in the first quarter of 2026.
This new program marks a fundamental shift in how we measure success and reward value. Applicable to all partner types and sizes – ISVs, RSIs, GSIs, and more – the new program reinforces our strategic move toward recognizing partner contribution across the entire customer lifecycle.
Google Cloud Partner Network is being completely streamlined to focus on real-world results. This marks a strategic shift from measuring program work to valuing genuine customer outcomes. This includes rewarding successful co-sell sales efforts, high-quality service delivery, and shared innovation with ISVs. We are also integrating AI into the program’s core to make partner participation much easier, allowing more focus on customers instead of routine program administration.
With its official kickoff in Q1, the new program will provide a six-month transition window for partners to adjust to the new framework. Today, we are sharing the first details of the Google Cloud Partner Network, which is centered on three pillars: simplicity, outcomes, and automation.
Simplicity
We’re making the program simpler by moving away from tracking the work of traditional program requirements, such as business plans and customer stories, and towards recognising partner contributions – includingpre-sales influence, co-innovation, and post-sales support.
Because the program is designed to put the customer first, we’ve narrowed requirements to focus on partner efforts that deliver real, measurable value. For example, the program will recognize investments in skills, real-world experience, and successful customer outcomes.
Outcomes
The new program will provide clear visibility into how partner impact is recognized and rewarded, focusing on customer outcomes. Critical highlights include:
New tiers: We’re evolving from a two-tier to a three-tier model:Select, Premier, and a new Diamond tier. Diamond is our highest distinction – it is intentionally selective, reserved for the few partners who consistently deliver exceptional customer outcomes. Each tier will now reflect our joint customer successes and will be determined based on exceptional customer outcomes across Google Cloud and Google Workspace.
New baseline competencies: A new competency framework marks a fundamental shift that will replace today’s specializations, in order to reward partners for their deep technical and sales capabilities. The framework focuses on a partner’s proven ability to help customers, measuring two key dimensions: capacity (knowledge and skills development, validated by technical certifications and sales credentials) and capability (real-world success, measured by pre-sales and post-sales contributions to validated closed/won opportunities). This framework operates independently from tiering to allow partners to earn a competency without any dependency on a program tier.
New advanced competency: The new global competencies introduce a second level of achievement, Advanced Competency, to signal a higher designation.
Automation
Building on the proven success and transparency delivered through tools like the Earnings Hub and Statement of Work Analyzer, today’s Partner Network Hubwill transform to deliver automation and transparency across the program.
The administrative responsibility for partners to participate in the program will be dramatically reduced through the use of AI and other tools. For example, a key change is the introduction of automated tracking across tiering and competency achievements. We will automatically apply every successful customer engagement toward a partner’s progress in all eligible tiers and competencies. This radical simplification eliminates redundant reporting and ensures seamless, comprehensive recognition for the outcomes delivered.
What’s next…
The new program and portal will officially launch in Q1 2026, enabling partners to immediately log in, explore benefits and differentiation paths, and begin achieving new tiers and competencies. To ensure a smooth transition, we will host a series of webinars and listening sessions throughout early next year to educate partners on Google Cloud Partner Network.
When extreme weather or unexpected natural disaster strikes, time is the single most critical resource. For public sector agencies tasked with emergency management, the challenge isn’t just about crafting a swift response, it’s about communicating that response to citizens effectively. At our recent Google Public Sector Summit, we demonstrated how Google Workspace with Gemini is helping government agencies turn complex, legally-precise official documents and text into actionable, personalized public safety tools almost instantly, thereby transforming the speed and efficacy of disaster response communication.
Let’s dive deeper into how Google Workspace with Gemini can help transform government operations and boost the speed and effectiveness of critical public outreach during a natural disaster.
The challenge: Turning authority into action
Imagine you are a Communications Director at the Office of Emergency Management. In the aftermath of a severe weather event, the state government has just issued a critical Executive Order (EO), which serves as a foundational text, legally precise, and essential for internal agency coordination. However, its technical, authoritative language is not optimized for the public’s urgent questions such as: “Am I safe? Is my family safe? What should I do now?”
Manually translating and contextualizing this information for the public, and finding official answers to critical questions – often hidden in the details – can create a dangerous information gap during a fast-moving natural disaster.
Built on a foundation of trust
Innovation requires security. Google Workspace with Gemini empowers agencies to adopt AI without compromising on safety or sovereignty, supported by:
FedRAMP High authorization to meet the rigorous compliance standards of the public sector.
Data residency & access controls including data regions, access transparency, and access approvals.
Advanced defense mechanisms like context-aware access (CAA), data loss prevention (DLP), and client-side encryption (CSE).
Operational resilience with Business Continuity editions to help keep your agency connected and operational during critical events.
Google Workspace with Gemini: Your natural disaster response partner
This is one area where Google Workspace with Gemini can help serve as your essential natural disaster partner, by empowering government leaders to move beyond manual translation and rapidly create dynamic, user-facing tools.
For example, by using the Gemini app, the Communications Director at the Office of Emergency Management can simply upload the Executive Order PDF and prompt Gemini to ‘create an interactive safety check tool based on these rules.’ Gemini instantly parses the complex legal definitions—identifying specific counties, curfew times, and exemptions—and writes the necessary code to render a functional, interactive interface directly within the conversation window.
What was once a static document becomes a clickable prototype in seconds, ready to be tested and deployed.
Image: Gemini turns natural disaster declaration into an interactive map
Three core capabilities driving transformation
This process is driven by three core Google Workspace with Gemini capabilities.
Unprecedented speed of transformation. The journey from a complex, static document to a working, interactive application is measured in minutes, not days or weeks. This acceleration completely changes the speed of development for mission-critical tools. In a disaster, the ability to deploy a targeted public safety resource instantly can be life-saving.
Deep contextual understanding.Gemini’s advanced AI goes beyond simple summarization. When provided with a full document and specific instructions, it can synthesize the data to perform complex tasks. For example, Gemini can analyze an executive order to identify embedded technical terms and locations, interpreting them as specific geographic areas that require attention. It extracts this pertinent information—while citing sources for grounding—and can transform raw text into a practical, location-aware tool for the public.
A repeatable blueprint for any natural disaster. The entire process—from secure document upload to the creation of a working, live application—is repeatable. This means the model can be saved and leveraged for any future public safety resource, whether it’s a severe weather warning, a health advisory, or a general preparedness guide. This repeatable blueprint future-proofs an agency’s ability to communicate quickly and effectively during any emergency.
Serving the public with speed and clarity
By leveraging Google Workspace with Gemini, public sector agencies can ensure that official emergency declarations immediately translate into clear, actionable details for the public. This shift from dense legal text to personalized guidance is paramount for strengthening public trust, improving citizen preparedness, and ultimately keeping communities safe.
Are you ready to drive transformation within your own agency? Check out the highlights from our recent Google Public Sector Summit where leaders gathered to share how they are applying the latest Google AI and security technologies to solve complex challenges and advance their missions. Learn more about our Google Workspace Test Drive, and sign up for a no-cost 30-day pilot which provides your agency with full, hands-on access to the entire Google Workspace with Gemini, commitment-free, on your own terms.
The AI state of the art is shifting rapidly from simple chat interfaces to autonomous agents capable of planning, executing, and refining complex workflows. In this new landscape, the ability to ground these intelligent agents in your enterprise data is key to unlocking true business value. Google Cloud is at the forefront of this shift, empowering you to build robust, data-driven applications quickly and accurately.
Last month, Google announced Antigravity, an AI-first integrated development environment (IDE). And now, you can now give the AI agents you build in Antigravity direct, secure access to the trusted data infrastructure that powers your organization, turning abstract reasoning into concrete, data-aware action. With Model Context Protocol (MCP) servers powered by MCP Toolbox for Databases now available within Antigravity, you can securely connect your AI agents to services like AlloyDB for PostgreSQL, BigQuery, Spanner, Cloud SQL, Looker and others within Google’s Data Cloud, all within your development workflow.
Why use MCP in Antigravity?
We designed Antigravity to keep you in the flow, but the power of an AI agent is limited by what it “knows.” To build truly useful applications, your agent needs to understand your data. MCP acts as the universal translator. You can think of it like a USB-C port for AI. It allows the LLMs in your IDE to plug into your data sources in a standardized way. By integrating pre-built MCP servers directly into Antigravity, you don’t need to perform any manual configuration. Your agents can now converse directly with your databases, helping you build and iterate faster without ever leaving the IDE.
Getting started with MCP servers
In Antigravity, connecting an agent to your data is a UI-driven experience, eliminating the challenges we’ve all faced when wrestling with complex configuration files just to get a database connection running. Here’s how to get up and running.
1. Discover and launch
You can find MCP servers for Google Cloud in the Antigravity MCP Store. Search for the service you need, such as “AlloyDB for PostgreSQL” or “BigQuery,” and click on Install to start the setup process.
Launching the Antigravity MCP store
2. Configure your connection
Antigravity presents a form where you can add your service details such as Project ID and region. You can also enter your password or have Antigravity use your Identity and Access Management (IAM) credentials for additional security. These are stored securely, so your agent can access the tools it needs without exposing raw secrets in your chat window.
Installing the AlloyDB for PostgreSQL MCP Server
See your agents in action
Once connected to Antigravity, your agent gains a suite of “tools” (executable functions) that it can use to assist you, and help transform your development and observability experience across different services. Let’s take a look at a couple of common scenarios.
Streamlining database tasks with AlloyDB for PostgreSQL
When building against a relational database like PostgreSQL, you may spend time switching between your IDE and a SQL client to check schema names or test queries. With the AlloyDB MCP server, your agent handles that context and gains the ability to perform database administration and generate high-quality SQL code you can include in your apps — all within the Antigravity interface.
For example:
Schema exploration: The agent can use list_tables and get_table_schema to read your database structure and explain relationships to you instantly.
Query development: Ask the agent to “Write a query to find the top 10 users,” and it can use execute_sql to run it and verify the results immediately.
Optimization: Before you commit code, use the agent to run get_query_plan to ensure your logic is performant.
Antigravity agent using the MCP tools
Unlocking analytics with BigQuery
For data-heavy applications, your agent can act as a helpful data analyst. Leveraging the BigQuery MCP server, it can, for example:
Forecast: Use forecast to predict future trends based on historical data.
Search the catalog: Use search_catalog to discover and manage data assets.
Augmented analytics: Use analyze_contribution to understand the impact of different factors on data metrics.
Building on truth with Looker
Looker acts as your single source of truth for business metrics. Looker’s MCP server allows your agent to bridge the gap between code and business logic, for example:
Ensuring metric consistency: No more guessing whether a field is named total_revenue or revenue_total. Use get_explores and get_dimensions to ask your agent, “What is the correct measure for Net Retention?” and receive the precise field reference from the semantic model.
Instantly validating logic: Don’t wait to deploy a dashboard to test a theory. Use run_query to execute ad-hoc tests against the Looker model directly in your IDE, so that your application logic matches the live data.
Auditing reports: Use run_look to pull results from existing saved reports, allowing you to verify that your application’s output aligns with official business reporting.
Build with data in Antigravity
By integrating Google’s Data Cloud MCP servers into Antigravity, it’s easier than ever to use AI to discover insights and develop new applications. Now, with access to a wide variety of data sources that run your business, get ready to take the leap from simply talking to your code, to creating new experiences for your users.
To get started, check out the following resources:
Building Generative AI applications has become accessible to everyone, but moving those applications from a prototype to a production-ready system requires one critical step: Evaluation.
How do you know if your LLM is safe? How do you ensure your RAG system isn’t hallucinating? How do you test an agent that generates SQL queries on the fly?
At its core, GenAI Evaluation is about using data and metrics to measure the quality, safety, and helpfulness of your system’s responses. It moves you away from “vibes-based” testing (just looking at the output) to a rigorous, metrics-driven approach using tools like Vertex AI Evaluation and the Agent Development Kit (ADK).
To guide you through this journey, we have released four hands-on labs that take you from the basics of prompt testing to complex, data-driven agent assessment.
Evaluating Single LLM Outputs
Before you build complex systems, you must understand how to evaluate a single prompt and its response. This lab introduces you to GenAI Evaluation, a service that helps you automate the evaluation of your model’s outputs.
You will learn how to define metrics, such as safety, groundedness, and instruction following. You will also learn how to run evaluation tasks against a dataset. This is the foundational step for any production-ready AI application.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07005beb0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Evaluate RAG Systems with Vertex AI
Retrieval Augmented Generation (RAG) is a powerful pattern, but it introduces new failure points: did the search fail to find the document, or did the LLM fail to summarize it?
This lab takes you deeper into the evaluation lifecycle. You will learn how to verify “Faithfulness” (did the answer come from the context?) and “Answer Relevance” (did it actually answer the user’s question?). You will pinpoint exactly where your RAG pipeline needs improvement.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07005b790>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Evaluating Agents with ADK
Agents are dynamic; they choose tools and plan steps differently based on the input. This makes them harder to test than standard prompts. You aren’t just grading the final answer; you are grading the trajectory, which is the path the agent took to get there.
This lab focuses on using the Agent Development Kit (ADK) to trace and evaluate agent decisions. You will learn how to define specific evaluation criteria for your agent’s reasoning process and how to visualize the results to ensure your agent is using its tools correctly.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc0787f3e20>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Build and Evaluate BigQuery Agents
When an agent interacts with data, precision is paramount. A SQL-generating agent must write syntactically correct queries and retrieve accurate numbers. A hallucination here doesn’t just look bad, it might lead to bad business decisions.
In this advanced lab, you will build an agent capable of querying BigQuery and then use the GenAI Eval Service to verify the results. You will learn to measure Factual Accuracy and Completeness, ensuring your agent provides the exact data requested without omission.
aside_block
<ListValue: [StructValue([(‘title’, ‘Go to lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc0787f35b0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Trust Your AI in Production
Ready to make your AI applications production-grade? Start evaluating your model’s outputs or the trajectory taken by your agents with these codelabs:
These labs are part of the AI Evaluation module in our official Production-Ready AI with Google Cloud program. Explore the full curriculum for more content that will help you bridge the gap from a promising prototype to a production-grade AI application.
To build a production-ready agentic system, where intelligent agents can freely collaborate and act, we need standards and shared protocols for how agents talk to tools and how they talk to each other.
In the Agent Production Patterns module in the Production-Ready AI with Google Cloud Learning Path, we focus on interoperability, exploring the standard patterns for connecting agents to data, tools and each other. Here are three hands-on labs to help you build these skills.
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07046d100>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Connecting to Data with MCP
Once you understand the basics, the next step is giving your agent access to knowledge. Whether you are analyzing massive datasets or searching operational records, the MCP Toolbox provides a standard way to connect your agent to your databases.
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07046db80>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Expose a CloudSQL database to an MCP Client
If you need your agent to search for specific records—like flight schedules or hotel inventory—this lab demonstrates how to connect to a CloudSQL relational database.
aside_block
<ListValue: [StructValue([(‘title’, ‘Start the lab!’), (‘body’, <wagtail.rich_text.RichText object at 0x7fc07046d040>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
From Prototype to Production
By moving away from custom integrations and adopting standards like MCP and A2A, you can build agents that are easier to maintain and scale. These labs provide the practical patterns you need to connect your agents to your data, your tools, and each other.
These labs are part of the AgentProduction Patterns module in our official Production-Ready AI with Google Cloud Learning Path. Explore the full curriculum for more content that will help you bridge the gap from a promising prototype to a production-grade AI application.
Share your progress using the hashtag #ProductionReadyAI. Happy learning!
Welcome to the first Cloud CISO Perspectives for December 2025. Today, Francis deSouza, COO and president, Security Products, Google Cloud, shares our Cybersecurity Forecast report for the coming year, with additional insights from our Office of the CISO colleagues.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x7fa5b03dd1c0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Forecasting 2026: The year AI rewrites the security playbook
By Francis deSouza, COO, Google Cloud
Francis deSouza, COO and president, Security Products, Google Cloud
We are at a unique point in time where we’re facing a generational refactoring of the entire technology stack, including the threat landscape. 2025 was a watershed year in cybersecurity, where AI moved to the forefront of every company’s agenda, changing the game for both security offense and defense.
While threats continue to intensify — with attackers using AI for sophisticated phishing and deepfakes — defenders also have been gaining ground. This year’s evolutions will continue to drive change in the coming year, and our annual Cybersecurity Forecast report for 2026 explores how today’s lessons will impact tomorrow’s cybersecurity across four key areas: artificial intelligence, cybercrime, nation-state threats, and regulatory obligations.
Organizations haven’t spent enough time preparing their workforces to use AI securely. It is essential that companies build a learning culture around security that includes true AI fluency.
1. The rise of agentic security automation
AI and agents will redefine how organizations secure their environment, turning the security operations center from a monitoring hub into an engine for automated action. This is critical because the window of opportunity has decreased; bad actors operate in hours, not weeks.
As data volumes explode, AI agents can give defenders a speed advantage we haven’t had in years. By stepping in to detect anomalies, automate data analysis, and initiate response workflows, your security teams can focus on the complex decisions that require human judgment. This shift won’t just improve speed — it will drive similar gains in proactively strengthening your entire security posture.
2. Building AI fluency as a defense
We will likely see a wave of AI-driven attacks targeting employees, largely because the weak link in security remains the user. Organizations haven’t spent enough time preparing their workforces to use AI securely. It is essential that companies build a learning culture around security that includes true AI fluency.
Every organization should deploy something like our Model Armor to protect their AI models. Implementing a validation layer at the gateway level ensures that guardrails are active controls rather than just theoretical guidelines.
However, technology is only half the equation. We also need a security-conscious workforce. If we don’t help our employees build these skills, teams simply won’t be equipped to identify the new wave of threats or understand how best to defend against them.
This means looking past standard training, and investing in efforts, like agentic security operations center (SOC) workshops and internal cyber war games efforts, to help educate their employees on what the threat landscape looks like in an AI world.
Read on for the key points from the Cybersecurity Forecast report, bolstered with new insights from our Office of the CISO.
AI advantages
Widespread adoption of AI agents will create new security challenges, requiring organizations to develop new methodologies and tools to effectively map their new AI ecosystems. A key part of this will be the evolution of identity and access management (IAM) to treat AI agents as distinct digital actors with their own managed identities.
AI adoption will transform security analysts’ roles, shifting them from drowning in alerts to directing AI agents in an agentic SOC. This will allow analysts to focus on strategic validation and high-level analysis, as AI handles data correlation, incident summaries, and threat intelligence drafting.
The heightened capability of agentic AI to take actions and execute tasks autonomously elevates the importance of cybersecurity basics. Organizations will need to create discrete boundary definitions for the authorization, authentication, and monitoring of each agent.
Taylor Lehmann, director, health care and life sciences
A year from now, we’re going to have an awesome security opportunity to secure a new persona in our organizations: Knowledge workers who produce truly useful, mission-critical applications and software using ideas and words — but not necessarily well-written, vetted, and tested code.
We’re going to need better and more fine-grained paths to help these new “idea-native developers” who use powerful AI tools and agents to build, test, submit, manage and blast secure code into secure production as safely and as fast as they can. In 2026 and 2027, we’re going to see how big this opportunity is. We should prepare to align our organizations, operations, and technology (OOT) to take advantage of it.
A corollary to this comes from our DORA reports: Just as AI has amplified productivity and begun optimizing work, it amplifies organizational dysfunctions — especially those that lead to inefficiently and ineffectively secured data.
Marina Kaganovich, executive trust lead
The heightened capability of agentic AI to take actions and execute tasks autonomously elevates the importance of cybersecurity basics. Organizations will need to create discrete boundary definitions for the authorization, authentication, and monitoring of each agent.
Beyond technical controls, organizational defense will depend on fostering an AI-literate workforce through training and awareness, as staff shift from performing tasks to architecting and overseeing agents. To be successful, organizations will require a fundamental shift in risk-informed culture.
Bill Reid, security advisor
Aggressive adoption of agentic AI will drive a renewed interest in threat modeling practices. Security teams will be asked to deeply understand what teams are trying to build, and will need to think about the data flows, the trust boundaries, and the guardrails needed.
Agentic AI will also demand that the supply chain be considered within that threat model, beyond the software bill of materials (SBOM), to look at how those services will control autonomous actions. It will also force a renewed look at identity and entitlements, as agents are asked to act on behalf of or as an extension of employees in the enterprise.
What may have been acceptable wide scopes covered by detective controls may no longer be sufficient, given the speed of action that comes with automation and the chaining of models together in goal seeking behavior.
Vesselin Tzvetkov, senior cybersecurity advisor
As Francis noted, agentic security operations are set to become the standard for modern SOCs, dramatically enhancing the speed and capabilities of security organizations. The agentic SOC in 2026 will feature multiple small, dedicated agents for tasks like summarization, alert grouping, similarity detection, and predictive remediation.
This shift will transform modern SOC roles and processes, moving away from tiered models in favor of CI/CD-like automation. AI capabilities and relevant know-how are essential for security personnel.
As AI drives new AI threat hunting capabilities to gain insight from data lakes in previously underexplored areas, such as OT protocols for manufacturing and industry-specific protocols like SS7 for telecommunications, the overall SOC coverage and overall industry security will improve.
Vinod D’Souza, director, manufacturing and industry
In 2026, agentic AI will help the manufacturing and industrial sector cross the critical threshold from static automation to true autonomy. Machines will self-correct and self-optimize with a speed and precision that exceeds human capacity.
The engine powering this transformation is the strategic integration of cloud-native SCADA and AI-native architectures. Security leaders should redefine their mandate from protecting a perimeter to enabling a trusted ecosystem anchored in cyber-physical identity.
Every sensor, service, autonomous agent, and digital twin should be treated as a verified entity. By rooting security strategies in data-centered Zero Trust, organizations stop treating security as a gatekeeper and transform it into the architectural foundation. More than just securing infrastructure, the goal is to secure the decision-making integrity of autonomous systems.
AI threats
We anticipate threat actors will move decisively from using AI as an exception to using it as the norm. They will use AI to enhance the speed, scope, and effectiveness of their operations, streamlining and scaling attacks.
A critical and growing threat is prompt injection, an attack that manipulates AI to bypass its security protocols and follow an attacker’s hidden command. Expect a significant rise in targeted attacks on enterprise AI systems.
Threat actors will accelerate the use of highly manipulative AI-enabled social engineering. This includes vishing (voice phishing) with AI-driven voice cloning to create hyperrealistic impersonations of executives or IT staff, making attacks harder to detect and defend against.
The increasing complexity of hybrid and multicloud architectures, coupled with the rapid, ungoverned introduction of AI agents, will accelerate the crisis in IAM failures, cementing them as the primary initial access vector for significant enterprise compromise.
Anton Chuvakin, security advisor
We’ve been hearing about the sizzle of AI for some time, but now we need the steak to be served. While there’s still a place for exciting, hypothetical use cases, we need tangible AI benefits backed by solid security data of value and benefits obtained and proven.
Whether your company adopts agents or not, your employees will use them for work. Shadow agents raise new and interesting risks, especially when your employees connect their personal agents to corporate systems. Organizations will have to invest to mitigate the risks of shadow agents — merely blocking them simply won’t work (they will sneak back in immediately).
David Stone, director, financial services
As highlighted in the Google Threat Intelligence Group report on adversarial use of AI, attackers will use gen AI to exploit bad hygiene, employ deepfake capabilities to erode trust in processes, and discover zero-day vulnerabilities. Cyber defenders will likewise have to adopt gen AI capabilities to find and fix cyber hygiene, patch code at scale, and scrutinize critical business processes to get signals to find and stop exploitation of humans in the process.
Security will continue to grow in importance in the boardroom as the key focus on resilience, business enablement, and business continuity — especially as AI-driven attacks evolve.
Jorge Blanco, director, Iberia and Latin America
The increasing complexity of hybrid and multicloud architectures, coupled with the rapid, ungoverned introduction of AI agents, will accelerate the crisis in IAM failures, cementing them as the primary initial access vector for significant enterprise compromise.
The proliferation of sophisticated, autonomous agents — often deployed by employees without corporate approval (the shadow agent risk) — will create invisible, uncontrolled pipelines for sensitive data, leading to data leaks and compliance violations. The defense against this requires the evolution of IAM to agentic identity management, treating AI agents as distinct digital actors with their own managed identities.
Organizations that fail to adopt this dynamic, granular control — focusing on least privilege, just-in-time access, and robust delegation — will be unable to minimize the potential for privilege creep and unauthorized actions by these new digital actors. The need for practical guidance on securing multicloud environments, including streamlined IAM configuration, will be acutely felt as security teams grapple with this evolving threat landscape.
Sri Gourisetti, senior cybersecurity advisor
The increased adversarial use of AI for the development of malware modules may likely result in “malware bloat” — a high volume of AI-generated malicious code that is non-functional or poorly optimized, creating significant noise for amateur adversaries and defenders.
Functional malware will become more modular and mature, designed to be compatible and interact with factory floor and OT environments as the manufacturing and industrial sector moves beyond initial exploration of generative AI toward the structural deployment of agentic AI in IT, OT, and manufacturing workflows.
Widya Junus, strategy operations
Over 70% of cloud breaches stem from compromised identities, according to a recent Cloud Threat Horizons report, and we expect that trend to accelerate as threat actors exploit AI. The security focus should shift from human-centered authentication to automated governance of non-human identities using Cloud Infrastructure Entitlement Management (CIEM) and Workload Identity Federation (WIF).
Accordingly, as AI-assisted attacks lower the barrier for entry and cloud-native ransomware specifically targets APIs to encrypt workloads, organizations will increasingly rely on tamper-proof backups (such as Backup Vault) and AI-driven automated recovery workflows to ensure business continuity — rather than relying solely on perimeter defenses to stop every attack.
Cybercrime
The combination of ransomware, data theft, and multifaceted extortion will remain the most financially disruptive category of cybercrime. The volume of activity is escalating, with focus on targeting third-party providers and exploiting zero-day vulnerabilities for high-volume data exfiltration.
As the financial sector increasingly adopts cryptocurrencies, threat actors are expected to migrate core components of their operations onto public blockchains for unprecedented resilience against traditional takedown efforts.
As security controls mature in guest operating systems, adversaries are pivoting to the underlying virtualization infrastructure, which is becoming a critical blind spot. A single compromise here can grant control over the entire digital estate and render hundreds of systems inoperable in a matter of hours.
Next year, we’ll see the first sustained, automated campaigns where threat actors use agentic AI to autonomously discover and exploit vulnerabilities faster than human defenders can patch exploited vulnerabilities.
David Homovich, advocacy lead
In 2026, we expect to see more boards pressuring CISOs to translate security exposure and investment into financial terms, focusing on metrics like potential dollar losses and the actual return on security investment. Crucially, operational resilience — the organization’s ability to quickly recover from an AI-fueled attack — is a non-negotiable board expectation.
CISOs take note: Boards are asking us about business resilience and the impact of advanced, machine-speed attacks — like adversarial AI and securing autonomous identities such as AI agents. Have your dollar figures ready, because this is the new language of defense for boards.
Crystal Lister, security advisor
Next year, we’ll see the first sustained, automated campaigns where threat actors use agentic AI to autonomously discover and exploit vulnerabilities faster than human defenders can patch exploited vulnerabilities.
2025 showed us that adversaries are no longer leveraging artificial intelligence just for productivity gains, they are deploying novel AI-enabled malware in active operations. The ShadowV2 botnet was likely a test run for autonomous C2 infrastructure.
Furthermore, the November 2025 revelations about Chinese state-sponsored actors using Anthropic’s Claude to automate espionage code-writing demonstrates that barriers to entry for sophisticated attacks have collapsed. Our security value proposition should shift from detection to AI-speed preemption.
The global stage: Threat actors
Cyber operations in Russia are expected to undergo a strategic shift, prioritizing long-term global strategic goals and the development of advanced cyber capabilities over just tactical support for the conflict in Ukraine.
The volume of China-nexus cyber operations is expected to continue surpassing that of other nations. They will prioritize stealthy operations, aggressively targeting edge devices and exploiting zero-day vulnerabilities.
Driven by regional conflicts and the goal of regime stability, Iranian cyber activity will remain resilient, multifaceted, and semi-deniable, deliberately blurring the lines between espionage, disruption, and hacktivism.
North Korea will continue to conduct financial operations to generate revenue for the regime, cyber espionage against perceived adversaries, and seek to expand IT worker operations.
Sovereign cloud will become a drumbeat across most of Europe, as EU member states seek to decrease their reliance on American tech companies.
Bob Mechler, director, Telco, Media, Entertainment and Gaming
The telecom cybersecurity landscape in 2026 will be dominated by the escalation of AI-driven attacks and persistent geopolitical instability. We may witness the first major AI-driven cybersecurity breach, as adversaries use AI to automate exploit development and craft sophisticated attacks that outpace traditional defenses.
This technological escalation coincides with a baseline of state-backed and politically-motivated cyber-threat activity, where critical infrastructure is targeted as part of broader geopolitical conflicts. Recent state-sponsored campaigns, such as Salt Typhoon, highlight how adversaries are already penetrating telecommunications networks to establish long-term access, posing a systemic threat to national security.
Toby Scales, security advisor
Sovereign cloud will become a drumbeat across most of Europe, as EU member states seek to decrease their reliance on American tech companies.
At the same time, the AI capability gap will continue to widen and both enterprises and governments will chase agreements with frontier model providers. Regulatory bodies may seek to enforce “locally hosted fine-tuned models” as a way to protect state secrets, but will face predictable opposition from frontier model developers.
Meeting regulatory obligations
Governance has taken on new importance in the AI era. Key areas of focus are expanding to include data integrity to prevent poisoning attacks, model security to defend against evasion and theft, and governance fundamentals to ensure transparency and accountability.
CISOs and governance, risk, and compliance teams should work together to build an AI resilience architecture, establish continuous AI health monitoring, integrate AI into business continuity and incident response, and embed AI resilience into security governance.
Bhavana Bhinder, security, privacy, and compliance advisor
In 2026, we will see the validated AI operating model become the industry standard for healthcare and life sciences (HCLS), with a shift from pilot projects to organizations seeking full-scale production deployments that are compliant and audit-ready by design. The logical evolution for HCLS will move towards agentic evaluation, where autonomous agents act as real-time auditors.
Instead of periodic reviews, these agents will continuously validate that generative AI outputs (such as clinical study reports) remain factually grounded and conform to regulatory standards. Organizations using governed, quality-scored data necessary to trust advanced models like Gemini across the drug lifecycle, clinical settings, and quality management will depend on AI workflows that natively support industry- and domain-specific regulations.
Odun Fadahunsi, senior security risk and compliance advisor
As regulators and sectoral bodies in finance, healthcare and critical infrastructure define AI-specific resilience obligations, CISOs must treat AI resilience as a primary pillar of security, not a separate or optional discipline. AI systems are poised to become so deeply embedded in identity, fraud detection, customer operations, cloud automation, and decisioning workflows that AI availability and reliability will directly determine an organization’s operational resilience.
Unlike traditional systems, AI can fail in silent, emergent, or probabilistic ways — drifting over time, degrading under adversarial prompt, and behaving unpredictably after upstream changes in data or model weights. These failure modes will create security blindspots, enabling attackers to exploit model weaknesses that bypass traditional controls.
CISOs and governance, risk, and compliance teams should work together to build an AI resilience architecture, establish continuous AI health monitoring, integrate AI into business continuity and incident response, and embed AI resilience into security governance.
For more leadership guidance from Google Cloud experts, please see ourCISO Insights hub.
Here are the latest updates, products, services, and resources from our security teams so far this month:
Responding to React2Shell (CVE-2025-55182): Follow these recommendations to minimize remote code execution risks in React and Next.js from the React2Shell (CVE-2025-55182) vulnerability. Read more.
How Google Does It: Securing production services, servers, and workloads: Here are the three core pillars that define how we protect production workloads at Google-scale. Read more.
How Google Does It: Using Binary Authorization to boost supply chain security: “Don’t trust, verify,” guides how we secure our entire software supply chain. Here’s how we use Binary Authorization to ensure that every component meets our security best practices and standards. Read more.
New data on ROI of AI in security: Our new ROI of AI in security report showcases how organizations are getting value from AI in cybersecurity, and finds a significant, practical shift is underway. Read more.
Using MCP with Web3: How to secure blockchain-interacting agents: In the Web3 world, who hosts AI agents, and who holds the private key to operations, are pressing questions. Here’s how to get started with the two most likely agent models. Read more.
Expanding the Google Unified Security Recommended program: We are excited to announce Palo Alto Networks as the latest addition to the Google Unified Security Recommended program, joining previously announced partners CrowdStike, Fortinet and Wiz. Read more.
Why PQC is Google’s path forward (and not QKD): After closely evaluating Quantum Key Distribution (QKD), here’s why we chose post-quantum cryptography (PQC) as the more scalable solution for our needs. Read more.
Architecting security for agentic capabilities in Chrome: Following the recent launch of Gemini in Chrome and the preview of agentic capabilities, here’s our approach and some new innovations to improve the safety of agentic browsing. Read more.
Android Quick Share support for AirDrop: As part of our efforts to continue to make cross-platform communication easier, we’ve made Quick Share interoperable with AirDrop, allowing for two-way file sharing between Android and iOS devices, starting with the Pixel 10 Family. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x7fa5b03dd430>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
Intellexa’s prolific zero-day exploits continue: Despite extensive scrutiny and public reporting, commercial surveillance vendors such as Intellexa continue to operate unimpeded. Known for its “Predator” spyware, new GTIG analysis shows that Intellexa is evading restrictions and thriving. Read more.
APT24’s pivot to multi-vector attacks: GTIG is tracking a long-running and adaptive cyber espionage campaign by APT24, a People’s Republic of China (PRC)-nexus threat actor that has been deploying BADAUDIO over the past three years. Here’s our analysis of the malware, and how defenders can detect and mitigate this persistent threat. Read more.
Get going with Time Travel Debugging using a .NET process hollowing case study: Unlike traditional live debugging, this technique captures a deterministic, shareable record of a program’s execution. Here’s how to start incorporating TTD into your analysis. Read more.
Analysis of UNC1549 targeting the aerospace and defense ecosystem: Following last year’s post on suspected Iran-nexus espionage activity targeting the aerospace, aviation, and defense industries in the Middle East, we discuss additional tactics, techniques, and procedures (TTPs) observed in incidents Mandiant has responded to. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Podcasts from Google Cloud
The truth about autonomous AI hacking: Heather Adkins, Google’s Security Engineering vice-president, separates the hype from the hazards of autonomous AI hacking, with hosts Anton Chuvakin and Tim Peacock. Listen here.
Escaping 1990s vulnerability management: Caleb Hoch, consulting manager for security transformations, Mandiant, discusses with Anton and Tim how vulnerability management has evolved beyond basic scanning and reporting, and the biggest gaps between modern practices and what organizations are actually doing. Listen here.
The art and craft of cloud bug hunting: Bug bounty professionals Sivanesh Ashok and Sreeram KL, have won the Most Valuable Hacker award from the Google Cloud VRP team. They chat about all things buggy with Anton and Tim, including how to write excellent bug bounty reports. Listen here.
Behind the Binary: The art of deconstructing problems: Host Josh Stroschein is joined by Nino Isakovic, a long-time low-level security expert, for a thought-provoking conversation that spans the foundational and the cutting-edge — including his discovery of the ScatterBrain obfuscating compiler. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.
We can all agree that the quality of AI-driven answers relies on the consistency of the underlying data. But AI models, while powerful, lack business context out of the box. As more organizations ask questions of their data using natural language, it is increasingly important to unify business measures and dimensions, ensure consistency company-wide. If you want trustworthy AI, what you need is a semantic layer that acts as the single source of truth for business metrics.But how do you make that data accessible and actionable for your end users? Building off the recent introduction of Looker’s Model Context Protocol (MCP) server, in this blog we take you through the process of creating an Agent Development Kit (ADK) agent that is connected to Looker via the MCP Toolbox for Databases and exposing it within Gemini Enterprise. Let’s get started.Step 1 – Set up Looker Integration in MCP Toolbox
MCP Toolbox for Databases is a central open-source server that hosts and manages toolsets, enabling agentic applications to leverage Looker’s capabilities without working directly with the platform. Instead of managing tool logic and authentication themselves, agents act as MCP clients and request tools from the Toolbox. The MCP Toolbox handles all the underlying complexities, including secure connections to Looker, authentication and query execution.
The MCP Toolbox for Databases natively supports Looker’s pre-built toolset. To access these tools, follow the below steps:
Connect to Cloud Shell. Check that you’re already authenticated, and that the project is set to your project ID using the following command:
Install the binary version of the MCP Toolbox for Databases via the script given below. This command is for Linux; if you run on Macintosh or Windows, ensure that you download the correct binary. Check out the releases page for your Operation System and Architecture and download the correct binary.
code_block
<ListValue: [StructValue([(‘code’, ‘export OS=”linux/amd64″ # one of linux/amd64, darwin/arm64, darwin/amd64, or windows/amd64rncurl -O https://storage.googleapis.com/genai-toolbox/v0.12.0/$OS/toolboxrnchmod +x toolbox’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5baae0a30>)])]>
Deploy Toolbox to Cloud Run
Next, you’ll need to run MCP Toolbox. The simplest way to do that is on Cloud Run, Google Cloud’s fully managed container application platform. Here’s how:
The Cloud Run will ask if you want Unauthenticated, select No.Allow Unauthenticated: N
Step 2: Deploy ADK Agent to Agent Engine
Next, you need to configure Agent Development Kit (ADK), a flexible and modular framework for developing and deploying AI agents. ADK was designed to make agent development feel more like software development, to make it easier for developers to create, deploy, and orchestrate agentic architectures that range from simple tasks to complex workflows. And while ADK is optimized for Gemini and the Google ecosystem, it’s also model-agnostic, deployment-agnostic, and is built for compatibility with other frameworks.
Vertex AI Agent Engine, a part of the Vertex AI Platform, is a set of services that enables developers to deploy, manage, and scale AI agents in production. Agent Engine handles the infrastructure to scale agents in production so you can focus on creating applications.
Open a new terminal tab in Cloud Shell and create a folder named my-agents as follows. You also need to navigate to the my-agents folder.
Now you’re ready to use adk to create a scaffolding, including folders, environment and basic files, for our Looker Agent Application via the adkcreate command with an app name looker_app:
Gemini model for choosing a model for the root agent
Vertex AI for the backend
Your default Google Project Id and region
code_block
<ListValue: [StructValue([(‘code’, ‘Choose a model for the root agent:rn1. gemini-2.5-flash-001rn2. Other models (fill later)rnChoose model (1, 2): 1rnrnrn1. Google AIrn2. Vertex AIrnChoose a backend (1, 2): 2rnrnEnter Google Cloud project ID [your_current_project_id]:rnEnter Google Cloud region [us-central1]:rnrnAgent created in /home/romin/looker-app:rn- .envrn- __init__.pyrn- agent.py’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5bd2720a0>)])]>
Observe the folder in which a default template and required files for the Agent have been created.
First up is the .env file:
code_block
<ListValue: [StructValue([(‘code’, ‘GOOGLE_GENAI_USE_VERTEXAI=1rnGOOGLE_CLOUD_PROJECT=YOUR_GOOGLE_PROJECT_IDrnGOOGLE_CLOUD_LOCATION=YOUR_GOOGLE_PROJECT_REGION’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5bd272220>)])]>
The values indicate that you will be using Gemini via Vertex AI along with the respective values for the Google Cloud Project Id and location.
Then you have the __init__.py file that marks the folder as a module and has a single statement that imports the agent from the agent.py file:
Finally, take a look at the agent.py file. The contents can be edited to similar to the example below:
Insert the Cloud Run URL highlighted here (. not the one with the project number in the url).
code_block
<ListValue: [StructValue([(‘code’, ‘import osrnfrom google.adk.agents import LlmAgentrnfrom google.adk.planners.built_in_planner import BuiltInPlannerrnfrom google.adk.tools.mcp_tool.mcp_toolset import MCPToolsetrnfrom google.adk.tools.mcp_tool.mcp_session_manager import SseConnectionParams, StreamableHTTPConnectionParamsrnfrom google.genai.types import ThinkingConfigrnfrom google.auth import compute_enginernimport google.auth.transport.requestsrnimport google.oauth2.id_tokenrnrn# Replace this URL with the correct endpoint for your MCP server.rnMCP_SERVER_URL = “YOUR_CLOUD_RUN_URL/mcp”rnif not MCP_SERVER_URL:rn raise ValueError(“The MCP_SERVER_URL is not set.”)rndef get_id_token():rn “””Get an ID token to authenticate with the MCP server.”””rn target_url = MCP_SERVER_URLrn audience = target_url.split(‘/mcp’)[0]rn auth_req = google.auth.transport.requests.Request()rn id_token = google.oauth2.id_token.fetch_id_token(auth_req, audience)rn # Get the ID token.rn return id_tokenrnrnrnroot_agent = LlmAgent(rn model=’gemini-2.5-flash’,rn name=’looker_agent’,rn description=’Agent to answer questions about Looker data.’,rn instruction=(rn ‘You are a helpful agent who can answer user questions about Looker data the user has access to. Use the tools to answer the question. If you are unsure on what model to use, try defaulting to thelook and if you are also unsure on the explore, try order_items if using thelook model’rn ),rnplanner=BuiltInPlanner(rnthinking_config=ThinkingConfig(include_thoughts=False, thinking_budget=0)rn),rntools=[rnMCPToolset(rnconnection_params=StreamableHTTPConnectionParams(rnurl=MCP_SERVER_URL,rnheaders={rn”Authorization”: f”Bearer {get_id_token()}”,rn}rn),rnerrlog=None,rn# Load all tools from the MCP server at the given URLrntool_filter=None,rn)rn],rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fa5bd272940>)])]>
NOTE: Ensure you grant the Cloud Run Invoker role to the default Agent Engine Service Account (i.e., service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com)
Step 3: Connect to Gemini Enterprise
Now it’s time to create a Gemini Enterprise app (instructions here).
Run the below command with the GCP Project Number, Reasoning Engine resource name output from the ‘deploy agent_engine’ command above, and your Gemini Enterprise Agent ID from the Gemini Enterprise Apps interface:
Your Looker data will now be available within your Gemini Enterprise app.If you don’t have access to this feature, contact your Google Cloud account team.
Querying business data made easier
Connecting Looker’s semantic layer to Vertex AI Agent services by way of the ADK and MCP Toolbox is a big win for data accessibility. By exposing your trusted Looker models and Explores in Gemini Enterprise, you empower end-users to query complex business data using natural language. This integration closes the gap between data insights and immediate action, ensuring that your organization’s semantic layer is not just a source of passive reports, but an active, conversational, and decision-driving asset.