In today’s data-driven world, teams struggle with siloed data, lack of business context, data reliability concerns, and inconsistent governance that hinders actionable insights. But what if there was a way that could transform your data landscape, unlocking the true value of your information?
That’s the problem we aim to solve with the experimental launch of data products in BigQuery, announced at Google Cloud Next.
Data products in BigQuery offers an approach to organizing, sharing, and leveraging your most valuable asset by treating data as the product. Imagine a ‘Customer Sales’ data product: a curated bundle of BigQuery views combining customer order details and regional sales data. The Sales Analytics team, as the data product owner, provides business context for campaign analysis, along with data freshness guarantees and a dedicated point of contact. With this context and guarantees, data consumers can now effectively use this data product to make informed business decisions related to customer sales.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3eba6447b400>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
A data product in BigQuery simplifies the transaction between data producers and consumers by allowing data producers to bundle one or more BigQuery tables or views that address a use case, and distribute them as a logical block to data consumers. While BigQuery already provides a powerful way to share data through datasets listed in data exchanges, data products go beyond this by offering a higher level of abstraction and richer context on the business case the data addresses. Data products are available within the BigQuery experience, allowing data consumers to search and discover relevant assets as one consumable unit.
A data product allows data producers to manage their data as product, which entails the following:
Build for use cases: Identify the customer, use case, and build a data product with one or more assets that addresses the use case.
Establish ownership: Define the owner and contact information for the data product, helping to ensure accountability and provide trust for consumers.
Democratize context: Distribute valuable context about the problems the product addresses, usage examples and expectations.
Streamline contracts: Provide data consumers the ability to annotate details on data freshness and quality to provide trust and cut down time to insight.
Govern assets: Control who can view the product and regulate access to the data that’s distributed via the data product.
Discover data: Provide data consumers the ability to easily discover and search data products.
Distribute data: Distribute the data product beyond the organization’s boundaries into private consortiums or to the public via a data exchange.
Evolve offerings: Iterate and evolve the product to address consumer needs.
When data producers build assets that address use cases and manage data as a product, it allows data teams to be more efficient, with:
Reduced redundancy: By creating standardized and reusable data products, data teams avoid building the same datasets or pipelines repeatedly for different users or purposes. This frees up their time and resources.
Better prioritization: Treating data as a product helps data teams prioritize their work based on the value and impact of each data product, aligning their efforts with business needs.
Demonstrable ROI: By tracking the usage and the impact of a data product, data teams can better measure and communicate the value of their work to the organization.
Built-in data governance: In the future, data products will be able to incorporate governance policies and compliance workflows, helping to ensure that data is managed responsibly and consistently.
Finally, all of these translate to efficiency for the data consumer by reducing the toil involved in finding the right asset. Data consumers get faster access to insight, since anyone within the organization can search, browse, and discover data products, as well as subscribe to the data product. They also get increased trust, because when data is well-defined, reliable, and properly documented, it’s easier to select the right data for a given use case.
Data products in BigQuery provide the building blocks and controls you need to manage data as a product. It leads to faster access to insights for data consumers through business-outcome-driven data management, maximizing value to the organization.
Are you ready to unlock the untapped potential of your data? Sign up for the experimental preview here.
Have you ever had something on the tip of your tongue, but you weren’t exactly sure how to describe what’s in your mind?
For developers, this is where “vibe coding ” comes in. Vibe coding helps developers achieve their vision with models like Gemini 2.5 Pro to generate code from natural language prompts. Instead of writing every line of code, developers can now describe the desired functionality in plain language. AI translates these “vibes” into your vision.
Today, we’ll show you how vibe coding can help developers create Model Context Protocol (MCP) servers. MCP, launched in November 2024 by Anthropic, provides an open standard for integrating AI models with various data sources and tools. Since its release, it has become increasingly popular for building AI applications – including with new experimental models like Gemini 2.5.
You can use Gemini 2.5 Pro’s code generation capabilities to create MCP servers with ease, helping you build intuitive, natural language specifications and operational AI infrastructure.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3eba671716d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
The methodology
Effective AI-assisted coding, especially for specific tasks like generating MCP server code with models such as Gemini 2.5 Pro, starts with clear prompting. To achieve the best results:
Provide context: Offer relevant background information about the MCP server.
Be specific: Give clear and detailed instructions for the code you need.
Be patient: Generating and refining code can take time.
Remember that this is often an iterative process. Be prepared to refine your instructions and regenerate the code until the results are satisfactory.
How to create an MCP server using vibe coding, step by step
There are two ways to leverage Gemini 2.5 Pro for vibe coding: through the Gemini app (gemini.google.com) or by utilizing the Google Gen AI SDK.
Visit gemini.google.com and upload the saved PDF file.
Enter your prompt to generate the desired code.
Here’s an example of prompt to generate a Google Cloud BigQuery MCP Server:
instruction = """
You are an MCP server expert. Your mission is to write python code for MCP server. The MCP server development guide and examples are provided.
Please create MCP server code for Google Cloud BigQuery. It has two tools:
One is to list tables for all datasets,
The other is to describe a table.
Google Cloud project ID and location will be provided in the query string. Please use project id to access BigQuery client.
”””
4. Copy your code, and test the server using this notebook
Alternatively, you can use Google Gen AI SDK to create your server code in a notebook.
Approach 2: Use the Google Gen AI SDK
Begin by configuring the system’s instructions.
system_instruction = f"""
You are an MCP server expert.
Your mission is to write python code for MCP server.
Here's the MCP server development guide and example:
{reference_content}
"""
2. Set user prompt
This step involves defining the instructions or questions that will guide the Gemini Vibe Coding process. The user prompt acts as the input for the AI model, specifying the desired outcome for building MCP servers.
url = "https://medlineplus.gov/about/developers/webservices/"
prompt_base = """
Please create an MCP server code for https://medlineplus.gov/about/developers/webservices/. It has one tool:
- get_medical_term. You provide a medical term, this tool will return an explanation of the medical term.
Here's the API details:
"""
prompt = [prompt_base, types.Part.from_uri(file_uri=url, mime_type="text/html")]
Creating an MCP server example for a government website offering free API services is shown above.
To enhance understanding of the API being used, the API service URL is provided as additional context for Gemini content generation.
3. Generate code
Utilize the provided function to create the necessary server code.
4. Use this notebook to test the server. The complete and detailed code is available within this notebook.
Test it by yourself
Gemini 2.5 Pro, currently in preview, offers exceptional code generation capabilities for MCP servers, drastically speeding up and easing the development of your MCP applications. Keep in mind that vibe coding, including models like Gemini 2.5 Pro, may produce errors, so thorough code review is essential before implementation.
To begin creating your own code, explore Gemini app. We suggest experimenting with various prompts and Gemini models.
Google Threat Intelligence Group (GTIG) has identified a new piece of malware called LOSTKEYS, attributed to the Russian government-backed threat group COLDRIVER (also known as UNC4057, Star Blizzard, and Callisto). LOSTKEYS is capable of stealing files from a hard-coded list of extensions and directories, along with sending system information and running processes to the attacker. Observed in January, March, and April 2025, LOSTKEYS marks a new development in the toolset of COLDRIVER, a group primarily known for credential phishing against high-profile targets like NATO governments, non-governmental organizations (NGOs), and former intelligence and diplomatic officers. GTIG has been tracking COLDRIVER for many years, including their SPICA malware in 2024.
COLDRIVER typically targets high-profile individuals at their personal email addresses or at NGO addresses. They are known for stealing credentials and after gaining access to a target’s account they exfiltrate emails and steal contact lists from the compromised account. In select cases, COLDRIVER also delivers malware to target devices and may attempt to access files on the system.
Recent targets in COLDRIVER’s campaigns have included current and former advisors to Western governments and militaries, as well as journalists, think tanks, and NGOs. The group has also continued targeting individuals connected to Ukraine. We believe the primary goal of COLDRIVER’s operations is intelligence collection in support of Russia’s strategic interests. In a small number of cases, the group has been linked to hack-and-leak campaigns targeting officials in the UK and an NGO.
To safeguard at-risk users, we use our research on serious threat actors like COLDRIVER to improve the safety and security of Google’s products. We encourage potential targets to enroll in Google’s Advanced Protection Program, enable Enhanced Safe Browsing for Chrome, and ensure that all devices are updated.
Stage 1 — It Starts With A Fake CAPTCHA
LOSTKEYS is delivered at the end of a multi-step infection chain that starts with a lure website with a fake CAPTCHA on it. Once the CAPTCHA has been “verified,” PowerShell is copied to the users clipboard and the page prompts the user to execute the PowerShell via the “run” prompt in Windows:
The first stage PowerShell that is pasted in will fetch and execute the second stage. In multiple observed cases, the second stage was retrieved from 165.227.148[.]68.
COLDRIVER is not the only threat actor to deliver malware by socially engineering their targets to copy, paste, and then execute PowerShell commands—a technique commonly called “ClickFix.” We have observed multiple APT and financially motivated actors use this technique, which has also been widelyreportedpublicly. Users should exercise caution when encountering a site that prompts them to exit the browser and run commands on their device, and enterprise policies should implement least privilege and disallow users from executing scripts by default.
Stage 2 — Device Evasion
The second stage calculates the MD5 hash of the display resolution of the device and if the MD5 is one of three specific values it will stop execution, otherwise it will retrieve the third stage. This step is likely done to evade execution in VMs. Each observed instance of this chain uses different, unique identifiers that must be present in the request to retrieve the next stage. In all observed instances the third stage is retrieved from the same host as the previous stages.
Stage 3 — Retrieval of the Final Payload
The third stage is a Base64-encoded blob, which decodes to more PowerShell. This stage retrieves and decodes the final payload. To do this it pulls down two more files, from the same host as the others, and again using different unique identifiers per infection chain.
The first is a Visual Basic Script (VBS) file, which we call the “decoder” that is responsible for decoding the second one. The decoding process uses two keys, which are unique per infection chain. The decoder has one of the unique keys and the second key is stored in stage 3. The keys are used in a substitution cipher on the encoded blob, and are unique to each infection chain. A Python script to decode the final payload is:
# Args: encoded_file Ah90pE3b 4z7Klx1V
import base64
import sys
if len(sys.argv) != 4:
print("Usage: decode.py file key1 key2")
sys.exit(1)
if len(sys.argv[2]) != len(sys.argv[3]):
print("Keys must be the same length")
sys.exit(1)
with open(sys.argv[1], 'r') as f:
data = f.read()
x = sys.argv[2]
y = sys.argv[3]
for i in range(len(x)):
data = data.replace(x[i], '!').replace(y[i], x[i]).replace('!', y[i])
with open(sys.argv[1] + '.out', 'wb') as f:
f.write(base64.b64decode(data))
The Final Payload (LOSTKEYS)
The end result of this is a VBS that we call LOSTKEYS. It is a piece of malware that is capable of stealing files from a hard-coded list of extensions and directories, along with sending system information and running processes to the attacker. The typical behavior of COLDRIVER is to steal credentials and then use them to steal emails and contacts from the target, but as we have previously documented they will also deploy malware called SPICA to select targets if they want to access documents on the target system. LOSTKEYS is designed to achieve a similar goal and is only deployed in highly selective cases.
A Link To December 2023
As part of the investigation into this activity, we discovered two additional samples, hashes of which are available IOCs section, dating back as early as December 2023. In each case, the samples end up executing LOSTKEYS but are distinctly different from the execution chain mentioned here in that they are Portable Executable (PE) files pretending to be related to the software package Maltego.
It is currently unclear if these samples from December 2023 are related to COLDRIVER, or if the malware was repurposed from a different developer or operation into the activity seen starting in January 2025.
Protecting the Community
As part of our efforts to combat threat actors, we use the results of our research to improve the safety and security of Google’s products. Upon discovery, all identified malicious websites, domains and files are added to Safe Browsing to protect users from further exploitation. We also send targeted Gmail and Workspace users government-backed attacker alerts notifying them of the activity and encouraging potential targets to enable Enhanced Safe Browsing for Chrome and ensure that all devices are updated.
We are committed to sharing our findings with the security community to raise awareness and with companies and individuals that might have been targeted by these activities. We hope that improved understanding of tactics and techniques will enhance threat hunting capabilities and lead to stronger user protections across the industry.
Indicators of compromise (IOCs) and YARA rules are included in this post, and are also available as a GTI collection and rule pack.
YARA Rules
rule LOSTKEYS__Strings {
meta:
author = "Google Threat Intelligence"
description = "wscript that steals documents and becaons system
information out to a hardcoded address"
hash = "28a0596b9c62b7b7aca9cac2a07b067109f27d327581a60e8cb4fab92f8f4fa9"
strings:
$rep0 = "my_str = replace(my_str,a1,"!" )"
$rep1 = "my_str = replace(my_str,b1 ,a1 )"
$rep2 = "my_str = replace(my_str,"!" ,b1 )"
$mid0 = "a1 = Mid(ch_a,ina+1,1)"
$mid1 = "b1 = Mid(ch_b,ina+1,1)"
$req0 = "ReqStr = base64encode( z & ";" &
ws.ExpandEnvironmentStrings("%COMPUTERNAME%") & ";" &
ws.ExpandEnvironmentStrings("%USERNAME%") & ";" &
fso.GetDrive("C:\").SerialNumber)"
$req1 = "ReqStr = Chain(ReqStr,"=+/",",-_")"
$cap0 = "CapIN "systeminfo > """ & TmpF & """", 1, True"
$cap1 = "CapIN "ipconfig /all >> """ & TmpF & """", 1, True"
$cap2 = "CapIN "net view >> """ & TmpF & """", 1, True"
$cap3 = "CapIN "tasklist >> """ & TmpF & """", 1, True"
condition:
all of ($rep*) or all of ($mid*) or all of ($req*) or all of ($cap*)
}
UNC3944, which overlaps with public reporting on Scattered Spider, is a financially-motivated threat actor characterized by its persistent use of social engineering and brazen communications with victims. In early operations, UNC3944 largely targeted telecommunications-related organizations to support SIM swap operations. However, after shifting to ransomware and data theft extortion in early 2023, they impacted organizations in a broader range of industries. Since then, we have regularly observed UNC3944 conduct waves of targeting against a specific sector, such as financial services organizations in late 2023 and food services in May 2024. Notably, UNC3944 has also previously targeted prominent brands, possibly in an attempt to gain prestige and increased attention by news media.
Google Threat Intelligence Group (GTIG) observed a decline in UNC3944 activity after 2024 law enforcement actions against individuals allegedly associated with the group. Threat actors will often temporarily halt or significantly curtail operations after an arrest, possibly to reduce law enforcement attention, rebuild capabilities and/or partnerships, or shift to new tooling to evade detection. UNC3944’s existing ties to a broader community of threat actors could potentially help them recover from law enforcement actions more quickly.
Recent public reporting has suggested that threat actors used tactics consistent with Scattered Spider to target a UK retail organization and deploy DragonForce ransomware. Subsequent reporting by BBC News indicates that actors associated with DragonForce claimed responsibility for attempted attacks at multiple UK retailers. Notably, the operators of DragonForce ransomware recently claimed control of RansomHub, a ransomware-as-a-service (RaaS) that seemingly ceased operations in March of this year. UNC3944 was a RansomHub affiliate in 2024, after the ALPHV (aka Blackcat) RaaS shut down.While GTIG has not independently confirmed the involvement of UNC3944 or the DragonForce RaaS, over the past few years, retail organizations have been increasingly posted on tracked data leak sites (DLS) used by extortion actors to pressure victims and/or leak stolen victim data. Retail organizations accounted for 11 percent of DLS victims in 2025 thus far, up from about 8.5 percent in 2024 and 6 percent in 2022 and 2023. It is plausible that threat actors including UNC3944 view retail organizations as attractive targets, given that they typically possess large quantities of personally identifiable information (PII) and financial data. Further, these companies may be more likely to pay a ransom demand if a ransomware attack impacts their ability to process financial transactions.
UNC3944 global targeting map
We have observed the following patterns in UNC3944 victimology:
Targeted Sectors: The group targets a wide range of sectors, with a notable focus on Technology, Telecommunications, Financial Services, Business Process Outsourcing (BPO), Gaming, Hospitality, Retail, and Media & Entertainment organizations.
Geographical Focus: Targets are primarily located in English-speaking countries, including the United States, Canada, the United Kingdom, and Australia. More recent campaigns have also included targets in Singapore and India.
Victim Organization Size: UNC3944 often targets large enterprise organizations, likely due to the potential for higher impact and ransom demands. They specifically target organizations with large help desk and outsourced IT functions which are susceptible to their social engineering tactics.
A high-level overview of UNC3944 tactics, techniques and procedures (TTPs) are noted in the following figure.
UNC3944 attack lifecycle
Proactive Hardening Recommendations
The following provides prioritized recommendations to protect against tactics utilized by UNC3944, organized within the pillars of:
Identity
Endpoints
Applications and Resources
Network Infrastructure
Monitoring / Detections
While implementing the full suite of the recommendations in this guide will generally have some impact on IT and normal operations, Mandiant’s extensive experience supporting organizations to defend against, contain, and eradicate UNC3944 has shown that an effective starting point involves prioritizing specific areas. Organizations should begin by focusing on recommendations that:
Achieve complete visibility across all infrastructure, identity, and critical management services.
Ensure the segregation of identities throughout the infrastructure.
Enhance strong authentication criteria.
Enforce rigorous identity controls for password resets and multi-factor authentication (MFA) registration.
Educate and communicate the importance of remaining vigilant against modern-day social engineering attacks / campaigns (see Social Engineering Awareness section later in this post). UNC3944 campaigns not only target end-users, but also IT and administrative personnel within enterprise environments.
These serve as critical foundational measures upon which other recommendations in this guide can be built.
Google SecOps customers benefit from existing protections that actively detect and alert on UNC3944 activity.
Identity
Positive Identify Verification
UNC3944 has proven to be very prolific in using social engineering techniques to impersonate users when contacting the help desk. Therefore, further securing the “positive identity” process is critical.
Train help desk personnel to positively identify employees before modifying / providing security information (including initial enrollment). At a minimum, this process should be required for any privileged accounts and should include methods such as:
On-Camera / In-Person verification
ID Verification
Challenge / Response questions
If a suspected compromise is imminent or has occurred, temporarily disable or enhance validation for self-service password reset methods. Any account management activities should require a positive identity verification as the first step. Additionally, employees should be required to authenticate using strong authentication PRIOR to changing authentication methods (e.g., adding a new MFA device). Additionally, implement use of:
Trusted Locations
Notification of authentication / security changes
Out-of-band verification for high-risk changes. For example, require a call-back to a registered number or confirmation via a known corporate email before proceeding with any sensitive request.
Avoid reliance on publicly available personal data for verification (e.g., DOB, last 4 SSN) as UNC3944 often possesses this information. Use internal-only knowledge or real-time presence verification when possible.
Temporarily disable self-service MFA resets during elevated threat periods, and route all such changes through manual help desk workflows with enhanced scrutiny.
Strong Authentication
To prevent against social engineering or other methods used to bypass authentication controls:
Remove SMS, phone call, and/or email as authentication controls.
Utilize an authenticator app that requires phishing resistant MFA (e.g., number matching and/or geo-verification).
If possible, transition to passwordless authentication.
Leverage FIDO2 security keys for authenticating identities that are assigned privileged roles.
Ensure administrative users cannot register or use legacy MFA methods, even if those are permitted for lower-tier users.
Enforce multi-context criteria to enrich the authentication transaction. Examples include not only validating the identity, but also specific device and location attributes as part of the authentication transaction.
For organizations that leverage Google Workspace, these concepts can be enforced by using context-aware access policies.
For organizations that leverage Microsoft Entra ID, these concepts can be enforced by using a Conditional Access Policy.
MFA Registration and Modification
To prevent compromised credentials from being leveraged for modifying and registering an attacker-controlled MFA method:
Review authentication methods available for user registration and disallow any unnecessary or duplicative methods.
Restrict MFA registration and modification actions to only be permissible from trusted IP locations and based upon device compliance. For organizations that leverage Microsoft Entra ID, this can be accomplished using a Conditional Access Policy.
If a suspected compromise has occurred, MFA re-registration may be required. This action should only be permissible from corporate locations and/or trusted IP locations.
Review specific IP locations that can bypass the requirement for MFA. If using Microsoft Entra ID, these can be in Named Locations and the legacy Service Settings.
Investigate and alert when the same MFA method or phone number is registered across multiple user accounts, which may indicate attacker-controlled device registration.
Administrative Roles
To prevent against privilege escalation and further access to an environment:
For privileged access, decouple the organization’s identity store (e.g., Active Directory) from infrastructure platforms, services, and cloud admin consoles. Organizations should create local administrator accounts (e.g., local VMware VCenter Admin account). Local administrator accounts should adhere to the following principles:
Created with long and complex passwords
Passwords should not be temporarily stored within the organization’s password management or vault solution
Enforcement of Multi-Factor Authentication (MFA)
Restrict administrative portals to only be accessible from trusted locations and with privileged identities.
Leverage just-in-time controls for leveraging (“checking out”) credentials associated with privileged actions.
Enforce access restrictions and boundaries that follow the principle of least-privilege for accessing and administering cloud resources.
For organizations that leverage Google Cloud, these concepts can be enforced by using IAM deny or principle access boundary policies.
For organizations that leverage Microsoft Entra ID, these concepts can be enforced by using Azure RBAC and Entra ID RBAC controls.
Enforce that privileged accounts are hardened to prevent exposure or usage on non-Tier 0 or non-PAW endpoints.
Playbooks
Modern-day authentication is predicated on more than just a singular password. Therefore, organizations should ensure that processes and associated playbooks include steps to:
Revoke tokens and access keys.
Review MFA device registrations.
Review changes to authentication requirements.
Review newly enrolled devices and endpoints.
Endpoints
Device Compliance and Validation
An authentication transaction should not only include strong requirements for identity verification, but also require that the device be authenticated and validated. Organizations should consider the ability to:
Enforce posture checks for devices remotely connecting to an environment (e.g., via a VPN). Example posture checks for devices include:
Validating the installation of a required host-based certificate on each endpoint.
Verifying that the endpoint operates on an approved Operating System (OS) and meets version requirements.
Confirming the organization’s Endpoint Detection and Response (EDR) agent is installed and actively running. Enforce EDR installation and monitoring for all managed endpoint devices.
Rogue / Unauthorized Endpoints
To prevent against threat actors leveraging rogue endpoints to access an environment, organizations should:
Monitor for rogue bastion hosts or virtual machines that are either newly created or recently joined to a managed domain.
Harden policies to restrict the ability to join devices to Entra or on-premises Active Directory.
Review authentication logs for devices that contain default Windows host names.
Lateral Movement Hardening
To prevent against lateral movement using compromised credentials, organizations should:
Limit the ability for local accounts to be used for remote (network-based) authentication.
Disable or restrict local administrative and/or hidden shares from being remotely accessible.
Enforce local firewall rules to block inbound SMB, RDP, WinRM, PowerShell, & WMI.
GPOs: User Rights Assignment Lockdown (Active Directory)
For domain-based privileged and service accounts, where possible, organizations should restrict the ability for accounts to be leveraged for remote authentication to endpoints. This can be accomplished using a Group Policy Object (GPO) configuration for the following user rights assignments:
Deny log on locally
Deny log on through Remote Desktop Services
Deny access to this computer from network
Deny log on as a batch
Deny log on as a service
Applications and Resources
Virtual Private Network (VPN) Access
Threat actors may attempt to change or disable VPN agents to limit network visibility by security teams. Therefore, organizations should:
Disable the ability for end users to modify VPN agent configurations.
Ensure appropriate logging when configuration changes are made to VPN agents.
For managed devices, consider an “Always-On” VPN configuration to ensure continuous protection.
Privileged Access Management (PAM) Systems
To prevent against threat actors attempting to gain access to privileged access management (PAM) systems, organizations should:
Isolate and enforce network and identity access restrictions for enterprise password managers or privileged access management (PAM) systems. This should also include leveraging dedicated and segmented servers / appliances for PAM systems, which are isolated from enterprise infrastructure and virtualization platforms.
Reduce the scope of accounts that have access to PAM systems, in addition to requiring strong authentication (MFA).
Enforce role-based access controls (RBAC) within PAM systems, restricting the scope of accounts that can be accessed (based upon an assigned role).
Follow the principle of just-in-time (JIT) access for checking-out credentials stored in PAM systems.
Virtualization Infrastructure
To prevent against threat actors attempting to gain access to virtualization infrastructure, organizations should:
Isolate and restrict access to ESXi hosts / vCenter Server Appliances.
Ensure that backups of virtual machines are isolated, secured and immutable if possible.
Unbind the authentication for administrative access to virtualization platforms from the centralized identity provider (IdP). This includes individual ESXi hosts and vCenter Servers.
Proactively rotate local root / administrative passwords for privileged identities associated with virtualization platforms.
If possible use stronger MFA and bind to local SSO for all administrative access to virtualization infrastructure.
Enforce randomized passwords for local root / administrative identities correlating to each virtualized host that is part of an aggregate pool.
Disable / restrict SSH (shell) access to virtualization platforms.
Enable lockdown mode on all ESXi hosts.
Enhance monitoring to identify potential malicious / suspicious authentication attempts and activities associated with virtualization platforms.
Backup Infrastructure
To prevent against threat actors attempting to gain access to backup infrastructure and data, organizations should:
Leverage unique and separate (non-identity provider integrated) credentials for accessing and managing backup infrastructure, in addition to the enforcement of MFA for the accounts.
Ensure that backup servers are isolated from the production environment and reside within a dedicated network. To further protect backups, they should be within an immutable backup solution.
Implement access controls that restrict inbound traffic and protocols for accessing administrative interfaces associated with backup infrastructure.
Periodically validate the protection and integrity of backups by simulating adversarial behaviors (red teaming).
Endpoint Security Management
To prevent against threat actors weaponizing endpoint security and management technologies such as EDR and patch management tools, organizations should:
Segment administrative access to endpoint security tooling platforms.
Reduce the scope of identities that have the ability to create, edit, or delete Group Policy Objects (GPOs) in on-premises Active Directory.
If Intune is leveraged, enforce Intune access policies that require multi-administrator approval (MMA) to approve and enforce changes.
Monitor and review unauthorized access to EDR and patch management technologies.
Monitor script and application deployment on endpoints and systems using EDR and patch management technologies.
Review and monitor “allow-listed” executables, processes, paths, and applications.
Inventory installed applications on endpoints and review for potential unauthorized installations of remote access (RATs) and reconnaissance tools.
Cloud Resources
To prevent against threat actors leveraging access to cloud infrastructure for additional persistence and access, organizations should:
Monitor and review cloud resource configurations to identify and investigate newly created resources, exposed services, or other unauthorized configurations.
Monitor cloud infrastructure for newly created or modified network security group (NSG) rules, firewall rules, or publicly exposed resources that can be remotely accessed.
Monitor for the creation of programmatic keys and credentials (e.g., access keys).
Network Infrastructure
Access Restrictions
To proactively identify exposed applications, ingress pathways, and to reduce the risk of unauthorized access, organizations should:
Leverage vulnerability scanning to perform an external unauthenticated scan to identify publicly exposed domains, IPs, and CIDR IP ranges.
Enforce strong authentication (e.g., phishing-resistant MFA) for accessing any applications and services that are publicly accessible.
For sensitive data and applications, enforce connectivity to cloud environments / SaaS applications to only be permissible from specific (trusted) IP ranges.
Block TOR exit node and VPS IP ranges.
Network Segmentation
The terminology of “Trusted Service Infrastructure” (TSI) is typically associated with management interfaces for platforms and technologies that provide core services for an organization. Examples include:
Asset and Patch Management Tools
Network Management Tools and Devices
Virtualization Platforms
Backup Technologies
Security Tooling
Privileged Access Management Systems
To minimize the direct access and exposure of the management plane for TSI, organizations should:
Restrict access to TSI to only originate from internal / hardened network segments or PAWs.
Create detections focused on monitoring network traffic patterns for directly accessing TSI, and alert on anomalies or suspicious traffic.
Egress Restrictions
To restrict the ability for command-and-control and reduce the capabilities for mass data exfiltration, organizations should:
Restrict egress communications from all servers. Organizations should prioritize enforcing egress restrictions from servers associated with TSI, Active Directory domain controllers, and crown jewel application and data servers.
Block outbound traffic to malicious domain names, IP addresses, and domain names/addresses associated with remote access tools (RATs).
Monitoring / Detections
Reconnaissance
Upon initial compromise, UNC3944 is known to search for documentation on topics such as: user provisioning, MFA and/or device registration, network diagrams, and shared credentials in documents or spreadsheets.
UNC3944 will also use network reconnaissance tools like ADRecon, ADExplorer, and SharpHound. Therefore, organizations should:
Ensure any sites or portals that include these documents have access restrictions to only required accounts.
Sweep for documents and spreadsheets that may contain shared credentials and remove them.
Implement alerting rules on endpoints with EDR agents for possible execution of known reconnaissance tools.
If utilizing an Identity monitoring solution, ensure detection rules are enabled and alerts are created for any reconnaissance and discovery detections.
Implement an automated mechanism to continuously monitor domain registrations. Identify domains that mimic the organization’s naming conventions, for instance:[YourOrganizationName]-helpdesk.com or [YourOrganizationName]-SSO.com.
MFA Registration
To further harden the MFA registration process, organizations should:
Review logs to specifically identify events related to the registration or addition of new MFA devices or methods to include actions similar to:
MFA device registered
Authenticator app added
Phone number added for MFA
The same MFA device / method / phone number being associated with multiple users
Verify the legitimacy of new registrations against expected user behavior and any onboarding or device enrollment records.
Contact users if new registrations are detected to confirm if the activity is intentional.
Collaboration and Communication Platforms
To prevent against social engineering and/or unauthorized access or modifications to communication platforms, organizations should:
Review organizational policies around communication tools such as Microsoft Teams.
Allow only trusted external domains for expected vendors and partners.
If external domains cannot be blocked, create a baseline of trusted domains and alert on new domains that attempt to contact employees.
Provide awareness training to employees and staff to directly contact the organization’s helpdesk if they receive suspicious calls or messages.
The following is a Microsoft Defender advanced hunting query example. The query is written to detect when an external account (attempting to impersonate the help desk) attempts to contact the organization’s users.
Note: The DisplayName field can be modified to include other relevant fields specific to the organization (such as “IT Support” or “ServiceDesk”).
CloudAppEvents
| where Application == "Microsoft Teams"
| where ActionType == "ChatCreated"
| extend HasForeignTenantUsers =
parse_json(RawEventData)["ParticipantInfo"]["HasForeignTenantUsers"]
| extend DisplayName = parse_json(RawEventData)["Members"][0]["DisplayName"]
| where IsExternalUser == 1 or HasForeignTenantUsers == 'true'
| where DisplayName contains "help" or AccountDisplayName contains "help"
or AccountId contains "help"
The following is a Google SecOps search query example.
Note: The DisplayName field can be modified to include other relevant fields specific to the organization (such as “IT Support” or “ServiceDesk”).
Authentication from infrequent locations – including from proxy and VPN service providers.
Attempts made to change authentication methods or criteria.
Monitoring and hunting for authentication anomalies based upon social engineering tactics.
Bypassing Multi-Factor Authentication
UNC3944 has been known to modify requirements for the use of Multi-factor Authentication. Therefore, organizations should:
For Entra ID, monitor for modifications to any Trusted Named Locations that may be used to bypass the requirement for MFA.
For Entra ID, monitor for changes to Conditional Access Policies that enforce MFA, specifically focusing on exclusions of compromised user accounts and/or devices for an associated policy.
Ensure the SOC has visibility into token replay or suspicious device logins, aligning workflows that can trigger step-up (re)authentication when suspicious activity is detected.
Abuse of Domain Federation
For organizations that are using Microsoft Entra ID, monitor for possible abuse of Entra ID Identity Federation:
Check domain names that are registered in the Entra ID tenant, paying particular attention to domains that are marked as Federated.
Review the Federation configuration of these domains to ensure that they are correct.
Monitor for creation of any new domains within the tenant and for changing the authentication method to be Federated.
Abuse of Domain Federation requires the account accomplishing the changes to have administrative permissions in Entra ID. Hardening of all administrative accounts, portals, and programmatic access is imperative.
Social Engineering Awareness
UNC3944 is extremely proficient at using multiple forms of social engineering to convince users into doing something that will allow them to gain access. Organizations should educate users to be aware of and notify internal security teams of attempts that utilize the following tactics:
SMS phishing messages that claim to be from IT requesting users to download and install software on their machine. These may include claims that the user’s machine is out-of-compliance or is failing to report to internal management systems.
SMS messages or emails with links to sites that reference domain names that appear legitimate and reference SSO (single sign-on) and a variation of the company name. Messages may include text informing the user that they need to reset their password and/or MFA.
Phone calls to users from IT with requests to reset a password and/or MFA – or requesting that the user provide a validated one time passcode (OTP) from their device.
SMS messages or emails with requests to be granted access to a particular system, particularly if the organization already has an established method for provisioning access.
MFA fatigue attacks, where attackers may repeatedly send MFA push notifications to a victim’s device until the user unintentionally or out of frustration accepts one. Organizations should train users to reject unexpected MFA prompts and report such activity immediately.
Impersonation via collaboration tools – UNC3944 has used platforms like Microsoft Teams to pose as internal IT support or service desk personnel. Organizations should train users to verify unusual chat messages and avoid sharing credentials or MFA codes over internal collaboration tools like Microsoft Teams. Limiting external domains and monitoring for impersonation attempts (e.g., usernames containing ‘helpdesk’ or ‘support’) is advised.
In rare cases, attackers have used doxxing threats or aggressive language to scare users into compliance. Ensure employees understand this tactic and know that the organization will support them if they report these incidents.
Across industries, enterprises need efficient and proactive solutions. Imagine frontline professionals using voice commands and visual input to diagnose issues, access vital information, and initiate processes in real-time. The Gemini 2.0 Flash Live API empowers developers to create next-generation, agentic industry applications.
This API extends these capabilities to complex industrial operations. Unlike solutions relying on single data types, it leverages multimodal data – audio, visual, and text – in a continuous livestream. This enables intelligent assistants that truly understand and respond to the diverse needs of industry professionals across sectors like manufacturing, healthcare, energy, and logistics.
In this post, we’ll walk you through a use case focused on industrial condition monitoring, specifically motor maintenance, powered by Gemini 2.0 Flash Live API. The Live API enables low-latency bidirectional voice and video interactions with Gemini. With this API we can provide end users with the experience of natural, human-like voice conversations, and with the ability to interrupt the model’s responses using voice commands. The model can process text, audio, and video input, and it can provide text and audio output. Our use case highlights the API’s advantages over conventional AI and its potential for strategic collaborations.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7fa064a370>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Demonstrating multimodal intelligence: A condition monitoring use case
The demonstration features a live, bi-directional multimodal streaming backend driven by Gemini 2.0 Flash Live API, capable of real-time audio and visual processing, enabling advanced reasoning and life-like conversations. Utilizing the API’s agentic and function calling capabilities alongside Google Cloud services allows for building powerful live multimodal systems with a clean, mobile-optimized user interface for factory floor operators. The demonstration uses a motor with a visible defect as a real-world anchor.
Real-time visual identification: Pointing the camera at a motor, Gemini identifies the model and instantly summarizes relevant information from its manual, providing quick access to crucial equipment details.
Real-time visual defect identification: With a voice command like “Inspect this motor for visual defects,” Gemini analyzes the live video, identifies and localizes the defect, and explains its reasoning.
Streamlined repair initiation: Upon identifying defects, the system automatically prepares and sends an email with the highlighted defect image and part information, directly initiating the repair process.
Real-time audio defect identification: Analyzing pre-recorded audio of healthy and defective motors, Gemini accurately distinguishes the faulty one based on its sound profile and explains its analysis.
Multimodal QA on operations: Operators can ask complex questions about the motor while pointing the camera at specific components. Gemini intelligently combines visual context with information from the motor manual to provide accurate voice-based answers.
Under the hood: The technical architecture
The demonstration leverages the Gemini Multimodal Livestreaming API on Google Cloud Vertex AI. The API manages the core workflow and agentic function calling, while the regular Gemini API handles visual and audio feature extraction.
The workflow involves:
Agentic function calling: The API interprets user voice and visual input to determine the desired action.
Audio defect detection: Upon user intent, the system records motor sounds, stores them in GCS, and triggers a function that uses a prompt with examples of healthy and defective sounds, analyzed by the Gemini Flash 2.0 API to diagnose the motor’s health.
Visual inspection: The API recognizes the intent to detect visual defects, captures images, and calls a function that uses zero-shot detection with a text prompt, leveraging the spatial understanding of the Gemini Flash 2.0 API to identify and highlight defects.
Multimodal QA: When users ask questions, the API identifies the intent for information retrieval, performs RAG on the motor manual, combines it with multimodal context, and uses the Gemini API to provide accurate answers.
Sending repair orders: Recognizing the intent to initiate a repair, the API extracts the part number and defect image, using a pre-defined template to automatically send a repair order via email.
Such a demo can be easily built with minimal custom integration, by referring to theguide here, and incorporating the features mentioned in the diagram above. The majority of the effort would be in adding custom function calls for various use cases.
Key capabilities and industrial benefits with cross-industry use cases
Real-time multimodal processing: The API’s ability to simultaneously process live audio and visual streams provides immediate insights in dynamic environments, crucial for preventing downtime and ensuring operational continuity.
Use case: In healthcare, a remote medical assistant could use live video and audio to guide a field paramedic, receiving real-time vital signs and visual information to provide expert support during emergencies.
Advanced audio & visual reasoning: Gemini’s sophisticated reasoning interprets complex visual scenes and subtle auditory cues for accurate diagnostics.
Use Case: In manufacturing, AI can analyze the sounds and visuals of machinery to predict failures before they occur, minimizing production disruptions.
Agentic function calling for automated workflows: The API’s agentic nature enables intelligent assistants to proactively trigger actions, like generating reports or initiating processes, streamlining workflows.
Use case: In logistics, a voice command and visual confirmation of a damaged package could automatically trigger a claim process and notify relevant parties.
Seamless integration and scalability: Built on Vertex AI, the API integrates with other Google Cloud services, ensuring scalability and reliability for large-scale deployments.
Use case: In agriculture, drones equipped with cameras and microphones could stream live data to the API for real-time analysis of crop health and pest detection across vast farmlands.
Mobile-optimized user experience: The mobile-first design ensures accessibility for frontline workers, allowing interaction with the AI assistant at the point of need using familiar devices.
Use case: In retail, store associates could use voice and image recognition to quickly check inventory, locate products, or access product information for customers directly on the store floor.
Proactive maintenance and efficiency gains: By enabling real-time condition monitoring, industries can shift from reactive to predictive maintenance, reducing downtime, optimizing asset utilization, and improving overall efficiency across sectors.
Use case: In the energy sector, field technicians can use the API to diagnose issues with remote equipment like wind turbines through live audio and visual streams, reducing the need for costly and time-consuming site visits.
Get started
Explore the cutting edge of AI interaction with the Gemini Live API, as showcased by this solution. Developers can leverage its codebase – featuring low-latency voice, webcam/screen integration, interruptible streaming audio, and a modular tool system via Cloud Functions – as a robust starting point. Clone the project, adapt the components, and begin creating transformative, multimodal AI solutions that feel truly conversational and aware. The future of the intelligent industry is live, multimodal, and within reach for all sectors.
For AI developers building cutting-edge applications with large model sizes, a reliable foundation is non-negotiable. You need your AI to perform consistently, delivering results without hiccups, even under pressure. This means having dedicated resources that won’t get bogged down by other users’ activity. While existing Vertex AI Prediction Endpoints – managed pools of resources to deploy AI models for online inference – provide a capable serving solution, developers need better ways to reach consistent performance and resource isolation in case of shared resource contention.
Today, we are pleased to announce Vertex AI Prediction Dedicated Endpoints, a new family of Vertex AI Prediction endpoints, designed to address the needs of modern AI applications, including those related with large-scale generative AI models.
Dedicated endpoint architected for generative AI and large models
Serving generative AI and other large-scale models introduces unique challenges related to payload size, inference time, interactivity, and performance demands. The new Vertex AI Prediction Dedicated Endpoints have been specifically engineered to help you build more reliably with the following new integrated features:
Native support for streaming inference: Essential for interactive applications like chatbots or real-time content generation, Vertex AI Endpoints now provide native support for streaming, simplifying development and architecture, via the following APIs:
streamRawPredict: Utilize this dedicated API method for bidirectional streaming to send prompts and receive sequences of responses (e.g., tokens) as they become available.
OpenAI Chat Completion: To facilitate interoperability and ease migration, endpoints serving compatible models can optionally expose an interface conforming to the widely used OpenAI Chat Completion streaming API standard.
gRPC protocol support: For latency-sensitive applications or high-throughput scenarios often encountered with large models, endpoints now natively support gRPC. Leveraging HTTP/2 and Protocol Buffers, gRPC can offer performance advantages over standard REST/HTTP.
Customizable request timeouts: Large models can have significantly longer inference times. We now provide the flexibility, via API, to configure custom timeouts for prediction requests, accommodating a wider range of model processing durations beyond the default settings.
Optimized resource handling: The underlying infrastructure is designed to better handle the resource demands (CPU/GPU, memory, network bandwidth) of large models, contributing to the overall stability and performance, especially when paired with Private Endpoints.
The newly integrated capabilities of Vertex AI Prediction Dedicated Endpoints offer a unified and robust serving solution tailored for demanding modern AI workloads. From today, Vertex AI Model Garden will use Vertex AI Prediction Dedicated Endpoints as the standard serving method for self-deployed models.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7fac8ca580>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Optimized networking via Private Service Connect (PSC)
While Dedicated Endpoints Public remain available for models accessible over the public internet, we are enhancing networking options on Dedicated Endpoints utilizing Google Cloud Private Service Connect (PSC). The new DedicatedEndpoints Private (via PSC)provide a secure and performance-optimized path for prediction requests. By leveraging PSC, traffic routes entirely within Google Cloud’s network, offering significant benefits:
Enhanced security: Requests originate from within your Virtual Private Cloud (VPC) network, eliminating public internet exposure for the endpoint.
Improved performance consistency: Bypassing the public internet reduces latency variability.
Reduced performance interference: PSC facilitates better network traffic isolation, mitigating potential “noisy neighbor” effects and leading to more predictable performance, especially for demanding workloads.
For production workloads with strict security requirements and predictable latency, Private Endpoints using Private Service Connect are the recommended configuration.
How Sojern is using the new Vertex AI Prediction Dedicated Endpoints to serve models at scale
Sojern is a marketing company focusing on the hospitality industry, matching potential customers to travel businesses around the globe. As part of their growth plans, Sojern turned to Vertex AI. Leaving their self-managed ML stack behind, Sojern can focus more on innovation, while scaling out far beyond their historical footprint.
Given the nature of Sojern’s business, their ML deployments follow a unique deployment model, requiring several high throughput endpoints to be available and agile at all times, allowing for constant model evolution. Using Public Endpoints would cause rate limiting and ultimately degrade user experience; moving to a Shared VPC model would have required a major design change for existing consumers of the models.
With Private Service Connect (PSC) and Dedicated Endpoint, Sojern avoided hitting the quotas / limits enforced on Public Endpoints, while also avoiding a network redesign to accommodate Shared VPC.
The ability to quickly promote tested models, take advantage of Dedicated Endpoint’s enhanced featureset, and improve latency for their customers strongly aligned with Sojern’s goals. The Sojern team continues to onboard new models, always improving accuracy and customer satisfaction, powered by Private Service Connect and Dedicated Endpoint.
Get started
Are you struggling to scale your prediction workloads on Vertex AI? Check out the resources below to start using the new Vertex AI Prediction Dedicated Endpoints:
Your experience and feedback are important as we continue to evolve Vertex AI. We encourage you to explore these new endpoint capabilities and share your insights through Google Cloud community forum.
When’s the last time you watched a race for the braking?
It’s the heart-pounding acceleration and death-defying maneuvers that keep most motorsport fans on the edge of their seats. Especially when it comes to Formula E — and really all EVs — the explosive, near-instantaneous acceleration of an electric motor is part of the appeal.
A less considered, yet no less important feature, is how EVs can regeneratively brake, turning friction into fuel. Part of Formula E’s mission is to make EVs a compelling automotive choice for consumers, not just world-class racers; highlighting this powerful aspect of the vehicles has become a priority. The question remained: How do you get others to feel the same exhilaration from deceleration?
The answer came from the mountains above Monaco, as well as some prompts in Gemini 2.5.
In the lead up to the Monaco E-Prix, Formula E and Google undertook a project dubbed Mountain Recharge. The challenge: Whether a Formula E GENBETA race car, starting with only 1% battery, could regenerate enough energy from braking during a descent through France’s coastal Alps to then complete a full lap of the iconic Monaco circuit.
More than just a stunt, this experiment is testing the boundaries of technology — and not just in EVs, but on the cloud, too. Without the live analytics and plenty of AI-powered planning, the Mountain Recharge might not have come to pass. In fact, AI even helped determine which mountain pass would be best suited for this effort. (Read on to find out which one, and see if we made it to the bottom.)
Mountain Recharge is exciting not only for thrills on the course but also the potential it shows for AI across industries. In addition to its role in helping to execute tasks, AI proved valuable to the brainstorming, experimentation, and rapidfire simulations that helped get Mountain Recharge to the finish line.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e50bafb0cd0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Planning the charge up the mountain
Before even setting foot or wheel to the course, the team at Formula E and Google Cloud turned to Gemini to try and figure out if such an endeavor was possible.
To answer the fundamental question of feasibility, the team entered a straightforward prompt into Google’s AI Studio: “Starting with just 1% battery, could the GENBETA car potentially generate enough recharge by descending a high mountain pass to do a lap of the Circuit of Monaco?”
The AI Studio validator, running Gemini 2.5 Pro with its deep reasoning functionality, analyzed first-party data that had been uploaded by Formula E on the GENBETA’s capabilities; we then grounded the model with Google Search to further improve accuracy and reliability by connecting to the universe of information available online.
AI Studio shared its “thinking” in a detailed eight-step process, which included identifying the key information needed; consulting the provided documents; gathering external information through a simulated search; performing calculations and analysis; and finally synthesizing the answer based on the core question.
The final output: “theoretically feasible.” In other words, the perfect challenge.
Navigating the steep turns above Monaco helped generate plenty of power for Mountain Recharge.
Still working in AI Studio, we then used a new feature, the ability to build custom apps such as the Maps Explorer, to determine the best route, which turned out to be theCol de Braus. AI Studio then mapped out a route for the challenge. This rigorous, data-backed validation, facilitated by AI Studio and Gemini’s ability to incorporate technical specifications and estimations, transformed the project from a speculative what-if into something Formula E felt confident attempting.
AI played an important role away from the course, as well. To aid in coordination and planning, teams at Formula E and Google Cloud used NotebookLM to digest the technical regulations and battery specifications and locate relevant information within them, which, given the complexity of the challenge and the number of parties involved, helped ensure cross-functional teams were kept up to date and grounded with sourced data to help make informed decisions.
Smart cars, smart drivers, and a smartphone
During the mountain descent, real-time monitoring of the car’s progress and energy regeneration would be crucial. Firebase and BigQuery were instrumental in visualizing this real-time telemetry. Data from both multiple sensors and Google Maps was streamed to BigQuery, Google Cloud’s data warehouse, from a high-performance mobile phone connected to the car (a Pixel 9 was well suited to the task).
This data stream proved to be yet another challenge to overcome, because of the patchy mobile signal in the mountainous terrain of the Maritime Alps. When data couldn’t be sent, it was cached locally on the phone until the signal was available again.
BigQuery’s capacity for real-time data ingestion and in-platform AI model creation enabled speedy analysis and the calculation of essential metrics. A web-based dashboard was developed using Firebase that connected to BigQuery to display both data and insights. AI Studio greatly facilitated the development of the application by translating a picture of a dashboard mockup into fully functional code.
“From figuring out if our crazy Mountain Recharge idea was even possible, to giving us live insights during the descent, AI was our guide,” said Alex Aidan, Formula E’s VP of Marketing. “It’s what turned an ambitious ‘what if’ into a reality we could track moment by moment.”
After completing its descent, the car stored up enough energy that it is expected to complete its lap of the Monaco circuit on Saturday, as part of the E-Prix’s pre-race festivities.
A different kind of push start.
Benefits beyond the finish line
Both the success and the development of the Mountain Recharge campaign offer valuable lessons to others pursuing ambitious projects. It shows that AI doesn’t have to be central to a project — it can be just as powerful at facilitating and optimizing something we’ve been doing for years, like racing cars. Our results in the Mountain Recharge only underscores the potential benefits of AI for a wide range of industries:
Enhanced planning and exploration: Just as Gemini helped Formula E explore unconventional ideas and identify the optimal route, businesses can leverage large language models for innovative problem-solving, market analysis, and strategic planning, uncovering unexpected angles and accelerating the journey from “what if” to “we can do that”.
Streamlined project management: NotebookLM’s ability to centralize and organize vast amounts of information demonstrates how AI can significantly improve efficiency in complex projects, from logistics and resource allocation to research and compliance. This reduces the risk of errors and ensures smoother coordination across teams.
Data-driven decision making: The real-time data analysis capabilities showcased in the Mountain Recharge underscore the power of cloud-based data platforms like BigQuery. Organizations can leverage these tools to gain immediate insights from their data, enabling them to make agile adjustments and optimize performance on the fly. This is invaluable in dynamic environments where rapid responses are critical.
Deeper understanding of complex systems: By applying AI to analyze intricate data streams, teams can gain a more profound understanding of the factors influencing performance.
Such capabilities certainly impressed James Rossiter, a former Formula E Team Principal, current test driver, and broadcaster for the series. “I was really surprised at the detail of the advice and things to consider,” Rossiter said. “We always talk about these things as a team, but as this is so different to racing, I had to totally rethink the drive.”
The Formula E Mountain Recharge campaign is more than just an exciting piece of content; it’s a testament to the power of human ingenuity amplified by intelligent technology. It’s also the latest collaboration between Formula E and Google Cloud and our shared commitment to use AI to push the boundaries of what’s possible in the sport in the sport and in the world.
We’ve already developed an AI-powered digital driving coach to help level the field for EV racing. Now, with the Mountain Recharge, we can inspire everyday drivers well beyond the track with the capabilities of electric vehicles.
It’s thinking big, even if it all starts with a simple prompt on a screen. You just have to ask the right questions, starting with the most important ones: Is this possible, and how can we make it so?
At Google Cloud, we empower businesses to accelerate their generative AI innovation cycle by providing a path from prototype to production. Palo Alto Networks, a global cybersecurity leader, partnered with Google Cloud to develop an innovative security posture control solution that can answer complex “how-to” questions on demand, provide deep insights into risk with just a few clicks, and guide users through remediation steps.
Using advanced AI services, including Google’s Gemini models and managed Retrieval Augmented Generation (RAG) services such as Google Cloud’s Vertex AI Search, Palo Alto Networks had an ideal foundation for building and deploying gen AI-powered solutions.
The end result was Prisma Cloud Co-pilot, the Palo Alto Networks Prisma Cloud gen AI offering. It helps simplify cloud security management by providing an intuitive, AI-powered interface to help understand and mitigate risks.
Technical challenges and surprises
The Palo Alto Networks Prisma Cloud Co-pilot journey began in 2023 and launched in October 2024. During this time, Palo Alto Networks witnessed Google’s AI models evolve rapidly, from Text Bison (PaLM) to Gemini Flash 1.5. That rapid pace of innovation meant that each iteration brought new capabilities, necessitating a development process that could quickly adapt to the evolving landscape.
To effectively navigate the dynamic landscape of evolving gen AI models, Palo Alto Networks established robust processes that proved invaluable to their success:
Prompt engineering and management: Palo Alto Networks used Vertex AI to help manage prompt templates and built a diverse prompt library to generate a wide range of responses. To rigorously test each new model’s capabilities, limitations, and performance across various tasks, Palo Alto Networks and Google Cloud team systematically created and updated prompts for each submodule. Additionally, Vertex AI’s Prompt Optimizer helped streamline the tedious trial-and-error process of prompt engineering.
Intent recognition:Palo Alto Networks used the Gemini Flash 1.5 model to develop an intent recognition module, which efficiently routed user queries to the relevant co-pilot component. This approach provided users with many capabilities through a unified and lightweight user experience.
Input guardrails: Palo Alto Networks created guardrails as a first line of defense against unexpected, malicious, or simply incorrect queries that could compromise the functionality and experience of the chatbot. These guardrails maintain the chatbot’s intended functionality by preventing known prompt injection attacks, such as circumventing system instructions; and restricting chatbot usage to its intended scope. Guardrails were created to detect if user queries are restricted to responses within the predefined domain of general cloud security, risks, and vulnerabilities to prevent unintended use. Any topics outside this scope did not receive a response from the chatbot. Additionally, since the chatbot was designed for proprietary code generation for Palo Alto Networks systems to query internal systems, requests for general-purpose code generation similarly did not receive a response.
Evaluation dataset curation: A robust and representative evaluation dataset serves as a foundation to accurately and quickly assess the performance of gen AI models. The Palo Alto Networks team took great care to choose high-quality evaluation data and keep it relevant by constantly refreshing it with representative questions and expert-validated answers. The accuracy and reliability of the evaluation dataset was sourced and validated directly from Palo Alto Networks subject matter experts.
Automated evaluation:In collaboration with Google Cloud, Palo Alto Networks developed an automated evaluation pipeline using Vertex AI’s gen AI evaluation service. This pipeline allowed Palo Alto Networks to rigorously scale their assessment of different gen AI models, and benchmark those models using custom evaluation metrics while focusing on key performance indicators such as accuracy, latency, and consistency of responses.
Human evaluator training and red teaming: Palo Alto Networks invested in training their human evaluation team to identify and analyze specific loss patterns and provide detailed answers on a broad set of custom rubrics. This allowed them to pinpoint where a model’s response was inadequate and provide insightful feedback on model performance, which then guided model selection and refinement.
The team also conducted red teaming exercises focused on key areas, including:
Manipulating the co-pilot: Can the co-pilot be tricked into giving bad advice by feeding it false information?
Extracting sensitive data: Can the co-pilot be manipulated into revealing confidential information or system details?
Bypassing security controls: Can the co-pilot be used to craft attacks that circumvent existing security measures?
Load testing:To ensure the gen AI solutions met real-time demands, Palo Alto Networks actively load tested them, working within the pre-defined QPM (query per minute) and latency parameters of Gemini models. They simulated user traffic scenarios to find the optimal balance between responsiveness and scalability using provisioned throughput, which helped ensure a smooth user experience even during peak usage.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3e91549430>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Operational and business challenges
Operationalizing gen AI can introduce complex challenges across multiple functions, especially for compliance, legal, and information security. Evaluating ROI for gen AI solutions also requires new metrics. To address these challenges, Palo Alto Networks implemented the following techniques and processes:
Data residency and regional ML processing:Since many Palo Alto Networks customers need a regional approach for ML processing capabilities, we prioritized regional machine learning processing to help enable customer compliance with data residency needs and regional regulations, if applicable.
Where Google does not offer an AI data center that matched Prisma Cloud data center locations, customers were able to choose having their data processed in the U.S. before gaining access to the Prisma Cloud Co-pilot. We implemented strict data governance policies and used Google Cloud’s secure infrastructure to help safeguard sensitive information and uphold user privacy.
Deciding KPIs and measuring success for gen AI apps:The dynamic and nuanced nature of gen AI applications demands a bespoke set of metrics tailored to capture its specific characteristics and comprehensively evaluate its efficacy. There are no standard metrics that work for all use cases. The Prisma Cloud AI Co-pilot team relied on technical and business metrics to measure how well the system was operating.
Technical metrics, such as recall, helped to measure how thoroughly the system fetches relevant URLs when answering questions from documents, and to help increase the accuracy of prompt responses and provide source information for users.
Customer experience metrics, such as measuring helpfulness, relied on explicit feedback and telemetry data analysis. This provided deeper insights into user experience that resulted in increased productivity and cost savings.
Collaborating with security and legal teams: Palo Alto Networks brought in legal, information security, and other critical stakeholders early in the process to identify risks and create guardrails for issues including, but not limited to: information security requirements, elimination of bias in the dataset, appropriate functionality of the tool, and data usage in compliance with applicable law and contractual obligations.
Given customer concerns, enterprises must prioritize clear communication around data usage, storage, and protection. By collaborating with legal and information security teams early on to create transparency in marketing and product communications, Palo Alto Networks was able to build customer trust and help ensure they have a clear understanding of how and when their data is being used.
Ready to get started with Vertex AI ?
The future of generative AI is bright, and with careful planning and execution, enterprises can unlock its full potential. Explore your organization’s AI needs through practical pilots in Vertex AI, and rely on Google Cloud Consulting for expert guidance.
Your customers might not all speak the same language. If you operate internationally or serve a diverse customer base, you need your chatbot to meet them where they are – whether they’re searching for something in Spanish or Japanese. If you want to give your customers multilingual support with chatbots, you’ll need to orchestrate multiple AI models to handle diverse languages and technical complexities intelligently and efficiently. Customers expect quick, accurate answers in their language, from simple requests to complex troubleshooting.
To get there, developers need a modern architecture that can leverage specialized AI models – such as Gemma and Gemini – and a standardized communication layer so your LLM models can speak the same language, too. Model Context Protocol, or MCP, is a standardized way for AI systems to interact with external data sources and tools. It allows AI agents to access information and execute actions outside their own models, making them more capable and versatile. Let’s explore how we can build a powerful multilingual chatbot using Google’s Gemma, Translation LLM and Gemini models, orchestrated via MCP.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3e910f0ca0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
The challenge: Diverse needs, one interface
Building a truly effective support chatbot might be challenging for a few different reasons:
Language barriers: Support needs to be available in multiple languages, requiring high-quality, low-latency translation.
Query complexity: Questions range from simple FAQs (handled easily by a basic model) to intricate technical problems demanding advanced reasoning.
Efficiency: The chatbot needs to respond quickly without getting bogged down, especially when dealing with complex tasks or translations.
Maintainability: As AI models evolve and business needs change, the system must be easy to update without requiring a complete overhaul.
Trying to build a single, monolithic AI model to handle everything is often inefficient and complex. A better approach? Specialization and smart delegation.
MCP architecture for harnessing different LLMs
The key to making these specialized models work together effectively is MCP. MCP defines how an orchestrator (like our Gemma-powered client) can discover available tools, request specific actions (like translation or complex analysis) from other specialized services, pass necessary information (the “context”), and receive results back. It’s the essential plumbing that allows our “team” of AI models to collaborate. Here’s a framework for how it works with the LLMs:
Gemma: The chatbot uses a versatile LLM like Gemma to manage conversations, understand user requests, handle basic FAQs, and determine when to utilize specialized tools for complex tasks via MCP.
Translation LLM server: A dedicated, lightweight MCP server exposing Google Cloud’s Translation capabilities as a tool. Its sole focus is high-quality, fast translation between languages, callable via MCP.
Gemini: A specialized MCP server uses Gemini Pro or similar LLM for complex technical reasoning and problem-solving when invoked by the orchestrator.
Model Context Protocol:This protocol will allow Gemma to discover and invoke the Translation and Gemini “tools” running on their respective servers.
How it works
Let’s walk through an example non-English language scenario:
A technical question arrives: A customer types a technical question into the chat window, but it’s in French.
Gemma receives the text: The Gemma-powered client receives the French text. It recognizes the language isn’t English and determines translation is needed.
Gemma calls on Translation LLM: : Gemma uses the MCP connection to send the French text to the Translation LLM Server, requesting an English translation.
Text is translated: The Translation LLM Server performs the translation via its MCP-exposed tool and sends the English version back to the client.
This architecture offers broad applicability. For example, imagine a financial institution’s support chatbot where all user input, regardless of the original language, must be preserved in English in real time for fraud detection. Here, Gemma operates as the client, while Translation LLM, Gemini Flash, and Gemini Pro function on the server. In this configuration, the client-side Gemma manages multi-turn conversations for routine inquiries and intelligently directs complex requests to specialized tools. As depicted in the architectural diagram, Gemma manages all user interactions within a multi-turn chat. A tool leveraging Translation LLM can translate user queries and concurrently save them for immediate fraud analysis. Simultaneously, Gemini Flash and Pro models can generate responses based on the user’s requests. For intricate financial inquiries, Gemini Pro can be employed, while Gemini Flash can address less complex questions.
Let’s look at this sample GitHub repo that illustrates how this architecture works.
Why this is a winning combination
This is a powerful combination because it’s designed for both efficiency and how easily you can adapt it.
The main idea is splitting up the work. The Gemma model based client that users interact with stays light, handling the conversation and sending requests where they need to go. Tougher jobs, like translating or complex thinking, are sent to separate LLMs built specifically for those tasks. This way, each piece does what it’s best at, making the whole system perform better.
A big plus is how this makes things easier to manage and more flexible. Because the parts connect with a standard interface (the MCP), you can update or swap out one of the specialized LLMs – maybe to use a newer model for translation – without having to change the Gemma client. This makes updates simpler, reduces potential headaches, and lets you try new things more easily. You can use this kind of setup for things like creating highly personalized content, tackling complex data analysis, or automating workflows more intelligently.
Get started
Ready to build your own specialized, orchestrated AI solutions?
Explore the code: Clone the GitHub repository for this project and experiment with the client and server setup.
We’re thrilled to share that Google Cloud Spanner has been recognized by Gartner in the Critical Capabilities for Cloud Database Management Systems for Operational Use Cases report, where it was ranked #1 in the Lightweight Transactions Use Case and was ranked #3 in the OLTP Transactions Use Case and the Application State Management Use Case.This recognition showcases Spanner’s strength and versatility to handle the most demanding workloads.
Beyond traditional transactions: Expanding capabilities
We believe the Gartner recognition isn’t just about raw performance. We feel it’s about Spanner’s comprehensive feature set, which is designed to address the complex needs of modern enterprises. Beyond its renowned transactional consistency and global scalability, Spanner offers a powerful multi-model experience, seamlessly integrating the graph, full-text, and vector search functionality required by modern applications.
Graph database functionality: Spanner’s ability to model and query relationships makes it a strong fit for applications requiring graph analysis, such as social networks, fraud detection, and recommendation engines.
Full-text search: Integrated full-text search capabilities enable efficient retrieval of unstructured data, powering features like product catalogs, content management systems, and knowledge bases.
Vector search: With the rise of AI and machine learning, Spanner’s vector search capabilities facilitate similarity searches, enabling applications like image recognition, semantic search, and personalized recommendations.
This flexibility allows developers to build diverse applications on a single platform that provides dynamic elasticity combined with operational efficiency without the complexity of managing multiple specialized databases.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2a36fcb9a0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A truly global service: Transactions and analytics combined
Spanner’s global footprint helps ensure low latency and high availability for transactional workloads, regardless of a user’s location. But its power extends beyond transactions. Spanner’s deep integration with BigQuery allows for federated queries, enabling real-time analytics on transactional data without the need for complex ETL processes. This integration also supports reverse ETL from BigQuery, allowing you to push analytical insights back into Spanner for operational use.
Real-world impact: Customer success stories
The true testament to Spanner’s capabilities is its impact on our customers. Here’s a sampling of how it’s being used in the field:
Spanner’s high availability, external consistency, and infinite horizontal scalability made it the ideal choice for Deutsche Bank’s business critical application for online banking.
By consolidating all user data with the exception of logs to a single database for development, COLOPL has eliminated the scalability constraints that occurred when using horizontally and vertically partitioned databases for large-scale services.
With Spanner’s fully-managed relational database, Kroger has been able to build a true event-driven ledger, which enables the company to capture unique events to make better-informed decisions about how to direct associates to be more productive.
Looking ahead
We believe Spanner’s recognition in the Gartner Critical Capabilities report reinforces Google’s position in the Cloud Database Management Systems market. We’re committed to continuing to innovate and expand Spanner’s capabilities, empowering our customers to build the next generation of mission-critical applications.
Whether you need a database for global transactions, multi-model applications or real-time analytics, Spanner is the solution you can rely on. Sign up for a free Spanner trial account and experience the power of multi-model Spanner today.
Gartner Critical Capabilities for Cloud Database Management Systems for Operational Use Cases, Ramke Ramakrishnan, Henry Cook, Xingyu Gu, Masud Miraz, Aaron Rosenbaum, 18 December, 2024.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. in the U.S. and internationally and is used herein with permission. All rights reserved. This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
It’s a core part of our mission at Google Cloud to help you meet your evolving policy, compliance, and business objectives. To help further strengthen the security of your cloud environment, we continue regular delivery of new security controls and capabilities on our cloud platform.
We announced at Google Cloud Next multiple new capabilities in our IAM, Access Risk, and Cloud Governance portfolio. Our announcements covered a wide range of new product capabilities and security enhancements in Google Cloud, including:
Identity and Access Management (IAM)
Access Risk products including VPC Service Controls, Context-Aware Access and Identity Threat Detection and Response
Cloud Governance with Organization Policy Service
Resource Management
We also announced new AI capabilities to help cloud developers and operators at every step of the application lifecycle. These new capabilities take an application-centered approach and embed AI assistance throughout the application development lifecycle, driven by new features in Gemini Code Assist and Gemini Cloud Assist.
IAM, Access Risk, and Cloud Governance portfolio.
What’s new in Identity and Access Management
Workforce Identity Federation
Workforce Identity Federation extends Google Cloud’s identity capabilities to support syncless, attribute-based single sign on. Over 95% of Google Cloud products now support Workforce Identity Federation.We also released support for FedRAMP High government requirements to help manage and satisfy compliance mandates.
Enhanced security for non-human identities
With the rise of microservices and the popularity of multicloud deployments, non-human and workload identities are growing rapidly, much faster than human identities. Many large enterprises now have between 10 and 45 times more non-human identities than human (user) identities, often with expansive permissions and privileges.
Securing non-human identities is a key goal for Google Cloud, and we are announcing two new capabilities to enhance authorization and access protection:
Keyless access to Google Cloud APIs using X.509 certificates, to further strengthen workload authentication.
Cloud Infrastructure Entitlement Management (CIEM) for multicloud
Across the security landscape, we are contending with the problem of excessive and often unnecessary widely-granted permissions. At Google Cloud, we work to proactively address the permission problem with tools that can help you control permission proliferation, while also providing comprehensive defense across all layers.
Cloud Infrastructure Entitlement Management (CIEM), our key tool for addressing permission issues, is now available for Azure (in preview) and generally available for Google Cloud and AWS.
IAM Admin Center
We also announced IAM Admin Center , a single pane of glass experience that is customized to your role, showcasing recommendations, notifications, and active tasks. You can also launch into other services directly from the console.
IAM Admin Center will provide organization administrators and project administrators a unified view to discover, learn, test, and use IAM capabilities. It’ll provide contextual discovery of features, enable focus on day to day tasks, and offer curated guides for getting started and resources for continuous learning.
Additionally, other IAM features grew in coverage and in feature depth.
Previously, we announced IAM Deny and Principal access boundary (PAB) policies, powerful mechanisms to set policy-based guardrails on access to resources. As these important controls continue to grow in service coverage and adoption, now there is a need for tooling to simplify planning and visualize impact.
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb687b11a30>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
What’s new with Access Risk
Comprehensive security demands continuous monitoring and control even with authenticated users and workloads equipped with the right permissions and engaged in active sessions. Google Cloud’s access risk portfolio brings dynamic capabilities that layer additional security controls around users, workloads, and data.
Enhanced access and session security
Today, you can use Context-Aware Access (CAA) to secure access to Google Cloud based on attributes including user identity, network, location, and corporate-managed devices.
Coming soon, CAA will be further enhanced with Identity Threat Detection and Response (ITDR) capabilities, using numerous activity signals, such as activity from a suspicious source or a new geo location, to automatically identify risky behavior, and trigger further security validations using mechanisms such as multi-factor authentication (MFA), re-authentication, or denials.
We also announced automatic re-authentication, which triggers a re-authentication request when users perform highly-sensitive actions such as updating billing accounts. This will be enabled by default, and while you can opt-out we strongly recommend you keep it turned on.
Expanded coverage for VPC Service Controls
VPC Service Controls lets you create perimeters that protect your resources and data, and for services that you explicitly specify. To speed up diagnosis and troubleshooting when using VPC Service Controls, we launched Violation Analyzer and Violation Dashboard to help you diagnose an access denial event.
What’s new in Cloud Governance with Organization Policy Service
Expanded coverage for Custom Organization Policy
Google Cloud’s Organization Policy Service gives you centralized, programmatic control over your organization’s resources. Organization Policy already provides predefined constraints, but for greater control you can create custom organization policies. Custom organization policy has now expanded service coverage, with 62 services supported.
Google Cloud Security Baseline
Google Cloud strives to make good security outcomes easier for customers to achieve. As part of this continued effort, we are releasing an updated and stronger set of security defaults, our Google Cloud Security Baseline. These were rolled out to all new customers last year — enabled by default — and based on positive feedback, we are now recommending them to all existing customers.
Starting this year, existing customers are seeing recommendations in their console to adopt the Google Cloud Security Baseline. You also have access to a simulator that tests how these constraints will impact your current environment.
What’s new with resource management
App-enablement with Resource Manager
We also extended our application centric approach to Google Cloud’s Resource Manager. App-enabled folders, now in preview, streamline application management by organizing services and workloads into a single manageable unit, providing centralized monitoring and management, simplifying administration, and providing an application-centric view.
You can now enable application management on folders in a single step.
Learn more
To learn more, you can view the Next ‘25 session recording with an overview of these announcements.
Welcome to the second Cloud CISO Perspectives for April 2025. Today, Sandra Joyce, vice president, Google Threat Intelligence, will talk about the practical applications of AI in both attack and defense, adapted from her RSA Conference keynote.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e18bacbdbe0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Data-driven insights into AI and cybersecurity
By Sandra Joyce, vice president, Google Threat Intelligence
We have been talking about AI’s exciting potential for cybersecurity for a couple of years. While we should be really excited about the future, we also need to look at the here and now, where AI is already impacting our industry. It’s time for results.
Sandra Joyce, vice president, Google Threat Intelligence
When we look at the current state of AI and cybersecurity, I see three consistent patterns:
There’s a lot of speculation. The potential of what AI is going to do in the future as compared to the value it can provide right now. It’s treated as a horizon-scanning issue.
There’s experimentation, too. Many security teams are still testing different solutions, not entirely sure yet how they’re going to integrate AI into their workflows.
There are lots of anecdotes. Stories that create a distorted perspective of the landscape based on one-off incidents, which can increase the risk that we lurch from one headline to the next instead of focusing on the reality of AI development in security.
Thankfully, the same AI capabilities that attackers are using for productivity gains can have a different impact when defenders seize them: They have the power to make defenders even more resilient.
Today’s real world impact
To cut through the noise so we can understand where we should actually be focusing our AI efforts, we need better data – specifically in two buckets: AI in the threat landscape, and AI for defense.
With so many different potential adversarial use cases related to AI, we need to prioritize the most prominent AI-driven attack vectors so we can properly manage the risks they present.
At the same time, CISOs need AI to deliver for defense. What is AI’s real value proposition? How does it meaningfully help deliver savings and improve security outcomes over the next 6 to 12 months?
Today, I’m going to share data-driven analyses that can help eliminate the guesswork, and help you prioritize the practical applications of AI that we’re seeing have a tangible impact.
How attackers are using AI
As part of our work countering threats to Google and our users, Google Threat Intelligence Group analysts track known threat actors, and we investigate how these threat actors are currently attempting to use generative AI, specifically Gemini. We’ve identified Advanced Persistent Threat groups from more than 20 countries that have accessed our public Gemini AI services.
Threat actors have used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, performing reconnaissance on target organizations, researching vulnerabilities, payload development, and seeking assistance with malicious scripting and evasion techniques.
Crucially, we see that these are existing attack phases being made more efficient, not fundamentally new AI-driven attacks. We’ve observed threat actors experimenting with AI and finding productivity gains, but not yet developing novel capabilities.
Much of the current discourse can feel overly alarmist. Our analysis shows that while AI is a useful tool for common tasks, we haven’t seen indications of adversaries developing fundamentally new attack vectors using these models.
Attackers are using Gemini the way many of us are using AI: It’s a productivity tool to help them brainstorm and refine their work. Instead of inventing brand new attack methods using AI, they are enhancing traditional tactics. We did not observe unique AI-enabled attacks, or prompt attacks.
The good news is that Gemini’s safety measures continue to restrict adversarial operational capabilities. While Gemini provided assistance with common, neutral tasks like content creation, summarization, and simple coding, it generated safety responses when prompted with more elaborate or explicitly malicious requests. We even observed unsuccessful attempts by threat actors to use Gemini to research techniques for abusing Google products such as Gmail, stealing data, and bypassing account verification.
How defenders are using AI
Thankfully, the same AI capabilities that attackers are using for productivity gains can have a different impact when defenders seize them: They have the power to make defenders even more resilient. There are use cases we recommend CISOs lean into right now to harness the potential of AI.
The growing volume of cyber threats has increased workloads for defenders and created a need for improved automation and innovative approaches. AI has enabled increased efficiency, supporting malware analysis, vulnerability research and analyst workflows.
The true test of any malware analysis tool lies in its ability to identify never-before-seen techniques that are not detected by traditional methods. Gemini can understand how code behaves in a deep way to spot new threats, even threats never seen before, and can make this kind of advanced analysis more widely accessible.
Our current results using large-language models (LLM) to create new fuzzing harnesses are showing real promise. We’ve achieved coverage increases of up to 7,000% across 272 C and C++ projects in OSS-Fuzz.
Google Project Zero and Google DeepMind collaborated on a project called Big Sleep, which has already uncovered its first real-world vulnerability using a LLM.
At Google, we’re using LLMs to speed up our security and privacy incident workflows. Gemini helps us write incident summaries 51% faster while also measurably improving their quality in blind evaluations by human reviewers.
We’re also using AI to reduce toil for our own analyst workflows. GTIG uses an internal AI tool that reviews thousands of event logs collected from an investigation and quickly summarizes them – in minutes – as a bite-sized overview that can be easily understood across the intelligence team, a process that previously took hours of effort.
Another internal AI tool also helps us provide crucial information to customers on the hacktivist threats they face, and reduce toil, in a way that would not be feasible without AI. Our analysts will onboard a hacktivist group’s main social channel (such as Telegram) into the AI tool, and when we have collected enough data from that channel, it creates a comprehensive report on the group’s behavior – including TTPs, preferred targets, and attacks that they’ve claimed credit for. That report is then reviewed, validated, and edited by a GTIG analyst.
We’ve only scratched the surface today of how AI is actively shaping the cybersecurity landscape right now. If you’re reading this from the RSA Conference, please come visit the Google Cloud Security Hub and speak to our experts about the tangible value we’re already gaining from integrated and agentic AI, and how to make Google part of your security team to benefit as well.
You can check out all our RSA Conference announcements here, and of course visit us anytime at our CISO Insights Hub.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e18bacbdd90>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
From insight to action: M-Trends, agentic AI, and how we’re boosting defenders at RSAC 2025: From the latest M-Trends report to updates across Google Unified Security, our product portfolio, and our AI capabilities, here’s what’s new from us at RSAC. Read more.
The dawn of agentic AI in security operations at RSAC 2025: Agentic AI promises a fundamental, tectonic shift for security teams, where intelligent agents work alongside human analysts. Here’s our vision for the agentic future. Read more.
Building an open ecosystem for AI-driven security with MCP: Bring AI to your security tools with open-source model context protocol (MCP) servers for Google Security Operations, Google Threat Intelligence, and Security Command Center. Learn how to connect security tools to LLMs. Read more.
3 new ways to use AI as your security sidekick: Generative AI is already providing clear and impactful security results. Here’s three decisive examples that organizations can adopt right now. Read more.
Introducing the Cyber Savvy Boardroom podcast: Our new monthly podcast features security and business leaders known for intuition, expertise, and guidance, discussing what matters most with experts from our Office of the CISO. Read more.
Your comprehensive guide to Google Cloud Security at RSA 2025: From connecting with experts to witnessing innovative cloud technology in action, Google Cloud Security is the place to be at the RSA Conference. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Tell us what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x3e18bacbdfa0>), (‘btn_text’, ‘Vote now’), (‘href’, ‘https://www.linkedin.com/feed/update/urn:li:activity:7322982998843273216/’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
Zero-day exploitation continues to grow gradually: Google Threat Intelligence Group (GTIG) has released a comprehensive overview and analysis of the 75 zero-day vulnerabilities exploited in the wild in 2024. While zero-day exploitation continues to grow at a slow but steady pace, we’ve also started seeing vendor efforts to mitigate zero-day exploitation start to pay off. Read more.
M-Trends 2025: Data, insights, and recommendations from the frontlines: The 16th edition of our annual threat intelligence report provides data, analysis, and learnings drawn from more than 450,000 hours of incident investigations conducted in 2024. Providing actionable insights into current cyber threats and attacker tactics, this year’s report continues our tradition of helping organizations understand the evolving threat landscape and improve their defenses based on real-world data. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Podcasts from Google Cloud
How cyber-savvy is your board: We’ve long extolled the importance of bringing boards of directors up to speed on cybersecurity challenges both foundational and cutting-edge, which is why we’ve launched “Cyber Savvy Boardroom,” a new monthly podcast from our Office of the CISO’s David Homovich, Alicja Cade, and Nick Godfrey. Our first three episodes feature security and business leaders known for their intuition, expertise, and guidance, including Karenann Terrell, Christian Karam, and Don Callahan. Listen here.
Going big with cloud security rewards: From vulnerability response at cloud scale to what makes a great vulnerability report, Google Cloud’s Michael Cote and Aadarsh Karumathil discuss and debate the ever-evolving world of vulnerability report rewards with hosts Anton Chuvakin and Tim Peacock. Listen here.
Defender’s Advantage: Going from Windows RDP to rogue: Host Luke McNamara is joined by GTIG Senior Security Researcher Rohit Nambiar to discuss interesting usage of Windows Remote Desktop Protocol by UNC5837. Listen here.
Behind the Binary: Inside a community’s fight against malware: We chat with founder Roman Huessy about the future of community-driven threat intelligence and abuse.ch, a vital non-profit project built by and for the global cybersecurity community to fight against threat actors. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.
Recently at Google Cloud Next 25, we announced our latest Cross-Cloud Network innovation: Cloud WAN, a fully managed, reliable, and secure solution to transform enterprise wide area network (WAN) architectures. Today, we continue our series of deep dives into the technologies powering Cloud WAN, namely Premium Tier networking and the Verified Peering Provider program.
The ever-changing enterprise network
The evolution of enterprise WANs is marked by a significant shift from primarily connecting branches and headquarters to managing a growing volume of traffic directed towards the internet, cloud-based services, and Software-as-a-Service (SaaS) applications. In this transformed landscape, achieving consistent end-to-end reliability has become a paramount concern for organizations.
However, ensuring end-to-end reliability with traditional WAN architectures can often be costly and complex. The integration of solutions from multiple vendors can escalate both expenses and operational intricacy. Furthermore, the common practice of overprovisioning network resources to guarantee reliability adds a significant financial burden. The reliance on multiple internet hops to reach cloud services can introduce latency and result in unpredictable service level agreements (SLAs). Compounding these challenges, fluctuating bandwidth demands make effective cost management increasingly difficult.
In essence, traditional WAN architectures present substantial hurdles to obtaining consistent end-to-end reliability in a cost-effective, manageable, and predictable manner, especially as businesses accelerate their adoption of cloud and SaaS solutions.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3e18a4eb0dc0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/products?#networking’), (‘image’, None)])]>
Cloud WAN: Global connectivity for distributed enterprises
Cloud WAN offers a single, fully managed network solution for reliable, any-to-any connectivity. It leverages Google’s high-performance global network — the same infrastructure that connects over a billion users daily to services like YouTube, Search, Maps, and Workspace — to connect enterprise sites, cloud applications, data centers, and users.
Cloud WAN enables a high-performance and consistently reliable architecture through:
Premium Tier: Google’s high performance global backbone ensures reliable traffic delivery within the Google network and all internet-bound traffic between Google services and the internet
Verified Peering Provider (VPP) Program: This program ensures reliable, high-quality internet connectivity between the enterprise and Google networks.
The following sections will explore how these components work together to create a strong, dependable, and efficient WAN.
Premium Tier: a high-performance backbone
With Premium Tier, ingress user traffic enters Google’s network using Anycast at the edge point of presence (PoP) that is optimized for the user location. This traffic is carried over Google’s backbone to the relevant application hosted in any Google Cloud region. For outbound traffic, a peering location near the destination ISP is selected to avoid congestion on peering links. This sends outgoing packets along Google’s backbone for the bulk of their journey, and the traffic egresses near the destination, for the highest reliability. Google Cloud’s network is engineered and provisioned so that there are at least three independent paths (N+2 redundancy) between Google Cloud regions, providing availability even in the case of fiber issues or an unplanned outage.
Traffic travels from the peering PoP to the end-user via an internet service provider’s (ISP) network. When the ISP participates in Google’s Verified Peering Provider (VPP) program, that means that Google has pre-validated their redundant and diverse connectivity to Google’s global network, providing a more reliable experience for the customer.
Key benefits of Premium Tier include:
Global reach: Connectivity with 42 regions, 200+ network edge locations worldwide, 2M+ miles of fiber, and 33 subsea cable investments
Enhanced performance: Cross-Cloud Network provides up to 40% improved performance compared to the public internet1
High reliability: Backed by a 99.99% uptime SLA, providing peace of mind for critical applications
Predictable pricing: Starting in 2H’25, we will offer committed use discounts for Premium Tier and Standard Tier internet data transfers
“Google Cloud’s Premium Tier network provides exceptional global reachability and consistent low latency for Snap’s 450M daily active users, enabling reachability in many countries being served from Google cloud regions. Google Cloud’s low inter-region latency helps Snap ensure a responsive real-time user experience, which is critical for excellent user experience. The superior network performance including low latency across cloud providers is the primary reason why Snap chose to use Google Cloud.” – Mahmoud Ragab, Manager Software Engineering, Snap
Verified Peering Provider: Simple and reliable connectivity
The Verified Peering Provider program recognizes ISPs who have demonstrated high-quality, diverse, and reliable connectivity to Google’s network. Google Cloud customers can choose to reach Google’s services through these ISPs to access all publicly accessible Google services.
Choosing a Verified Peering Provider provides several benefits to Google Cloud customers:
Simplified and reliable connectivity: Choosing a Verified Peering Provider simplifies Google connectivity by identifying ISPs that offer internet services optimized for enterprises. Customers who choose to connect with a Verified Peering Provider aren’t required to meet Google’s peering requirements, leaving the complexities of peering arrangements to the ISPs.
Stable internet latency: The program’s peering redundancy requirements help ensure that participating ISPs can maintain diverse paths to Google’s network across physically separated network locations, minimizing single points of failure. This design helps keep latency stable and predictable during planned network maintenance or unexpected outages.
Ease of locating a well-connected ISP: Customers often struggle to locate an ISP with diverse and reliable connectivity to Google’s network, or understand where an ISP’s network is connected to Google. The Verified Peering Provider program discloses the locations where an ISP is connected to Google, allowing customers to select the ISP that is closest to their workloads.
Expanding the Verified Peering Provider program
With Cloud WAN, we also expanded the Verified Peering Provider program. Since its launch last year, the program has successfully enrolled more than 40 ISPs across 50 metropolitan areas, spanning North America, Europe, Latin America, and the Asia Pacific region. These partnerships have been crucial in enhancing the Google Cloud experience for our users, offering simplified connectivity solutions to access publicly accessible Google services.
Building on this momentum, we are broadening VPP enrollment eligibility to additional ISPs worldwide. We encourage all interested ISPs to review the technical criteria and begin the enrollment process.
The foundation of a tomorrow’s enterprise network
Together, Premium Tier and the Verified Peering Provider program enable Cloud WAN with high-performance, reliable, and secure connectivity to Google Cloud resources and the broader internet.
Premium Tier helps ensure that traffic between the internet and Google Cloud stays on Google’s high-performance global network, as close to the user as possible, maximizing reliability and performance. This is crucial for globally distributed enterprises that require consistent application performance and user experience across different regions.
The Verified Peering Provider program works alongside Premium Tier, signaling that the ISPs connecting to Google have reliable and redundant connectivity from Enterprise branches to Google’s global network. By choosing a Verified Peering Provider, you get simplified, enterprise-ready connectivity using the ISP’s existing connection to Google — plus access to its SLAs and support offerings.
By combining Premium Tier and a Verified Peering Provider, enterprises can achieve end-to-end:
Improved performance: Higher bandwidth for faster application response times
Enhanced reliability: Increased network uptime with consistent user experience
Simplified management: Streamlined network operations and reduced complexity
The demand for reliable and efficient enterprise network connectivity will only continue to grow with the emergence of AI. By leveraging Cloud WAN, enterprises can upgrade their networks and unlock the potential of cloud-based applications and services. Learn more about Cloud WAN on the Cross-Cloud Network solution page, and read the first blog in our Cloud WAN deep dive series on NCC Gateway.
1. During testing, network latency was more than 40% lower when traffic to a target traveled over the Cross-Cloud Network compared to when traffic to the same target traveled across the public internet.
At Next ’25, we introduced several new innovations within BigQuery, the autonomous data to AI platform. BigQuery ML provides a full range of AI and ML capabilities, enabling you to easily build generative AI and predictive ML applications with BigQuery. The new AI and ML capabilities from BigQuery ML include:
a new state-of-the-art pre-trained forecasting model (TimesFM) which drastically simplifies forecasting problems
support for generating or extracting structured data with large language models (LLMs)
a set of new row-wise inference functions enabling you to mix gen AI processing with standard SQL
expanded model choice with Gemini and OSS models
the general availability of the Contribution Analysis feature, useful for explaining changes in your business metrics
Let us explore these new capabilities.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6ec4519910>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
1. TimesFM forecasting model in BigQuery
Accurate time series forecasting is essential for many business scenarios such as planning, supply chain management, and resource allocation. BigQuery now embeds TimesFM, a state-of-the-art (SOTA) pre-trained model from Google Research, enabling powerful forecasting via the simple AI.FORECAST function. Trained on over 100 billion real-world time-points, TimesFM provides impressive zero-shot forecasting accuracy across various real world domains and at different granularities without requiring you to train or tune on your data.
Key benefits of TimesFM in BigQuery include:
Managed and scalable: A fully managed, highly scalable forecasting engine within BigQuery.
Easy forecasting: Generate forecasts for one or millions of time series in a single query – no model training required.
Here’s a basic example of creating a forecast using the new AI.FORECAST function with TimesFM:
SQL
SELECT * FROM AI.FORECAST(
TABLE dataset.table,
data_col => "data",
timestamp_col => "timestamp",
model => "TimesFM 2.0",
horizon => 30
)
This query forecasts the “data” column for the next 30 time units, using “timestamp” as the time identifier. Please see the documentation for more details.
2. Structured data extraction and generation with LLMs
Extracting structured information consistently from unstructured data such as customer reviews, emails, logs etc. can be complex. BigQuery’s new AI.GENERATE_TABLE function simplifies structured data extraction/generation using the constrained decoding capabilities of LLMs. This function takes a model, a table of input data and an output_schema as inputs and outputs a table whose schema is determined by the output_schema parameter.
Here’s how you can use AI.GENERATE_TABLE:
SQL
SELECT * FROM AI.GENERATE_TABLE(
MODEL project_id.dataset.model,
(SELECT medical_transcripts as prompt from table),
STRUCT("age INT64, medications ARRAY<STRING>" AS output_schema)
)
In this example, the output table has ‘age’ and ‘medications’ columns — no complex parsing required. The output is written as a BigQuery temporary table. To materialize the results to a permanent table, the above query can be used in a DDL statement:
CREATE TABLE project_id.dataset.my_structured_table
AS <AI.GENERATE_TABLE subquery>
The first wave of BigQuery’s LLM functions focused on table-valued functions (TVFs) that output entire tables. We are now introducing row-wise AI functions for LLM inference for more flexible and expressive data manipulation and analysis. These scalar functions enhance the usability of LLMs within BigQuery, as they can be used anywhere a value is needed, such as in SELECT, WHERE, JOIN, and GROUP BY clauses. Let’s go though some of the capabilities we are adding: a) Basic text generation with AI.GENERATE
First, let’s see how the new AI.GENERATE() can be used for convenient row-wise LLM inference:
SELECT
city,
AI.GENERATE(
('Give a short, one sentence description of ', city),
connection_id => 'us.test_connection',
endpoint => 'gemini-2.0-flash').result
FROM mydataset.cities;
b) Structured output with AI.GENERATE
In addition, the structured output generation capabilities introduced above also extend to row-wise AI functions. In this example, the query generates state capitals for a list of states, using the output_schema argument to set two custom fields in the output struct — state and capital:
SQL
SELECT
state,
AI.GENERATE(
('What is the capital of ', state, '?'),
connection_id => 'us.example_connection',
endpoint => 'gemini-2.0-flash',
output_schema => 'state STRING, capital STRING').capital
FROM mydataset.states;
c) Type-specific functions (e.g., AI.GENERATE_BOOL)
For common tasks requiring specific data types like boolean, integer, or float, BigQuery now offers simple, type-specific functions. For instance, you can use AI.GENERATE_BOOL for classification or validation tasks:
SQL
SELECT city.name, AI.GENERATE_BOOL(
("Is", city.name, "in the state of WA?"),
connection_id => "us.example_connection",
endpoint => 'gemini-2.0-flash').result
FROM city
Additional type-specific functions, namely AI.GENERATE_INT and AI.GENERATE_DOUBLE, are also available for generating integer and floating-point results. Please see the documentation for more details.
4. Expanded model choice: Gemini, OSS and third-party
BigQuery ML allows you to use LLMs to perform tasks such as entity extraction, sentiment analysis, translation, text generation, and more on your data using familiar SQL syntax. In addition to first-party Gemini models, BigQuery supports inference with open-source and third-party models, which comes in two flavors:
Customer-managed endpoints for open source models (previously announced): You can host any open source model of your choice on a Vertex AI Model Garden endpoint and then use it from BigQuery.
Model as a service integrations: Access fully managed model endpoints directly through BigQuery. This already included models like Anthropic’s Claude, and we are excited to announce newly added support for Llama and Mistral models, further expanding model choice available to developers.
5. Contribution analysis now generally available
Businesses constantly need to answer questions like “Why did our sales drop last month?” or ” For what user, device, demographics combination was our marketing campaign most effective?” Answering these “why” questions accurately is vital, but often involves complex manual analysis. The BigQuery contribution analysis feature automates this analysis and helps you pinpoint the key factors (or combinations of factors) responsible for the most significant changes in a metric between the control and test groups you define.
Now generally available, the BigQuery ML contribution analysis release includes enhancements focused on improved interpretability and performance, including:
A new summable by category metric to analyze the sum of a numerical measure of interest normalized by a categorical variable
Top-K Insights by Apriori Support option to automatically fetch k insights with the largest segment size
A redundant insight pruning option, which improves result readability by returning only unique insights
Let’s say you want to understand what drove changes in the average sales per user across various vendors and payment types between the control and test data. To answer this with a contribution analysis model, you tell BigQuery which factors (dimensions) to investigate (dimension_id_cols), what metric you care about (contribution_metric), and which column identifies your test/control groups (is_test_col).
SQL
-- Define the contribution analysis task
CREATE MODEL bqml_tutorial.contribution_analysis_model
OPTIONS (
model_type = 'CONTRIBUTION_ANALYSIS',
dimension_id_cols = ['vendor', 'month', 'payment_type'],
contribution_metric = 'sum(sales)/count(distinct user_id)',
is_test_col = 'is_test_col',
top_k_insights_by_apriori_support = 25,
pruning_method = 'PRUNE_REDUNDANT_INSIGHTS'
) AS
SELECT * FROM dataset.input_data;
Once the model is created, you can use a SQL query like the following to generate insights:
SELECT * FROM ML.GET_INSIGHTS (MODEL bqml_tutorial.contribution_analysis_model);
BigQuery returns a prioritized list showing which combinations of factors (e.g., “Users paying via Amex Credit Card from Vendor”) had the most significant impact on the average sales per user between your control and test groups.
Bring AI into your data
The latest BigQuery ML updates bring powerful AI/ML capabilities directly into your data workflows. Between forecasting with TimesFM, automated root-cause analysis with contribution analysis, flexible row-wise LLM functions, streamlined structured data generation, and expanded model choice, you can move faster from data to insights and impactful outcomes.
AI is fundamentally transforming the compute landscape, demanding unprecedented advances in data center infrastructure. At Google, we believe that physical infrastructure — the power, cooling, and mechanical systems that underpin everything — isn’t just important, but critical to AI’s continued scaling.
We have a long-standing partnership with the Open Compute Project (OCP) that has been instrumental in driving industry collaboration and open innovation in infrastructure. At the 2025 OCP EMEA Summit today, we discussed the power delivery transformation from 48 volts direct current (VDC) to the new +/-400 VDC, which will enable IT racks to scale from 100 kilowatts up to 1 megawatt. We also shared that we’ll contribute our fifth-generation cooling distribution unit, Project Deschutes, to OCP, helping to accelerate adoption of liquid cooling industry-wide.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6ec45ea7c0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Transforming power delivery with 1 MW per IT rack
Google has a long history of advancing data center power delivery. Almost 10 years ago, we championed the adoption of 48 VDC inside the IT rack to significantly increase the power distribution efficiency and reduce losses compared to what typical 12 VDC solutions delivered. The industry responded to our call to action to collaborate on this technology, and the resulting architecture has worked well, scaling from 10 kilowatts to 100 kilowatts IT racks.
The AI era requires even greater power delivery capabilities for two distinct reasons. The first is simply that ML will require more than 500 kW per IT rack before 2030. The second is the densification of each IT rack, where every millimeter of space in the IT rack is used for tightly interconnected “xPUs” (e.g. GPUs, TPUs, CPUs). This requires a much higher voltage DC power distribution solution, where power components and battery backup are outside of the IT rack.
We are excited to introduce +/-400 VDC power delivery that can support up to 1 MW per rack. This is about much more than simply increasing power delivery capacity — selecting 400 VDC as the nominal voltage allows us to leverage the supply chain established by electric vehicles (EVs), for greater economies of scale, more efficient manufacturing, and improved quality and scale, to name a few. As part of the Mt Diablo project, we are collaborating with Meta, and Microsoft at OCP to standardize the electrical and mechanical interfaces, and the 0.5 specification draft will be available for industry feedback in May.
The first embodiment of this work is an AC-to-DC sidecar power rack that disaggregates power components from the IT rack. This solution improves the end-to-end efficiency by ~ 3% while enabling the entire IT rack to be used for xPUs. Longer term, we are exploring directly distributing higher-voltage DC power within the data center and to the rack, for even greater power density and efficiency.
+/-400 VDC power delivery: AC-to-DC sidecar power rack
The liquid cooling imperative
The dramatic increase in chip power consumption — from 100W chips to accelerators exceeding 1000W — has made advanced thermal management essential. Packing more powerful chips into racks also creates significant challenges for cooling density. Liquid cooling has emerged as the clear solution, given its superior thermal and hydraulic properties. Water can transport approximately 4000 times more heat per unit volume than air for a given temperature change, while the thermal conductivity of water is roughly 30 times greater than air.
At Google, we’ve deployed liquid cooling at GigaWatt scale across more than 2000 TPU Pods in the past seven years with remarkable uptime — consistently at about 99.999%. Google first used liquid cooling in TPU v3 that was deployed in 2018. Liquid-cooled ML servers have nearly half the geometrical volume of their air-cooled counterparts because they replace bulky heatsinks with compact cold plates. This allowed us to double chip density and quadruple the size of our liquid-cooled TPU v3 supercomputer compared to the air-cooled TPU v2 generation.
We’ve continued to refine this technology generation over generation, from TPU v3 and TPU v4, through TPU v5, and most recently, Ironwood. Our implementation utilizes in-row coolant distribution units (CDUs) with redundant components and uninterruptible power supplies (UPS) for high availability. These CDUs isolate the rack’s liquid loop from the facility loop, providing a controlled, high-performance cooling system delivered via manifolds, flexible hoses, and cold plates that are directly attached to the high-power chips. In our CDU architecture, named Project Deschutes, the pump and heat exchanger unit is redundant, which is what has enabled us to consistently achieve the above-mentioned fleet-wide CDU availability of ~99.999% since 2020.
We will contribute the fifth-generation Project Deschutes CDU, currently in development, to OCP later this year. This contribution, including system details, specifications, and best practices, is intended to help accelerate the industry’s adoption of liquid cooling at scale. Our insights are drawn from nearly a decade of designing and deploying liquid cooling across four generations of TPUs, and encompass:
Design for high cooling performance
Manufacturing quality
Reliability and uptime
Deployment velocity
Serviceability and operational excellence
Supply ecosystem advancements
Project Deschutes CDU: 4th gen in deployment, 5th gen in concept
Get ready for the next generation of AI
We’re encouraged by the significant strides the industry has made in power delivery and liquid cooling. However, with the accelerating pace of AI hardware development, it’s clear that we must collectively quicken our pace to prepare data centers for what’s next. We’re particularly excited about the potential for rapid industry adoption of +/-400 VDC, facilitated by the upcoming Mt Diablo specification. We also strongly encourage the industry to adopt the Project Deschutes CDU design and leverage our extensive liquid cooling learnings. Together, by embracing these advancements and fostering deeper collaboration, we believe the most impactful innovations are still ahead.
The rise of AI is revolutionizing data management platforms, where advanced automation, built-in data intelligence, and AI-powered data management are changing how organizations manage traditional tasks like data ingestion, data processing and governance.
We’re excited to announce that Google was named a Leader in The Forrester Wave™: Data Management for Analytics Platforms, Q2 2025 report. In the report, Google received 5 out of 5, the highest score possible, across 13 different criteria. We believe this is a testament to our strengths in several key areas, particularly in delivering agentic experiences that automate manual tasks and accelerate gen AI use cases, built-in intelligence to unlock new insights from structured and unstructured data, real-time capabilities driving insights to action, and a secure and governed multimodal data foundation with governance across the data-to-AI lifecycle.
According to the report:
Google’s distinctive and forward-thinking vision is to provide a unified, agentic, intelligent, and seamlessly integrated data platform that blends data management, advanced analytics, and AI capabilities at scale. The platform continues to evolve rapidly, focusing on advanced automation, open standards, global scale, self-service, and deeper integration with other Google services. The vendor’s roadmap is exceptionally well-defined, delivering a powerful strategic direction and alignment with AI positioned at its core.
Google placed furthest on Strength of Strategy and received above-average customer feedback in the evaluation, denoted by the halo around Google’s circle. Customers such as Dun & Bradstreet, Shopify,General Mills and many more choose BigQuery for its autonomous data and AI capabilities when building their data management platforms. Let’s take a closer look at the capabilities that differentiate Google Cloud’s data platform.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e9e64929c40>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Agentic and AI-assisted capabilities to power your analytics
Data management isn’t just about storing and querying data; it’s also about intelligent automation and assistance. As highlighted in our recent announcements from Google Cloud Next 25, BigQuery has evolved into an autonomous data-to-AI platform, where specialized data agents, advanced engines, and business users can all operate on a self-managing multimodal data foundation built for processing and activating all types of data. With assistive capabilities powered by gen AI and integrations with Vertex AI for model building and deployment, you can reduce the complexities of data management and smooth the path from raw data to actionable AI-driven insights.
BigQuery’s AI-powered data management capabilities are designed for users of all skill levels. Data analysts can use natural language to query data, generate SQL, and summarize results. Data engineers can automate manual tasks like data preparation, building data pipelines, and performing anomaly detection to accelerate analytics workflows. Data scientists can use AI-driven notebook experiences and new engines to process complex data and support advanced analyses in real time.
A multimodal data foundation with unified governance
BigQuery helps unify analytics across diverse data types by allowing data teams to build on an open lakehouse foundation. It combines highly performant native data management capabilities with support for open formats like Apache Iceberg, Delta, and Hudi. Multimodal support lets you store and analyze structured and unstructured data within the same table, streamlining complex analytics workflows. Finally, BigQuery’s universal catalog lets you work across SQL, Spark, AI, BI, and third-party engines, all with a flexible and open data lakehouse architecture, supporting interoperability.
Beyond the universal catalog, BigQuery data governance (powered by Dataplex)provides a unified experience for discovering, managing, monitoring, and governing data across data lakes, warehouses, and AI models. It also enables consistent policy enforcement, automated data quality checks, and comprehensive lineage tracking. Combined with a robust security infrastructure and fine-grained access controls, it helps you manage your data and AI assets with confidence, supporting compliance and building trust. Features like managed disaster recovery, enhanced workload management for aligning budget with performance needs, and flexible pricing with spend-based commitments further reinforce enterprise readiness.
Built-in intelligence for real-time insights
BigQuery enables your teams to build and deploy machine learning models using their existing SQL skills. This helps to eliminate complexity and accelerates the adoption of AI across the organization. BigQuery’s integration with advanced AI models helps to extract insights from multimodal data in documents, videos, and images. Scalable vector search supports intelligent recommendations, while the new BigQuery AI query engine allows analysts to use familiar SQL and LLMs for real-world context when analyzing unstructured data.
Real-time data capabilities are important for bringing fresh data to your AI models. BigQuery is designed from the ground up to support high-throughput streaming ingestion, allowing data to be analyzed as soon as it arrives. Real-time data combined with built-in machine learning and AI enables use cases like real-time fraud detection, dynamic personalization, operational monitoring, and immediate response to changing market conditions. Combining real-time data pipelines with the Vertex AI allows you to build and deploy models that react instantly, turning real-time data into real-time intelligent action.
Google is your partner for data to AI transformation
Google’s recognition as a Leader in The Forrester Wave™: Data Management For Analytics Platforms, validates our strategy and execution in delivering a comprehensive, AI-powered platform. Our focus on AI-driven assistance, a multimodal data foundation, and real-time intelligence helps to reduce manual data management tasks, so you can accelerate insights, and innovate faster.
As we evolve BigQuery into an autonomous data-to-AI platform, we are committed to helping you navigate the complexities of the modern data landscape and lead with data and AI. Thank you, our customers and partners, for choosing BigQuery to power your data management and analytics. Learn more about BigQuery today by visiting our website. Read the full Forrester Wave™: Data Management For Analytics Platforms, Q2 2025 report here.
Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change. For more information, read about Forrester’s objectivity here .
The traditional drug discovery process involves massive capital investments, prolonged timelines, and is plagued with daunting failure rates. From initial research to obtaining regulatory approval, bringing a new drug to market can take decades. During this time, many drug candidates that had seemed very promising fail to deliver, either due to inefficacy or safety concerns. Only a small fraction of candidates successfully make it through clinical trials and regulatory hurdles.
Enter SandboxAQ, which is helping researchers explore vast chemical spaces, gain deep insights into molecular interactions, and predict biological outcomes with precision. It does so with cutting-edge computational approaches such as active learning, absolute free energy perturbation solution (AQFEP), generative AI, structural analysis, and predictive data analytics, ultimately reducing drug discovery and development timelines. And it does all this on a cloud-native foundation.
Drug design involves an iterative cycle of designing, synthesizing, and testing molecules referred to as the Design-Make-Test cycle. Many customers approach SandboxAQ during the design phase, often when their computational methods are falling short. By improving and accelerating this part of the cycle, SandboxAQ helps medicinal chemists bring innovative and effective molecules to market. For example, in a project related to neurodegenerative disease, SandboxAQ’s approach expanded chemical space from 250,000 to 5.6 million molecules, achieving a 30-fold increase in hit rate and dramatically accelerating the discovery of candidate molecules.
Cloud-native development for scientific insight
SandboxAQ’s software relies on large-scale computation and to maximize flexibility and scale, they use a cloud strategy, which includes Google Cloud infrastructure and tools.
The technologies in large-scale virtual screening campaigns need to be agile and scale cost-effectively. Specifically, SandboxAQ engineers need to be able to quickly iterate on scientific code, immediately run that code at scale cost-effectively, and store and organize all of the data it produces.
SandboxAQ achieved a significant boost in efficiency and scalability with Google Cloud infrastructure. They scaled their computational throughput by 100X to leverage tens of thousands of virtual machines (VMs) in parallel. They also improved utilization by reducing idle time by 90%. By consolidating development and deployment on Google Cloud, SandboxAQ streamlined its workflows, from code development and testing to large-scale batch processing and machine-learning model training.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e17204f9340>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
All of SandboxAQ’s development and deployment takes place in the cloud. Code and data live in cloud-based services, and development is done on a cloud-based platform that provides scientists and engineers with self-service VMs with standardized and centrally maintained environments and tools. This is important, because scientific code often requires heavy-duty computing hardware. Scientists have access to hefty 96-core machines, or instances with large GPUs. They can also create new machines with alternate configurations or CPU types as depicted below, enabling low-friction testing and development processes across heterogeneous resources.
SanboxAQ scientists and developers manage and access their Bench machines (see above) using the company’s `bench` client. They can connect to machines via SSH or use any number of managed tools, for example a browser-based VNC service for instant remote desktop, or JupyterLab for a familiar notebook development flow.
As code is ready to be run at a larger scale, researchers can dispatch SandboxAQ parameterized sets of computations as jobs on an internal tool powered by Batch, a fully managed service to schedule, queue, and execute batch jobs on Google infrastructure. With development and batch runtime environments closely synced, changes can be quickly run at scale. Code developed on bench machines is pushed to GitHub and immediately available for batch execution. Then, as tools are reviewed and merged into `main` of the company’s monorepo, the new tools become automatically available on SandboxAQ scientists’ bench machines, who can launch parallel jobs processing millions of molecules on any kind of Google Cloud VM resource in any global zone, utilizing either on-demand or Spot VMs.
SandboxAQ’s implementation of a globally resolved transitive dependency tree, enables simple package and dependency management. With this practice, Google Batch can seamlessly integrate with individual tools developed by engineers to train many instances of a model in parallel.
Machine learning is a core component of SandoxAQ’s strategy, making easy data access especially important. At the same time, SandboxAQ’s Drug Discovery team also works with clients who have sensitive data. To secure customers’ data, bench and batch workloads read and write data from a unified interface that’s managed via IAM, allowing granular control of different data sources within the organization.
Meanwhile, Google Cloud services like Cloud Logging, Cloud Monitoring, Compute Engine and Cloud Run make it simple to develop tools to monitor these workloads, easily surface logs to SandboxAQ scientists, and comb through huge amounts of output data. As new features are tested or bugs show up, changes are made immediately available to the scientific team, without having to wrangle infrastructure. Then, as code becomes stable, they can incorporate it into downstream production applications, all in a centrally secured, unified way on Google Cloud.
In short, having a unified development, batch compute, and production environment on Google Cloud reduces the friction SandboxAQ faces to develop new workloads and run them at scale. With shared environments for scientific workload development and engineering, SandboxAQ makes it quick and easy for customers to move from experimentation to production, delivering the results customers want, fast.
SandboxAQ solution in the real world
SandboxAQ is already having a profound impact on drug discovery programs targeting a range of hard-to-treat diseases. For example, there are advanced collaborations with Professor Stanley Pruisner’s lab at University of California San Francisco (UCSF), Riboscience, Sanofi, and with the Michael J Fox Foundation, to name a few. With this approach built on Google CloudSandboxAQ has achieved a superior hit rate compared to other methods like high throughput screening, demonstrating the transformative potential of SandboxAQ on drug discovery and bringing cures to patients faster.
At Google Cloud Next 25, we expanded the availability of Gemini in Looker, including Conversational Analytics, to all Looker platform users, redefining how line-of-business employees can rapidly gain access to trusted data-driven insights through natural language. Due to the complexity inherent in traditional business intelligence products, which require steep learning curves or advanced SQL knowledge, many potential users who could benefit from BI tools simply don’t. But with the convergence of AI and BI, the opportunity to ask questions and chat with your data using natural language breaks down the barriers that have long stood in the way.
Conversational Analytics from Looker is designed to make BI more simple and approachable, democratizing data access, enabling users to ask data-related queries in plain, everyday language, and go beyond static dashboards that often don’t answer all potential questions. In response, users receive accurate and relevant answers derived from Looker Explores or BigQuery tables, without needing to know SQL or specific data tools.
For data analysts, this means fewer support tickets and interruptions, so they can focus on higher priority work, Business users can now take on their own data queries themselves and get answers, empowering trusted self-service by , putting the controls in the hands of users who need the answers most. Now, instead of struggling with field names and date formats, users can simply ask questions like: “What were our top-performing products last quarter?” or say “Show me the trend of website traffic over the past six months.” Additionally, when using Conversational Analytics with Looker Explores, users can be sure tables are consistently joined and metrics are calculated the same way every time.
With Conversational Analytics, ask questions of your data and get AI-driven insights.
Conversational Analytics in Looker is designed to be simple, helpful, and easy to use, offering:
Trusted, consistent results: Conversational Analytics only uses fields defined by your data experts in LookML. Once the fields are selected, they are deterministically translated to SQL by Looker, the same way every time.
Transparency with “How was this calculated?”: This feature provides a clear, natural language explanation of the underlying query that generated the results, presented in easy-to-understand bullet points.
A deeper dive with follow-up questions: Just like a natural conversation, users can ask follow-up questions to explore the data further. For example, users can ask to filter a result to a specific region, to change the timeframe of the date filter, or to switch from bar graph to an area chart. Conversational Analytics allows for seamless iteration and deeper exploration of the data.
Hidden insights with Gemini: Once the initial query results are displayed, users can click the “Insights” button to ask Gemini to analyze the data results and generate additional insights about patterns and trends they might have otherwise missed.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6ec440a400>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Empowering data analysts and developers
With the release of Conversational Analytics, our goal is for it to benefit data analysts and developers on top of line-of–business teams. The Conversational Analytics agent lets data analysts provide crucial context and instructions to Gemini, enhancing its ability to answer business user questions effectively, and empowering analysts to map business jargon to specific fields, specify the best fields for filtering, and define custom calculations.
Analysts can further curate the experience by creating agents for specific use cases. When business users select an agent, they can feel confident that they are interacting with the right data source.
As announced at Next 25, the Conversational Analytics API will power Conversational Analytics across multiple first-party Google Cloud experiences and third-party products, including customer applications, chat apps, Agentspace, and BigQuery, bringing the benefits of natural language queries to your data to the applications where you work every day. Later this year we’ll also bring Conversational Analytics into Looker Dashboards, allowing users to chat with their data in that familiar interface, whether inside Looker or embedded in other applications.Also, if you’re interested in solving even more complex problems while chatting with your data, you can try our new Code Interpreter (available in preview), which uses Python rather than SQL to perform advanced analysis like cohort analysis and forecasting. With the Conversational Analytics Code Interpreter, you can tackle data science tasks without learning advanced coding or statistical methods. Sign up for access here.
Expanding the reach of AI for BI
Looker Conversational Analytics is a step forward in making BI accessible to a wider audience. By removing the technical barriers and providing an intuitive, conversational interface, Looker is empowering more business users to leverage data in their daily routines. With Conversational Analytics available directly in Looker, organizations can now make data-driven insights a reality for everyone. Start using Conversational Analytics today in your Looker instance.
Written by: Casey Charrier, James Sadowski, Clement Lecigne, Vlad Stolyarov
Executive Summary
Google Threat Intelligence Group (GTIG) tracked 75 zero-day vulnerabilities exploited in the wild in 2024, a decrease from the number we identified in 2023 (98 vulnerabilities), but still an increase from 2022 (63 vulnerabilities). We divided the reviewed vulnerabilities into two main categories: end-user platforms and products (e.g., mobile devices, operating systems, and browsers) and enterprise-focused technologies, such as security software and appliances.
Vendors continue to drive improvements that make some zero-day exploitation harder, demonstrated by both dwindling numbers across multiple categories and reduced observed attacks against previously popular targets. At the same time, commercial surveillance vendors (CSVs) appear to be increasing their operational security practices, potentially leading to decreased attribution and detection.
We see zero-day exploitation targeting a greater number and wider variety of enterprise-specific technologies, although these technologies still remain a smaller proportion of overall exploitation when compared to end-user technologies. While the historic focus on the exploitation of popular end-user technologies and their users continues, the shift toward increased targeting of enterprise-focused products will require a wider and more diverse set of vendors to increase proactive security measures in order to reduce future zero-day exploitation attempts.
Scope
This report describes what Google Threat Intelligence Group (GTIG) knows about zero-day exploitation in 2024. We discuss how targeted vendors and exploited products drive trends that reflect threat actor goals and shifting exploitation approaches, and then closely examine several examples of zero-day exploitation from 2024 that demonstrate how actors use both historic and novel techniques to exploit vulnerabilities in targeted products. The following content leverages original research conducted by GTIG, combined with breach investigation findings and reporting from reliable open sources, though we cannot independently confirm the reports of every source. Research in this space is dynamic and the numbers may adjust due to the ongoing discovery of past incidents through digital forensic investigations. The numbers presented here reflect our best understanding of current data.
GTIG defines a zero-day as a vulnerability that was maliciously exploited in the wild before a patch was made publicly available. GTIG acknowledges that the trends observed and discussed in this report are based on detected and disclosed zero-days. Our analysis represents exploitation tracked by GTIG but may not reflect all zero-day exploitation.
aside_block
<ListValue: [StructValue([(‘title’, ‘A 2024 Zero-Day Exploitation Analysis’), (‘body’, <wagtail.rich_text.RichText object at 0x3e437326c9d0>), (‘btn_text’, ‘Download now’), (‘href’, ‘https://services.google.com/fh/files/misc/2024-zero-day-exploitation-analysis-en.pdf’), (‘image’, None)])]>
Key Takeaways
Zero-day exploitation continues to grow gradually. The 75 zero-day vulnerabilities exploited in 2024 follow a pattern that has emerged over the past four years. While individual year counts have fluctuated, the average trendline indicates that the rate of zero-day exploitation continues to grow at a slow but steady pace.
Enterprise-focused technology targeting continues to expand. GTIG continued to observe an increase in adversary exploitation of enterprise-specific technologies throughout 2024. In 2023, 37% of zero-day vulnerabilities targeted enterprise products. This jumped to 44% in 2024, primarily fueled by the increased exploitation of security and networking software and appliances.
Attackers are increasing their focus on security and networking products. Zero-day vulnerabilities in security software and appliances were a high-value target in 2024. We identified 20 security and networking vulnerabilities, which was over 60% of all zero-day exploitation of enterprise technologies. Exploitation of these products, compared to end-user technologies, can more effectively and efficiently lead to extensive system and network compromises, and we anticipate adversaries will continue to increase their focus on these technologies.
Vendors are changing the game. Vendor investments in exploit mitigations are having a clear impact on where threat actors are able to find success. We are seeing notable decreases in zero-day exploitation of some historically popular targets such as browsers and mobile operating systems.
Actors conducting cyber espionage still lead attributed zero-day exploitation. Between government-backed groups and customers of commercial surveillance vendors (CSVs), actors conducting cyber espionage operations accounted for over 50% of the vulnerabilities we could attribute in 2024. People’s Republic of China (PRC)-backed groups exploited five zero-days, and customers of CSVs exploited eight, continuing their collective leading role in zero-day exploitation. For the first year ever, we also attributed the exploitation of the same volume of 2024 zero-days (five) to North Korean actors mixing espionage and financially motivated operations as we did to PRC-backed groups.
Looking at the Numbers
GTIG tracked 75 exploited-in-the-wild zero-day vulnerabilities that were disclosed in 2024. This number appears to be consistent with a consolidating upward trend that we have observed over the last four years. After an initial spike in 2021, yearly counts have fluctuated but not returned to the lower numbers we saw in 2021 and prior.
While there are multiple factors involved in discovery of zero-day exploitation, we note that continued improvement and ubiquity of detection capabilities along with more frequent public disclosures have both resulted in larger numbers of detected zero-day exploitation compared to what was observed prior to 2021.
Figure 1: Zero-days by year
Higher than any previous year, 44% (33 vulnerabilities) of tracked 2024 zero-days affected enterprise technologies, continuing the growth and trends we observed last year. The remaining 42 zero-day vulnerabilities targeted end-user technologies.
Enterprise Exploitation Expands in 2024 as Browser and Mobile Exploitation Drops
End-User Platforms and Products
In 2024, 56% (42) of the tracked zero-days targeted end-user platforms and products, which we define as devices and software that individuals use in their day-to-day life, although we acknowledge that enterprises also often use these. All of the vulnerabilities in this category were used to exploit browsers, mobile devices, and desktop operating systems.
Zero-day exploitation of browsers and mobile devices fell drastically, decreasing by about a third for browsers and by about half for mobile devices compared to what we observed last year (17 to 11 for browsers, and 17 to 9 for mobile).
Chrome was the primary focus of browser zero-day exploitation in 2024, likely reflecting the browser’s popularity among billions of users.
Exploit chains made up of multiple zero-day vulnerabilities continue to be almost exclusively (~90%) used to target mobile devices.
Third-party components continue to be exploited in Android devices, a trend we discussed in last year’s analysis. In 2023, five of the seven zero-days exploited in Android devices were flaws in third-party components. In 2024, three of the seven zero-days exploited in Android were found in third-party components. Third-party components are likely perceived as lucrative targets for exploit development since they can enable attackers to compromise many different makes and models of devices across the Android ecosystem.
2024 saw an increase in the total number of zero-day vulnerabilities affecting desktop operating systems (OSs) (22 in 2024 vs. 17 in 2023), indicating that OSs continue to be a strikingly large target. The proportional increase was even greater, with OS vulnerabilities making up just 17% of total zero-day exploitation in 2023, compared to nearly 30% in 2024.
Microsoft Windows exploitation continued to increase, climbing from 13 zero-days in 2022, to 16 in 2023, to 22 in 2024. As long as Windows remains a popular choice both in homes and professional settings, we expect that it will remain a popular target for both zero-day and n-day (i.e. a vulnerability exploited after its patch has been released) exploitation by threat actors.
Figure 2: Zero-days in end-user products in 2023 and 2024
Enterprise Technologies
In 2024, GTIG identified the exploitation of 33 zero-days in enterprise software and appliances. We consider enterprise products to include those mainly utilized by businesses or in a business environment. While the absolute number is slightly lower than what we saw in 2023 (36 vulnerabilities), the proportion of enterprise-focused vulnerabilities has risen from 37% in 2023 to 44% in 2024. Twenty of the 33 enterprise-focused zero-days targeted security and network products, a slight increase from the 18 observed in this category for 2023, but a 9% bump when compared proportionally to total zero-days for the year.
The variety of targeted enterprise products continues to expand across security and networking products, with notable targets in 2024 including Ivanti Cloud Services Appliance, Palo Alto Networks PAN-OS, Cisco Adaptive Security Appliance, and Ivanti Connect Secure VPN. Security and network tools and devices are designed to connect widespread systems and devices with high permissions required to manage the products and their services, making them highly valuable targets for threat actors seeking efficient access into enterprise networks. Endpoint detection and response (EDR) tools are not usually equipped to work on these products, limiting available capabilities to monitor them. Additionally, exploit chains are not generally required to exploit these systems, giving extensive power to individual vulnerabilities that can single-handedly achieve remote code execution or privilege escalation.
Over the last several years, we have also tracked a general increase of enterprise vendors targeted. In 2024, we identified 18 unique enterprise vendors targeted by zero-days. While this number is slightly less than the 22 observed in 2023, it remains higher than all prior years’ counts. It is also a stark increase in the proportion of enterprise vendors for the year, given that the 18 unique enterprise vendors were out of 20 total vendors for 2024. 2024’s count is still a significant proportional increase compared to the 22 unique enterprise vendors targeted out of a total of 23 in 2023.
Figure 3: Number of unique enterprise vendors targeted
The proportion of zero-days exploited in enterprise devices in 2024 reinforces a trend that suggests that attackers are intentionally targeting products that can provide expansive access and fewer opportunities for detection.
Exploitation by Vendor
The vendors affected by multiple 2024 zero-day vulnerabilities generally fell into two categories: big tech (Microsoft, Google, and Apple) and vendors who supply security and network-focused products. As expected, big tech took the top two spots, with Microsoft at 26 and Google at 11. Apple slid to the fourth most frequently exploited vendor this year, with detected exploitation of only five zero-days. Ivanti was third most frequently targeted with seven zero-days, reflecting increased threat actor focus on networking and security products. Ivanti’s placement in the top three reflects a new and crucial change, where a security vendor was targeted more frequently than a popular end-user technology-focused vendor. We discuss in a following section how PRC-backed exploitation has focused heavily on security and network technologies, one of the contributing factors to the rise in Ivanti targeting.
We note that exploitation is not necessarily reflective of a vendor’s security posture or software development processes, as targeted vendors and products depend on threat actor objectives and capabilities.
Types of Exploited Vulnerabilities
Threat actors continued to utilize zero-day vulnerabilities primarily for the purposes of gaining remote code execution and elevating privileges. In 2024, these consequences accounted for over half (42) of total tracked zero-day exploitation.
Three vulnerability types were most frequently exploited. Use-after-free vulnerabilities have maintained their prevalence over many years, with eight in 2024, and are found in a variety of targets including hardware, low-level software, operating systems, and browsers. Command injection (also at eight, including OS command injection) and cross-site scripting (XSS) (six) vulnerabilities were also frequently exploited in 2024. Both code injection and command injection vulnerabilities were observed almost entirely targeting networking and security software and appliances, displaying the intent to use these vulnerabilities in order to gain control over larger systems and networks. The XSS vulnerabilities were used to target a variety of products, including mail servers, enterprise software, browsers, and an OS.
All three of these vulnerability types stem from software development errors and require meeting higher programming standards in order to prevent them from occurring. Safe and preventative coding practices, including, but not limited to code reviews, updating legacy codebases, and utilizing up-to-date libraries, can appear to hinder production timelines. However, patches prove the potential for these security exposures to be prevented in the first place with proper intention and effort and ultimately reduce the overall effort to properly maintain a product or codebase.
Who Is Driving Exploitation
Figure 4: 2024 attributed zero-day exploitation
Due to the stealthy access zero-day vulnerabilities can provide into victim systems and networks, they continue to be a highly sought after capability for threat actors. GTIG tracked a variety of threat actors exploiting zero-days in a variety of products in 2024, which is consistent with our previous observations that zero-day exploitation has diversified in both platforms targeted and actors exploiting them. We attributed the exploitation of 34 zero-day vulnerabilities in 2024, just under half of the total 75 we identified in 2024. While the proportion of exploitation that we could attribute to a threat actor dipped slightly from our analysis of zero-days in 2023, it is still significantly higher than the ~30% we attributed in 2022. While this reinforces our previous observation that platforms’ investment in exploit mitigations are making zero-days harder to exploit, the security community is also slowly improving our ability to identify that activity and attribute it to threat actors.
Consistent with trends observed in previous years, we attributed the highest volume of zero-day exploitation to traditional espionage actors, nearly 53% (18 vulnerabilities) of total attributed exploitation. Of these 18, we attributed the exploitation of 10 zero-days to likely nation-state-sponsored threat groups and eight to CSVs.
CSVs Continue to Increase Access to Zero-Day Exploitation
While we still expect government-backed actors to continue their historic role as major players in zero-day exploitation, CSVs now contribute a significant volume of zero-day exploitation. Although the total count and proportion of zero-days attributed to CSVs declined from 2023 to 2024, likely in part due to their increased emphasis on operational security practices, the 2024 count is still substantially higher than the count from 2022 and years prior. Their role further demonstrates the expansion of the landscape and the increased access to zero-day exploitation that these vendors now provide other actors.
In 2024, we observed multiple exploitation chains using zero-days developed by forensic vendors that required physical access to a device (CVE-2024-53104, CVE-2024-32896, CVE-2024-29745, CVE-2024-29748). These bugs allow attackers to unlock the targeted mobile device with custom malicious USB devices. For instance, GTIG and Amnesty International’s Security Lab discovered and reported on CVE-2024-53104 in exploit chains developed by forensic company Cellebrite and used against the Android phone of a Serbian student and activist by Serbian security services. GTIG worked with Android to patch these vulnerabilities in the February 2025 Android security bulletin.
PRC-Backed Exploitation Remains Persistent
PRC threat groups remained the most consistent government-backed espionage developer and user of zero-days in 2024. We attributed nearly 30% (five vulnerabilities) of traditional espionage zero-day exploitation to PRC groups, including the exploitation of zero-day vulnerabilities in Ivanti appliances by UNC5221 (CVE-2023-46805 and CVE-2024-21887), which GTIG reported on extensively. During this campaign, UNC5221 chained multiple zero-day vulnerabilities together, highlighting these actors’ willingness to expend resources to achieve their apparent objectives. The exploitation of five vulnerabilities that we attributed to PRC groups exclusively focused on security and networking technologies. This continues a trend that we have observed from PRC groups for several years across all their operations, not just in zero-day exploitation.
North Korean Actors Mix Financially Motivated and Espionage Zero-Day Exploitation
For the first time since we began tracking zero-day exploitation in 2012, in 2024, North Korean state actors tied for the highest total number of attributed zero-days exploited (five vulnerabilities) with PRC-backed groups. North Korean groups are notorious for their overlaps in targeting scope; tactics, techniques, and procedures (TTPs); and tooling that demonstrate how various intrusion sets support the operations of other activity clusters and mix traditional espionage operations with attempts to fund the regime. This focus on zero-day exploitation in 2024 marks a significant increase in these actors’ focus on this capability. North Korean threat actors exploited two zero-day vulnerabilities in Chrome as well as three vulnerabilities in Windows products.
In October 2024, it was publicly reported that APT37 exploited a zero-day vulnerability in Microsoft products. The threat actors reportedly compromised an advertiser to serve malicious advertisements to South Korean users that would trigger zero-click execution of CVE-2024-38178 to deliver malware. Although we have not yet corroborated the group’s exploitation of CVE-2024-38178 as reported, we have observed APT37 previously exploit Internet Explorer zero-days to enable malware distribution.
North Korean threat actors also reportedly exploited a zero-day vulnerability in the Windows AppLocker driver (CVE-2024-21338) in order to gain kernel-level access and turn off security tools. This technique abuses legitimate and trusted but vulnerable already-installed drivers to bypass kernel-level protections and provides threat actors an effective means to bypass and mitigate EDR systems.
Non-State Exploitation
In 2024, we linked almost 15% (five vulnerabilities) of attributed zero-days to non-state financially motivated groups, including a suspected FIN11 cluster’s exploitation of a zero-day vulnerability in multiple Cleo managed file transfer products (CVE-2024-55956) to conduct data theft extortion. This marks the third year of the last four (2021, 2023, and 2024) in which FIN11 or an associated cluster has exploited a zero-day vulnerability in its operations, almost exclusively in file transfer products. Despite the otherwise varied cast of financially motivated threat actors exploiting zero-days, FIN11 has consistently dedicated the resources and demonstrated the expertise to identify, or acquire, and exploit these vulnerabilities from multiple different vendors.
We attributed an additional two zero-days in 2024 to non-state groups with mixed motivations, conducting financially motivated activity in some operations but espionage in others. Two vulnerabilities (CVE-2024-9680 and CVE-2024-49039, detailed in the next section) were exploited as zero-days by CIGAR (also tracked as UNC4895 or publicly reported as RomCom), a group that has conducted financially motivated operations alongside espionage likely on behalf of the Russian government, based partly on observed highly specific targeting focused on Ukrainian and European government and defense organizations.
A Zero-Day Spotlight on CVE-2024-44308, CVE-2024-44309, and CVE-2024-49039: A look into zero-days discovered by GTIG researchers
Spotlight #1: Stealing Cookies with Webkit
On Nov. 12, 2024, GTIG detected a potentially malicious piece of JavaScript code injected on https://online.da.mfa.gov[.]ua/wp-content/plugins/contact-form-7/includes/js/index.js?ver=5.4. The JavaScript was loaded directly from the main page of the website of the Diplomatic Academy of Ukraine, online.da.mfa.gov.ua. Upon further analysis, we discovered that the JavaScript code was a WebKit exploit chain specifically targeting MacOS users running on Intel hardware.
The exploit consisted of a WebKit remote code execution (RCE) vulnerability (CVE-2024-44308), leveraging a logical Just-In-Time (JIT) error, succeeded by a data isolation bypass (CVE-2024-44309). The RCE vulnerability employed simple and old JavaScriptCore exploitation techniques that are publicly documented, namely:
Setting up addrof/fakeobj primitives using the vulnerability
Leaking StructureID
Building a fake TypedArray to gain arbitrary read/write
JIT compiling a function to get a RWX memory mapping where a shellcode can be written and executed
The shellcode traversed a set of pointers and vtables to find and call WebCookieJar::cookieRequestHeaderFieldValue with an empty firstPartyForCookies parameter, allowing the threat actor to access cookies of any arbitrary website passed as the third parameter to cookieRequestHeaderFieldValue.
The end goal of the exploit is to collect users’ cookies in order to access login.microsoftonline.com. The cookie values were directly appended in a GET request sent to https://online.da.mfa.gov.ua/gotcookie?.
This is not the first time we have seen threat actors stay within the browser to collect users’ credentials. In March 2021, a targeted campaign used a zero-day against WebKit on iOS to turn off Same-Origin-Policy protections in order to collect authentication cookies from several popular websites. In August 2024, a watering hole on various Mongolian websites used Chrome and Safari n-day exploits to exfiltrate users’ credentials.
While it is unclear why this abbreviated approach was taken as opposed to deploying full-chain exploits, we identified several possibilities, including:
The threat actor was not able to get all the pieces to have a full chain exploit. In this case, the exploit likely targeted only the MacIntel platform because they did not have a Pointer Authentication Code (PAC) bypass to target users using Apple Silicon devices. A PAC bypass is required to make arbitrary calls for their data isolation bypass.
The price for a full chain exploit was too expensive, especially when the chain is meant to be used at a relatively large scale. This especially includes watering hole attacks, where the chances of being detected are high and subsequently might quickly burn the zero-day vulnerability and exploit.
Stealing credentials is sufficient for their operations and the information they want to collect.
This trend is also observed beyond the browser environment, wherein third-party mobile applications (e.g., messaging applications) are targeted, and threat actors are stealing the information only accessible within the targeted application.
Spotlight #2: CIGAR Local Privilege Escalations
CIGAR’s Browser Exploit Chain
In early October 2024, GTIG independently discovered a fully weaponized exploit chain for Firefox and Tor browsers employed by CIGAR. CIGAR is a dual financial- and espionage-motivated threat group assessed to be running both types of campaigns in parallel, often simultaneously. In 2023, we observed CIGAR utilizing an exploit chain in Microsoft Office (CVE-2023-36884) as part of an espionage campaign targeting attendees of the Ukrainian World Congress and NATO Summit; however, in an October 2024 campaign, the usage of the Firefox exploit appears to be more in line with the group’s financial motives.
Our analysis, which broadly matched ESET’s findings, indicated that the browser RCE used is a use-after-free vulnerability in the Animation timeline. The vulnerability, known as CVE-2024-9680, was an n-day at the time of discovery by GTIG.
Upon further analysis, we identified that the embedded sandbox escape, which was also used as a local privilege escalation to NT/SYSTEM, was exploiting a newfound vulnerability. We reported this vulnerability to Mozilla and Microsoft, and it was later assigned CVE-2024-49039.
Double-Down on Privilege Escalation: from Low Integrity to SYSTEM
Firefox uses security sandboxing to introduce an additional security boundary and mitigate the effects of malicious code achieving code execution in content processes. Therefore, to achieve code execution on the host, an additional sandbox escape is required.
The in-the-wild CVE-2024-49039 exploit, which contained the PDB string C:etalonPocLowIL@OutputPocLowIL.pdb, could achieve both a sandbox escape and privilege escalation. The exploit abused two distinct issues to escalate privileges from Low Integrity Level (IL) to SYSTEM: the first allowed it to access the WPTaskScheduler RPC Interface (UUID: {33d84484-3626-47ee-8c6f-e7e98b113be1}), normally not accessible from a sandbox Firefox content process via the “less-secure endpoint” ubpmtaskhostchannel created in ubpm.dll; the second stems from insufficient Access Control List (ACL) checks in WPTaskScheduler.dll RPC server, which allowed an unprivileged user to create and execute scheduled tasks as SYSTEM.
1. Securing the endpoint: In WPTaskScheduler::TsiRegisterRPCInterface, the third argument to RpcServerUseProtseq is a non-NULL security descriptor (SD).
This SD should prevent the Firefox “Content” process from accessing the WPTaskScheduler RPC endpoint. However, a lesser known “feature” of RPC is that RPC endpoints are multiplexed, meaning that if there is a less secure endpoint in the same process, it is possible to access an interface indirectly from another endpoint (with a more permissive ACL). This is what the exploit does: instead of accessing RPC using the ALPC port that the WPTaskScheduler.dll sets up, it resolves the interface indirectly via upbmtaskhostchannel. ubpm.dll uses a NULL security descriptor when initializing the interface, instead relying on the UbpmpTaskHostChannelInterfaceSecurityCb callback for ACL checks:
Figure 5: NULL security descriptor used when creating “ubpmtaskhostchannel” RPC endpoint in ubpm.dll::UbpmEnableTaskHostChannelRpcInterface, exposing a less secure endpoint for WPTaskScheduler interface
2. Securing the interface: In the same WPTaskScheduler::TsiRegisterRPCInterface function, an overly permissive security descriptor was used as an argument to RpcServerRegisterIf3. As we can see on the listing below, the CVE-2024-49039 patch addressed this by introducing a more locked-down SD.
Figure 6: Patched WPTaskScheduler.dll introduces a more restrictive security descriptor when registering an RPC interface
3. Ad-hoc Security: Implemented in WPTaskScheduler.dll::CallerHasAccess and called prior to enabling or executing any scheduled task. The function performs checks on whether the calling user is attempting to execute a task created by them or one they should be able to access but does not perform any additional checks to prevent calls originating from an unprivileged user.
CVE-2024-49039 addresses the issue by applying a more restrictive ACL to the interface; however, the issue with the less secure endpoint described in “1. Securing the endpoint” remains, and a restricted token process is still able to access the endpoint.
Unidentified Actor Using the Same Exploits
In addition to CIGAR, we discovered another, likely financially motivated, group using the exact same exploits (albeit with a different payload) while CVE-2024-49039 was still a zero-day. This actor utilized a watering hole on a legitimate, compromised cryptocurrency news website redirecting to an attacker-controlled domain hosting the same CVE-2024-9680 and CVE-2024-49039 exploit.
Outlook and Implications
Defending against zero-day exploitation continues to be a race of strategy and prioritization. Not only are zero-day vulnerabilities becoming easier to procure, but attackers finding use in new types of technology may strain less experienced vendors. While organizations have historically been left to prioritize patching processes based on personal or organizational threats and attack surfaces, broader trends can inform a more specific approach alongside lessons learned from major vendors’ mitigation efforts.
We expect zero-day vulnerabilities to maintain their allure to threat actors as opportunities for stealth, persistence, and detection evasion. While we observed trends regarding improved vendor security posture and decreasing numbers around certain historically popular products—particularly mobile and browsers—we anticipate that zero-day exploitation will continue to rise steadily. Given the ubiquity of operating systems and browsers in daily use, big tech vendors are consistently high-interest targets, and we expect this to continue. Phones and browsers will almost certainly remain popular targets, although enterprise software and appliances will likely see a continued rise in zero-day exploitation. Big tech companies have been victims of zero-day exploitation before and will continue to be targeted. This experience, in addition to the resources required to build more secure products and detect vulnerabilities in responsible manners, permits larger companies to approach zero-days as a more manageable problem.
For newly targeted vendors and those with products in the growing prevalence of targeted enterprise products, security practices and procedures should evolve to consider how successful exploitation of these products could bypass typical protection mechanisms. Preventing successful exploitation will rely heavily on these vendors’ abilities to enforce proper and safe coding practices. We continue to see the same types of vulnerabilities exploited over time, indicating patterns in what weaknesses attackers seek out and find most beneficial to exploit. Continued existence and exploitation of similar issues makes zero-days easier; threat actors know what to look for and where exploitable weaknesses are most pervasive.
Vendors should account for this shift in threat activity and address gaps in configurations and architectural decisions that could permit exploitation of a single product to cause irreparable damage. This is especially true for highly valuable tools with administrator access and/or widespread reach across systems and networks. Best practices continue to represent a minimum threshold of what security standards an architecture should demonstrate, including zero-trust fundamentals such as least-privilege access and network segmentation. Continuous monitoring should occur where possible in order to restrict and end unauthorized access as swiftly as possible, and vendors will need to account for EDR capabilities for technologies that currently lack them (e.g., many security and networking products). GTIG recommends acute threat surface awareness and respective due diligence in order to defend against today’s zero-day threat landscape. Zero-day exploitation will ultimately be dictated by vendors’ decisions and ability to counter threat actors’ objectives and pursuits.
At Google, we believe in empowering people and founders to use AI to tackle humanity’s biggest challenges. That’s why we’re supporting the next generation of AI leaders through our Google for Startups Accelerator: AI First programs. We announced the program in January and today, we’re proud to welcome 16 UK-based startups into our accelerator community that are using AI to drive real-world impact.
Out of hundreds of applicants, we’ve carefully selected these 16 high-potential startups to receive 1:1 guidance and support from Google, each demonstrating a unique vision for leveraging AI to address critical challenges and opportunities. This diverse cohort showcases how AI is being applied across sectors — from early cancer detection and climate resilience, to smarter supply chains and creative content generation. By joining the Google for Startups Accelerator: AI First UK program, these startups gain access to technical expertise, mentorship, and a global network to help them scale responsibly and sustainably.
“Google for Startups Accelerator: AI First provides an exceptional opportunity for us to enhance our AI expertise, accelerate the development of our data-driven products, and engage meaningfully with potential investors.” – Denise, Williams, Managing Director, Dysplasia Diagnostics.
Read more about the selected startups and the founders shaping the future of AI:
Bindbridge (London) is a generative AI platform that discovers and designs molecular glues for targeted protein degradation in plants.
Building Atlas (Edinburgh) uses data and AI to support the decarbonisation of non-domestic buildings by modelling the best retrofit plans for any portfolio size.
Comply Stream (London) helps to streamline financial crime compliance operations for businesses and consumers.
Datawhisper (London) provides safe and compliant AI Agentic solutions tailored for the fintech and payments industry.
Deducta (London) is a data intelligence platform that supports global procurement teams with supply chain insights and efficiencies.
Dysplasia Diagnostics (London) develops AI-based, non-invasive, and affordable solutions for early cancer detection and treatment monitoring.
Flow.bio (London)is an end-to-end cloud platform for running large sequencing pipelines and auto-structuring bio-data for machine learning workflows.
Humble (London) enables non-technical users to build and share AI-powered apps and workflows, allowing them to automate without writing code.
Immersive Fox (London) is an AI studio for creating presenter-led marketing and communication videos directly from text.
Kestrix (London) uses thermal drones and advanced software to map and quantify heat loss from buildings and generate retrofit plans.
Measmerize (Birmingham) provides sizing advice for fashion e-commerce retailers, enabling brands to increase sales and decrease return rates.
PSi (London) uses AI to host large-scale online deliberations, enabling local governments to harness collective intelligence for effective policymaking.
Shareback (London) is an AI platform that allows employees to securely interact with GPT-based assistants trained on company, department, or project-specific data.
Sikoia (London) streamlines customer verification for financial services by consolidating data, automating tasks, and delivering actionable insights.
SmallSpark (Cardiff) enables low power AI at the edge, simplifying the deployment, management, and optimization of ML models on embedded devices.
Source.dev (London) simplifies the software development lifecycle for smart devices, to help accelerate innovation and streamline software updates.
“Through the program, we aim to leverage Google’s expertise and cutting-edge AI infrastructure to supercharge our growth on all fronts.” Lauren Ladd, Founder, Shareback
These 16 startups reflect the diversity and depth of AI innovation happening across the UK. Each company will receive technical mentorship, strategic guidance, and access to strategic connections from Google, and will continue to receive hands-on support via our alumni network after the program wraps in July.
Congratulations to this latest cohort! To learn more about applying for an upcoming Google for Startups program , visit the program page here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4e9c1c2520>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>