Google Cloud

2025 10 02

GCP – Accelerate AI with Agents: Event Series for Developers in EMEA

The future of technology is unfolding across EMEA, and Google Cloud is at the forefront of this transformation. We’re organizing a series of exciting events across the region, offering a unique opportunity for developers and tech enthusiasts to dive deep into the world of artificial intelligence, agents, and cloud computing. Whether you’re looking to use AI to make your applications smarter, get hands-on experience with cutting-edge developer tools, or network with industry experts, there’s an event for you nearby.

The Rise of the AI Agent

A major focus of these events is the “agentic era” of AI, where intelligent agents are set to empower developers and builders to automate tasks and be more efficient with their daily work. This event series is making stops in several EMEA cities, offering immersive experiences with hands-on labs. These events are designed to provide step-by-step guidance for building and scaling AI agents within your organization.

Developers will find plenty of opportunities to get hands-on experience with the latest Google Cloud technologies. The labs are specifically crafted for developers and technical practitioners, offering workshops that delve into cutting-edge technologies from Google and Google Cloud with sessions on Cloud Run, Vertex AI, Gemini, ADK, and much more.

Innovate with us!

This is more than just a series of events; it’s an invitation to be part of a movement that is shaping the future of technology and business. Don’t miss this opportunity to learn from Google, connect with your peers, and get hands-on with the tools that will define the next wave of innovation.

Ready to join the AI revolution? Explore the full list of events and register today to secure your spot. The future is being built now, and you can be a part of it. Visit the official Google Cloud events page to find an event near you and start your journey into the new era of AI and cloud. This is your chance to learn, connect, and innovate with Google Cloud.

Register now!

Location	Date	Registration Link
Dubai, United Arab Emirates	October 4	Register Now
Helsinki, Finland	October 7	Register Now
Oslo, Norway	October 14	Register Now
Vienna, Austria	October 15	Register Now
Paris, France	October 16	Register Now
Prague, Czech Republic	October 23 to October 24	Register Now
Rotterdam, Netherlands	October 27 to October 28	Register Now
Copenhagen, Denmark	October 28	Register Now
Stockholm, Sweden	November 5	Register Now
Bucharest, Romania	November 5	Register Now
Istanbul, Turkey	November 7	Register Now
Hamburg, Germany	November 13	Register Now
Riyadh, Kingdom of Saudi Arabia	November 16 to November 17	Register Now
London, United Kingdom	November 18, 19, or 20	Register Now
Athens, Greece	November 19	Register Now

Read More for the details.

2025 10 02

GCP – The oracles of DeFi: How to build trustworthy data feeds for decentralized applications

Tibor Kiss Cloud, Google Cloud gcp

Distributed ledger technology (DLT) emerged with Bitcoin as a censorship-resistant way to conduct payments between distrusting peers. After a period, traditional financial institutions began to explore the technology, recognizing the potential of its immutability, decentralization, and programmability to redesign financial instruments and workflows.

However, a foundational issue has stalled many enterprise blockchain projects at the pilot stage: the data integrity problem. Moving from controlled test environments to production systems introduces attack vectors and failure modes that don’t exist in traditional centralized systems – a challenge that is particularly pressing for institutions like DZ BANK that are pioneering enterprise DLT adoption.

To solve these challenges, DZ BANK and Google Cloud built an architectural solution for trustworthy data delivery to blockchain applications. This post describes how market data can be securely fed into DZ BANK’s Smart Derivative Contracts (SDCs) using Google Cloud technology.

2025 is the year of strategic engagement

Distributed ledger technology (DLT) has reached a maturity inflection point. Regulatory frameworks are stabilizing, scalability limitations are being addressed, and the Eurosystem’s validation work demonstrates that smart contract protocols enable efficient settlement across separate DLT infrastructures. The exploratory phase is ending — institutions that haven’t moved beyond pilot projects risk being left behind as competitors deploy production blockchain systems.

The technology’s core value proposition — immutable, programmable, decentralized execution — enables new digital financial products that weren’t previously feasible. Smart contracts can eliminate counterparty risk, automate complex settlement procedures, and enable peer-to-peer financial interactions without traditional intermediaries. But realizing these benefits requires solving a fundamental challenge that early blockchain systems were designed to avoid.

Eurosystem research shows that smart contract protocols enable efficient settlement for financial instruments and functional interoperability across separate DLT infrastructures. Based on these results, the Eurosystem recently announced plans to strengthen activities in this field, aiming to create a harmonized digital European financial market infrastructure.

The need for trustworthy data

While DLT was conceptualized as a technology where code is law, banks, asset managers, and clearinghouses require off-ledger data from external sources. This includes price feeds, interest rate feeds, KYC/AML attestations, legal event triggers, proofs of reserves, IoT sensor data, and payment confirmations.

Data trustworthiness is paramount in DLT systems. Wrong data can have unintended consequences in any system, but DLT transactions are often irreversible. A payment using an incorrect interest rate can be cancelled in traditional systems, while DLT transactions are typically final – especially when participants are pseudonymous and unreachable by legal interventions. This creates new attack vectors: attackers who manipulate off-ledger data can cause significant financial harm, including theft of on-chain assets.

Building trustworthy oracle architecture

Off-chain data is supplied to DLT systems via oracles –- interfaces that deliver external information to smart contracts. Trustworthy oracle services require addressing three key aspects:

Data must be correct at the source – underlying systems must produce accurate information.
Data must remain untampered during transit and processing.
Data must be delivered timely and reliably.

The combination of Google Cloud’s secure, highly available infrastructure with DZ BANK’s vision for standardized, deterministic financial protocols meets these requirements. Google’s global technical infrastructure provides security throughout the entire information processing lifecycle, ensuring data integrity and reliability at source and in transit. DZ BANK’s focus on technology-agnostic, deterministic patterns for financial instruments enables trustworthy, automatable financial innovations. Together this approach provides the foundation for delivering timely, untampered data to any DLT system and establishes scalable protocol standards for secure digital financial services.

This architectural pattern provides a blueprint for other institutions facing similar challenges, creating reusable components for trustworthy data delivery across different financial smart contracts and blockchain networks.

The Smart Derivative Contract: a DLT-based financial instrument relying on trustworthy data

The Smart Derivative Contract (SDC) use case covers a fully algorithmic financial product lifecycle that relies heavily on external data, serving as an archetype for oracle-based financial data feeds produced on Google Cloud and consumed by smart contracts. The deterministic settlement cycle requires robust oracle services to determine settlement values. Key functionalities include deterministic valuation, automated margining, netting, and trade termination handling to remove counterparty credit risk in OTC transactions. These processes depend on reliable real-time market data from oracles to determine net present value (NPV) and settlement amounts.

The underlying protocol is published as Ethereum Request for Comments (ERC) 6123. Based on this open protocol, DZ BANK has conducted several pilot transactions with binding legal effect and successfully validated the use case on Bundesbank’s DLT infrastructure (the Trigger Solution).

Why are derivatives the hardest test case? They require precise mathematical calculations using current market data to determine NPV for settlement. Interest rate swaps, for example, need current swap quotes to bootstrap discount and forward rate curves before calculating settlement amounts. The entire process must be deterministic and tamper-proof, with cryptographic evidence of data integrity throughout the pipeline. A secure oracle service is essential for the SDC lifecycle and automated settlement.

DZZBank-Oracles-Blockchain-Finance-Architecture

Technical foundation: security layers

The oracle system architecture addresses distinct threat models, each requiring different technical countermeasures:

Mitigating software supply chain issues

For SDCs, an attacker might tamper with code that determines NPV calculations – for example, modifying functionality to return artificially low values by default. This would cause settlement at incorrect amounts. To mitigate these issues, we follow Secure Software Supply Chain practices, leveraging Binary Authorization to enforce policies that only permit container images with valid attestation to be deployed. This attestation certifies that container images have passed required checks, such as build verification by trusted CI pipelines, generating build provenance. For maximum security, SLSA Level 3 assurance is desired. By verifying this attestation at deploy time, Binary Authorization blocks unauthorized or tampered images, preventing the malicious code from running.

Secure connections to data sources

Oracle functions calculating NPV need to access relevant market data, which could reside in cloud databases like Cloud SQL. The workload can connect via Private Service Connect, enabling access to the Cloud SQL instance using a private internal IP address within its own VPC network. This keeps all traffic within the Google Cloud network, allowing secure data access without traversing the public internet.

TEE attestation

Ensuring oracle data correctness requires proving that data was generated by specific untampered software versions. For SDCs, this applies to software responsible for value calculations. Using Confidential Space, a trusted execution environment, ensures that only authorized, unmodified software workloads can process data.

This is achieved through remote attestation, where data owners set access conditions based on verification of authorized workload attributes, including the specific digest of containerized software images. Code identity verification complements the inherent trust participants place in the oracle’s business logic. A verifiable attestation token provides strong assurance and can be packaged with the oracle data output to prove origin and correctness.

Transport Layer Security

Transport Layer Security (TLS) adds an additional layer by encrypting oracle data output during transmission to the blockchain. While TEE attestation proves data was generated by authorized, unmodified software within a secure environment, TLS protects data from interception or tampering during network transit.

Applications beyond derivatives

The architectural patterns developed for SDCs apply to other enterprise blockchain use cases that require trustworthy external data. Cross-chain asset transfers, for example, need reliable payment confirmation data to avoid double-spending attacks. Supply chain applications need verified sensor readings and logistics confirmations.

Secure cross-chain protocols are essential for financial interoperability, especially for settling asset and cash legs across separate networks. However, current protocols like HTLC rely on time-outs, creating security vulnerabilities. A more secure approach, proposed in ERC-7573, uses a stateless oracle that releases cryptographic keys contingent on payment success to either complete asset swaps or return funds. By decrypting keys as instructed by a smart contract, the oracle enhances both security and efficiency – an example of how: trustworthy off-chain oracles enable smart contracts.

Building production-ready blockchain infrastructure

The collaboration between DZ BANK and Google Cloud demonstrates that enterprise blockchain adoption is no longer limited by technology. Success depends on integrating decentralized applications with existing business systems while maintaining security and reliability standards.

For developers and architects working on enterprise blockchain projects, this collaboration provides both technical patterns to emulate and infrastructure components to leverage. The challenge isn’t building blockchain applications — it’s building blockchain applications that enterprises can trust with critical business processes.

Ready to explore how these architectural approaches can support your blockchain infrastructure requirements? The frameworks and patterns developed through this collaboration offer practical starting points for building trustworthy oracle systems that meet enterprise security and reliability standards.

The team would also like to thank Christian Fries at DZ BANK and Googlers Chris Diya, Yuriy Babenko, and Latif Ajouaoui for their contributions.

Read More for the details.

2025 10 01

GCP – Gemini CLI extension for PostgreSQL in action: Build a fuzzy search feature in minutes

Tibor Kiss Cloud, Google Cloud gcp

Adding features to an app can be hard. One minute you’re writing code, the next you’re switching to the PostgreSQL database client to run a query, and then it’s over to the console to check on your instances. For example, let’s say you wanted to add search capabilities. This can mean adding the right extensions to your PostgreSQL database and learning how to use it. How do you know which extension to use and how to use it in your queries? What if you could do it all from one place, using plain English?

The recently announced Gemini CLI extension for PostgreSQL is here to do just that. It’s a powerful tool that brings the magic of natural language and the convenience of the command line to your database tasks.

Let’s see it in action: The fuzzy search challenge

Imagine you want to add a “fuzzy search” feature to your app — you know, so users can find a “t-shirt” even if they type “tshirt”.

Normally, this would involve a bunch of research and manual steps. But with the Gemini CLI, it’s a conversation:

The ask: You tell Gemini CLI you need to implement fuzzy search.
The smart suggestion: Gemini CLI immediately identifies that the pg_trgm extension is the perfect tool for the job.

The proactive check: It doesn’t just assume you have the extension. It runs list_installed_extensions() to see if pg_trgm is already installed. If not, it installs it for you!

The performance tip: To make your search blazing fast, it then recommends creating a GIST or GIN index on the relevant table column (e.g., the details column in your product_inventory table).

The code: Finally, it provides you with a sample query, showing you exactly how to execute the fuzzy search.

In just a few steps, you’ve gone from a feature request to a fully implemented, optimized solution. That’s not just a command line; that’s a database assistant.

Key features you’ll love

To recap, the Gemini CLI extension for PostgreSQL lets you:

Talk to your database: Use natural language to get what you need, from simple schema exploration to complex tasks like setting up extensions.
Exert full lifecycle control: From creating a new PostgreSQL instance to managing users and permissions, this extension has you covered.
Generate code on the fly: Speed up your development by automatically generating data classes and other code snippets based on your table schemas.

To get started, use the Gemini CLI extension for PostgreSQL.

Read More for the details.

2025 09 30

GCP – Forecasts and data insights come to BigQuery’s MCP and Agent Development Kit tools

Tibor Kiss Cloud, Google Cloud gcp

For AI agents to be really useful, they need to be able to securely interact with enterprise data. In July, we introduced a toolset to help AI agents interact with and analyze business data in BigQuery through natural language, and with just a few lines of code. Today, we’re taking the next step, with “Ask data insights” for Conversational Analytics and the “BigQuery Forecast” for time-series predictions, going beyond fetching metadata and executing queries to full-scale data analysis and predictions. Both tools are available today in the MCP Toolbox as well as Agent Development Kit’s built-in toolset.

Let’s dive into what you can do with these new tools.

ask_data_insights: Converse with BigQuery

With the ask_data_insights tool, you can now answer complex questions of your structured data in BigQuery using plain English.

Built on top of the powerful Conversational Analytics API, ask_data_insights enables an agent to utilize the API to offload the task of understanding the user’s question, pulling in relevant context, formulating and executing the queries, and summarizing the answer in plain English. Along the way, the ask_data_insights tool shows its work, returning a detailed step-by-step log of its process, so you have full transparency into how it arrived at the answer.

Predict the future with BigQuery Forecast

Information without insights is just noise. The ability to predict future trends, whether sales, user traffic, or inventory needs, is critical for any business. BigQuery Forecast simplifies time-series forecasting using BigQuery ML’s AI.FORECAST function based on the built-in TimesFM model.

With BigQuery Forecast, the agent can run the forecasting job directly within BigQuery, without you having to set up machine learning infrastructure. Point the tool at your data, specify what you want to predict and a time horizon, and the agent will make its predictions using TimesFM.

New tools in action: Building a Google Analytics Data Agent

Let’s explore how to build a simple agent to answer questions about Google Analytics 360 data using ask_data_insights and BigQuery Forecast. For this demo,

The data is stored in BigQuery tables. Users of this agent only require read access to these tables, which are available under the BigQuery public dataset. bigquery-public-data.google_analytics_sample.
We will use ADK to build this agent and “adk web” to test it.
We are using one tool from the ADK’s built-in tools and one from the MCP toolbox. You can choose to use either option depending on your agent architecture and needs.

This diagram shows the architecture of this simple agent:

And here is the agent code:

code_block: <ListValue: [StructValue([(‘code’, ‘import asynciornfrom google.adk.agents import Agentrnfrom google.adk.runners import Runnerrnfrom google.adk.sessions import InMemorySessionServicernfrom google.adk.tools.bigquery import BigQueryCredentialsConfigrnfrom google.adk.tools.bigquery import BigQueryToolsetrnfrom google.adk.tools.bigquery.config import BigQueryToolConfigrnfrom google.adk.tools.bigquery.config import WriteModernfrom google.genai import typesrnfrom toolbox_core import ToolboxSyncClientrnimport google.authrnrn# Constants for this example agentrnAGENT_NAME = “bigquery_agent”rnAPP_NAME = “bigquery_app”rnUSER_ID = “user1234″rnSESSION_ID = “1234”rnGEMINI_MODEL = “gemini-2.5-pro”rnrn# A tool configuration to block any write operationsrntool_config = BigQueryToolConfig(write_mode=WriteMode.BLOCKED)rnrn# We are using application default credentialsrnapplication_default_credentials, _ = google.auth.default()rncredentials_config = BigQueryCredentialsConfig(rn credentials=application_default_credentialsrn)rnrn# Instantiate the built in BigQuery toolset with single toolrn# Use “ask_data_insights” for deeper insightsrnbigquery_toolset = BigQueryToolset(rn credentials_config=credentials_config, bigquery_tool_config=tool_config, tool_filter=[‘ask_data_insights’]rn)rnrn# Instantiate a Toolbox toolset. Only forecasting tool used. Make sure the Toolbox MCP server is already running locally. You can learn more at this codelabrntoolbox = ToolboxSyncClient(“http://127.0.0.1:5000”)rnmcp_tools = toolbox.load_toolset(‘bq-mcp-toolset’)rnrn# Agent Definitionrnroot_agent = Agent(rn model=GEMINI_MODEL,rn name=AGENT_NAME,rn description=(rn “Agent to answer questions about Google Analytics data stored in BigQuery”rn ),rn instruction=”””\rn You are a Google Analytics agent that can answer questions on Google Analytics data. rnrntImportant context: rnttYou have access to Google Analytics data in BigQuery which you should use to answer the questionsrnttTables available to you are – rntt1. `<my-project>.google_analytics_sample.daily_total_visits`rntt2. `<my-project>.google_analytics_sample.ga_sessions_20170801`rntAutomatically choose the table based on the user question. When either can be used, use the first one. rn “””,rn tools=mcp_tools+[bigquery_toolset],rn)’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x7f440d109fd0>)])]>

Using the agent code above, let’s turn to the ADK’s developer UI, i.e., adk web, to test the agent and see it in action.

First, let’s use the tools to understand our data…

3-Agent using insight tool to summarize the data — Agent using the insights tool to summarize the data

Then, let’s see if the agent can answer a business question.

4-Conversational Analytics backend is equipped with deeper thinking — The Conversational Analytics API backend is equipped with deeper thinking, and is able to bring out rich insights.

As you can see above, the Conversational Analytics API is equipped with the ability to perform deep thinking, so it can provide rich insights into our question.

Now, let’s see if the agent can predict the future.

Short answer, yes, yes it can, with a 95% confidence level. With these tools, the power of the TimesFM model is finally available to business users, regardless of their technical skill level.

Bring analysis and forecasts to your data

These new BigQuery capabilities will help developers reimagine how they build data-driven applications and agents. Together, we believe the combination of AI-powered Conversational Analytics and powerful, built-in forecasting capabilities will make performing sophisticated data analysis easier than ever.

Learn more about the ask_data_insights and BigQuery Forecast tools in the MCP Toolbox for databases and the core Agent Development Kit.

Read More for the details.

2025 09 30

GCP – Cybercrime Observations from the Frontlines: UNC6040 Proactive Hardening Recommendations

Tibor Kiss Cloud, Google Cloud gcp

Written by: Omar ElAhdan, Matthew McWhirt, Michael Rudden, Aswad Robinson, Bhavesh Dhake,
Laith Al

Background

Protecting software-as-a-service (SaaS) platforms and applications requires a comprehensive security strategy. Drawing from analysis of UNC6040’s specific attack methodologies, this guide presents a structured defensive framework encompassing proactive hardening measures, comprehensive logging protocols, and advanced detection capabilities. While emphasizing Salesforce-specific security recommendations, these strategies provide organizations with actionable approaches to safeguard their SaaS ecosystem against current threats.

Google Threat Intelligence Group (GTIG) is tracking UNC6040, a financially motivated threat cluster that specializes in voice phishing (vishing) campaigns specifically designed to compromise organizations’ Salesforce instances for large-scale data theft and subsequent extortion. Over the past several months, UNC6040 has demonstrated repeated success in breaching networks by having its operators impersonate IT support personnel in convincing telephone-based social engineering engagements. This approach has proven particularly effective in tricking employees, often within English-speaking branches of multinational corporations, into actions that grant the attackers access or lead to the sharing of sensitive credentials, ultimately facilitating the theft of organization’s Salesforce data. In all observed cases, attackers relied on manipulating end users, not exploiting any vulnerability inherent to Salesforce.

A prevalent tactic in UNC6040’s operations involves deceiving victims into authorizing a malicious connected app to their organization’s Salesforce portal. This application is often a modified version of Salesforce’s Data Loader, not authorized by Salesforce. During a vishing call, the actor guides the victim to visit Salesforce’s connected app setup page to approve a version of the Data Loader app with a name or branding that differs from the legitimate version. This step inadvertently grants UNC6040 significant capabilities to access, query, and exfiltrate sensitive information directly from the compromised Salesforce customer environments. This methodology of abusing Data Loader functionalities via malicious connected apps is consistent with recent observations detailed by Salesforce in their guidance on protecting Salesforce environments from such threats.

In some instances, extortion activities haven’t been observed until several months after the initial UNC6040 intrusion activity, which could suggest that UNC6040 has partnered with a second threat actor that monetizes access to the stolen data. During these extortion attempts, the actor has claimed affiliation with the well-known hacking group ShinyHunters, likely as a method to increase pressure on their victims.

attack-flow-unc6040-hardening — Figure 1: Data Loader attack flow

We have observed the following patterns in UNC6040 victimology:

Motive: UNC6040 is a financially motivated threat cluster that accesses victim networks by vishing social engineering.
Focus: Upon obtaining access, UNC6040 has been observed immediately exfiltrating data from the victim’s Salesforce environment using Salesforce’s Data Loader application. Following this initial data theft, UNC6040 was observed leveraging end-user credentials obtained through credential harvesting or vishing to move laterally through victim networks, accessing and exfiltrating data from the victim’s accounts on other cloud platforms such as Okta and Microsoft 365.
Attacker infrastructure: UNC6040 primarily used Mullvad VPN IP addresses to access and perform the data exfiltration on the victim’s Salesforce environments and other services of the victim’s network.

Proactive Hardening Recommendations

The following section provides prioritized recommendations to protect against tactics utilized by UNC6040. This section is broken down to the following categories:

Identity
- Help Desk and End User Verification
- Identity Validation and Protections
SaaS Applications
- SaaS Application (e.g., Salesforce) Hardening Measures
Logging and Detections

Note: While the following recommendations include strategies to protect SaaS applications, they also cover identity security controls and detections applicable at the Identity Provider (IdP) layer and security enhancements for existing processes, such as the help desk.

1. Identity

Positive Identity Verification

To protect against increasingly sophisticated social engineering and credential compromise attacks, organizations must adopt a robust, multilayered process for identity verification. This process moves beyond outdated, easily compromised methods and establishes a higher standard of assurance for all support requests, especially those involving account modifications (e.g., password resets or multi-factor authentication modifications).

Guiding Principles

Assume nothing: Do not inherently trust the caller’s stated identity. Verification is mandatory for all security-related requests.
Defense-in-depth: Rely on a combination of verification methods. No single factor should be sufficient for high-risk actions.
Reject unsafe identifiers: Avoid relying on publicly available or easily discoverable data. Information such as:
- Date of birth
- Last four digits of a Social Security number
- High school names
- Supervisor names

This data should not be used as primary verification factors, as it’s often compromised through data breaches or obtainable via open source intelligence (OSINT).

Standard Verification Procedures

Live Video Identity Proofing (Primary Method)

This is the most reliable method for identifying callers. The help desk agent must:

Initiate a video call with the user
Require the user to present a valid corporate badge or government-issued photo ID (e.g., driver’s license) on camera next to their face
Visually confirm that the person on the video call matches the photograph on the ID
Cross-reference the user’s face with their photo in the internal corporate identity system
Verify that the name on the ID matches the name in the employee’s corporate record

Contingency for No Video: If a live video call is not possible, the user must provide a selfie showing their face, their photo ID, and a piece of paper with the current date and time written on it.

Additionally, before proceeding with any request – help desk personnel must check the user’s calendar for Out of Office (OOO) or vacation status. All requests from users who are marked as OOO should be presumptively denied until they have officially returned.

Out-of-Band (OOB) Verification (For High-Risk Requests)

For high-risk changes like multi-factor authentication (MFA) resets or password changes for privileged accounts, an additional OOB verification step is required after the initial ID proofing. This can include:

Call-back: Placing a call to the user’s registered phone number on file
Manager approval: Sending a request for confirmation to the user’s direct manager via a verified corporate communication channel

Special Handling for Third-Party Vendor Requests

Mandiant has observed incidents where attackers impersonate support personnel from third-party vendors to gain access. In these situations, the standard verification principals may not be applicable.

Under no circumstances should the Help Desk move forward with allowing access. The agent must halt the request and follow this procedure:

End the inbound call without providing any access or information
Independently contact the company’s designated account manager for that vendor using trusted, on-file contact information
Require explicit verification from the account manager before proceeding with any request

Outreach to End Users

Mandiant has observed the threat actor UNC6040 targeting end-users who have elevated access to SaaS applications. Posing as vendors or support personnel, UNC6040 contacts these users and provides a malicious link. Once the user clicks the link and authenticates, the attacker gains access to the application to exfiltrate data.

To mitigate this threat, organizations should rigorously communicate to all end-users the importance of verifying any third-party requests. Verification procedures should include:

Hanging up and calling the official account manager using a phone number on file
Requiring the requester to submit a ticket through the official company support portal
Asking for a valid ticket number that can be confirmed in the support console

Organizations should also provide a clear and accessible process for end-users to report suspicious communications and ensure this reporting mechanism is included in all security awareness outreach.

Salesforce has additional guidance that can be referenced.

Identity Protections

Since access to SaaS applications is typically managed by central identity providers (e.g., Entra ID, Okta), Mandiant recommends that organizations enforce unified identity security controls directly within these platforms.

Guiding Principles

Mandiant’s approach focuses on the following core principles:

Authentication boundary This principle establishes a foundational layer of trust based on network context. Access to sensitive resources should be confined within a defined boundary, primarily allowing connections from trusted corporate networks and VPNs to create a clear distinction between trusted and untrusted locations.
Defense-in-depth This principle dictates that security cannot rely on a single control. Organizations should layer multiple security measures,such as strong authentication, device compliance checks, and session controls.
Identity detection and response Organizations must continuously integrate real-time threat intelligence into access decisions. This ensures that if an identity is compromised or exhibits risky behavior, its access is automatically contained or blocked until the threat has been remediated.

Identity Security Controls

The following controls are essential for securing access to SaaS applications through a central identity provider.

Utilize Single Sign-On (SSO)

Ensure that all users accessing SaaS applications are accessing via a corporate-managed SSO provider (e.g., Microsoft Entra ID or Okta), rather than through platform-native accounts. A platform-native break glass account should be created and vaulted for use only in the case of an emergency.

In the event that SSO through a corporate-managed provider is not available, refer to the content specific to the applicable SaaS application (e.g., Salesforce) rather than Microsoft Entra ID or Okta.

Mandate Phishing-Resistant MFA

Phishing-resistant MFA must be enforced for all users accessing SaaS applications. This is a foundational requirement to defend against credential theft and account takeovers. Consider enforcing physical FIDO2 keys for accounts with privileged access. Ensure that no MFA bypasses exist in authentication policies tied to business critical applications.

For Microsoft Entra ID:

General MFA Policy: Enforce MFA for all users with Conditional Access
Passkey (FIDO2) Setup: Enable passkey and FIDO2 security keys

For Okta:

Configure FIDO2 (WebAuthn): Set up the FIDO2 (WebAuthn) authenticator
Enforce via Policy: Create an Authentication Policy to require strong authenticators

For Google Cloud Identity / Workspace:

General MFA Policy: Deploy 2-Step Verification
Security Key enforcement: Use a security key for 2-Step Verification

For Salesforce:

MFA is required by default for local Salesforce accounts: Salesforce Multi-Factor Authentication FAQ
Configure FIDO2 (WebAuthn): Register a Security Key as an Identity Verification Method for Salesforce Orgs

Enforce Device Trust and Compliance

Access to corporate applications must be limited to devices that are either domain-joined or verified as compliant with the organization’s security standards. This policy ensures that a device meets a minimum security baseline before it can access sensitive data.

Key device posture checks should include:

Valid host certificate: The device must present a valid, company-issued certificate
Approved operating system: The endpoint must run an approved OS that meets current version and patch requirements
Active EDR agent: The corporate Endpoint Detection and Response (EDR) solution must be installed, active, and reporting a healthy status

For Microsoft Entra ID:

Device Compliance Policy: Require compliant devices for all users with Conditional Access

For Okta:

Device Trust Overview: Configure Okta Device Trust for managed devices

For Google Cloud Identity / Workspace:

Context-Aware Access: Overview of Context-Aware Access
Endpoint Verification: Monitor and gather details about devices with Endpoint Verification

Automate Response to Identity Threats

Mandiant recommends that organizations implement dynamic authentication policies that respond to threats in real time. By integrating identity threat intelligence feeds—from both native platform services and third-party solutions—into the authentication process, organizations can automatically block or challenge access when an identity is compromised or exhibits risky behavior.

This approach primarily evaluates two categories of risk:

Risky sign-ins: The probability that an authentication request is illegitimate due to factors like atypical travel, a malware-linked IP address, or password spray activity
Risky users: The probability that a user’s credential has been compromised or leaked online

Based on the detected risk level, Mandiant recommends that organizations apply a tiered approach to remediation.

Recommended Risk-Based Actions

For high-risk events: Organizations should apply the most stringent security controls. This includes blocking access entirely.
For medium-risk events: Access should be granted only after a significant step-up in verification. This typically means requiring proof of both the user’s identity (via strong MFA) and the device’s integrity (by verifying its compliance and security posture).
For low-risk events: Organizations should still require a step-up authentication challenge, such as standard MFA, to ensure the legitimacy of the session and mitigate low-fidelity threats.

For Microsoft Entra ID:

Overview
Configuration

For Okta:

Behavior detection
Risk-based policies

For Google Cloud Identity / Workspace:

Access context manager overview: Use Context-Aware Access to create granular, risk-based access policies

For Salesforce Shield:

Overview
Event monitoring: Provides detailed logs of user actions—such as data access, record modifications, and login origins—and allows these logs to be exported for external analysis
Transaction security policies: Monitors for specific user activities, such as large data downloads, and can be configured to automatically trigger alerts or block the action when it occurs

2. SaaS Applications

Salesforce Targeted Hardening Controls

This section details specific security controls applicable for Salesforce instances. These controls are designed to protect against broad access, data exfiltration, and unauthorized access to sensitive data within Salesforce.

Network and Login Controls

Restrict logins to only originate from trusted network locations.

See Salesforce guidance on network access and profile-based IP restrictions.

Restrict Login by IP Address

This control prevents credential misuse from unauthorized networks, effectively blocking access even if an attacker has stolen valid user credentials.

Define login IP ranges at the profile level to only permit access from corporate and trusted network addresses.
In Session Settings, enable “Enforce login IP ranges on every request” to ensure the check is not bypassed by an existing session.

See Salesforce guidance on setting trusted IP ranges.

Application and API Access Governance

Govern Connected App and API Access

Threat actors often bypass interactive login controls by leveraging generic API clients and stolen OAuth tokens. This policy flips the model from “allow by default” to “deny by default,” to ensure that only vetted applications can connect.

Enable a “Deny by Default” API policy: Navigate to API Access Control and enable “For admin-approved users, limit API access to only allowed connected apps.” This blocks all unapproved clients.
Maintain a minimal application allowlist: Explicitly approve only essential Connected Apps. Regularly review this allowlist to remove unused or unapproved applications.
Enforce strict OAuth policies per app: For each approved app, configure granular security policies, including restricting access to trusted IP ranges, enforcing MFA, and setting appropriate session and refresh token timeouts.
Revoke sessions when removing apps: When revoking an app’s access, ensure all active OAuth tokens and sessions associated with it are also revoked to prevent lingering access.
Organizational process and policy: Create policies governing application integrations with third parties. Perform Third-Party Risk Management reviews of all integrations with business-critical applications (e.g., Salesforce, Google Workspace, Workday).

See Salesforce guidance on managing API access.

User Privilege and Access Management

Implement the Principle of Least Privilege

Users should only be granted the absolute minimum permissions required to perform their job functions.

Use a “Minimum Access” profile as a baseline: Configure a base profile with minimal permissions and assign it to all new users by default. Limit the assignment of “View All” and “Modify All” permissions.
Grant privileges via Permission Sets: Grant all additional access through well-defined Permission Sets based on job roles, rather than creating numerous custom profiles.
Disable API access for non-essential users: The “API Enabled” permission is required for tools like Data Loader. Remove this permission from all user profiles and grant it only via a controlled Permission Set to a small number of justified users.
Hide the ‘Setup’ menu from non-admin users: For all non-administrator profiles, remove access to the administrative “Setup” menu to prevent unauthorized configuration changes.
Enforce high-assurance sessions for sensitive actions: Configure session settings to require a high-assurance session for sensitive operations such as exporting reports.

See Salesforce guidance on modifying session security settings.

See Salesforce guidance on requiring high-assurance session security.

See Salesforce guidance on “View All” and “Modify All” permissions.

Granular Data Access Policies

Enforce “Private” Organization-Wide Sharing Defaults (OWD)

Set the internal and external Organization-Wide Defaults (OWD) to “Private” for all sensitive objects.
Use strategic Sharing Rules or other sharing mechanisms to grant wider data access, rather than relying on broad access via the Role Hierarchy.

Leverage Restriction Rules for Row-Level Security

Restriction Rules act as a filter that is applied on top of all other sharing settings, allowing for fine-grained control over which records a user can see.

See Salesforce guidance on restriction rules.

Revoke Salesforce Support Login Access

Ensure that any users with access to sensitive data or with privileged access to the underlying Salesforce instance are setting strict timeouts on any Salesforce support access grants.

Revoke any standing requests and only re-enable with strict time limits for specific use cases. Be wary of enabling these grants from administrative accounts.

See Salesforce guidance on granting Salesforce Support login access.

Mandiant recommends running the Salesforce Security Health Check tool to identify and address misconfigurations. For additional hardening recommendations, reference the Salesforce Security Guide.

3. Logging and Detections

Salesforce Targeted Logging and Detections Controls

This section outlines key logging and detection strategies for Salesforce instances. These controls are essential for identifying and responding to advanced threats within the SaaS environment.

SaaS Applications Logging

To gain visibility into the tactics, techniques, and procedures (TTPs) used by threat actors against SaaS Applications, Mandiant recommends enabling critical log types in the organization’s Salesforce environment and ingesting the logs into their Security Information and Event Management (SIEM).

What You Need in Place Before Logging

Before you turn on collection or write detections, make sure your organization is actually entitled to the logs you are planning to use – and that the right features are enabled.

Entitlement check (must-have)

Most security logs/features are gated behind Event Monitoring via Salesforce Shield or the Event Monitoring Add-On. This applies to Real-Time Event Monitoring (RTEM) streaming and viewing.

Pick your data model per use case

RTEM – Streams (near real-time alerting): Available in Enterprise/Unlimited/Developer subscriptions; streaming events retained ~3 days.
RTEM – Storage: Many are Big Objects (native storage); some are standard objects (e.g. Threat Detection stores)
Event Log Files (ELF) – CSV model (batch exports): Available in Enterprise/Performance/Unlimited editions.
Event Log Objects (ELO) – SOQL model (queryable history): Shield/add-on required.

Turn on what you need (and scope access)

Use Event Manager to enable/disable streaming and storing per event; viewing RTEM events.
Grant access via profiles/permissions sets for RTEM and Threat Detection UI.

Threat Detection & ETS

Threat Detection events are viewed in UI with Shield/add-on; stored in corresponding EventStore objects.
Enhanced Transaction Security (ETS) is included with RTEM for block/MFA/notify actions on real-time events.

Recommended Log Sources to Monitor

Login History (LoginHistory): Tracks all login attempts, including username, time, IP address, status (successful/failed), and client type. This allows you to identify unusual login times, unknown locations, or repeated failures, which could indicate credential stuffing or account compromise.
Login Events (LoginEventStream): LoginEvent tracks the login activity of users who log in to Salesforce.
Setup Audit Trail (SetupAuditTrail): Records administrative and configuration changes within your Salesforce environment. This helps track changes made to permissions, security settings, and other critical configurations, facilitating auditing and compliance efforts.
API Calls (ApiEventStream): Monitors API usage and potential misuse by tracking calls made by users or connected apps.
Report Exports (ReportEventStream): Provides insights into report downloads, helping to detect potential data exfiltration attempts.
List View Events (ListViewEventStream): Tracks user interaction with list views, including access and manipulation of data within those views.
Bulk API Events (BulkApiResultEvent): Track when a user downloads the results of a Bulk API request.
Permission Changes (PermissionSetEvent): Tracks changes to permission sets and permission set groups. This event initiates when a permission is added to, or removed from a permission set.
API Anomaly (ApiAnomalyEvent): Track anomalies in how users make API calls.
Unique Query Event Type: Unique Query events capture specific search queries (SOQL), filter IDs, and report IDs that are processed, along with the underlying database queries (SQL).
External Identity Provider Event Logs: Track information from login attempts using SSO. (Please follow the guidance provided by your Identity Provider for monitoring and collecting IdP event logs.)

These log sources will provide organizations with the logging capabilities to properly collect and monitor the common TTPs used by threat actors. The key log sources to monitor and observable Salesforce activities for each TTP are as follows:

TTP	Observable Salesforce Activities	Log Sources
Vishing	Suspicious login attempts (rapid failures). Logins from unusual IPs/ASNs (e.g., Mullvad/Tor). OAuth (“Remote Access 2.0”) from unrecognized clients.	Login History LoginEventStream/LoginEvent Setup Audit Trail
Malicious Connected App Authorization (e.g., Data Loader, custom scripts)	New Connected App creation/modification (broad scopes: api, refresh_token, offline_access). Policy relaxations (Permitted Users, IP restrictions). Granting of API Enabled / “Manage Connected Apps” via perms.	Setup Audit Trail PermissionSetEvent LoginEventStream/LoginEvent (OAuth)
Data Exfiltration (via API, Data Loader, reports)	High-rate Query/QueryMore/QueryAll bursts. Large RowsProcessed/RecordCount in reports & list views (chunked). Bulk job result downloads. File/attachment downloads at scale	ApiEventStream/ApiEvent ReportEventStream/ReportEvent ListViewEventStream/ListViewEvent BulkApiResultEvent FileEvent/FileEventStore ApiAnomalyEvent/ReportAnomalyEvent Unique Query Event Type
Lateral Movement/Persistence (within Salesforce or to other cloud platforms)	Permissions elevated (e.g., View/Modify All Data, API Enabled). New user/service accounts. LoginAs activity. Logins from VPN/Tor after SF OAuth. Pivots to Okta/M365, then Graph data pulls.	Setup Audit Trail PermissionSetEvent LoginAsEventStream

SaaS Applications Detections

While native SIEM threat detections provide some protection, they often lack the centralized visibility needed to connect disparate events across a complex environment. By developing custom targeted detection rules, organizations can proactively detect malicious activities.

Data Exfiltration & Cross-SaaS Lateral Movement (Post-Authorization)

MITRE Mapping: TA0010 – Exfiltration & TA0008 – Lateral Movement

Scenario & Objectives

After an user authorizes a (malicious or spoofed) Connected App, UNC6040 typically:

Performs data exfiltration quickly (REST pagination bursts, Bulk API downloads, lards/sensitive report exports).
Pivots to Okta/Microsoft 365 from the same risky egress IP to expand access and steal more data.

The objective here is to detect Salesforce OAuth → Exfil within ≤10 minutes, and Salesforce OAuth → Okta/M365 login within ≤60 minutes (same risky IP), plus single-signal, low-noise exfil patterns.

Baseline & Allowlist

Re-use the lists you already maintain for the vishing phase and add two regex helpers for content focus.

STRING

ALLOWLIST_CONNECTED_APP_NAMES
KNOWN_INTEGRATION_USERS (user ids/emails that legitimately use OAuth)
VPN_TOR_ASNS (ASNs as strings)

CIDR

ENTERPRISE_EGRESS_CIDRS (your corporate/VPN public egress)

REGEX

SENSITIVE_REPORT_REGEX

(?i)b(all|export|dump)b.*b(contact|lead|account|customer|pii|email|phone|ssn)b

- M365_SENSITIVE_GRAPH_REGEX

(?i)^https?://graph.microsoft.com/(beta|v1.0)/(users|me)/messages
(?i)^https?://graph.microsoft.com/(beta|v1.0)/drives/.*/items/.*/content
(?i)^https?://graph.microsoft.com/(beta|v1.0)/reports/
(?i)^https?://graph.microsoft.com/(beta|v1.0)/users(?|$)

High Fidelity Detection Catalog (Pseudo-Code)

Salesforce OAuth → Data Exfil in ≤10 Minutes (Multi-Event)

Suspicious OAuth followed within 10m by Bulk result download, REST pagination burst, or sensitive/large report export by the same user.

Why high-fidelity: Matches UNC6040’s “approve → drain” pattern; tight window + volume thresholds.

Key signals:

OAuth success (unknown app OR allowlisted+risky egress), bind on user.
Then any of:

BulkApiResultEvent with big RowsProcessed/RecordCount
ApiEventStream many query/queryMore calls
ReportEventStream large/sensitive report export

Lists/knobs: ENTERPRISE_EGRESS_CIDRS, VPN_TOR_ASNS, SENSITIVE_REPORT_REGEX.

$oauth.metadata.product_name = "SALESFORCE"
$oauth.metadata.log_type = "SALESFORCE"
$oauth.extracted.fields["LoginType"] = "Remote Access 2.0"
($oauth.extracted.fields["Status"] = "Success" or $oauth.security_result.action_details = "Success")
( not ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
or ( ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
and ( not ($ip in cidr %ENTERPRISE_EGRESS_CIDRS)
or strings.concat(ip_to_asn($ip), "") in %VPN_TOR_ASNS ) ) )
$uid = coalesce($oauth.principal.user.userid, $oauth.extracted.fields["UserId"])

$bulk.metadata.product_name = "SALESFORCE"
$bulk.metadata.log_type = "SALESFORCE"
$bulk.metadata.product_event_type = "BulkApiResultEvent"
$uid = coalesce($bulk.principal.user.userid, $bulk.extracted.fields["UserId"])

match:
$uid over 10m

$oauth.metadata.product_name = "SALESFORCE"
$oauth.metadata.log_type = "SALESFORCE"
$oauth.extracted.fields["LoginType"] = "Remote Access 2.0"
($oauth.extracted.fields["Status"] = "Success" or $oauth.security_result.action_details = "Success")
( not ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
or ( ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
and ( not ($ip in cidr %ENTERPRISE_EGRESS_CIDRS)
or strings.concat(ip_to_asn($ip), "") in %VPN_TOR_ASNS ) ) )
$uid = coalesce($oauth.principal.user.userid, $oauth.extracted.fields["UserId"])

$api.metadata.product_name = "SALESFORCE"
$api.metadata.log_type = "SALESFORCE"
$api.metadata.product_event_type = "ApiEventStream"
$uid = coalesce($api.principal.user.userid, $api.extracted.fields["UserId"])

match:
$uid over 10m

$oauth.metadata.product_name = "SALESFORCE"
$oauth.metadata.log_type = "SALESFORCE"
$oauth.extracted.fields["LoginType"] = "Remote Access 2.0"
($oauth.extracted.fields["Status"] = "Success" or $oauth.security_result.action_details = "Success")
( not ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
or ( ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
and ( not ($ip in cidr %ENTERPRISE_EGRESS_CIDRS)
or strings.concat(ip_to_asn($ip), "") in %VPN_TOR_ASNS ) ) )
$uid = coalesce($oauth.principal.user.userid, $oauth.extracted.fields["UserId"])

$report.metadata.product_name = "SALESFORCE"
$report.metadata.log_type = "SALESFORCE"
$report.metadata.product_event_type = "ReportEventStream"
strings.to_lower(coalesce($report.extracted.fields["ReportName"], "")) in regex SENSITIVE_REPORT_REGEX
$uid = coalesce($report.principal.user.userid, $report.extracted.fields["UserId"])

match:
$uid over 10m

Note: Single event rule can also be used instead of multi-event rules in this case where only the Product Event Types like ApiEventStream, BulkApiResultEvent, ReportEventStream can be used as a single event rule to be monitored. But, care has to be taken if a single event rule is established as these can be very noisy, and thus the reference lists should be actively monitored.

Bulk API Large Result Download (Non-Integration User)

Bulk API/Bulk v2 result download above threshold by a human user.

Why high-fidelity: Clear exfil artifact.

Key signals: BulkApiResultEvent, user not in KNOWN_INTEGRATION_USERS.

Lists/knobs: KNOWN_INTEGRATION_USERS, size threshold.

$e.metadata.product_name = "SALESFORCE"
$e.metadata.log_type = "SALESFORCE"
$e.metadata.product_event_type = "BulkApiResultEvent"
not (coalesce($e.principal.user.userid, $e.extracted.fields["UserId"]) in %KNOWN_INTEGRATION_USERS)

REST Query Pagination Burst (query/queryMore)

High-rate query*/queryMore calls over a short window.

Why high-fidelity: Mimics scripted drains; steady human usage won’t hit burst thresholds.

Key signals: ApiEventStream, Operation in query, queryMore, query_all, queryall, count ≥ threshold in 10m, user not in KNOWN_INTEGRATION_USERS.

Lists/knobs: burst threshold, KNOWN_INTEGRATION_USERS.

$api.metadata.product_name = "SALESFORCE"
$api.metadata.log_type = "SALESFORCE"
$api.metadata.product_event_type = "ApiEventStream"
not (coalesce($api.principal.user.userid, $api.extracted.fields["UserId"]) in %KNOWN_INTEGRATION_USERS)
strings.to_lower(coalesce($api.extracted.fields["Operation"], "")) in regex `(?i)^(query|querymore|query_all|queryall)$`
$uid = coalesce($api.principal.user.userid, $api.extracted.fields["UserId"])

Sensitive Report Export by Non-Integration User

Exports of large or sensitive-named reports by a human.

Why high-fidelity: Report extracts are a common, noisy-to-attackers but high-signal vector.

Key signals: ReportEventStream, high RowsProcessed or ReportName matches SENSITIVE_REPORT_REGEX, user not in KNOWN_INTEGRATION_USERS.

Lists/knobs: SENSITIVE_REPORT_REGEX, KNOWN_INTEGRATION_USERS.

$e.metadata.product_name = "SALESFORCE"
$e.metadata.log_type = "SALESFORCE"
$e.metadata.product_event_type = "ReportEventStream"
not (coalesce($e.principal.user.userid, $e.extracted.fields["UserId"]) in %KNOWN_INTEGRATION_USERS)
strings.to_lower(coalesce($e.extracted.fields["ReportName"], "")) in regex %SENSITIVE_REPORT_REGEX

Salesforce OAuth → Okta/M365 Login From Same Risky IP in ≤60 Minutes (Multi-Event)

Suspicious Salesforce OAuth followed within 60m by Okta or Entra ID login from the same public IP, where the IP is off-corp or VPN/Tor ASN.

Why high-fidelity: Ties the attacker’s egress IP across SaaS within a tight window.

Key signals:

Salesforce OAuth posture (unknown app OR allowlisted+risky egress)
OKTA* or OFFICE_365 USER_LOGIN from the same IP

Lists/knobs: ENTERPRISE_EGRESS_CIDRS, VPN_TOR_ASNS. (Optional sibling rule binding by user email if identities are normalized.)

$oauth.metadata.product_name = "SALESFORCE"
$oauth.metadata.log_type = "SALESFORCE"
$oauth.extracted.fields["LoginType"] = "Remote Access 2.0"
($oauth.extracted.fields["Status"] = "Success" or $oauth.security_result.action_details = "Success")
( not ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
or ( ($app in %ALLOWLIST_CONNECTED_APP_NAMES)
and ( not ($ip in cidr %ENTERPRISE_EGRESS_CIDRS)
or strings.concat(ip_to_asn($ip), "") in %VPN_TOR_ASNS )
$ip = coalesce($oauth.principal.asset.ip, $oauth.principal.ip)

$okta.metadata.log_type in "OKTA"
$okta.metadata.event_type = "USER_LOGIN"
$ip = coalesce($okta.principal.asset.ip, $okta.principal.ip) = $ip

$o365.metadata.log_type = "OFFICE_365"
$o365.metadata.event_type = "USER_LOGIN"
$ip = coalesce($o365.principal.asset.ip, $o365.principal.ip)

match:
$ip over 10m

M365 Graph Data-Pull After Risky Login

Entra ID login from risky egress followed by Microsoft Graph endpoints that pull mail/files/reports.

Why high-fidelity: Captures post-login data access typical in account takeovers.

Key signals: OFFICE_365 USER_LOGIN with off-corp IP or VPN/Tor ASN, then HTTP to URLs matching M365_SENSITIVE_GRAPH_REGEX by the same account within hours.

Lists/knobs: ENTERPRISE_EGRESS_CIDRS, VPN_TOR_ASNS, M365_SENSITIVE_GRAPH_REGEX.

$login.metadata.log_type = "OFFICE_365"
$login.metadata.event_type = "USER_LOGIN"
$ip  = coalesce($login.principal.asset.ip, $login.principal.ip)
( not ($ip in cidr %ENTERPRISE_EGRESS_CIDRS)
 or strings.concat(ip_to_asn($ip), "") in %VPN_TOR_ASNS )
$acct = coalesce($login.principal.user.userid, $login.principal.user.email_addresses)

$http.metadata.product_name in ("Entra ID","Microsoft")
($http.metadata.event_type = "NETWORK_HTTP" or $http.target.url != "")
$acct = coalesce($http.principal.user.userid, $http.principal.user.email_addresses)
strings.to_lower(coalesce($http.target.url, "")) in regex %M365_SENSITIVE_GRAPH_REGEX

match:
$acct over 30m

Tuning & Exceptions

Identity joins – The lateral rule groups by IP for robustness. If you have strong identity normalization (Salesforce <-> Okta <-> M365), clone it and match on user email instead of IP.
Change windows – Suppress time-bound rules during approved data migrations/Connected App onboarding (temporarily add vendor app to ALLOWLIST_CONNECTED_APP_NAMES)
Integration accounts – Keep KNOWN_INTEGRATION_USERS current; most noise in exfil rules comes from scheduled ETL.
Egress hygiene – Keep ENTERPRISE_EGRESS_CIDRS current; stale NAT/VPN ranges inflate VPN/Tor findings.
Streaming vs stored – The aforementioned rules assume Real-Time Event Monitoring Stream objects (e.g., ApiEventStream, ReportEventStream, ListViewEventStream, BulkApiResultEvent). For historical hunts, query the stored equivalents (e.g., ApiEvent, ReportEvent, ListViewEvent) with the same logic.

IOC-Based Detections

Scenario & Objectives

A malicious threat actor has either successfully accessed or attempted to access an organization’s network.

The objective is to detect the presence of known UNC6040 IOCs in the environment based on all of the available logs.

Reference Lists

Reference lists organizations should maintain:

STRING

UNC6040_IOC_LIST (IP addresses from threat intel sources eg. VirusTotal)

List of indicators of compromise (IOCs).

High Fidelity Detection Catalog (Pseudo-Code)

UNC6040 IP_IoC Detected

A known IOC associated with UNC6040 was detected in the organization’s environment either from a source or destination connection.

High-fidelity when conditioned on source or destination IP address matches a known UNC6040 IOC.

($e.principal.ip in %unc6040_IoC_list) or ($e.target.ip in %unc6040_IoC_list)

Acknowledgements

We would like to thank Salesforce for their collaboration and assistance in building this guide.

Read More for the details.

2025 09 29

GCP – Announcing Claude Sonnet 4.5 on Vertex AI

Tibor Kiss Cloud, Google Cloud gcp

Today, we’re announcing the general availability of Claude Sonnet 4.5, Anthropic’s most intelligent model and its best-performing model for complex agents, coding, and computer use, on Vertex AI.

Claude Sonnet 4.5 is built to work independently for hours, maintaining clarity while orchestrating tools and coordinating multiple agents to solve complex problems. It’s designed to excel at long-running tasks with enhanced domain knowledge in coding, finance, research, and cybersecurity. Key use cases include:

Coding: Autonomously complete long-horizon coding tasks. Plan and execute software projects that span hours or days, as well as everyday development tasks.
Cybersecurity: Deploy agents that autonomously patch vulnerabilities before exploitation, shifting from reactive detection to proactive defense.
Financial analysis: Handle everything from entry-level financial analysis to advanced predictive analysis like continuously monitoring global regulatory changes and preemptively adapting compliance systems.
Research: Handle tools, context, and deliver ready-to-go office files to drive expert analysis into final deliverables and actionable insights.

We’re also announcing Vertex AI support for Anthropic’s upgrades to Claude Code—including a VS Code extension and the next version (2.0) of the terminal interface, complete with checkpoints for more autonomous operation. Powered by Claude Sonnet 4.5, Claude Code now handles longer, more complex development tasks than the previous version.

Get started

Start building with Claude Sonnet 4.5 today by following these instructions:

Navigate to the Claude Sonnet 4.5 model card in the Vertex AI Model Garden, select “Enable”, and follow the proceeding instructions. You can also find and easily procure Claude Sonnet 4.5 on Google Cloud Marketplace.
Explore our sample notebook for help getting started.
Visit our documentation for pricing and regional support details.

The Vertex AI advantage: production-ready agents and applications

Bringing powerful models like Claude Sonnet 4.5 to production quickly requires more than just API access; it requires a unified AI platform. Vertex AI gives you an all-in-one platform to build, deploy, and manage your AI with confidence:

Orchestrate multi-agent systems: Build agents with a flexible, open approach through Agent Development Kit (ADK), then scale them in production with Agent Engine.
Get committed capacity: Reserve dedicated capacity and prioritized processing for your Claude Sonnet 4.5 applications at a fixed cost. This is made possible with provisioned throughput. To get started, contact your Google Cloud sales representative.
Boost model performance and efficiency: Get the most out of Claude with supported features. Run large-scale jobs with batch predictions, analyze codebases with a 1 million token context window, reduce costs with prompt caching, and ground responses with citation support. For detailed information on Claude-supported features, refer to our documentation.
Deploy with highly efficient infrastructure: Deploying on Vertex AI means running on infrastructure that is purpose-built for AI and designed to provide optimal performance and cost across your workloads. The global endpoint for Claude also enhances availability and reduces latency by dynamically serving traffic from the nearest available region.

Operate securely, by design: Build securely from day one, knowing your data is protected and you can easily manage compliance. This is provided by Vertex AI and Google Cloud’s robust, built-in security and data governance controls.

How customers are building with Claude on Vertex AI

Leading organizations are already leveraging the powerful combination of Claude and Google Cloud to drive significant business impact.

Augment Code is powering its AI coding assistant, which helps developers navigate and contribute to production-grade codebases, with Claude on Vertex AI.

“What we’re able to get out of Claude is truly extraordinary, but all of the work we’ve done to deliver knowledge of customer code, used in conjunction with Claude and the other models we host on Google Cloud, is what makes our product so powerful.” – Scott Dietzen, CEO, Augment Code

spring.new is helping users build custom business applications and tools in hours using natural language prompts.

“Customers tell us that our platform, powered by Claude models and Google Cloud, enables one person to create applications in one to two hours that previously took up to three months.” – Amitay Gilboa, CEO, spring.new

TELUS built its generative AI platform, Fuel iX™, on Google Cloud to give its team members a choice of curated AI models, like Claude, inspiring engineering excellence and enterprise-wide productivity.

“Getting a model as powerful as Claude on Vertex AI is a win-win that makes life so much easier. We get a model that excels at tool calling on a comprehensive platform that integrates with our core Google Cloud workloads like GKE and Cloud Run — that’s the magic.” – Justin Watts, Distinguished Engineer, TELUS

Read More for the details.

2025 09 29

GCP – Cloud CISO Perspectives: Boards should be ‘bilingual’ in AI, security to gain advantage

Tibor Kiss Cloud, Google Cloud gcp

Welcome to the second Cloud CISO Perspectives for September 2025. Today, Google Cloud COO Francis deSouza offers his insights on how boards of directors and CISOs can thrive with a good working relationship, adapted from a recent episode of the Cyber Savvy Boardroom podcast.

As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.

aside_block: <ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x7f6c1d998340>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

Boards should be ‘bilingual’ in AI and security to gain a competitive advantage

By Francis deSouza, chief operating officer, Google Cloud

FrancisdeSouza2396-hi — Francis deSouza, chief operating officer, Google Cloud

AI is one of the fastest, most impactful technology shifts I’ve seen in my career. As adoption continues to surge, companies are facing complex and often technical questions about how AI intersects with corporate governance and strategy. One way forward is for boards of directors and cybersecurity teams to become “bilingual” in how AI and cybersecurity affect each other — to understand how AI needs to be secured against threats, how AI can be used to empower defenders, and how both needs affect business outcomes.

Organizations that adopt AI should evolve its cybersecurity posture because AI models and agents expand the surface area that needs to be protected. That requires hardening existing data infrastructure, developing access controls for agents, and understanding how those changes affect governance and risk management.

By learning the language of AI for defense, boards can be better prepared to use AI to create a competitive advantage.

Cybersecurity should be a core duty of every board member, not just those serving on audit and risk committees. Becoming bilingual in AI can help board members focus on why they should understand their organization’s security posture, and be prepared for potential breaches. But there’s much more that boards can do — here are four steps leaders can take to drive effective change in today’s dynamic environment.

Becoming bilingual in AI can help board members focus on why they should understand their organization’s security posture, and be prepared for potential breaches.

1. Integrate cybersecurity into business strategy

What used to be a landscape dominated by individual hackers has now dramatically expanded to sophisticated groups that have been specifically formed to extract value from organizations by stealing and ransoming their data.

While it’s important to be fluent in business strategy, boards should also work with security leaders towards integrating cybersecurity into their overall roadmap. Boards can encourage a collaborative approach to align cybersecurity with critical business services, which can help strengthen security posture, protect critical assets, and enhance resilience against evolving and emerging threats.

2. Develop a framework for cybersecurity investments

Boards should ask questions to ensure cybersecurity investments deliver real business value — beyond compliance. Key areas for boards to investigate include identifying and understanding the protection of critical digital and physical assets with software components, assessing the maturity level of protection, and knowing the potential cost of different types of breaches.

Here’s where boards should encourage third-party assessments, running simulations, and tabletop exercises to help prepare an organization for breach responses. It’s also important for boards to develop a framework for cybersecurity investments to help them benchmark spending against industry data, and assess the effectiveness of that investment.

When boards understand the risks and costs associated with different types of breaches, including remediation and reputational damage, they are better positioned to help assess the actual value of cybersecurity investments.

3. Prioritize cybersecurity in mergers and acquisitions

One area cybersecurity becomes especially critical is in mergers and acquisitions. Assessing a target company’s security posture is a critical component of due diligence, and can help create a roadmap for integrating the target company into the acquirer’s security and compliance posture.

This approach includes non-negotiables for day one, such as issuing new, compliant laptops, planning network segregation, and a remediation roadmap for any existing vulnerabilities. Third-party assessments also have a role to play here to help inform post-acquisition plans.

4. Create a cyber-aware culture from the top down

We’ve been vocal about how creating a cyber-aware culture starts at the top. Boards should set the tone by regularly placing cybersecurity on the agenda at the main board level at least once a year.

They can also review internal and third-party attestations, and examine breach action plans to encourage a holistic approach to cybersecurity. Executive leadership must champion the security-first mindset, setting clear expectations, allocating necessary resources, and holding teams accountable. This top-down approach sends a powerful message that security is a non-negotiable priority.

Why boards should have more AI cyber-awareness

Cybersecurity has emerged as a board-level issue because of digital transformation and the emergence of AI, and this presents an opportunity and a challenge. By becoming bilingual in AI and security, boards can ensure their companies are moving decisively to not only improve efficiency and security, but to redefine what’s possible in their industries.

For more on Google Cloud’s cybersecurity guidance for boards of directors, you can check out the resources at our insights hub.

aside_block: <ListValue: [StructValue([(‘title’, ‘Tell us what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x7f6c1d998850>), (‘btn_text’, ‘Join the conversation’), (‘href’, ‘https://google.qualtrics.com/jfe/form/SV_2n82k0LeG4upS2q’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

In case you missed it

Here are the latest updates, products, services, and resources from our security teams so far this month:

Blocking shadow agents won’t work. Here’s a more secure way forward: Shadow IT. Shadow AI. It’s human nature to use technology in the most expedient way possible, but shadow agents pose great risks. Here’s how to secure them, and your business. Read more.
How to combat bucket-squatting in five steps: Threat actors target cloud storage buckets to intercept your data and impersonate your business. Here’s five steps you can take to make them more secure. Read more.
How to secure your remote MCP server on Google Cloud: Here are five key MCP deployment risks you should be aware of, and how using a centralized proxy architecture on Google Cloud can help mitigate them. Read more.
The global harms of restrictive cloud licensing, one year later: Microsoft’s restrictive cloud licensing has harmed the global economy, but ending it could help supercharge Europe’s economic engine. Read more.
Introducing DNS Armor to mitigate domain name system risks: Google Cloud is partnering with Infoblox to deliver Google Cloud DNS Armor, a cloud-native DNS security service available now in preview. Read more.
Solve security operations challenges with expertise and speed: At Google Cloud, we understand the value that MSSPs can bring, so we’ve built a robust ecosystem of MSSP partners, specifically empowered to help you modernize security operations and achieve better security outcomes, faster. Read more.
New GCE and GKE dashboards strengthen security posture: We’ve introduced new, integrated security dashboards in GCE and GKE consoles, powered by Security Command Center, to provide critical insights. Read more.

Please visit the Google Cloud blog for more security stories published this month.

aside_block: <ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x7f6c1d998d90>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

Threat Intelligence news

Backdoor BRICKSTORM enabling espionage into tech and legal sectors: Google Threat Intelligence Group (GTIG) is tracking BRICKSTORM malware activity, which is being used to maintain persistent access to victim organizations in the U.S. across a range of industry verticals, including legal services, software as a service (SaaS) providers, business process outsourcers (BPOs), and technology companies. The value of these targets extends beyond typical espionage missions, potentially providing data to feed development of zero-days and establishing pivot points for broader access to downstream victims. Read more.
Widespread data theft targets Salesforce instances via Salesloft Drift: An investigation into Salesloft Drift has led Google Threat Intelligence Group (GTIG) to issue an advisory to alert organizations about widespread data theft from Salesloft Drift customer integrations, affecting Salesforce and others. The campaign is carried out by the actor tracked as UNC6395. We are advising Salesloft Drift customers to treat all authentication tokens stored in or connected to the Drift platform as potentially compromised. Read more.

Please visit the Google Cloud blog for more threat intelligence stories published this month.

Now hear this: Podcasts from Google Cloud

The AI future of SOAPA: Jon Oltsik, who coined Security Operations and Analytics Platform Architecture (SOAPA), gives hosts Anton Chuvakin and Tim Peacock an update on the ongoing debate between consolidating security around a single platform versus a more disaggregated, best-of-breed approach — including how agentic AI has changed the conversation. Listen here.
The AI-fueled arms race for email security: Email security is a settled matter, right? Not if AI has anything to say about it. AegisAI CEO Cy Khormaee and CTO Ryan Luo chat with Anton and Tim on how AI has upended email security best practices. Listen here.
Cyber Savvy Boardroom: Enterprise cyber leadership: Francis deSouza, chief operating officer, Google Cloud, joins Office of the CISO’s Nick Godfrey and David Homovich to talk about the biggest challenge facing boards in the next three to five years: governing agentic AI. Listen here.
Defender’s Advantage: How vSphere became a target for adversaries: Mandiant Consulting’s Stuart Carrera joins host Luke McNamara to discuss how threat actors are increasingly targeting the VMware vSphere estate, and leveraging in this environment to conduct extortion and data theft. Listen here.
Behind the Binary: Inside the FLARE-On reverse-engineering gauntlet: Host Josh Stroschein is joined by FLARE-On challenge host and author Nick Harbour, and regular challenge author Blas Kojusner, for an in-depth tour of its history, and discuss how it has grown into a must-do event for malware analysts and reverse engineers. Listen here.

To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.

Read More for the details.

2025 09 29

GCP – Broadcom’s VMware license changes as they relate to Google Cloud VMware Engine

Tibor Kiss Cloud, Google Cloud gcp

Broadcom recently announced a change to its VMware licensing model for hyperscalers, moving to an exclusive “bring your own” subscription model for VMware Cloud Foundation (VCF) starting on November 1, 2025. This means that in the future, Google Cloud VMware Engine (GCVE) customers will need to purchase portable VCF subscriptions directly from Broadcom to use with Google Cloud VMware Engine instead of buying VCF-included subscriptions of GCVE.

Google Cloud already offers a Bring Your Own License (BYOL) option for GCVE. In fact, we were the first hyperscaler to do so, in 2024.

Google and Broadcom plan for the following:

Customers with a committed use discount (CUD)

If you purchased a 1- or 3-year GCVE CUD on or before October 15, 2025, you can continue to use GCVE with the VCF license included for the remainder of your term.
For any new CUDs purchased after October 15, 2025, you’ll need to purchase the VCF licenses directly from Broadcom and the BYOL option of the GCVE service from Google Cloud.

Customers using on-demand

You can continue to operate VCF license-included, on-demand nodes that exist as of October 15, 2025 until June 30, 2027.

What’s not changing

The core capabilities of VMware Engine remain the same; it is still a managed Google service that provides a dedicated VMware Cloud Foundation environment running natively on Google Cloud’s infrastructure.

Helpful resources

We stand ready to help you navigate these changes. Here are some additional resources to guide you:

Google Cloud VMware Engine: Learn more about the service and options:

Google Cloud VMware Engine product page
Google Cloud VMware Engine pricing: Details on pricing, including the Bring Your Own License (BYOL) options and committed use discounts (CUDs)
CUDs for VMware Engine

Your Google Cloud Account Team: Please reach out to your Google Cloud account team, who can help review your existing commitments, discuss the implications of these changes for your organization, and help you plan for a smooth transition.

Read More for the details.

2025 09 29

GCP – Unlock next-gen VMs using GKE compute classes and Compute Flexible CUDs

Tibor Kiss Cloud, Google Cloud gcp

Organizations are consistently looking to gain an edge with the latest advancements in cloud computing. New Google Compute Engine and Google Kubernetes Engine (GKE) Gen4 machine series including N4, C4, C4A, C4D, to name a few, offer significant improvements in performance, cost-efficiency, and capabilities. However, migrating to new hardware isn’t always straightforward. Teams often face challenges with compatibility testing, regional capacity, and navigating financial commitments, all of which can slow down adoption.

The good news is that two powerful Google Cloud features, when used together, provide a strategic and cost-effective path to adopting a new machine series without the usual overhead. By combining the technical agility of GKE compute classes with the financial adaptability of Compute Flexible Committed Use Discounts (Flex CUDs), you can innovate faster, maintain resilience, and optimize costs — all at the same time. Even better, Compute Flex CUDs also allow discounted consumption of Autopilot and Cloud Run —making it easy to consume the right compute for your workload. Let’s dive in.

The challenge: Overcoming hardware adoption hurdles

While adopting the latest machine series unlocks new levels of performance and efficiency, organizations can face some challenges during the transition:

Compatibility testing: Before a full migration, teams need to validate that their applications perform as expected on a new machine series. This requires a strategy for safely introducing new hardware to gather performance data and ensure compatibility.
Navigating regional capacity: As new machine series expand to more regions, their availability can vary. This creates a need for a fallback option to ensure application availability isn’t impacted by capacity limitations in a specific location.
Aligning financial commitments: Resource-based CUDs provide excellent value but are tied to specific machine families and are less flexible for teams who want to adopt newer, more cost-performant hardware while still under an existing commitment term.
Migration of workloads: The process of configuring, migrating, and managing workloads across multiple machine types can be operationally complex. This requires significant coordination from platform teams to execute smoothly.

The solution, part 1: GKE compute classes

GKE compute classes provide an elegant technical solution to the challenges of hardware adoption. Instead of tying your workloads to a single machine type, you can define a prioritized list of machine families that GKE can use for autoscaling. This gives you a flexible and resilient way to incrementally integrate cutting-edge technologies.

With compute classes, you can define a policy that tells GKE to prioritize a new, cost-performant machine family (like N4) but automatically fall back to an established machine family (like N2 or N2D) you’re already using if the first choice isn’t available. Compute classes allow you to safely roll out new hardware in waves, by incrementally subscribing new workloads to the compute class. This helps to minimize operational risks and downtime.

How it works: An example

Let’s say you want to take advantage of the superior price-performance of the new N4 machine series for a stateless web application, but you want to fall back to the previous-generation N2 series for large, unexpected spikes in traffic

You can create a custom ComputeClass object with a prioritized list of machine families:

ComputeClass Manifest (n4-fallback-class.yaml):

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: cloud.google.com/v1rnkind: ComputeClassrnmetadata:rn name: n4-fallback-classrnspec:rn priorities:rn – machineFamily: n4rn – machineFamily: n2’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6c3fcd48e0>)])]>

This simple definition instructs the GKE cluster autoscaler to first attempt to provision nodes from the N4 family. If it can’t, it automatically tries the next option in the list, the N2 family.

Next, you reference this class in your workload’s pod specification using a nodeSelector.

Workload Kubernetes manifest:

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: apps/v1rnkind: Deploymentrnmetadata:rn name: my-web-apprnspec:rn replicas: 3rn selector:rn matchLabels:rn app: my-web-apprn template:rn metadata:rn labels:rn app: my-web-apprn spec:rn nodeSelector:rn cloud.google.com/compute-class: n4-fallback-classrn containers:rn – name: web-serverrn image: “your-web-app-image”rn resources:rn requests:rn cpu: “1”rn memory: “3.5Gi”‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6c3fcd4c70>)])]>

You can gradually migrate workloads to N4 using this compute class configuration by simply adding the cloud.google.com/compute-class: n4-fallback-class nodeSelector label to the workloads in question and redeploying them.

Real-world success: Shopify safely adopts new hardware

This powerful combination of technical and financial flexibility isn’t just theoretical. It’s being used by leading companies today to drive real-world results. At Google Cloud Next ’25, Justin Reid, a principal engineer at Shopify, shared how the company leverages GKE compute classes to power one of the world’s largest GKE fleets.

GKE compute classes enabled Shopify to serve up massive scale during Black-Friday / Cyber Monday by implementing the exact strategy described above: they defined a compute class that prioritizes the new N4 machines and included N2 machines as a seamless fallback option.

“Compute Classes played a critical role in helping Shopify scale during our most demanding events… It removed a ton of operation complexity for us…” – Justin Reid, principal engineer, Shopify

Watch the whole Next ‘25 session here.

Another example: High-performance workloads with C-series Family

For demanding workloads, C-series VMs are a popular choice, offering consistently high performance and access to enterprise features such as locally attached SSDs, advanced maintenance controls, larger VM shapes, and higher CPU frequencies. You can set up a compute class to prioritize new, performant options like C4 and C4D, which deliver compelling price-performance gains over prior-generation VMs, and also include a fallback to a VM you’ve used extensively.

Your ComputeClass can set C4 or C4D as a primary VM, the other as a fallback option, and C2 VMs as a last. This can allow you to maximize your obtainability for the newest machine types and confidently take advantage of supply for multiple previous-generation platforms without sacrificing availability.

Your ComputeClass manifest might look something like this:

YAML:

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: cloud.google.com/v1rnkind: ComputeClassrnmetadata:rn name: c4-c4d-fallback-classrnspec:rn priorities:rn – machineFamily: c4rn – machineFamily: c4drn – machineFamily: c2 # or c2d’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6c3fcd43a0>)])]>

By referencing cloud.google.com/compute-class: c4-c4d-fallback-class in your workload’s pod specification, your demanding applications always land on the most performant and cost-efficient C-series VMs available, with a reliable fallback plan.

The solution, part 2: Compute Flexible CUDs

Technical agility is only half of the equation. Spend-based Compute Flexible CUDs provide the commercial flexibility to match. Unlike resource-based CUDs, which give you the maximum discount on one specific machine series, Flex CUDs apply to your total eligible compute spend across a wide range of machine families — including Gen4 (e.g. N4 and C4) while leveraging fallback options (e.g. C2, N2).

When you purchase a Compute Flexible CUD, you commit to a certain hourly spend on compute resources (vCPU , memory and local SSD) for a one or three-year term, receiving a significant discount in return (up to 46% off general purpose VMs for a three-year term).

How it works: An example

Imagine you’ve purchased a three-year Compute Flex CUD. Your GKE cluster, using the n4-fallback-class from the previous example, initially runs your workload on N4 machines. Your Compute Flex CUD discount automatically applies to that usage.

Now, suppose a sudden demand spike in your region results in GKE’s compute class policy provisioning N2 machines to temporarily handle the extra load. Critically, your Compute Flex CUD discount automatically follows your workload with your discounts now apply to N2 machines. Your savings follow your spend, giving you the confidence to adopt new hardware without losing your committed use discounts.

Real world success: Verve Group

Verve Group SE is a leading digital media company that empowers advertisers and publishers with AI-driven ad-software solutions, connecting them to deliver impactful campaigns with a focus on first-party data and privacy-first technologies.

“Verve uses a variety of machine series including the new C4D as well as other VMs like C3D and N2D. We use custom compute classes to orchestrate fall backs, ranked for cost/performance across regions. The bulk of our spend is covered by Compute Flex CUDs, which plays a vital role in giving us discount flexibility across many of the machine series we consume.” – Pablo Loschi, Principal Systems Engineer, Verve

A winning combination for modern infrastructure

By pairing the technical resilience of GKE compute classes with the discounting adaptability of Compute Flex CUDs, you can create a robust and economically sound strategy for hardware adoption like the new generation of Compute Engine machine shapes. This integrated approach empowers you to:

Innovate safely: Gradually introduce and test new machine series with your critical workloads.
Optimize performance and cost: Leverage the latest and most cost performant hardware Google Cloud has to offer.
Enhance resilience: Ensure high availability for your applications even as you integrate new hardware.
Simplify operations: Let GKE manage the complexities of node provisioning and scaling across different machine types.

Leverage these capabilities to stay at the forefront of innovation and confidently explore and harness the benefits of Google Cloud’s rapidly evolving compute landscape — securely, efficiently, and cost-effectively.

To learn more, this video offers a helpful overview of how custom compute classes can improve infrastructure autoscaling in GKE. Then, explore Compute Engine’s fourth generation machine types, GKE compute classes and Compute Flexible CUDs today!

Acknowledgements: We’d like to thank Yasmin Mowafy, Senior Product Manager, Google Compute Engine, for her contributions to this post.

Read More for the details.

2025 09 29

GCP – Google Distributed Cloud at the edge powers U.S. Air Force Mobility Guardian 2025

Tibor Kiss Cloud, Google Cloud gcp

For today’s mission owner, operating effectively in denied, degraded, intermittent, and limited bandwidth (DDIL) environments is paramount. The Department of Defense’s strategy requires smaller, dispersed teams to function autonomously, creating a critical need for secure AI and data processing at the edge.

Bringing cloud power to the edge

Recognizing this challenge, the U.S. Air Force partnered with Google Public Sector and General Dynamics Information Technology (GDIT) to deploy the Google Distributed Cloud (GDC) air-gapped appliance. The integrated hardware and software solution offers a ruggedized, transportable appliance that allows users to securely run workloads classified up to the Secret level. GDC’s secure-by-design architecture ensures that Zero Trust security is integral, not an afterthought, minimizing complex configurations in contested environments.

GDC at Mobility Guardian 2025

This July in Guam, The United States and allied partners participated in Mobility Guardian 2025, Air Mobility Command’s (AMC) premier readiness exercise part of the Air Force’s Department-Level Exercise (DLE) series. During this key test of modern expeditionary capabilities, the ruggedized GDC air-gapped appliance successfully demonstrated its ability to generate critical intelligence and AI-powered insights in a disconnected environment – which is fundamental to enhancing resilience and accelerating decision-making for the Air Force.

Delivering a decisive advantage: key capabilities in action

During Mobility Guardian, the Google Distributed Cloud appliance successfully delivered:

Resilient command and control: Provided a secure, disconnected IL2 collaboration platform with a containerized version of MatterMost, an open-source platform designed for secure communication. Airmen used Google’s generative AI models for real-time transcription, optical character recognition (OCR), translation, and summarization to improve interoperability.
Real-time data processing: The appliance integrated with GDIT’s Luna AI solution to process unified air defense data at the edge. This provides a low-latency picture for real-time flight path tracking and improved situational awareness which can be leveraged for counter-UAS triggers.
Edge development environment: The team provisioned a coding environment with Jupyter notebooks that engineers could use to develop and refine applications directly in the field, enabling the creation of new analytics and tools at the tactical edge.
AI-enabled tele-maintenance: The Air Force ingested hundreds of pages of technical manuals to create a grounded generative AI resource for instant troubleshooting. Its computer vision capabilities analyzed visual data from the field to automatically identify and extract critical elements – such as maintenance abnormalities, equipment status, or location coordinates – turning unstructured information into actionable readiness data.

From the edge to the enterprise: validating the future of defense operations

This collaboration between the Air Force, Google, and GDIT exemplifies how tactical, air-gapped cloud systems can bring out-of-the-box GPUs and AI-enhanced solutions to the mission – whenever and wherever they’re needed. This successful demonstration also validated a complete workflow – from operating autonomously at the edge to syncing seamlessly back to the core network once connectivity was restored. The insights gained are invaluable for the future of defense operations. The GDC air-gapped appliance is now ready for deeper integrations with mission applications and connectivity with Google’s Global Network (GGN) to create a secure high performance link from the edge back to the cloud.

To learn more about how we are empowering defense and intelligence agencies, register to attend the Google Public Sector Summit on October 29, in Washington, D.C.

^{Photo by Senior Airman Daniel Hernandez, 1st Combat Camera Squadron}

Read More for the details.

2025 09 29

GCP – Agent Factory Recap: Can you do my shopping?

Tibor Kiss Cloud, Google Cloud gcp

In episode #8 of The Agent Factory, Ivan Nardini and I are joined by Prateek Dudeja, product manager from the Agent Payment Protocol Team, to dive into one of the biggest hurdles for AI agents in eccomerce: trust, especially when it comes to money.

A video podcast discussing Agent Payments

This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.

Introducing Agent Payment Protocol

Timestamp: [01:43]

What if an agent could buy concert tickets for you at a specific time that the tickets go on sale. You don’t want to miss out! Maybe you want two tickets, and you don’t want to spend more than $200. You definitely want to sit in a section with a great view of the stage. To have an agent act as your ticket buyer, you would have to trust that agent with all facets of your request and your credit card. How can you be sure that the agent won’t buy 200 tickets or that it won’t charge you for a lifetime supply of rubber duckies?

The potential for a messy outcome with this concert ticket request provides insight into a “Crisis of Trust” that can hold back agentic commerce. The good news is there’s a way to move forward and build trust.

To solve the “Crisis of Trust,” Google introduced the Agent Payment Protocol (AP2), a new open standard. It’s not a new payment system; it’s a “trust layer” that sits on top of existing infrastructure. AP2 is designed to create a common, secure language for agents to conduct commerce, using role-based architecture and verifiable credentials.

Agent Payments and the Current Payment System

Timestamp: [02:29]

The current payment system was built for humans using trusted interfaces like browsers, not for autonomous agents, resulting in three main challenges for agents: authorization, agent error, and accountability.

The Agent Payment Protocol addresses these challenges by helping agents communicate securely with merchants and payment partners. The Agent Payment Protocol is available today as an extension for the A2A (Agent2Agent) protocol and relies on agents using the Model Context Protocol (MCP).

Deep Dive into the Agent Payment Protocol

Learn more about how this protocol works, including concepts and flow.

A Role-Based Ecosystem

Timestamp: [04:33]

The protocol is built on a “separation of concerns.” Your agent doesn’t have to do everything. There are specialized roles:

Shopping Agent: The AI agent you build, great at finding products.
Merchant Endpoint: The seller’s API.
Credential Provider: A secure digital wallet (like PayPal, Google Pay, etc.) that manages payment details.
Merchant Payment Processor: The entity that constructs the final authorization message for the payment networks.

Critical: Your shopping agent never touches the raw credit card number. It doesn’t need to be PCI compliant because it delegates the payment to the specialized, secure providers.

Verifiable Credentials (VCs)

Timestamp: [06:15]

The “handshakes” between these roles in the Agent Payment Protocol ecosystem are secured by Verifiable Credentials (VCs). Think of credentials as protocolized, cryptographically signed digital receipts that prove what was agreed upon.

There are three types of verifiable credentials:

Cart Mandate: For “human-present” scenarios. The user reviews a final cart and cryptographically signs it as proof of approval.

Intent Mandate: For “human-not-present” scenarios (like the concert ticket example). The user signs an intent (e.g., “buy tickets under $200”), giving the agent authority to act within those guardrails.

Payment Mandate: Provides clear visibility to payment networks and banks that an AI agent was involved in the transaction.

A Contractual Conversational Model

Timestamp: [08:03]

The Agent Payment Protocol process creates a “Contractual Conversational Model,” moving beyond simple API calls to a flow built on verifiable proof.

To understand this flow, we’ll walk through a human-present scenario:

Delegation: You tell your agent, “Buy two concert tickets.”
Discovery & Negotiation: The agent contacts the merchant’s endpoint to prepare the cart.
Finalize Cart: The agent reaches out to your Credential Provider (e.g., your digital wallet). You select the payment method. The agent only gets a reference (like the last 4 digits), never the full credential.
Authorization with Mandates: The agent shows you the final, finalized cart.
You cryptographically sign the Cart Mandate. This is the non-repudiable proof, the “contract.”
Purchase: The agent sends this signed mandate to the merchant. The merchant can now trust the purchase mandate is from you. The merchant’s payment processor uses the mandate to securely get the payment token from the credential provider and complete the transaction.

This flow all hinges on trust. In the short term, this trust is built using manual allow lists of approved agents and merchants. In the long term, the plan is to use open web standards like HTTPS and DNS ownership to verify identities.

Q&A with Prateek Dudeja

Timestamp: [13:07]

With the concepts explained, the discussion moved to a Q&A with Prateek.

Why a New Protocol for Payments?

Timestamp: [13:30]

Prateek gave a great analogy: HTTPS is a baseline protocol for browsing. Signing in requires stronger authentication. Making a payment requires an even higher level of trust. AP2 provides that “payments-grade security” on top of baseline protocols like A2A and MCP, ensuring the transaction is high-trust and truly from a human.

How Will Agents Find Trusted Partners?

Timestamp: [14:42]

In the short term, agents will use “decentralized registries of trust” (or allow lists) to find merchants they can interact with. Prateek noted that all the roles (merchant, credential provider, etc.) already exist in the payments industry today. The only new role is the Shopping Agent itself.

Accountability: What Happens When Things Go Wrong?

Timestamp: [16:03]

This is the big question. What if your agent shows you blue shoes, you wanted teal, but you click “approve” anyway?

Prateek explained that the signed Cart Mandate solves this. Because you biometrically signed a tamper-proof credential showing the blue shoes, the responsibility is on you. The merchant has cryptographic evidence that you saw and approved the exact product. This protects merchants from fraudulent chargebacks and users from unauthorized agent actions.

Demo: Reference Implementation

Timestamp: [18:04]

Prateek walked through a demo showing the human-present flow. It showed the user prompting the agent, the agent discovering products, and then the Credential Provider (PayPal) getting involved. The user selected their shipping and payment info from PayPal, and the agent only saw a reference. The user then signed the Cart Mandate, and the purchase was completed.

Compatibility and Getting Started

Timestamp: [19:43]

A key question was – Is this compatible with frameworks like LangGraph or CrewAI? Yes. Prateek confirmed the protocol is compatible with any framework. As long as your agent can communicate over A2A or MCP, you can use AP2.

To get started, Prateek directed developers to the GitHub repo. The first step is to see which role you want to play (merchant, credentials provider, etc.) and explore the sample code for that role.

The Future: Dynamic Negotiation

Timestamp: [21:13]

Looking ahead, Prateek shared an exciting vision for “dynamic negotiation.” Imagine telling your agent: “I want that red dress that’s out of stock. I need it by tomorrow… and I’m willing to pay 30% more”.

A merchant’s agent could see this “intent” and, if the dress becomes available, automatically complete the sale. What was a lost sale for the merchant becomes a completed order at a markup, and the user gets the exact item they desperately wanted.

Your turn to build

This conversation made it clear that building a secure payment infrastructure is a foundational step toward creating agents that can perform truly useful tasks in the real world. We’re moving from a simple, programmatic web to a conversational, contractual one, and this protocol provides the framework for it.

We encourage you to check out the Agent Payment Protocol GitHub repo, think about which role you could play in this new ecosystem, and start building today!

Connect with us

Shir Meir Lador → LinkedIn, X
Ivan Nardini → LinkedIn, X
Prateek Dudeja → Linkedin

Read More for the details.

2025 09 29

GCP – How TELUS is powering growth and productivity with Google

Tibor Kiss Cloud, Google Cloud gcp

Editor’s note: Today’s post is by Alyson Butler, Director of Team Member Experiences at TELUS, a communications technology company with more than 20 million customer connections. TELUS has integrated ChromeOS, Google Workspace, Chrome Enterprise Premium, and Cameyo to support its transformation into a globally recognized technology leader.

Thirty-five years ago, TELUS started out by providing telecommunication services to Canadians. More than three decades later, we’re still connecting people and the data they need, but we’ve grown into a large, diversified company offering a full range of innovative digital solutions for IT, healthcare, customer experience and agriculture.

As Director of Team Member Experiences at TELUS, I lead the teams responsible for managing the global digital workplace experience for nearly 60,000 team members worldwide. Our mission is to unlock their productivity in a secure way, ensuring they feel like a connected part of the team, while protecting our internal and customer data. This can be especially challenging with a distributed workforce. What may have sounded impossible before, was achievable with Google. We created an enterprise computing stack with ChromeOS, Google Workspace, Cameyo, and Chrome Enterprise Premium. This new solution made our login speed three times faster than our previous setup — a crucial improvement for our customer agents. The tech stack simplified our security and management while unlocking business benefits like higher productivity, cost-per-call savings, and improved customer satisfaction.

TELUS worked with ChromeOS, Chrome Enterprise browser and Google Workspace to improve experiences

Workspace established a strong foundation

Our digital journey began in 2018 with a mission to transform our distributed workforce. We knew that cloud apps with built-in security were the way forward. By migrating to Google Workspace, we immediately improved productivity, enhanced the end-user experience, and took advantage of its robust, natively built-in security and compliance features.

Workspace quickly became an end-to-end solution for us. Our teams create files in Google Docs, Sheets, and Slides, and store them in Google Drive, where the automatic version control ensures seamless collaboration.

We connect on Chat, manage approvals and workflows in Gmail, and use Google Meet to connect virtually. Google Calendar is our workhorse for planning and scheduling, allowing us to view other team members’ schedules, set up events, and send email invites — all from one tab. The benefits of Google Workspace allowed us to fundamentally rethink our end-user computing strategy.

With growth from acquisitions, we built a cloud platform

To address our technical debt and remove past complexities, we needed a platform that could support our growth from acquisitions while improving our security posture and team member experience. Seeing the positive results with Google Workspace, we were inspired to test other Google solutions to support our growth. We anticipated that a combination of ChromeOS, Cameyo, and Chrome Enterprise Premium would help us reap the benefits of browser-based computing, including simplified management and improved security.

We deployed ChromeOS in our call centers, allowing us to move away from our costly legacy Virtual Desktop Infrastructure (VDI) solution. However, we still needed to provide access to important contact center software.

After evaluating several virtualization products, Cameyo clearly outperformed the others and offered the significant cost savings we needed. By choosing ChromeOS with Cameyo, we achieved a major digital transformation, cutting operational expenses and avoiding a $15 million infrastructure refresh. The transition also accelerated performance and simplified our IT management. What’s more, our call center team members have found the Cameyo platform incredibly easy to use. Today, more than 100 applications flow through it. Looking ahead, we expect this solution to yield a significantly lower total cost of ownership compared to our previous solution.

This journey led us to what we now call the TELUS Desktop Stream: a zero-trust, browser-based app streaming solution built with ChromeOS, Chrome Enterprise Premium, and Cameyo. Cameyo is crucial because it allows all our applications, whether cloud-based or not, to run directly through a browser. This gives us confidence that we can continue to provide our teams with the necessary tools while reaping the benefits of cloud-based end-user computing.

Unlocking business-wide benefits

At every stage of our digital transformation, we’ve seen benefits beyond cost savings. For example, my IT team appreciates Chrome Enterprise Premium for its ability to streamline our Chrome browser management across our entire device fleet. They also love how Cameyo enables TELUS Desktop Stream to scale application access instantly, supporting the flexible workforce we need to serve customers worldwide.

Chrome Enterprise Premium gave us the security protection we needed for a distributed workforce and a growing business. With extra security features such as context-aware access controls and data loss protection, we are more secure as an organization, even with eliminating the need of VPN’s for team members leveraging virtual access.

ChromeOS helped us deliver the speed that team members requested for their productivity solutions. Login times are three times faster in ChromeOS compared with our old Windows operating system. And today, team members talk about how much faster it is to work within ChromeOS, integrated with Chrome Enterprise and Cameyo, compared with the previous slow VDI solution with annoying lags and inconsistent latency.

In the TELUS call center, the combined solutions have directly delivered greater efficiency by allowing us to handle more calls per hour. The integration has been really powerful, having a real impact on productivity, and by extension, the levels of service and engagement we can bring to our customers.

When employees are satisfied, customers are satisfied

Our ongoing Google deployment is helping us achieve our goal to thrive. We believe that when employees are empowered with intuitive, friction-less experiences, they become not just more productive but more satisfied as well. With Google, we are now set up for the future with a web based tech stack that is agile enough to securely support our growth, while keeping our teams productive and engaged.

Read More for the details.

2025 09 25

GCP – Accelerating cloud migrations to Google Cloud with Searce to drive profitable growth

Tibor Kiss Cloud, Google Cloud gcp

As companies transition past legacy infrastructure and set themselves up for growth in AI, multi-cloud, and platform engineering requirements, many are looking to Google Cloud for its reliability, performance, and cost benefits.

To achieve successful migrations, especially at the enterprise level, organizations need a seasoned partner with a deep understanding of platform modernization, automation, and business outcomes. Searce is one of the Google Cloud partners helping several organizations migrate to Google Cloud and achieve striking business value.

Searce has led over 1,000 migrations

Searce is an engineering-led, AI-powered consultancy and Google Cloud Premier partner that has delivered over 1,000 successful migrations to date. Their focus: helping businesses across industries scale faster, modernize legacy platforms, and unlock new efficiencies with Google Cloud.

Through hands-on experience, Searce has identified three consistent, measurable benefits that enterprises gain when migrating to Google Cloud:

Improved reliability and integration with cloud-native services (e.g., 25% improved reliability)
Increased productivity with reduced costs (e.g., 50% reduction in TCO)
Scalability and performance (e.g., Petabytes of data migrated with 75% reduced downtime and 30% performance improvement)

Together, Google Cloud’s Global VPC, leadership in AI/ML, and powerful managed container services like Google Kubernetes Engine and GKE Autopilot enable these transformations. Searce combines these capabilities with its engineering-led, solution-oriented approach to deliver impact fast — especially for organizations modernizing legacy platforms or scaling AI workloads. Let’s take a look at a few examples of migrations to Google Cloud that Searce has performed.

Boosting reliability in healthcare with GKE Gateway Controller

A major digital healthcare provider that treats more than 50 million patients per year required near 100% platform availability. Their legacy Kubernetes infrastructure on a different cloud provider would regularly have unplanned outages during microservices upgrades, which postponed care, interrupted pharmacies, and eroded patient trust.

Searce upgraded their platform to GKE, replacing legacy ingress resources with GKE Gateway Controllers to break apart routing and facilitate safer, quicker rollbacks. The outcome? Searce said the healthcare company saw:

25% increase in platform reliability
30% reduction in infrastructure expense
Streamlined log aggregation with Cloud Logging and single resource management with Global VPC architecture

With GKE, they now roll out updates securely, minimizing downtime and speeding up development without sacrificing availability.

50% lower TCO for a global fintech leader with GKE Autopilot

With operations in over 20 countries, an international fintech company used several Kubernetes clusters to achieve performance at scale, but managing this sort of global deployment had a price. Operational overhead was sucking engineering time and momentum out of innovation.

By moving them to GKE Autopilot, Searce assisted in unifying service management and transferring operational complexity to Google Cloud. With global HTTP load balancing, self-managing Infrastructure as Code (IaC) modules, and team-specific resource management, Searce said its customer’s business experienced:

50% reduction in total cost of ownership
40% improvement in engineering productivity
Reduced time-to-market for customer-facing features

This transformation enabled teams to concentrate on what is most important: creating value for users, not dealing with infrastructure.

Scaling performance for a global telecom giant

A leading global telecommunication provider with petabytes of data and over 150 years of history needed to be able to scale without compromises. Their traditional systems constrained performance and required unacceptably high amounts of downtime for maintenance.

They joined forces with Searce to plan a smooth transition to Google Cloud, optimizing storage and compute for AI workloads, and combining high-throughput data pipelines.

Searce and the customer were able to realize:

30% boost in system performance
75% decrease in downtime during migration
Increased customer innovation and data-driven services for a future-proof foundation

How Searce delivers migration success

With over 1000 successful migrations under its belt, Searce brings deep engineering expertise and a proven framework to help enterprises move to Google Cloud with speed, confidence, and long-term value.

At the core is Searce’s ‘evlos’ (Solve differently) approach to simplify, optimize, and future-proof enterprise workloads across infrastructure, data, applications, and AI.

A proven five-step migration framework

Searce’s structured approach ensures a smooth transition from discovery through Day-2 operations:

Discovery and assessment – Identify workloads and dependencies
Planning – Define migration, DR, and rollback strategies
Design, migrate and test – Wave-based automation with minimal downtime
Execution – Cutover and system stabilization
HyperCare – 24×7 post-migration support via Searce’s in-house SOC team

To help enterprises through their journey to Google Cloud and further streamline the customer journey, Searce offers three key accelerators:

Automated Cost Savings Assessment – 24-hour TCO report
Automated Technical Assessment – Secure, software-driven inventory
Automated Migration to Google Cloud – Script-based automation for faster cutovers

Ready to accelerate your migration journey?

Enterprises worldwide are choosing Google Cloud to build scalable, reliable, and AI-ready platforms. With proven migration experience, domain expertise, and a unique approach to transformation, Searce is the trusted partner to get you there faster. Backed by Google Cloud’s capabilities, Searce turns migration into a strategic enabler for innovation, not just a technical shift.

Explore what’s possible when you migrate with confidence. Access the whitepaper using this link to get more details.

Read More for the details.

2025 09 25

GCP – GPUs when you need them: Introducing Flex-start VMs

Tibor Kiss Cloud, Google Cloud gcp

Innovating with AI requires accelerators such as GPUs that can be hard to come by in times of extreme demand. To address this challenge, we offer Dynamic Workload Scheduler (DWS), a service that optimizes access to compute resources when and where you need them. In July, we announced Calendar mode in DWS to provide short-term ML capacity without long-term commitments, and today, we are taking the next step: the general availability (GA) of Flex-start VMs.

Available through the Compute Engine instance API, gcloud CLI, and the Google Cloud console, Flex-start VMs provide a simple and direct way to create single VM instances that can wait for in-demand GPUs. This makes it easy to integrate this flexible consumption option into your existing workflows and schedulers.

What are Flex-start VMs?

Flex-start VMs, powered by Dynamic Workload Scheduler, introduce a highly differentiated consumption model that’s a first among major cloud providers, letting you create single VM instances that provide fair and improved access to GPUs. Flex-start VMs are ideal for defined-duration tasks such as AI model fine-tuning, batch inference, HPC, and research experiments that don’t need to start immediately. In exchange for being flexible with start time, you get two major benefits:

Dramatically improved resource obtainability: By allowing your capacity requests to persist in a queue for up to two hours, you increase the likelihood of securing resources, without needing to build your own retry logic.
Cost-effective pricing: Flex-start VM SKUs offer significant discounts compared to standard on-demand pricing, making cutting-edge accelerators more accessible.

Flex-start VMs can run uninterrupted for a maximum of seven days and consume preemptible quota.

A new way to request capacity

With Flex-start VMs, you can now choose how your request is handled if capacity isn’t immediately available using a single parameter: request-valid-for-duration.

Without this parameter, when creating a VM, Compute Engine makes a short, best-effort attempt (about 90 seconds) to secure your resources. If capacity is available, your VM is provisioned. If not, the request fails quickly with a stockout error. This “fail-fast” behavior is good for workflows where you need an answer immediately so you can make scheduling decisions such as trying another zone or falling back to a different machine type.

However, for workloads that can wait, you can now make a persistent capacity request by setting the request-valid-for-duration flag. Select a period between 90 seconds and 2 hours to instruct Compute Engine to hold your request in a queue. Your VM enters a PENDING state, and the system works to provision your resources as they become available within your specified timeframe. This “get-in-line” approach provides a fair and managed way to access hardware, transforming the user experience from one of repeated manual retries to a simple, one-time request.

Key features of Flex-start VMs

Flex-start VMs offer several critical features for flexibility and ease of use:

Direct instance API access: Integration with instances.insert, or via a single CLI command, lets you create single Flex-start VMs simply and directly, making it easy to integrate them into custom schedulers and workflows.
Stop and start capabilities: You have full control over your Flex-start VMs. For instance, you can stop an instance to pause billing and release the underlying resources. Then, when you’re ready to resume it, simply issue a start command to place a new capacity request. Once the capacity is successfully provisioned, the seven-day maximum run duration clock resets.
Configurable termination action: For many advanced use cases, you can set instanceTerminationAction = STOP so that when your VM’s seven-day runtime expires, the instance is stopped rather than deleted. This preserves your VM’s configuration, including its IP address and boot disk, saving on setup time for subsequent runs.

What customers have to say

Customers across research and industry are using Flex-start VMs to improve their access to scarce accelerators.

“Our custom scheduling environment demands precise control and direct API access. The GA of Flex-start in the Instance API, particularly with its stop/start capabilities and configurable termination, is a game-changer. It allows us to seamlessly integrate this new, highly-efficient consumption model into our complex workflows, maximizing both our resource utilization and performance.” – Ragnar Kjørstad, Systems Engineer, Hudson River Trading (HRT)

“For our critical anti-fraud model training, Flex-start VMs are a game-changer. The queuing mechanism gives us reliable access to powerful A100 GPUs, which enhances our development cycles and security offerings at a significant performance-to-cost advantage.” – Bakai Zhamgyrchiev, Head of ML, Oz Forensics

Get started today

Getting started with a queued Flex-start VM is straightforward. You can create one using a gcloud command or directly through the API.

gcloud example (to wait in queue):

code_block: <ListValue: [StructValue([(‘code’, ‘gcloud beta compute instances create my-flex-start-vm \rn –machine-type=a3-megagpu-8g \rn –provisioning-model=FLEX_START \rn –max-run-duration=3d \rn –request-valid-for-duration=2h \rn –zone=us-central1-a’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f671e9dfd30>)])]>

API Request Snippet (JSON):

code_block: <ListValue: [StructValue([(‘code’, ‘{rn “name”: “my-flex-start-vm”,rn “machineType”: “zones/us-central1-a/machineTypes/a3-megagpu-8g”,rn “scheduling”: {rn “provisioningModel”: “FLEX_START”,rn “maxRunDuration”: {rn “seconds”: “259200”rn }rn },rn “params”: {rn “request_valid_for_duration”: {rn “seconds”: “7200”rn }rn },rn …rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f671e9df130>)])]>

Flex-start VMs in the Instance API is a direct response to the need for more efficient, reliable, and fair access to high-demand AI accelerators. By introducing a novel queuing mechanism,you can integrate the new Flex-start consumption model into your existing workflows easily, so you can spend less time architecting retry loops for on-demand access. To learn more and try Flex-start VMs today, see the documentation and pricing information.

Read More for the details.

2025 09 25

GCP – The global harms of restrictive cloud licensing, one year later

Tibor Kiss Cloud, Google Cloud gcp

A year ago today, Google Cloud filed a formal complaint with the European Commission about Microsoft’s anti-competitive cloud licensing practices — specifically those that impose financial penalties on businesses that use Windows Server software on Azure’s biggest competitors.

Despite regulatory scrutiny, it’s clear that Microsoft intends to keep its restrictive licensing policies in place for most cloud customers. In fact, it’s getting worse.

As part of a recent earnings call, Microsoft disclosed that its efforts to force software customers to use Azure are “not anywhere close to the finish line,” and represented one of three pillars “driving [its] growth.” As we approach the end of September, Microsoft is imposing another wave of licensing changes to force more customers to Azure by preventing managed service providers from hosting certain workloads on Azure’s competitors.

Regulators have taken notice. As part of a comprehensive investigation, the U.K.’s Competition and Markets Authority (CMA) recently found that restrictive licensing harms cloud customers, competition, economic growth, and innovation. At the same time, a growing number of regulators around the world are also scrutinizing Microsoft’s anti-competitive conduct — proving that fair competition is an issue that transcends politics and borders.

While some progress has been made, restrictive licensing continues to be a global problem, locking in cloud customers, harming economic growth, and stifling innovation.

Economic, security, and innovation harms

Restrictive cloud licensing has caused an enormous amount of harm to the global economy over the last year. This includes direct penalties that Microsoft forces businesses to pay, and downstream harms to economic growth, cybersecurity, and innovation. Ending restrictive licensing could help supercharge economies around the world.

Microsoft still imposes a 400% price markup on customers who choose to move legacy workloads to competitors’ clouds. This penalty forces customers onto Azure by making it more expensive to use a competitor. A mere 5% increase in cloud pricing due to lack of competition costs U.K. cloud customers £500 million annually, according to the CMA. A separate study in the EU found restrictive licensing amounted to a billion-Euro tax on businesses.

In the United States, the lack of competition due to Microsoft’s licensing tactics amounts to $750 million in overspending by government agencies every year.

Cybersecurity and reliability also suffer, as Microsoft drives customers into an insecure monoculture that becomes a single point of failure. Attacks on Microsoft’s insecure software have spread across governments and critical industries.

With AI technologies disrupting the business market in dramatic ways, ending Microsoft’s anti-competitive licensing is more important than ever as customers move to the cloud to access AI at scale. Customers, not Microsoft, should decide what cloud — and therefore what AI tools — work best for their business.

The ongoing risk of inaction

Perhaps most telling of all, the CMA found that since some of the most restrictive licensing terms went into place over the last few years, Microsoft Azure has gained customers at two or even three times the rate as competitors. Less choice and weaker competition is exactly the type of “existential challenge” to Europe’s competitiveness that the Draghi report warned of.

Ending restrictive licensing could help governments “unlock up to €1.2 trillion in additional EU GDP by 2030” and “generate up to €450 billion per year in fiscal savings and productivity gains,” according to a recent study by the European Centre for International Political Economy. Now is the time for regulators and policymakers globally to act to drive forward digital transformation and innovation.

In the year since our complaint to the European Commission, our message is as clear as ever: Restrictive cloud licensing practices harm businesses and undermine European competitiveness. To drive the next century of technology innovation and growth, regulators must act now to end these anti-competitive licensing practices that harm businesses.

Read More for the details.

2025 09 24

GCP – Meet the new GKE: Extending Autopilot to all qualifying clusters

Tibor Kiss Cloud, Google Cloud gcp

Autopilot is an operational mode for Google Kubernetes Engine (GKE) that provides a fully managed environment and takes care of operational details, like provisioning compute capacity for your workloads. Autopilot allows you to spend more time on developing your own applications and less time on managing node-level details. This year, we upgraded Autopilot’s autoscaling stack to a fully dynamic container-optimized compute platform that rapidly scales horizontally and vertically to support your workloads. Simply attach a horizontal pod autoscaler (HPA) or vertical pod autoscaler (VPA) to your environment, and experience a fully dynamic platform that can scale rapidly to serve your users.

More and more customers, including Hotspring and Contextual AI, understand that Autopilot can dramatically simplify Kubernetes cluster operations and enhance resource efficiency for their critical workloads. In fact, in 2024, 30% of active GKE clusters were created in Autopilot mode. The new container-optimized compute platform has also proved popular with customers, who report rapid performance improvements in provisioning time. The faster GKE provisions capacity, the more responsive your workloads become, improving your customers’ experience and optimizing costs.

Today, we are pleased to announce that the best of Autopilot is now available in all qualified GKE clusters, not just dedicated Autopilot ones. Now, you can utilize Autopilot’s container-optimized compute platform and ease of operation from existing GKE clusters. It’s generally available, starting with clusters enrolled in the Rapid release channel and running GKE version 1.33.1-gke.1107000 or later. Most clusters will qualify and be able to access these new features as they roll out to the other release channels, except clusters enrolled in the Extended channel and those that use the older routes-based networking. To access these new features, enroll in the Rapid channel and upgrade your cluster version, or wait to be auto-upgraded.

How to use it

Autopilot features are offered in Standard clusters via compute classes, which are a modern way to group and specify compute requirements for workloads in GKE. GKE now has two built-in compute classes, autopilot and autopilot-spot, that are pre-installed on all qualified clusters running on GKE 1.33.1-gke.1107000 or later and enrolled in the Rapid release channel. Running your workload on Autopilot’s container-optimized compute platform is as easy as specifying the autopilot (or autopilot-spot) compute class, like so:

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: v1rnkind: Podrnmetadata:rn name: timeserverrn labels:rn pod: timeserver-podrnspec:rn nodeSelector:rn cloud.google.com/compute-class: autopilotrn containers:rn – name: timeserver-containerrn image: docker.io/wdenniss/timeserver:1rn resources:rn requests:rn cpu: “50m”‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189cea1f10>)])]>

Better still, you can make the Autopilot container-optimized compute platform the default for a namespace, a great way to save both time and money. You get efficient bin-packing, where the workload is charged for resource requests (and can even still burst!), rapid scaling, and you don’t have to plan your node shapes and sizes.

Here’s how to set Autopilot as your default for a namespace:

code_block: <ListValue: [StructValue([(‘code’, ‘NAMESPACE_NAME=your_namespacernkubectl label namespaces $NAMESPACE_NAME cloud.google.com/default-compute-class=autopilot’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189cea1640>)])]>

Pod sizes for the container-optimized compute platform start at 50 milli-CPU (that’s just 5% of 1 CPU core!), and can scale to 28vCPU. With the container-optimized compute platform you only pay for the resources your Pod requests, so you don’t have to worry about system overhead or empty nodes. Pods such as those larger than 28 vCPU or with specific hardware requirements can also run in Autopilot mode on specialized compute with node-based pricing via customized compute classes.

Run AI workloads on GPUs and TPUs with Autopilot

It’s easy to pair Autopilot’s container-optimized compute platform with specific hardware such as GPUs, TPUs and high-performance CPUs to run your AI workloads. You can run those workloads in the same cluster side by side Pods on the container-optimized compute platform. By choosing Autopilot mode for these AI workloads, you benefit from the Autopilot’s managed node properties, where we take a more active role in management. Furthermore, you also get our enterprise-grade privileged admission controls that require workloads to run in user-space, for better supportability, reliability and an improved security posture.

Here’s how to define your own customized compute class that runs in Autopilot mode with specific hardware, in this example a G2 machine type with NVIDIA L4s with two priority rules:

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: cloud.google.com/v1rnkind: ComputeClassrnmetadata:rn name: gpu-l4-aprnspec:rn autopilot:rn enabled: truern priorities:rn – machineType: g2-standard-48rn spot: truern gpu:rn type: nvidia-l4rn count: 4rn – machineType: g2-standard-24rn spot: truern gpu:rn type: nvidia-l4rn count: 2’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189cea1ca0>)])]>

A new way to use compute classes

We’re also making compute classes work better with a new provisioning mode that automatically provisions resources for compute classes, without changing how other workloads are scheduled on existing node pools. This means you can now adopt the new deployment paradigm of compute class (including the new Autopilot-enabled compute classes) at your own pace, without affecting existing workloads and deployment strategies.

Until now, to use compute class in Standard clusters with automatic node provisioning, you needed to enable node auto-provisioning for the entire cluster. Node auto-provisioning has been part of GKE for many years, but it was previously an all-or-nothing decision — you couldn’t easily combine a manual node pool with a compute class provisioned by node auto-provisioning without potentially changing how workloads outside of the compute class were scheduled. Now you can, with our new automatically provisioned compute classes. All Autopilot compute classes use this system, so it’s easy to run workloads in Autopilot mode side-by-side with your existing deployments (e.g., on manual node pools). You can also enable this feature on any compute class starting with clusters in the Rapid channel running GKE version 1.33.3-gke.1136000 or later.

Here’s how:

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: cloud.google.com/v1rnkind: ComputeClassrnmetadata:rn name: gpu-l4rnspec:rn nodePoolAutoCreation:rn enabled: truern priorities:rn – machineType: g2-standard-48rn spot: truern gpu:rn type: nvidia-l4rn count: 4rn – machineType: g2-standard-24rn spot: truern gpu:rn type: nvidia-l4rn count: 2’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189cea1d90>)])]>

With the Autopilot mode for compute classes in Standard clusters, and the new automatic provisioning mode for all compute classes, you can now introduce compute class as an option to more clusters without impacting how any of your existing workloads are scheduled. Customers we’ve spoken to like this, as they can adopt these new patterns gradually for new workloads and by migrating existing ones, without needing to plan a disruptive switch-over.

Autopilot for all

At Google Cloud, we believe in the power of GKE’s Autopilot mode to simplify operations for your GKE clusters and make them more efficient. Now, those benefits are available to all GKE customers! To learn more about GKE Autopilot and how to enable it for your clusters, check out these resources.

How to run workloads on Autopilot in GKE Standard clusters.
Learn how the container-optimized compute works under the hood to drive performance.
Watch the GKE Spotlight from NEXT ‘25, and read the announcement.

Read More for the details.

2025 09 24

GCP – The new data scientist: From analyst to agentic architect

Tibor Kiss Cloud, Google Cloud gcp

The role of the data scientist is rapidly transforming. For the past decade, their mission has centered on analyzing the past to run predictive models that informed business decisions. Today, that is no longer enough. The market now demands that data scientists build the future by designing and deploying intelligent, autonomous agents that can reason, act, and learn on behalf of the enterprise.

This transition moves the data scientist from an analyst to an agentic architect. But the tools of the past — fragmented notebooks, siloed data systems, and complex paths to production — create friction that breaks the creative flow.

At Big Data London, we are announcing the next wave of data innovations built on an AI-native stack, designed to address these challenges. These capabilities help data scientists move beyond analysis to action by enabling them to:

Stop wasting time context-switching. We’re delivering a single, intelligent notebook environment where you can instantly use SQL, Python, and Spark together, letting you build and iterate in one place instead of fighting your tools.
Build agents that understand the real world. We’re giving you native, SQL-based access to the messy, real-time data — like live event streams and unstructured data — that your agents need to make smart, context-aware decisions.
Go from prototype to production in minutes, not weeks. We’re providing a complete ‘Build-Deploy-Connect’ toolkit to move your logic from a single notebook into a secure, production-grade fleet of autonomous agents.

Unifying the environment for data science

The greatest challenge of data science productivity is friction. Data scientists live in a state of constant, forced context-switching: writing SQL in one client, exporting data, loading it into a Python notebook, configuring a separate Spark cluster for heavy lifting, and then switching to a BI tool just to visualize results. Every switch breaks the creative “flow state” where real discovery happens. Our priority is to eliminate this friction by creating the single, intelligent environment an architect needs to engineer, build, and deploy — not just run predictive models.

Today, we are launching fundamental enhancements to Colab Enterprise notebooks in BigQuery and Vertex AI. We’ve added native SQL cells (preview), so you can now iterate on SQL queries and Python code in the same place. This lets you use SQL for data exploration and immediately pipe the results into a BigQuery DataFrame to build models in Python. Furthermore, rich interactive visualization cells (preview) automatically generate editable charts from your data to quickly assess the analysis. This integration breaks the barrier between SQL, Python, and visualization, transforming the notebook into an integrated development environment for data science tasks.

But an integrated environment is only half the solution; it must also be intelligent. This is the power of our Data Science Agent, which acts as an “interactive partner” inside Colab. Recent enhancements to this agent mean it can now incorporate sophisticated tool usage (preview) within its detailed plans, including the use of BigQuery ML for training and inferencing, BigQuery DataFrames for analysis using Python, or large scale Spark transformations. This means your analysis gets more advanced, your demanding workloads are more cost-effective to run, and your models get into production quicker.

In addition, we are also making our Lightning Engine generally available. The Lightning Engine accelerates Spark performance more than 4x compared to open-source Spark. And Lightning Engine is ML and AI-ready by default, seamlessly integrating with BigQuery Notebooks, Vertex AI, and VS Code. This means you can use the same accelerated Spark runtime across your entire workflow in any tool of choice — from initial exploration in a notebook to distributed training on Vertex AI. We’re also announcing advanced support for Spark 4.0 (preview), bringing its latest innovations directly to you.

Building agents that understand the real world

Agentic architects build systems that will sense and respond to the world in real time. This requires access to data that has historically been siloed in separate, specialized systems such as live event streams and unstructured data. To address this challenge we are making real-time streams and unstructured data more accessible for data science teams.

First, to process real-time data using SQL we are announcing stateful processing for BigQuery continuous queries (preview). In the past, it was difficult to ask questions about patterns over time using just SQL on live data. This new capability changes that. It gives your SQL queries a “memory,” allowing you to ask complex, state-aware questions. For example, instead of just seeing a single transaction, you can ask, “Has this credit card’s average transaction value over the last 5 minutes suddenly spiked by 300%?” An agent can now detect this suspicious velocity pattern — which a human analyst reviewing individual alerts would miss — and proactively trigger a temporary block on the card before a major fraudulent charge goes through. This unlocks powerful new use cases, from real-time fraud detection to adaptive security agents that learn and identify new attack patterns as they happen.

Second, we are removing the friction to build AI applications using a vector database, by helping data teams with autonomous embedding generation in BigQuery (preview) over multimodal data. Building on our BigQuery Vector Search capabilities, you no longer have to build, manage, or maintain a separate, complex data pipeline just to create and update your vector embeddings. BigQuery now takes care of this automatically as data arrives and as users search for new terms in natural language. This capability enables agents to connect user intent to enterprise data, and it’s already powering systems like the in-store product finder at Morrisons, which handles 50,000 customer searches on a busy day. Customers can use the product finder on their phones as they walk around the supermarket. By typing in the name of a product, they can immediately find which aisle a product is on and in which part of that aisle. The system uses semantic search to identify the specific product SKU, querying real-time store layout and product catalog data.

Trusted, production ready multi-agent development

When an analyst delivers a report and their job is done. When an architect deploys an autonomous application or agent, their job has just begun. This shift from notebook-as-prototype to agent-as-product introduces a critical new set of challenges: How do you move your notebook logic into a scalable, secure, and production-ready fleet of agents?

To solve this, we are providing a complete “Build-Deploy-Connect” toolkit for the agent architect. First, the Agent Development Kit (ADK) provides the framework to build, test, and orchestrate your logic into a fleet of specialized, production-grade agents. This is how you move from a single-file prototype to a robust, multi-agent system. And this agentic fleet doesn’t just find problems — it acts on them. ADK allows agents to ‘close the loop’ by taking intelligent, autonomous actions, from triggering alerts to creating and populating detailed case files directly in operational systems like ServiceNow or Salesforce.

A huge challenge until now was securely connecting these agents to your enterprise data, forcing developers to build and maintain their own custom integrations. To solve this, we launched first-party BigQuery tools directly integrated within ADK or via MCP. These are Google-maintained, secure tools that allow your agent to intelligently discover datasets, get table info, and execute SQL queries, freeing your team to focus on agent logic, not foundational plumbing. In addition, your agentic fleet can now easily connect to any data platform in Google Cloud using our MCP Toolbox. Available across BigQuery, AlloyDB, Cloud SQL, and Spanner, MCP Toolbox provides a secure, universal ‘plug’ for your agent fleet, connecting them to both the data sources and the tools they need to function.

This “Build-Deploy-Connect” toolkit also extends to the architect’s own workflow. While ADK helps agents connect to data, the architect (the human developer) needs to manage this system using a new primary interface: the command line (CLI). To eliminate the friction of switching to a UI for data tasks, we are integrating data tasks directly into the terminal with our new Gemini CLI extensions for Data Cloud (preview). Through the agentic Gemini CLI, developers can now use natural language to find datasets, analyze data, or generate forecasts — for example, you can simply state gemini bq “analyze error rates for ‘checkout-service'” — and even pipe results to local tools like Matplotlib, all without leaving your terminal.

Architecting the future

These innovations transform the impact data scientists can have within the organization. Using an AI-native stack we are now unifying the development environment in new ways, expanding data boundaries, and enabling trusted production ready development.

You can now automate tasks and use agents to become an agentic architect helping your organization to sense, reason, and act with intelligence. Ready to experience this transformation? Check out our new Data Science eBook with eight practical use cases and notebooks to get you started building today.

Read More for the details.

2025 09 24

GCP – Launching Gemini CLI extensions for Google Data Cloud

Tibor Kiss Cloud, Google Cloud gcp

In June, Google introduced Gemini CLI, an open-source AI agent that brings the power of Gemini directly into your terminal. And today, we’re excited to announce open-source Gemini CLI extensions for Google Data Cloud services.

Building applications and analyzing trends with services like Cloud SQL, AlloyDB and BigQuery has never been easier — all from your local development environment! Whether you’re just getting started or a seasoned developer, these extensions make common data interactions such as app development, deployment, operations, and data analytics more productive and easier. So, let’s jump right in!

Using a Data Cloud Gemini CLI extension

Before you get started, make sure you have enabled the APIs and configured the IAM permissions required to access specific services.

To retrieve the newest functionality, install the latest release of the Gemini CLI (v0.6.0):

npm install -g @google/gemini-cli@latest

Next, install the extension:

gemini extensions install https://github.com/gemini-cli-extensions/<EXTENSION>

Replace <EXTENSION> with the name of the service you want to use. For example, alloydb, cloud-sql-postgresql or bigquery-data-analytics.

Before starting the Gemini CLI, you’ll need to configure the extension to connect with your Google Cloud project by adding the required environment variables. The table below provides more information on the configuration required.

Extension Name	Description	Configuration
alloydb	Create resources and interact with AlloyDB for PostgreSQL databases and data.	Configuration
alloydb-observability	Monitor database performance and health for AlloyDB for PostgreSQL databases.	Configuration
bigquery-data-analytics	Discover and ask questions from BigQuery data.	Configuration
bigquery-conversational-analytics	Dive deeper , discover insights from BigQuery data using the built-in stateless agent offered by Conversational Analytics API	Configuration
cloud-sql-mysql	Connect and interact with a Cloud SQL for MySQL database and data.	Configuration
cloud-sql-mysql-observability	Monitor database performance and health for Cloud SQL for MySQL databases.	Configuration
cloud-sql-postgresql	Create resources and interact with Cloud SQL for PostgreSQL databases and data.	Configuration
cloud-sql-postgresql-observability	Monitor database performance and health for Cloud SQL for PostgreSQL databases.	Configuration
cloud-sql-sqlserver	Connect and interact with a Cloud SQL for SQL Server database and data.	Configuration
cloud-sql-sqlserver-observability	Monitor database performance and health for Cloud SQL for SQL Server databases.	Configuration
dataplex	Connect to Dataplex Universal Catalog to discover, manage, monitor, and govern data and AI artifacts across your data platform.	Configuration
firestore-native	Connect and interact with Firestore databases, collections, and documents.	Configuration
looker	Connect to Looker to query data, run Looks, and create dashboards.	Configuration
mysql	Connect and interact with a MySQL database and data.	Configuration
postgres	Connect and interact with a PostgreSQL database and data.	Configuration
spanner	Connect and interact with a Spanner database and data.	Configuration
sql-server	Connect and interact with a SQL Server database and data.	Configuration
mcp-toolbox	Load custom tools using MCP Toolbox for Databases.	Configuration

Now, you can start the Gemini CLI using command gemini. You can view the extensions installed with the command /extensions

You can list the MCP servers and tools included in the extension using command /mcp list

Using the Gemini CLI for Cloud SQL for PostgreSQL extension

The Cloud SQL for PostgreSQL extension lets you perform a number of actions. Some of the main ones are included below:

Create instance: Creates a new Cloud SQL instance for PostgreSQL (and also MySQL, or SQL Server)
List instances: Lists all Cloud SQL instances in a given project
Get instance: Retrieves information about a specific Cloud SQL instance
Create user: Creates a new user account within a specified Cloud SQL instance, supporting both standard and Cloud IAM users

Curious about how to put it in action? Like any good project, start with a solid written plan of what you are trying to do. Then, you can provide that project plan to the CLI as a series of prompts, and the agent will start provisioning the database and other resources:

After configuring the extension to connect to the new database, the agent can generate the required tables based on the approved plan. For easy testing, you can prompt the agent to add test data.

Now the agent can use the context it has to generate an API to make the data accessible.

As you can see, these extensions make it incredibly easy to start building with Google Cloud databases!

Using the BigQuery Analytics extensions

For your analytical needs, we are thrilled to give you a first look at the Gemini CLI extension for BigQuery Data Analytics. We are also excited to give access to the Conversational Analytics API through the BigQuery Conversational Analytics extension. This is the first step in our journey to bring the full power of BigQuery directly into your local coding environment, creating an integrated and unified workflow.

With this extension you can

Explore data: Use natural language to search for your tables.
Analyze: Ask business questions on the data and generate intelligent insights.
Dive deeper: Use conversational analytics APIs to dive deeper into the insights.
And extend: Use other tools or extensions to extend into advanced workflows like charting, reporting, code management, etc.

This initial release provides a comprehensive suite of tools to Gemini CLI:

Metadata tools: Discover and understand the BigQuery data landscape.
Query execution tool: Run any BigQuery query and get the results back, summarized to your console.
AI-powered forecasting: Leverage BigQuery’s built-in AI.Forecast function for powerful time-series predictions directly from the command line.
Deeper data Insights: The“ask_data_insights” tool provides access to server-side BigQuery agent for richer data insights.
And more …

[Note: To use the conversational analytics extension you need to enable additional APIs. Refer to documentation for additional info.]

Here is an example journey with analytics extensions:

Explore and analyze your data , e.g.,

code_block: <ListValue: [StructValue([(‘code’, ‘> find tables related to PyPi downloadsrn rn✦ I found the following tables related to PyPi downloads:rnrn * file_downloads: projects/bigquery-public-data/datasets/pypi/tables/file_downloadsrn * distribution_metadata: projects/bigquery-public-data/datasets/pypi/tables/distribution_metadata’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fdcc86a8700>)])]>

code_block: <ListValue: [StructValue([(‘code’, ‘> Using bigquery-public-data.pypi.file_downloads show me top 10 downloaded pypi packages this month rnrn✦ Here are the top 10 most downloaded PyPI packages this month:rnrn 1. boto3: 685,007,866 downloadsrn 2. botocore: 531,034,851 downloadsrn 3. urllib3: 512,611,825 downloadsrn 4. requests: 464,595,806 downloadsrn 5. typing-extensions: 459,505,780 downloadsrn 6. certifi: 451,929,759 downloadsrn 7. charset-normalizer: 428,716,731 downloadsrn 8. idna: 409,262,986 downloadsrn 9. grpcio-status: 402,535,938 downloadsrn 10. aiobotocore: 399,650,559 downloads’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fdcc5054a60>)])]>

Run deeper insights

Using “ask_data_insights” to trigger an agent on the BigQuery (Conversational analytics API) to answer your questions. The server side agent is smart enough to gather additional context about your data and offer deeper insights into your questions.

You can go further and generate charts and reports by mixing BigQuery data with your local tools. Here’s a prompt to try:

”using bigquery-public-data.pypi.file_downloads can you forecast downloads for the last four months of 2025 for package urllib3? Please plot a chart that includes actual downloads for the first 8 months, followed by the forecast for the last four months”

Get started today!

Ready to level up your Gemini CLI extensions for our Data Cloud services? Read more in the extensions documentation. Check out our templates and start building your own extensions to share with the community!

Read More for the details.

2025 09 24

GCP – Indiana DOT saved 360 hours of manual effort to meet a 30-day executive order with Google AI

Tibor Kiss Cloud, Google Cloud gcp

Public sector agencies are under increasing pressure to operate with greater speed and agility, yet are often hampered by decades of legacy data. Critical information, essential for meeting tight deadlines and fulfilling mandates, frequently lies buried within vast collections of unstructured documents. This challenge of transforming institutional knowledge into actionable insight is a common hurdle on the path to modernization.

The Indiana Department of Transportation (INDOT) recently faced this exact scenario. To comply with Governor Mike Braun’s Executive Order 25-13, all state agencies were given 30 days to complete a government efficiency report, mapping all statutory responsibilities to their core purpose. For INDOT, the critical information needed to complete this report was buried in a mix of editable and static documents – decades of policies, procedures, and manuals scattered across internal sites. A manual review was projected to take hundreds of hours, making the deadline nearly impossible. This tight deadline necessitated an innovative approach to data processing and report generation.

Recognizing a complex challenge as an opportunity for transformation, INDOT’s leadership envisioned an AI-powered solution. The agency chose to build its pilot program on its existing Google Cloud environment, which allowed it to deploy Gemini’s capabilities immediately. By taking this strategic approach, the team was able to turn a difficult compliance requirement into a powerful demonstration of government efficiency.

From manual analysis to an AI-powered pilot in one week

Operating in an agile week-long sprint, INDOT’s team built an innovative workflow centered on Retrieval-Augmented Generation (RAG). This technique enhances generative AI models by grounding them in specific, private data, allowing them to provide accurate, context-aware answers.

The technical workflow began with data ingestion and pre-processing. The team quickly developed Python scripts to perform “Extract, Transform, Load” (ETL) on the fly, scraping internal websites for statutes and parsing text from numerous internal files. This crucial step cleaned and structured the data for the next stage: indexing. Using Vertex AI Search, they created a robust, searchable vector index of the curated documents, which formed the definitive knowledge base for the generative model.

With the data indexed, the RAG engine in Vertex AI could efficiently retrieve the most relevant document snippets in response to a query. This contextual information was then passed to Gemini via Vertex AI. This two-step process was critical, as it ensured the model’s responses were based solely on INDOT’s official documents, not on public internet data.

Setting a new standard for government efficiency

Within an intensive, week-long effort, the team delivered a functioning pilot that generated draft reports across nine INDOT divisions with an impressive 98% fidelity – a measure of how accurately the new reports reflected the information in the original source documents. This innovative approach saved an estimated 360 hours of manual effort, freeing agency staff from tedious data collection to focus on the high-value work of refining and validating the reports. The solution enabled INDOT to become the largest Indiana state agency to submit its government efficiency report on time.

The government efficiency report was a novel experience for many on our executive team, demonstrating firsthand the transformative potential of large language models like Gemini. This project didn’t just help us meet a critical deadline; it paved the way for broader executive support of AI initiatives that will ultimately enhance our ability to serve Indiana’s transportation needs.

Alison Grand

Deputy Commissioner and Chief Legal Counsel, Indiana Department of Transportation

The AI-generated report framework was so effective that it became the official template for 60 other state agencies, powerfully demonstrating a responsible use of AI and building significant trust in INDOT as a leader in statewide policy. By building a scalable, secure RAG system on Google Cloud, INDOT not only met its tight deadline but also created a reusable model for future innovation, accelerating its mission to better serve the people of Indiana.

Join us at Google Public Sector Summit

To see Google’s latest AI innovations in action, and learn more about how Google Cloud technology is empowering state and local government agencies, register to attend the Google Public Sector Summit taking place on October 29 in Washington, D.C.

Read More for the details.

2025 09 24

GCP – From legacy complexity to Google-powered innovation

Tibor Kiss Cloud, Google Cloud gcp

Editor’s note: Today’s post is by Syed Mohammad Mujeeb, CIO and Arsalan Mazhar, Head of Infrastructure, for JS Bank a prominent and rapidly growing midsize commercial bank in Pakistan with a strong national presence of over 293 branches. JS Bank, always at the forefront of technology, deployed a Google stack to modernize operations while maintaining security & compliance.

Snapshot:

JS Bank’s IT department, strained across 293 branches, was hindered by endpoint instability, a complex security stack, and a lack of device standardization. This reactive environment limited their capacity for innovation.

Through a strategic migration to a unified Google ecosystem—including ChromeOS, Google Workspace, and Google Cloud—the bank transformed its operations. The deployment of 1,500 Chromebooks resulted in a more reliable, secure, and manageable IT infrastructure. This shift cut device management time by 40% and halved daily support tickets, empowering the IT team to pivot from routine maintenance to strategic initiatives like digitization and AI integration.

Reduced IT Burden: reduced device management time by 40%
Daily support tickets were halved, freeing up IT time for strategic, value-added projects
Nearly 90% endpoint standardization, creating a manageable and efficient IT architecture
A simplified, powerful security posture with the built-in protection of ChromeOS and Google Workspace

At JS Bank, we pride ourselves as technology pioneers, always bringing new technology into banking. Our slogan, “Barhna Hai Aagey,” means we are always moving onward and upward. But a few years ago, our internal IT infrastructure was holding us back. We researched and evaluated different solutions, but found the combination of ChromeOS and Google Workspace, a perfect fit in today’s technology landscape which is surrounded by cyber threats. When we shifted to a unified Google stack, we paved the way for our future driven by AI, innovation, and operational excellence.

Before our transformation, our legacy solution was functional, but it was a constant struggle. Our IT team was spread thin across our 293 branches, dealing with a cumbersome setup that required numerous security tools, including antivirus, anti-malware, all layered on top of each other. Endpoints crashed frequently, and with a mixture of older devices and some devices running Ubuntu, we lacked the standardization needed for true efficiency and security. It was a reactive environment, and our team was spending too much time on basic fixes rather than driving innovation.

We decided to make a strategic change to align with our bank’s core mission of digitization, and that meant finding a partner with an end-to-end solution. We chose Google because we saw the value in their integrated ecosystem and anticipated the future convergence of public and private clouds. We deployed 1,500 Chromeboxes across branches and fully transitioned to Google Workspace.

Today, we have achieved nearly 90% standardization across our endpoints with Chromebooks and Chromeboxes, all deeply integrated with Google Workspace. This shift has led to significant improvements in security, IT management, and employee productivity. The built-in security features of the Google ecosystem provide peace of mind, especially during periods of heightened cybersecurity threats, as we feel that Google will inherently protect us from cyberattacks. This has simplified security protocols in branches, eliminating the need for multiple antivirus and anti-malware tools, giving our security team incredible peace of mind. Moreover, the lightweight nature of the Google solutions ensures applications are available from anywhere, anytime, and deployments in branches are simplified.

To strengthen security across all corporate devices, we made Chrome our required browser. This provides foundational protections like Safe Browsing to block malicious sites, browser reporting, and password reuse alerts. For 1,500 users, we adopted Chrome Enterprise Premium. This provides features like Zero-Trust enterprise security, centralized management, data loss prevention (DLP) to protect against accidental data loss, secure access to applications with context-aware access restrictions, and scans high-risk files.

With Google, our IT architecture is now manageable. The team’s focus has fundamentally shifted from putting out fires to supporting our customers and building value. We’ve seen a change in our own employees, too; the teams who once managed our legacy systems are now eager to work within the Google ecosystem. From an IT perspective, the results are remarkable: the team required to manage the ChromeOS environment has shrunk to 40%. Daily support tickets have been halved, freeing IT staff from hardware troubleshooting to focus on more strategic application support, enhancing their job satisfaction and career development. Our IT staff now enjoy less taxing weekends due to reduced work hours and a lighter operational burden.

Our “One Platform” vision comes to life

We are simplifying our IT architecture using Google’s ecosystem to achieve our “One Platform” vision. As a Google shop, we’ve deployed Chromebooks enterprise-wide and unified user access with a “One Window” application and single sign-on. Our “One Data” platform uses an Elastic Search data lake on Google Cloud, now being connected to Google’s LLMs. This integrated platform provides our complete AI toolkit—from Gemini and NotebookLM to upcoming Document and Vision AI. By exploring Vertex AI, we are on track to become the region’s most technologically advanced bank by 2026.

Our journey involved significant internal change, but by trusting the process and our partners, we have built a foundation that is not only simpler and more secure but is also ready for the next wave of innovation. We are truly living our mission of moving onward and upward.

Read More for the details.