One of the most compelling aspects of cloud computing is being able to automatically scale resources up, but almost as importantly, to scale them back down to manage costs and performance. This is standard practice with virtual machines, for instance Compute Engine Managed Instance Groups, but because of their inherent complexity, less so with stateful services such as databases.
Memorystore for Redis Cluster capacity is determined by the number of shards in your cluster, which can be increased/decreased without downtime, and your cluster’s shard size, which maps on to the underlying node type. At this time, the node type of the cluster is immutable. To scale capacity in or out, you modify the number of shards in your cluster. To automate this process, you can deploy the Memorystore Cluster Autoscaler to monitor your cluster metrics, and rightsize your cluster based on that information. The Autoscaler performs the necessary resource adjustments using rulesets that evaluate memory and CPU utilization, without impacting cluster availability.
The following chart shows the Autoscaler in action, with a Memorystore for Redis Cluster instance automatically scaling out as memory utilization increases. The green line represents data being written to the cluster at the rate of one gigabyte every five minutes. The blue line represents the number of shards in the cluster. You can see that the cluster scales out, with the number of shards increasing in proportion to the memory utilization, then plateaus when the writes stop, and finally scales back in when the keys are flushed at the end of the test.
Experience and deployment
To use the Autoscaler, deploy it to one of your Google Cloud projects. The Autoscaler is very flexible and there are multiple options for its deployment, so the repository contains multiple example Terraform deployment configurations, as well as documentation that describes the various deployment models.
Once you’ve deployed the Autoscaler, configure it according to the scaling requirements of the Memorystore instances being managed, to suit your workloads’ characteristics. You do this by setting Autoscaler configuration parameters for each of the Memorystore instances. Once configured, the Autoscaler autonomously manages and scales the Memorystore instances. You can read more about these parameters later in this post, and in the Autoscaler documentation.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4df7d14c70>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Autoscaler architecture
The Autoscaler consists of two main components, the Poller and the Scaler. You can deploy these to either Cloud Run functions or Google Kubernetes Engine (GKE) via Terraform, and configure them so that the Autoscaler runs according to a user-defined schedule. The Poller queries the Memorystore metrics in Cloud Monitoring at a pre-defined interval to determine utilization, and passes them to the Scaler. The Scaler then compares the metrics against the recommended thresholds specified in the rule set, and determines if the instance should be scaled in or out, and if so, by how many shards. You can modify the sample configuration to determine minimum and maximum cluster sizes and any other thresholds suitable for your environment.
Throughout the flow, the Autoscaler writes a step-by-step summary of its recommendations and actions to Cloud Logging for tracking and auditing, as well as metrics to Cloud Monitoring to provide insight into its actions.
Scaling rubrics
Memorystore performance is most commonly limited by in-memory storage and by CPU. The Autoscaler is configured by default to take both of these factors into consideration when scaling, by utilizing the CPU_AND_MEMORY profile. This is a good place to start your deployment, and can be replaced with a custom configuration, if required, to best suit your needs.
Defaults
Metric
Average Default Setting
Max Default Setting
CPU scale OUT
CPU > 70%
Max CPU > 80% and average CPU > 50%
CPU scale IN
CPU < 50% *
Max CPU < 60% and average CPU < 40% *
Memory Scale OUT
Usage > 70%
Max Usage > 80% and average usage > 50%
Memory Scale IN
Usage < 50% *
Max Usage < 60% and average usage < 40% *
* Scale-in will be blocked if there are ongoing key evictions, which occur when the keyspace is full and keys are removed from the cache to make room. Scale in is enabled by default, but can be configured using a custom scaling profile. Refer to the Scaling Profiles section of the documentation for more information on how to do this.
Scaling scenarios and methods
Let’s take a look at some typical scenarios and their specific utilization patterns, and the Autoscaler configurations best suited to each of them. You can read more about the options described in the following section in the configuration documentation.
Standard workloads
With many applications backed by Memorystore, users interact with the application at certain times of day more than others, in a regular pattern — think a banking application where users check their accounts in the morning, make transactions during the afternoon and early evening, but don’t use the application much at night.
We refer to this fairly typical scenario as a “standard workload” whose time series shows:
Large utilization increase or decrease at certain points of the day
Small spikes over and under the threshold
A recommended base configuration for these types of workflow should include:
The LINEAR scalingMethod to cover large scale events
A small value for scaleOutCoolingMinutes — between 5 and 10 minutes — to minimize Autoscaler’s reaction time.
Plateau workloads
Another common scenario is applications with more consistent utilization during the day such as global apps, games, or chat applications. User interactions with these applications are more consistent, so the jumps in utilization are less pronounced than for a standard workload.
These scenarios create a “plateau workload” whose time series shows:
A pattern composed of various plateaus during the day
Some larger spikes within the same plateau
A recommended base configuration for these types of workflow should include:
The STEPWISE scalingMethod, with a stepSize sufficient to cover the largest utilization jump using only a few steps during a normal day, OR
The LINEAR scalingMethod, if there is likely to be a considerable increase or reduction in utilization at certain times, for example when breaking news is shared. Use this method together with a scaleInLimit to avoid reducing the capacity of your instance too quickly
Batch workloads
Customers often need increased capacity for their Memorystore clusters to handle batch processes or a sales event, where the timing is usually known in advance. These scenarios comprise a “batch workload” with the following properties:
A scheduled, well-known peak that requires additional compute capacity
A drop in utilization when the process or event is over
A recommended base configuration for these types of workloads should include two separate scheduled jobs:
One for the batch process or event, that includes an object in the configuration that uses the DIRECT scalingMethod, and a minSize value of the peak number of shards/nodes to cover the process or event
One for regular operations, that includes configuration with the same projectId and instanceId, but using the LINEAR or STEPWISE method. This job will take care of decreasing the capacity when the process or event is over
Be sure to choose an appropriate scaling schedule so that the two configurations don’t conflict. For both Cloud Run functions and GKE deployments, make sure the batch operation starts before the Autoscaler starts to scale the instance back in again. You can use the scaleInLimit parameter to slow the scale-in operation down if needed.
Spiky workloads
Depending on load, it can take around several minutes for Memorystore to update the cluster topology and fully utilize new capacity. Therefore, if your traffic is characterized by very spiky traffic or sudden-onset load patterns, the Autoscaler might not be able to provision capacity quickly enough to avoid latency, or efficiently enough to yield cost savings.
For these spiky workloads, a base configuration should:
Set a minSize that slightly over-provisions the usual instance workload
Use the LINEAR scalingMethod, in combination with a scaleInLimit to avoid further latency when the spike is over
Choose scaling thresholds large enough to smooth out some smaller spikes, while still being reactive to large ones
Advanced usage
As described above, the Autoscaler is preconfigured with scaling rules designed to optimize cluster size based on CPU and memory utilization. However, depending on your workload(s), you may find that you need to modify these rules to suit your utilization, performance and/or budget goals.
There are several ways to customize the rule sets that are used for scaling, in increasing order of effort required:
Choose to scale on only memory or only CPU metrics. This can help if you find your clusters flapping, i.e., alternating rapidly between sizes. You can do this by specifying a scalingProfile of either CPU or MEMORY to override the default CPU_AND_MEMORY in the Autoscaler configuration.
Use your own custom scaling rules by specifying a scalingProfile of CUSTOM, and supplying a custom rule set in the Autoscaler configuration as shown in the example here.
Create your own custom rule sets and make them available for everyone in your organization to use as part of a scaling profile. You can do this by customizing one of the existing scaling profiles to suit your needs. We recommend starting by looking at the existing scaling rules and profiles, and creating your own customizations.
Next steps
The OSS Autoscaler comes with a Terraform configuration to get you started, which can be integrated into your codebase for production deployments. We recommend starting with non-production environments, and progressing through to production when you are confident with the behavior of the Autoscaler alongside your application(s). Some more tips for production deployments are here in the documentation.
If there are additional features you would like to see in the Autoscaler — or would like to contribute to it yourself — please don’t hesitate to raise an issue via the GitHub issues page. We’re looking forward to hearing from you.
Today, we are thrilled to announce the public beta launch of Gen AI Toolbox for Databases in partnership with LangChain, the leading orchestration framework for developers building large language model (LLM) applications.
Gen AI Toolbox for Databases (Toolbox) is an open-source server that empowers application developers to connect production-grade, agent-based generative AI (gen AI) applications to databases. It streamlines the creation, deployment, and management of sophisticated gen AI tools capable of querying databases with secure access, robust observability, scalability, and comprehensive manageability. It also provides connectivity to popular open-source databases such as PostgreSQL, MySQL, as well as Google’s industry-leading Cloud Databases like AlloyDB, Spanner, and Cloud SQL for SQL Server. We are open to contributions from other databases outside of Google Cloud.
In this post, we’ll explore how Gen AI Toolbox for Databases works, and how to get started.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4df7dc48b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Challenges in gen AI tool management
Building AI agents requires using different tools, frameworks, and connecting to various data sources. This process creates several challenges for developers, particularly when these tools need to query databases. These include –
Scaling tool management: Current approaches to tool integration often require extensive, repetitive code and modifications across multiple locations for each tool. This complexity hinders consistency, especially when tools are shared across multiple agents or services. A more streamlined framework integration is needed to simplify tool management and ensure consistency across agents and applications.
Complex database connections: Databases require configuration, connection pooling, and caching for optimal performance at scale.
Security vulnerabilities: Ensuring secure access from gen AI models to sensitive data requires complex integration with auth services, databases and the application, which can be error-prone and introduce security risks.
Inflexible tool updates: Adding new tools or updating existing ones often requires a complete rebuild and redeployment of the application, potentially leading to downtime.
Limited workflow observability: Current solutions lack built-in support for comprehensive monitoring and troubleshooting, making it difficult to gain insights into gen AI workflows with databases.
Components
Gen AI Toolbox for Databases improves how gen AI tools interact with data, addressing common challenges in gen AI tool management. By acting as an intermediary between the application’s orchestration layer and data sources/databases, it enables faster development and more secure data access, improving the production-quality of tools.
Toolbox comprises two components: a server specifying the tools for application use, and a client interacting with this server to load these tools onto orchestration frameworks. This centralizes tool deployment and updates, incorporating built-in production best practices to enhance performance, security, and simplify deployments.
Benefits
Toolbox offers various features that provide better managebility, security and observability for AI Agents. Some of the benefits for application developers are as follows –
Simplified development – Reduced boilerplate code and consolidated integration simplifies tool development and sharing across other agents.
Built-in performance and scale – Built-in connection pooling and optimized connectors for popular databases to handle connection management efficiency.
Zero downtime deployment – A config-driven approach enables seamless deployment of new tools and updates without any service interruption and supports incremental rollouts.
Enhanced security – Using Oauth2 and ODIC, built-in support for common auth providers enables control over Agents’ access to tools and data.
End-to-end observability – Toolbox integrates with OpenTelemetry, providing day-one insights via logging, metrics, and tracing, offering end-to-end observability for better operations.
Compatibility with LangChain
LangChain is the most popular developer framework for building LLM applications, and we’re excited to announce Toolbox compatibility with the LangChain ecosystem from day one. Together with Toolbox, LangGraph can leverage LLMs like Gemini on Vertex AI to build powerful agentic workflows.
LangGraph extends LangChain’s capabilities by providing a framework for building stateful, multi-actor applications with LLMs. Its support for cycles, state management, and coordination enables the development of complex and dynamic AI agents. All of these capabilities integrate seamlessly with Toolbox.
Tool calling is essential for building agents. Agents need to call tools in a controlled and specified way, run the tool reliably, and then pass the correct context back to the LLM. LangGraph provides a low-level agent framework for managing how tools are called and how their responses are integrated, ensuring precision and control. Toolbox then handles the execution itself, seamlessly running the tool and returning results. Together, they create a powerful solution for tool calling in agent workflows.
“The integration of Gen AI Toolbox for Databases with the LangChain ecosystem is a boon for all developers” says Harrison Chase, CEO of LangChain. “In particular, the tight integration between Toolbox and LangGraph will allow developers to build more reliable agents than ever before.”
Get started with Gen AI Toolbox for Databases
Gen AI Toolbox for Databases simplifies gen AI tool development and deployment by automating the entire lifecycle. Here are some resources to get you started:
Last year, we offered our first ever “Google Launchpad for Women” series to empower women within our customer ecosystem to grow their cloud and AI skills. The response from our customers has been tremendous: more than 11,000 women across a breadth of roles – sales, leadership, marketing, finance, and more have completed previous editions of the program. As a result, they are building critical skills that help them put AI to work in their jobs, grow their careers, and help transform their businesses.
This year, in honor of International Women’s Day, we are opening “Google Launchpad for Women,” to thousands of more customer participants, providing them with no-cost training, exam prep, and access to Google experts. Registration is now open to Google Cloud customers in the Americas, EMEA, and Japan, with the three-week program beginning on March 4th in Japan and March 6th in the Americas and EMEA. Program benefits include:
Expert-led training: Two days of in-depth, instructor-led training covering key cloud concepts and best practices.
Industry insights: Engage with Google Cloud experts through panel discussions on topics such as Generative AI.
Exam preparation: Dedicated sessions to prepare for the Cloud Digital Leader certification exam.
Complimentary exam voucher: Participants will receive a voucher for the $99 exam fee.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get hands-on experience for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e24784c42b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome/’), (‘image’, None)])]>
Why these trainings are critical
Harnessing the power of cloud computing and AI is essential for all job roles, not just IT. As more businesses adopt AI, people across business roles utilize this technology every day and often make purchasing decisions about new AI platforms and tools. However, a talent gap remains, and is particularly pronounced for women, who represent about 14% of the global cloud workforce according to recent data from the World Economic Forum.
We aim to help our customers reduce this gap, ensure they have access to the skilled experts they need to advance their digital and AI transformations, and give more people opportunities to grow their careers and lead these transformations. Ultimately, those who complete the Google Launchpad for Women program will be well-equipped to achieve the Cloud Digital Leader certification, putting them at the forefront of the cloud and AI era.
Google Launchpad for Women is open to all Google Cloud customers, regardless of prior technical experience or role. We welcome women from all professional backgrounds who are eager to develop their cloud skills and advance their careers. While this initiative is specifically focused on women, we invite everyone to participate.
Sign up today
Visit the links below to learn more about each regional session and contact your sales rep to sign up today.
Mobile devices have become the go-to for daily tasks like online banking, healthcare management, and personal photo storage, making them prime targets for malicious actors seeking to exploit valuable information. Bad actors often turn to publishing and distributing malware via apps as a lucrative channel for generating illegal and/or unethical profits.
Android takes a multi-layered approach to combating malware to help keep users safe (more later in the post), but while we continuously strengthen our defenses against malware, threat actors are persistently updating their malware to evade detection. Malware developers used to complete their entire malicious aggression using the common Android app development toolkits in Java, which is easier to detect by reversing the Java bytecode. In recent years, malware developers are increasing the use of native code to obfuscate some of the critical malware behaviors and putting their hopes on obscuration in compiled and symbol-stripped Executable and Linkable Format (ELF) files, which can be more difficult and time-consuming to reveal their true intentions.
To combat these new challenges, Android Security and Privacy Team is partnering with Mandiant FLARE to extend the open-source binary analysis tool capa to analyze native ARM ELF files targeting Android. Together, we improved existing and developed new capa rules to detect capabilities observed in Android malware, used the capa rule matches to highlight the highly suspicious code in native files, and prompted Gemini with the highlighted code behaviors for summarization to enhance our review processes for faster decisions.
In this blog post, we will describe how we leverage capa behavior-detection capabilities and state-of-art Gemini summarization by:
Showcasing a malware sample that used various anti-analysis tricks to evade detections
Explaining how our existing and new capa rules identify and highlighted those behaviors
Presenting how Gemini summarizes the highlighted code for security reviews
An Illegal Gambling App Under a Music App Façade
Google Play Store ensures all published apps conform to local laws and regulations. This includes gambling apps, which are prohibited or require licenses in some areas. Developing and distributing illegal gambling apps in such areas can generate significant illicit profits, which sometimes is associated with organized crimes. To bypass Google Play Store’s security-screening procedures, some gambling apps disguise themselves with harmless façades like music or casual games. These apps only reveal their gambling portals in certain geographic markets using various anti-analysis tricks. Unfortunately, dynamic analysis, such as emulation and sandbox detonation, relies on specific device configurations, and threat actors keep trying different combinations of settings to evade our detections. It’s an ongoing game of cat and mouse!
In response, the Android Security and Privacy Team has evolved static analysis techniques, such as those that evaluate the behavior of a complete program and all its conditional logic. So, let’s describe an app that violated Google Play Store rules and show how we can better detect and block other apps like it.
We received reports of a music app opening gambling websites for users in certain geographical areas. It used an interesting trick of hiding key behaviors in a native ELF file that has most symbols (except the exported ones) stripped and is loaded at runtime to evade detection.
When we decompiled the app into Java source code, using a tool like JEB Decompiler, we found that the app has a song-playing functionality as shown in “MainActivity” of Figure 1. This looks like benign behavior and is fully within the limits of Google Play Store policies.
However, there was a small region of initialization code that loads an ELF file as soon as the app is initialized when calling the onCreate function, as shown in com.x.y.z class of Figure 1. To fully understand the behavior of the entire app, we also had to reverse engineer the ELF file, which requires a completely different toolset.
Figure 1: How the app applies anti-analysis techniques
Using a tool like Ghidra, we decompiled the ARM64 ELF file into C source code and found that this app estimates the user’s geographic location using timezone information (“Code Section 1” in Figure 1). The code implements a loop that compares the user’s timezone with a list of target regions (“Data Section” in Figure 1).
If the user’s location matches a value in the list (“Data Section” in Figure 1), this malware:
Downloads an encrypted DEX file from a remote server (“Code Section 2” in Figure 1)
Decrypts the downloaded DEX file (“Code Section 3” in Figure 1)
Loads the decrypted DEX file into memory (“Code Section 4” in Figure 1)
The loaded DEX file uses further server-side cloaking techniques and finally loads a gambling website (Figure 3) to the app users. Compared to the app icon in Figure 2, it is an obvious mismatch of the app’s advertised functionality.
Figure 2: The app icon as published
Figure 3: The loaded gambling website in app
While there are many detection technologies, such as YARA, available for identifying malware distributed in ELF files, they are less resilient to app updates or variations introduced by threat actors. Fortunately, the Android Security and Privacy Team has developed new techniques for detecting malicious Android apps by inspecting their native ELF components. For example, in the gambling app in Figure 3, there are many API calls dynamically resolved via the Java Native Interface (JNI) that interact with the Android runtime. Our detection systems recognized these cross-runtime interactions and reason about their intent. We’ve enumerated behaviors commonly seen in Android malware, such as making ptrace API calls, extracting device information, downloading code from remote servers to local storage, and making various cryptographic operations via JNI, turning them into capa detections we can use to identify and block Google Play Store threats.
Let’s now talk a little more about how this works.
Android capa Rules
capa is a tool that detects capabilities in executable files. You run it against a compiled program, and it tells you what it thinks the program can do. For example, capa might suggest that a file is a backdoor, is capable of installing services, or relies on HTTP to communicate.
Mandiant FLARE extended capa to support BinExport2, an architecture agnostic representation of disassembled programs. This enables capa to match capabilities for additional architectures and file formats, such as those supported by Ghidra and its BinExport2 plugin, with an initial focus on ARM64 ELF files. The Android Security and Privacy Team then created new capa rules focused specifically on detecting capabilities observed in ARM64 ELF files used by various Android malware samples. These proprietary rules alongside capa’s open-source rules are used to detect malware capabilities as part of internal Android malware analysis pipelines.
Referring back to the gambling app in Figure 3, the following Google proprietary rules and open-source capa rules matched the malicious functions performing cloaking techniques for further inspection.
Proprietary rules:
Make ptrace API calls
Extract device configuration information via JNI on Android
Extract timezone via JNI on Android
Encode or decode data using Base64 via JNI on Android
Encrypt or decrypt data using Cipher API via JNI on Android
Instead of browsing hundreds of thousands lines of obfuscated code, our analysts were able to quickly identify the evidence of the app’s wrong-doings using the function addresses matching those rules and enforced on the app.
Gemini Summaries of capa Rule Matches
Safeguarding the Android ecosystem, our Android malware analysis pipelines scan millions of ELF files in-depth every day, each one containing thousands to millions of lines in their decompiled codes. On top of the fast-evolving Gemini capabilities in malware analysis, capa rules are able to select the most interesting code for Gemini summarization, with sharpened focus on a much smaller set of the most suspicious functions.
Figure 4: How capa rules and Gemini help our reviewers and automation
We asked Gemini to summarize the functions matched on capa rules from the earlier gambling app with the following prompt:
You are a world-class Android reverse engineer with malware analysis specialities. You are given a few functions from an ARM/AArch64 ELF file as part of an Android App, with their decompiled pseudo-C codes. You need to analyze these functions to
1. summarize what these functions try to achieve
2. decide whether any of them has code/behavior patterns that would be found in Android malware.
{some more instructions…}
…
These functions are decompiled by Ghidra into pseudo-C format as follows:
[function_1 content]
[function_2 content]
…
In your response, please give your overall verdict assessing the risk level using HIGH / MEDIUM / LOW in the beginning, and explain your rationale for each potential malicious behavior.
Gemini responded with the following suggestions:
VERDICT: HIGH RISK
The provided Android application code exhibits several concerning behaviors strongly indicative of malicious intent. The risk level is assessed as HIGH due to the presence of multiple red flags consistent with sophisticated Android malware techniques. The analysis below details the specific malicious behavior patterns observed:
1. Dynamic Code Loading (HIGH): The function comeOn demonstrates a pattern of dynamically loading and executing code. The app fetches files (potentially DEX or ELF) using getCacheFilePath and downloadFile, then attempts to execute them using getDexClassLoader. This is a classic technique for delivering malicious payloads after initial installation, allowing the malware to easily update its functionality and evade static analysis. The use of custom decryption (decryptFile, decryptAES) further enhances the obfuscation and anti-analysis capabilities, indicating a deliberate attempt to conceal the loaded code’s true nature.
2. Custom Decryption and Execution (HIGH): The decryptFile function performs custom decryption of a file, likely to obfuscate the malicious payload. The decryption utilizes an AES algorithm, which adds a layer of complexity to the reversal process. In combination with the dynamic loading, this points strongly towards a sophisticated malware strategy. The decrypted file is then written to disk (fopen, fwrite) and executed using a custom class loader.
3. Time Cloaking (MEDIUM):thread_function incorporates time-based cloaking. It checks the user’s timezone (getTimeZoneInfo) and potentially alters behavior if the timezone is not “America/Sao Paulo”. More importantly, it appends a timestamp ("?time=" + current time) to a string used for downloading further files. This suggests an attempt to evade detection by changing behavior based on the device’s time and potentially making analysis more challenging by dynamically creating different network requests.
4. Network Cloaking (MEDIUM):thread_function exhibits additional cloaking behavior, though details are limited. The function interacts with network resources to download additional resources, showing its network-active nature. The data used in network operations are likely obfuscated through string manipulation and URL obfuscation, making identification and analysis difficult. The precise cloaking mechanisms used are not fully clear from the decompiled code, but the presence of this behavior is a significant concern.
5. Obfuscation Techniques (HIGH): The entire codebase uses extensive obfuscation techniques. The code is heavily reliant on complex data structures (deques, hash tables, trees) and intricate control flow. Function and variable names are nonsensical, hindering analysis. The use of multiple layers of indirection, combined with aggressive heap memory management (operator_new, operator_delete), makes the code difficult to understand and analyze. These obfuscation methods are typical of malware aiming to evade reverse engineering.
6. Anti-Debugging/Anti-Analysis Techniques (HIGH): The JNI_OnLoad function includes a call to ptrace(PTRACE_TRACEME, 0, 0, 0). This is a clear indicator of an attempt to detect debugging or analysis tools. If a debugger is attached, this function’s behavior will likely differ; it could crash, stall the execution, or even hide its malicious code.
Conclusion:
The combination of dynamic code loading, custom decryption, sophisticated obfuscation, and anti-debugging techniques creates a high likelihood of malicious activity. The incomplete understanding of the role of several custom functions further aggravates the risk. The app’s behavior is heavily contingent on received data and time-sensitive factors, raising serious concerns about its potential for wide-ranging malicious acts. Immediate action should be taken to contain and investigate this application further. A detailed sandbox analysis is critical to identify any network communications, file system manipulation, or other abusive activities.
As seen in the Gemini output, the Android ELF behaviors are explained clearly on the functions matched on capa rules.
In this particular example, Gemini helped to:
Accentuate the function call sequences to perform dynamic code loading, where our analysts can easily inspect the key function calls getCacheFilePath and getDexClassLoader
Identify the timezone extraction with the additional URL parameter hint, where our analysts may try to probe the malicious payload quickly and accurately
Describe more potential suspicious behaviors (e.g. getDexClassLoader JNI call, URL obfuscation) for further rule-writing ideas
capa rules in Android together with Gemini summarization shows great potential for further malware detection with more advanced techniques. Our analysts are closely monitoring the malware trends and techniques in the market and writing up-to-date capa rules to catch the bad actors in the wild.
Android’s Multi-Layered Security Approach
Android’s ever-evolving, multi-layered security approach includes integrating advanced features and working with developers and device implementers to keep the Android platform and ecosystem safe. This includes, but is not limited to:
Advanced built-in protections: Google Play Protect automatically scans every app on Android devices with Google Play Services, no matter the download source. This built-in protection, enabled by default, provides crucial security against malware and unwanted software. Google Play Protect scans more than 200 billion apps daily and performs real-time scanning at the code-level on novel apps to combat emerging and hidden threats, like polymorphic malware. In 2024, Google Play Protect’s real-time scanning identified more than 13 million new malicious apps from outside Google Play.
Google Play and developer protections from malware: To create a trusted experience for everyone on Google Play, we use our SAFE principles as a guide, incorporating multi-layered protections that are always evolving to help keep Google Play safe. These protections start with the developers themselves, who play a crucial role in building secure apps. We provide developers with best-in-class tools, best practices, and on-demand training resources for building safe, high-quality apps. Every app undergoes rigorous review and testing, with only approved apps allowed to appear in the Play Store. Before a user downloads an app from Play, users can explore its user reviews, ratings, and Data safety section on Google Play to help them make an informed decision.
Engagement with the security research community: Google works closely with the security community on multiple levels, including the App Defense Alliance, to advance app safety standards. Android also collaborates with Google Threat Intelligence Group (GTIG) to address emerging threats and safeguard Android users worldwide.
Equipped with the fast-evolving Gemini, our analysts are able to spend less time on those sophisticated samples, minimising the exposure for malicious apps and ensuring the safety of Android ecosystems.
Acknowledgement
Special thanks to Willi Ballenthin, Yannis Gasparis, Mike Hunhoff, and Moritz Raabe for their support.
As AI continues to unlock new opportunities for business growth and societal benefits, we’re working to reduce the carbon intensity of AI systems — including by optimizing software, improving hardware efficiency, and powering AI models with carbon-free energy.
Today we’re releasing a first-of-its-kind study1 on the lifetime emissions of our Tensor Processing Unit (TPU) hardware. Over two generations — from TPU v4 to Trillium — more efficient TPU hardware design has led to a 3x improvement in the carbon-efficiency of AI workloads.2
Our life-cycle assessment (LCA) provides the first detailed estimate of emissions from an AI accelerator, using observational data from raw material extraction and manufacturing, to energy consumption during operation. These measurements provide a snapshot of the average, chip-level carbon intensity of Google’s TPU hardware, and enable us to compare efficiency across generations.
Introducing Compute Carbon Intensity (CCI)
Our study examined five models of TPUs to estimate their full life-cycle emissions and understand how hardware design decisions have impacted their carbon-efficiency. To measure emissions relative to computational performance and enable apples-to-apples comparisons between chips, we developed a new metric — Compute Carbon Intensity (CCI) — that we believe can enable greater transparency and innovation across the industry.
CCI quantifies an AI accelerator chip’s carbon emissions per unit of computation (measured in grams of CO2e per Exa-FLOP).3 Lower CCI scores mean lower emissions from the AI hardware platform for a given AI workload — for example training an AI model. We’ve used CCI to track the progress we’ve made in increasing the carbon-efficiency of our TPUs, and we’re excited to share the results.
Key takeaways
Google’s TPUs have become significantly more carbon-efficient. Our study found a 3x improvement in the CCI of our TPU chips over 4 years, from TPU v4 to Trillium. By choosing newer generations of TPUs — like our 6th-generation TPU, Trillium — our customers not only get cutting-edge performance, but also generate fewer carbon emissions for the same AI workload.
Operational electricity emissions are key. Today, operational electricity emissions comprise the vast majority (70%+) of a Google TPU’s lifetime emissions. This underscores the importance of improving the energy efficiency of AI chips and reducing the carbon intensity of the electricity that powers them. Google’s efforts torun on 24/7 carbon-free energy (CFE) on every grid where we operate by 2030 aims directly at reducing the largest contributor to TPU emissions — operational electricity consumption.
Manufacturing matters. While operational emissions dominate an AI chip’s lifetime emissions, emissions associated with chip manufacturing are still notable — and their share of total emissions will increase as we reduce operational emissions with carbon-free energy. The study’s detailed manufacturing LCA helps us target our manufacturing decarbonization efforts towards the highest-impact initiatives. We’re actively working with our supply chain partners to reduce these emissions through more sustainable manufacturing processes and materials.
Our significant improvements in AI hardware carbon-efficiency in this paper complement rapid advancements in AI model and algorithm design. Outside of this study, continued optimization of AI models is reducing the number of computations required for a given model performance. Some models that once required a supercomputer to run can now be run on a laptop, and at Google we’re using techniques like Accurate Quantized Training and speculative decoding to further increase model efficiency. We expect model advancements to continue unlocking carbon-efficiency gains, and are working to quantify the impact of software design on carbon-efficiency in future studies.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud TPU API’), (‘body’, <wagtail.rich_text.RichText object at 0x3e570b0e0af0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/marketplace/product/google/tpu.googleapis.com’), (‘image’, None)])]>
Partnering for a sustainable AI future
The detailed approach we’ve taken here allows us to target our efforts to continue increasing the carbon-efficiency of our TPUs.
This life-cycle analysis of AI hardware is an important first step in quantifying and sharing the carbon-efficiency of our AI systems, but it’s just the beginning. We will continue to analyze other aspects of AI’s emissions footprint — for example AI model emissions and software efficiency gains — and share our insights with customers and the broader industry.
1. The authors would like to thank and acknowledge the co-authors for their important contributions: Ian Schneider, Hui Xu, Stephan Benecke, Tim Huang, and Cooper Elsworth. 2. A February 2025 Google case study quantified the full lifecycle emissions of TPU hardware as a point-in-time snapshot across Google’s generations of TPUs. To estimate operational emissions from electricity consumption of running workloads, we used a one month sample of observed machine power data from our entire TPU fleet, applying Google’s 2023 average fleetwide carbon intensity. To estimate embodied emissions from manufacturing, transportation, and retirement, we performed a life-cycle assessment of the hardware. Data center construction emissions were estimated based on Google’s disclosed 2023 carbon footprint. These findings do not represent model-level emissions, nor are they a complete quantification of Google’s AI emissions. Based on the TPU location of a specific workload, CCI results of specific workloads may vary. 3. CCI includes both estimates of lifetime embodied and operational emissions in order to understand the impact of improved chip design on our TPUs. In this study, we hold the impact of carbon-free energy on carbon intensity constant across generations, by using Google’s 2023 average fleetwide carbon intensity. We did this purposefully to remove the impact of deployment location on the results.
Last year we announced Imagen 3, our highest quality image generation model. Imagen 3 is available to Vertex AI customers, which means businesses can create high quality images that reflect their own brand style and logos for use in marketing, advertising, or product design.
Today, we’ll share how you can build your brand style with a logo using Imagen 3, Gemini, and the Python Library Pillow.
First, use Imagen 3 to generate visual options
Imagen 3 generates the most realistic and highest quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting, and artifact reduction. The new Imagen 3 generation model (002), delivers even higher visual appeal, prompt alignment, and overall preference.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e570cdb1f40>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Here’s how it works: Imagen 3 generates the initial images, Gemini selects and refines them, while Pillow enables precise integration and manipulation. This collaborative workflow allows for a high degree of customization and efficiency in building your brand identity.
Imagen 3 uses natural language processing (NLP) to transform text descriptions into high-quality images. But here’s the secret to getting the right image: combine Imagen with Gemini’s selection process.
Let’s take an example. Imagine you’re opening a coffee shop named “Layo Cafe.” You want a logo that embodies your brand’s modern, inviting aesthetic.
Here’s how you can use Imagen and Gemini to help:
Describe your vision: Provide Imagen with a prompt, for example,”Create an image for a new coffee shop campaign“. Gemini will rephrase your prompt to generate a better prompt for the image generation, for example, “Photorealistic image of a bright, modern coffee shop interior, showcasing a steaming cup of coffee on a minimalist table, bathed in warm, natural light. Focus on the coffee and the inviting atmosphere.“
Generate options: Imagen will generate multiple variations based on your description.
Gemini’s selection: Gemini, Google’s next-generation large language model, steps in to analyze each image. It considers factors like aesthetics, readability, and brand alignment to select the most suitable option.
In this example, Gemini created four images.
When asked which one performs the best, Gemini chose the first one. Why? Based on the provided instructions, it showed the best balance of elements. It shows a steaming cup of latte art coffee, in a bright, modern setting with warm natural light streaming in from large windows. The background is nicely blurred, keeping the focus on the coffee. The overall aesthetic is inviting and appealing, likely to attract customers. The other images either lack the latte art (important for showcasing the cafe’s offerings) or the lighting isn’t as warm and inviting.
Adjust or add instructions to Gemini prompt based on the desired output to ensure the best-generated image is selected, as each use case and expectation may vary.
Next, build your logo
Now that you have the right image, it’s time to integrate it with your marketing visuals. This works with three AI models working together – Gemini, Imagen, and Pillow.
Set the scene: Provide Imagen with a prompt describing the desired image, for example,”Create an image for a new coffee shop campaign“. Gemini will rephrase your prompt to generate a better prompt for the image generation, for example, “Photorealistic image of a bright, modern coffee shop interior, showcasing a steaming cup of coffee on a minimalist table, bathed in warm, natural light. Focus on the coffee and the inviting atmosphere.“
Ask Gemini to curate a selection based on your brand needs: Gemini analyzes the generated images and selects the one that best represents your brand and aligns with the desired aesthetic. Repeat the process for creating a new logo or if you already have a logo , proceed with the next step.
Integrate with Pillow: The Pillow library adds your Layo Cafe logo to the chosen image, ensuring optimal placement and size for maximum impact.
In this case, this was the preferred logo option:
Finally, land your message
Amplify your message by overlaying text with visuals. Whether it’s a catchy tagline or a special offer, integrating text into your AI-generated images is a powerful way to engage your audience.
Craft your message: Decide on the text you want to overlay on your image. For example, “Layo Cafe: Your daily dose of inspiration.”
Apply text overlay using the Pillow library: This Python Imaging Library acts as the artist’s brush, expertly adding the text to the image according to Gemini’s recommendations. With Pillow, the integration of text becomes seamless, allowing for a polished final product.
Reach a global audience: One of the most exciting features of this process is the ability to overlay text in any language on your generated images. This multilingual support broadens your creative horizons, enabling you to reach diverse audiences with tailored messages.
Let’s bring everything together. Here is the logo with text on Imagen’s best-generated image.
Get started today
By combining the creative ability of Imagen with the intelligent selection and design capabilities of Gemini, you can generate a logo, branded marketing materials, and enhance your visual storytelling. Want to see the code and examples?Check out the code here on GitHub.
Artificial intelligence (AI) is no longer a futuristic concept. In 2025, it’s poised to become a cornerstone of public sector operations, transforming how agencies make decisions and serve constituents. By leading with AI-powered innovation, government agencies can improve efficiency, enhance decision-making, and provide better services to citizens.
According to a recent Google commissioned IDC study that surveyed 161 federal Chief AI Officers and other AI decision makers, 50% of agencies reported high levels of AI maturity, which corresponds to mature behaviors like embracing an innovation culture and piloting generative AI use cases to drive mission impact.
Let’s dive into five AI trends we believe will shape the future of the public sector in the year ahead.
Trend #1: Multimodal AI: Unleashing the power of context
Imagine a world where AI can understand and analyze information from various sources like text, images, video and even audio. This is the power of multimodal AI. This year will be pivotal for AI adoption, driven largely by multimodal learning and the contextual awareness it enables. Looking ahead, multimodal AI will enable agencies to analyze local and state-level data and combine it with data from other sources like Google Earth Engine, Google Maps, Waze, and public data sets to improve decision-making, pre-empt climate-related risks and improve public infrastructure.
Google AI empowers Hawaii Department of Transportation (HDOT) to address climate challenges. By leveraging Google Earth Engine and Google Cloud to deploy a Climate Resilience Platform, they are able to assess risk and prioritize investment decisions based on multiple climate risks, asset conditions, and community impact.
Trend #2: AI agents: The evolution from chatbots to multi-agent systems
We’ve gone from simple chatbots of the past, to sophisticated AI agents of the future, capable of handling the most complex tasks. These agents can reason, plan, and even learn from their interactions. In the future, AI agents will help government employees work and code more efficiently, manage their applications, gain deeper data insights, identify and resolve security threats, and bring their best ideas to life.
Sullivan County, NY is using virtual agents that use Google AI to serve more citizens faster, at any hour of the day. With limited staff and budget, Sullivan County created a solution that supports citizens beyond traditional work hours while freeing up government workers to focus on strategic work.
Trend #3: Assistive search: The next frontier for knowledge work
AI is already transforming the way we access and understand information. With generative AI, governments can improve the accuracy and efficiency of searching vast datasets. By investing in semantic search, automated metadata tools, and advanced document transcription, agencies can unlock the value of their data and make it more accessible.
The Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL is embracing the power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.
Trend #4: AI-powered constituent experience: seamless and personalized
In 2025, AI will help improve citizen experiences. AI-powered tools and technologies can build trust and foster closer citizen-government relationships by enabling citizens to quickly and easily navigate government websites and services—such as applying for permits and licenses—offered in multiple languages and available 24/7.
Working with Google AI, Wisconsin Department of Workforce Development (DWD) was able to scale the state’s response to unemployment insurance claims and speed up overall response time. They were also successful in screening out fraudulent claims so that the UI program could be administered–with integrity–to Wisconsinites who needed financial assistance.
Trend #5: Security gets tighter— and tougher— with AI
As AI becomes more prevalent, so do the threats it poses. However, AI can also be a powerful tool for enhancing security. By automating threat detection, analyzing vast amounts of data, and responding to incidents quickly, AI can help protect government systems and sensitive information. As AI increasingly permeates government and citizen services, agencies must prioritize security to combat threats like deepfakes and disinformation.
New York City is hit by 90 billion cyber events every single week. “We have to distill those 90 billion events down to less than 50 or 60 things we look at. We couldn’t do that without a lot of artificial intelligence and automated decision-making tools.” – Matthew Fraser, Chief Technology Officer, New York City.
The opportunity ahead
We are already seeing incredible energy around AI in public sector agencies across the country who are using it to improve efficiency, enhance decision-making, and better serve their constituents. Going forward, we expect agencies to appoint and invest in AI leaders who bolster AI governance, build public trust, and invest in an effective workforce strategy with AI and security at the forefront. This will be crucial to ensure the responsible and secure development and deployment of AI across all levels of government.
Download the AI Trends 2025 in Public Sector report to learn more about how AI can help support your mission and accelerate your impact.
Cloud SQL Enterprise Plus edition provides high performance and availability for demanding applications with enhanced read and write performance. And high-performance applications often require that you tune the underlying database services.
To help application developers and DBAs build and deploy high performing applications, we’ve added new capabilities to query insights for Cloud SQL Enterprise Plus edition. This new database observability offering builds on top of the existing query insights capabilities in Cloud SQL Enterprise edition, and provides a unified and comprehensive observability experience that helps developers and database teams optimize performance faster than ever. Query insights for Cloud SQL Enterprise Plus edition captures and analyzes query telemetry and statistics to surface key performance indicators, diagnostic reports, and recommendations to enhance performance in an easy-to-consume format in the Google Cloud console. These signals and recommendations help application developers and database teams observe and tune overall database performance quickly, easily, and efficiently.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5709ca95e0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
With the new enhanced capabilities in query insights for Cloud SQL Enterprise Plus edition, you can:
Solve nuanced database performance issues faster than ever before. Access fine-grained database metrics such as wait events to conduct deeper root-cause analysis. With richer, near-real-time diagnostics, you can easily analyze query executions at a granular level. Query insights for Cloud SQL Enterprise Plus edition also helps you detect query plan regressions by capturing plans for all unique query executions and highlighting the rate-determining step for each execution.
Control your active query executions. Gain visibility into the queries currently running in your system. You can also choose to terminate sub-optimally running queries to unblock other critical queries and manage system resources better.
Enhance database performance with intelligent recommendations tailored for dynamic workloads. Query insights for Cloud SQL Enterprise Plus edition automatically analyzes workloads, highlights performance issues, and provides recommendations to solve them. It looks for common problems like missing indexes, missing or incorrect flags, etc. to help optimize queries and tune databases.
Use an AI-powered chat interface to ask your performance-related questions. Query insights for Cloud SQL Enterprise Plus editions comes along with an AI-powered natural language interface that provides advanced troubleshooting tips and tailored recommendations to make the resolution of complex database problems easier.
Let’s look at these capabilities in more detail.
Detailed query plans with 30-day telemetry
With 30 days of telemetry data, you can analyze long-term trends and identify recurring issues in query performance. By reviewing detailed query execution plans and comparing them over time, you can pinpoint inefficiencies and make data-driven optimizations for sustained database improvements.
Optimize query performance with historical analysis and execution plans over time
Wait events
Wait events in query insights help you identify where your database is stuck, such as on disk I/O or locks. This enables faster diagnosis of performance bottlenecks and smarter resource optimization.
Diagnose and optimize query performance with wait events
Index recommendations
Index recommendations help you identify performance bottlenecks by detecting missing indexes and providing precise, actionable recommendations to improve query performance. These recommendations offer specific SQL index-creation commands, showing their potential performance impact and highlighting the impacted queries, thereby streamlining the process of database performance optimization.
Index recommendations for enhanced query performance
Get started
Query insights for Cloud SQL Enterprise Plus edition democratizes access to enterprise-grade observability for tier-1 workloads on Google Cloud managed databases. Empowered by these advanced performance management capabilities, get ready to simplify complex workflows, monitor database health, write better queries, and meaningfully optimize system performance. This reduces mean-time-to-resolution for database performance issues, and allows application developers and database teams to focus more on the core business logic.
Query insights for Cloud SQL Enterprise Plus edition is now available in preview. Simply access query insights within the console and begin monitoring and managing your database performance. To get started, please visit our documentation.
In the federal government, organizations face unique challenges in meeting strict security and compliance requirements. FedRAMP, IL4, and IL5 standards set forth rigorous guidelines to ensure the protection of sensitive data and systems. Google Cloud understands the importance of meeting these accreditation needs and provides a comprehensive suite of tools and capabilities to support federal agencies.
Security-first development with Google Cloud
Google Cloud offers a suite of developer tools to streamline your workflow and enhance security:
Cloud Workstations: Enables secure, cloud-based workstations for remote work without compromising security.
Gemini Code Assist: Provides AI-powered code completion and suggestions helping you write better code faster.
Assured Open Source Software: Offers a curated catalog of audited and approved open-source software helping ensure compliance and security.
Cloud Build: Uses automated build and deployment service which streamlines the development process and reduces the risk for errors.
Artifact Registry: Provides secure storage and management of container images, making it easy to track and manage versions of software
The Google Cloud DevOps platform creates a strong foundation for a secure software supply chain by addressing various aspects of security throughout the development lifecycle. Powered by Cloud Workstations, Cloud Build, Artifact Registry,Google Kubernetes Engine and Cloud Run, this robust approach can help federal agencies protect their applications from supply chain attacks and ensure the integrity of their software.
Meet FedRAMP and IL4 and IL5 compliance with ease
Google Cloud offers security controls and certifications to help federal agencies meet various compliance regime requirements, such as FedRAMP, IL4 and IL5. These include:
Comprehensive encryption: Data is encrypted at rest and in transit, both in Google Cloud and when transferred to and from on-premises systems.
Vulnerability management: Google Cloud provides tools and services to identify and remediate vulnerabilities in a timely manner.
Continuous monitoring:Security Command Center provides real-time monitoring and threat detection.
Accelerating federal security and compliance with Google Cloud
Google Cloud provides a comprehensive platform that enables federal agencies to meet security accreditation needs while enhancing developer productivity. With a range of tools and capabilities, Google Cloud is the ideal partner for public sector organizations looking to innovate and transform their operations.
At Google Cloud, we’re deeply invested in making AI helpful to organizations everywhere — not just for our valued customers, but for our equally important partners.
Today, we’re thrilled to introduce a significant leap forward in how we enable our partners to co-market with us: Gemini-powered content creation within Partner Marketing Studio. These AI features are designed to streamline marketing efforts across our entire ecosystem, empowering our partners to unlock new levels of success, efficiency, and impact.
The evolving landscape of partner marketing and the need for AI
Today’s marketers are faced with a complex landscape of digital channels, diverse customer segments, and an ever-increasing demand for personalized experiences. In this dynamic environment, the ability to create high-quality, targeted content quickly and efficiently is more critical than ever.
For our partners, this challenge is amplified by the need to not only promote their own unique services and solutions, but also to tell a joint story with Google Cloud. This requires a delicate balance of creativity, strategic thinking, and operational efficiency. That’s where AI comes in.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e877c5efc10>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
With the addition of generative AI content creation, Partner Marketing Studio — our complimentary marketing automation tool for partners—evolves from a content repository into an extension of a partner’s marketing team, making it easier than ever to build campaigns that generate awareness and demand for partner services and solutions in three ways:
#1 Customize campaigns with precision
Partner Marketing Studio is where Google Cloud partners can access a curated library of customizable assets, tap into Google’s marketing expertise, and streamline campaign execution. Our new AI editing feature sits on top of this curated campaign library to give partners the ability to customize campaigns for their specific audience, whether it’s by organizational maturity, target persona, or industry.
Imagine you’re a partner, launching a new campaign to promote your expertise in data analytics. You find a pre-built campaign in Partner Marketing Studio that aligns with your goals, but you need to tailor for the healthcare industry. With the editing feature, you can easily refine the existing content with just a few prompts, ensuring your message best resonates with your target audience.
#2: Generate original campaigns from scratch with ease
Often partners need to create co-marketing content entirely from scratch. That’s where the AI creation feature comes in. This powerful new capability empowers partners to generate original marketing content quickly and efficiently within a co-branded template, without the need for extensive design or copywriting resources.
Imagine a partner is launching a new service that leverages Google Cloud’s AI capabilities. They need a fresh email campaign to promote this service to their target audience. With the creation feature, they can simply provide the objective of the asset they are creating and any relevant filters (like industry or audience), and in just moments, they will get a compelling, ready-to-use, co-branded email, complete with subject line, body copy, and a clear call to action—all infused with Google’s marketing expertise.
With the creation feature, partners can generate original marketing content, tailored to the needs of their organization, saving time and resources while ensuring the message is on point for the target audience.
#3: Built on Google’s foundation of AI innovation and expertise
These new features are powered by Gemini and grounded in Google’s content and brand standards. Built on Google’s deep understanding of AI and trained on our vast knowledge base of best practices and marketing insights, these features aren’t just providing access to generative AI capabilities. Partners gain access to Google’s marketing expertise, distilled into an intuitive, easy-to-use interface.
We’ve leveraged the power of Gemini to ensure these tools are not only powerful but also aligned with our brand standards and best practices. This means you can trust that the content generated by the editing and creation features is not only effective but also consistent with the Google Cloud brand.
Bringing AI marketing training to our partners
We’re committed to providing training and support to help partners effectively leverage new AI capabilities across Google products. That’s why we’re thrilled to announce that in 2025, we’ll be making one of our most successful internal training series available to our partners: AI Boost Bites.
This video training series is designed to provide practical, on-the-job training in just 5 minutes. Each bite-sized episode features Google marketers who’ve successfully integrated AI into their daily work, using tools like Gemini, Gemini for Workspace, NotebookLM, and AI Studio. Partners will get to see firsthand how Google marketers use these tools to produce compelling content (from text and graphics to video and audio), develop market insights, and solve real-world marketing challenges. To help put these learnings into practice, each video is followed by a “challenge” to apply the concepts. AI Boost Bites has been instrumental in upskilling Google Marketers in AI, fostering a culture of continuous learning and innovation that we are excited to extend to our partners. To find the AI Boost Bites training series and start using the AI features, login to Partner Marketing Studio.
Investing in our partners through comprehensive co-marketing benefits:
Partners at the Partner and Premiere level of the Partner Advantage Program can access Partner Marketing Studio and get started today. Partner Marketing Studio is more than just an AI-powered content creation tool: it’s a comprehensive platform designed to support the marketing efforts of Google Cloud partners every step of the way. In addition to generative AI features, partners can also access:
Global content library: Localized assets like Google-authored reports, targeted emails, social posts, banner ads, one-pagers, and pitch decks support our partners across the globe
Automation tools: Monitor the performance of your campaigns and make data-driven decisions.
Marketing resources: Access a wealth of co-marketing resources, including brand guidelines, logos, and marketing playbooks.
Campaigns: Explore our collection of pre-built campaigns, designed to help you promote your services and solutions.
Messaging: Use Google Cloud product and solution messaging to train your sales and marketing teams and develop your unique joint message with Google Cloud.
Templates: Design and build effective marketing and sales assets with ease using pre-built templates and plug-and-play editing.
Live support: Get personalized guidance from Google Cloud marketing experts. Our team hosts regular, live webinars, weekly orientation sessions for those new to Partner Marketing Studio, and one-on-one support to help you develop effective marketing strategies and maximize your results.
Google Cloud speakers: Request a Google Cloud speaker for your events. Elevate your events with the expertise and insights of our industry-leading speakers.
Google-vetted agencies: Connect with trusted marketing agencies who have proven success with Google Cloud partners to support your marketing initiatives. We’ve curated a list of top-tier agencies that specialize in Google Cloud marketing, ensuring you have access to the resources available.
What our pilot partners are saying
“With the new AI features in Partner Marketing Studio, we can create more targeted industry and persona-based versions of our Google Cloud marketing campaigns automatically. I’m excited that this efficiency will enable my team to focus on more strategic marketing activities and close more deals.” – Elissa Robins, Head of Marketing, SADA, An Insight company
Ready to unlock AI for your marketing?
We invite our partners and their marketing teams to join us for our upcoming webinar on February 18th to learn more about Partner Marketing benefits, and discover how Google Cloud is empowering partners to achieve unprecedented marketing success. This is your opportunity to get a firsthand look at these powerful new tools, including the new AI features, and learn how you can leverage them to drive your business forward.
The ability to deploy Swift’s Alliance Connect Virtual in Google Cloud allows financial institutions to leverage the scalability, flexibility, and cost-effectiveness of cloud infrastructure while maintaining the security and reliability standards required for financial transactions. By virtualizing the traditionally hardware-based Swift VPN connections, institutions can streamline their infrastructure, reduce operational overhead, and accelerate their digital transformation initiatives. Additionally, Google Cloud’s robust security features and compliance certifications help keep sensitive financial data protected.
“Cloud technology has been game-changing for the financial industry over the past decade and will be a key enabler of future transaction forms and flows. With the launch of Alliance Connect Virtual, Swift has taken a major step forward in supporting our customers’ cloud journeys, offering seamless and secure access to Swift via the public cloud. Teaming up with Google Cloud, we’re proud to deliver flexible and resilient solutions that align with the fast-growing cloud-first mindset of our customers, driving innovation while maintaining the highest levels of security and reliability. The feedback we have received from our pilot customers on Google Cloud has been overwhelmingly positive, and we are looking forward to seeing the adoption of the new offer scale.” – Sophie Racquet, Head of Alliance Connect Product Management, Swift
Architecting Alliance Connect Virtual on Google Cloud
The following diagrams show reference architectures of the deployment of the Alliance Connect Virtual connectivity project on Google Cloud. Alliance Connect Virtual is set up in Google Cloud and it provides connectivity to Swift via virtualized Juniper vSRX VPN and via internet or pseudo-leased-line connections to the Swift Network through network providers, based on the customer-chosen connectivity offering (Gold, Silver or Bronze). A pseudo leased line consists of four VLAN attachments and each pair of VLAN attachments has its own Cloud Router and two Partner Interconnect connections.
Alliance Connect Virtual is offered in three packages: Bronze, Silver and Gold. Depending on your Swift traffic’s criticality along with resiliency requirements, you can use the tier that best aligns with your needs. Find below the architecture for each package.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e877e8ba670>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Alliance Connect Virtual Gold:
The Alliance Connect Virtual Gold connectivity package provides the strongest resiliency and service level of the three options. The connectivity to Swift is made through Partner Interconnect provisioning two connections of equal capacity, with an enterprise-grade connection to Google Cloud that has the higher throughput of the three packages. Traffic goes through a service provider with a dedicated connection. By bypassing the public internet, your traffic takes fewer hops, so there are fewer points of failure where your traffic might get dropped or disrupted. This option is designed for customers handling more than 40,000 messages per day.
Alliance Connect Virtual Silver:
The Alliance Connect Virtual Silver package provides connectivity through one dedicated pseudo leased connection through a network provider using Partner Interconnect, providing high bandwidth and throughput. In this setup an internet connection is added as backup. This option is designed for customers handling between 1,000 and 40,000 messages per day.
Alliance Connect Virtual Bronze:
The Bronze Alliance Virtual Connect option provides low-cost internet connectivity. In this setup you can connect two VPN boxes in order to maintain a backup connection in case of failure. This option is designed for customers handling up to 1,000 messages per day.
Find out more about the different Alliance Virtual Connect Packages here.
This architecture includes the following components:
A set of VPC Networks for different vSRX network interfaces to segregate the traffic (Untrust VPC, Trust VPC, Interconnect VPC and Management VPC ). The traffic to Partner Interconnect or the internet goes through the Untrust VPC.
A set of VPC Subnets for different vSRX network interfaces to segregate the traffic (Untrust Subnets, Trust Subnets, Interconnect Subnets and Management Subnets)
A set of Firewall rules to control egress/ingress traffic between the Swift Network and other VPCs
Configuration of the Routes for the VPCs created above
Cloud Routers as per the architecture above that provide the routing for Cloud Interconnect.
Swift offers various messaging interfaces tailored to different customer needs and levels of complexity. Below we showcase the architecture of how the different messaging applications listed below can be deployed on Google Cloud and connect via Alliance Virtual Connect.
Alliance Cloud
Alliance Access
Alliance Messaging Hub
Along with the messaging interface, the High Availability (HA) tool is deployed in the application project. This tool is used to enhance the resilience and uptime of the connection to the Swift network through Alliance Connect Virtual (the connectivity packs deployed in the VPN project). The HA VM application achieve this by:
Monitoring and managing routing tables: This helps ensure that if one connection path to the Swift network or one availability zone becomes unavailable, the traffic can be seamlessly rerouted through the alternative path, minimizing disruption.
Maintaining redundant vSRX machines: Typically, the HA VMs oversee the two Compute Engine VMs that host the Juniper vSRX VPN, with one vSRX acting as the primary connection point and the other on standby. If the primary vSRX fails, the other vSRX automatically takes over the connection, helping to ensure continuity of service.
1. Alliance Cloud on Google Cloud:
Alliance Cloud is a fully managed, financial cloud-based messaging interface that connects customers to Swift’s services with the benefits of cloud deployments, such as reduced infrastructure management. Alliance Cloud offers a reduced total cost of ownership given that it is managed and hosted by Swift. Find more information on their website.
Alliance Cloud offers the following connectivity options to integrate messaging flows of the customers’ back-office applications with Alliance Cloud
Alliance Cloud offers a direct API called the Swift Messaging API (more information is available on the Swift messaging API | Swift Developer Portal), allowing customer back-office systems to integrate with Alliance Cloud using RESTful APIs. This can be achieved by choosing from Swift’s API footprint options; zero footprint, Swift SDK or Swift Microgateway (more information can be found on the Swift developer portal)
Alliance Cloud offers a software footprint through the Swift Integration Layer. This offers both file and RESTful API connectivity between the Swift Integration Layer and the customer back-office applications.
2. Alliance Access on Google Cloud:
Alliance Access is a Swift messaging interface that enables a secure connection to Swift by banks and financial institutions. Find more information on the Swift website. Alliance Access components can be deployed and managed within your Google Cloud environment. The following components will make up the Alliance Access solution:
Alliance Access Server: This is the core of the solution, a software application installed on the institution’s infrastructure. It acts as the interface between the institution’s internal systems and the Swift network.
Alliance Web Platform: A web-based interface that allows users to monitor message flows, manage configurations, and perform various operational tasks related to Swift messaging.
Alliance Gateway: A component that provides additional security and routing capabilities, by concentrating your flows from different interfaces through to Swift.
SwiftNet Link (SNL): Enables Alliance Gateway to perform application-to-application communication over SwiftNet services. Connectivity can be established via the different connectivity packs of Alliance Virtual Connect on Google Cloud.
Below, we present a few Reference Architectures on how a deployment of Alliance Access, using Alliance Virtual Connect in Google Cloud to establish connectivity to Swift network could look like:
Alliance Access itself does not require an independent Oracle database instance for its core functionality as it comes with its own embedded Oracle database Standard Edition instance. For the Alliance Access deployment on Google Cloud the reference architecture above uses the embedded OracleDB which is the deployment method supported for Alliance Access on Google Cloud.
Alliance Gateway and Alliance Web Platform come with an embedded Oracle database Standard Edition. These products mainly use it for storing configuration and logs, and do not store business data.
3. Alliance Messaging Hub
Alliance Messaging Hub (AMH) is a modular, financial messaging solution offered by Swift. AMH provides extensive throughput and sophisticated data management, delivering routing between different messaging services. Find more information on their website. The following components will make up the Alliance Messaging Hub (AMH) solution:
AMH Physical Nodes (servers): This is the core of the solution. An AMH Physical Node is a software application that acts as the interface between the institution’s internal systems and the Swift network. One or more such servers can be deployed.
Alliance Gateway:An optional component that provides additional security and routing capabilities, by concentrating your flows from different interfaces to Swift.
SNL:Enables Alliance Gateway to perform application-to-application communication over SwiftNet services. It can be established via the different connectivity packs of Alliance Virtual Connect on Google Cloud.
An Oracle Database shared by AMH Physical Nodes: Unlike Alliance Access, AMH does not come with the option of an embedded Oracle database. AMH Customers need to provide the database. To host their Oracle database on Google Cloud, customers can useBare Metal Solution, which provides a secure environment in which they can run specialized workloads, such as Oracle databases on high-performance, bare-metal servers. On the other hand, the Google Cloud and Oracle partnership opens up many possibilities for customers to host their Oracle database on the cloud , such as using Oracle Database@Google Cloud or hosting Oracle on Compute Engine. Oracle Database@Google Cloud allows customers to host database services in a Google Cloud datacenter running on Oracle Cloud Infrastructure (OCI) hardware.
Oracle Database@Google Cloud
Oracle Database on Google Compute Engine
Bare Metal Solution
OCI and Google Cross-Cloud Interconnect
Why deploy Swift connectivity on Google Cloud
Deploying the Swift connectivity stack on Google Cloud offers a compelling solution for financial institutions due to the platform’s inherent advantages:
Google Cloud’s robust infrastructure, designed to meet specific workload and industry needs, ensures high availability and reliability for mission-critical financial operations.
This infrastructure is optimized for AI, allowing institutions to leverage advanced analytics and automation for enhanced efficiency and security.
Additionally, Google Cloud’s commitment to sustainability aligns with the growing emphasis on responsible business practices, helping organizations minimize their environmental footprint while benefiting from advanced technology.
Furthermore, Google Cloud’s collaborative tools, powered by AI, streamline communication and workflow processes, empowering teams to work more efficiently and effectively.
The reference architectures above enable a secure and reliable connection to Swift by leveraging Google Cloud Infrastructure and network components. The following Google Cloud components play a crucial role in establishing a secure connection to Swift:
Partner Interconnect: Google Cloud Partner Interconnect offers a way to connect Swift’s on-premises network and Alliance Connect Virtual VPC network through a supported service provider. This type of connection provides secure and reliable data transfer, bypassing the public internet. This solution is also scalable, allowing you to increase capacity as your needs change.
Bare Metal Rack HSM: A key component of the Swift architecture is Swift HSM. It is a dedicated hardware device that safeguards Swift’s Public Key Infrastructure (PKI) credentials, ensuring secure signing of live traffic and authentication of production services. In order to leverage the benefits of the cloud for the hosting of Swift HSM, customers can leverage Bare Metal Rack HSM. Bare Metal Rack HSM provides dedicated racks and switches for hosting HSMs, ensuring isolation and a high degree of control over the environment. This aligns well with the security requirements of Swift HSM, which demands robust protection of sensitive key material. The Bare Metal Rack HSM solution is hosted in colocation facilities with active peering fabrics, ensuring low-latency connections to Google Cloud workloads. Google’s standards for these facilities and redundant infrastructure contribute to a highly available service. It is also hosted in facilities compliant with PCI-DSS, PCI-3DS, and SOC 1, 2, and 3 standards.
Oracle Database: The deployment of Alliance Messaging Hub will require Swift customers to deploy an Oracle Database. Google provides customers with several options to deploy oracle databases through the partnership of Google and Oracle which makes it easy for customers to migrate, modernize, and manage their Oracle-based applications in the cloud. You can find here the different ways to deploy Oracle on Google Cloud offering flexibility for your deployments.
To learn more about the exciting collaboration between Google Cloud and Swift, contact your Google Cloud sales representative, partner manager, or your Swift account manager.
Written By: Jacob Paullus, Daniel McNamara, Jake Rawlins, Steven Karschnia
Executive Summary
Mandiant exploited flaws in the Microsoft Software Installer (MSI) repair action of Lakeside Software’s SysTrack installer to obtain arbitrary code execution.
An attacker with low-privilege access to a system running the vulnerable version of SysTrack could escalate privileges locally.
Mandiant responsibly disclosed this vulnerability to Lakeside Software, and the issue has been addressed in version 11.0.
Introduction
Building upon the insights shared in a previous Mandiant blog post, Escalating Privileges via Third-Party Windows Installers, this case study explores the ongoing challenge of securing third-party Windows installers. These vulnerabilities are rooted in insecure coding practices when creating Microsoft Software Installer (MSI) Custom Actions and can be caused by references to missing files, broken shortcuts, or insecure folder permissions. These oversights create gaps that inadvertently allow attackers the ability to escalate privileges.
As covered in our previous blog post, after software is installed with an MSI file, Windows caches the MSI file in the C:WindowsInstaller folder for later use. This allows users on the system to access and use the “repair” feature, which is intended to address various issues that may be impacting the installed software. During execution of an MSI repair, several operations (such as file creation or execution) may be triggered from an NT AUTHORITYSYSTEM context, even if initiated by a low-privilege user, thereby creating privilege escalation opportunities.
This blog post specifically focuses on the discovery and exploitation of CVE-2023-6080, a local privilege escalation vulnerability that Mandiant identified in Lakeside Software’s SysTrack Agent version 10.7.8.
Exploiting the SysTrack Installer
Mandiant began by using Microsoft’s Process Monitor (ProcMon) to analyze and review file operations executed during the repair process of SysTrack’s MSI. While running the repair process as a low-privileged user, Mandiant observed file creation and execution within the user’s %TEMP% folder from MSIExec.exe.
Figure 1: MSIExec.exe copying and executing .tmp file in user’s %TEMP% folder
Each time Mandiant ran the repair functionality, MSIExec.exe wrote a new .tmp file to the %TEMP% folder using a formula-based name, and then executed it. Mandiant discovered, through dynamic analysis of the installer, that the name generated by the repair function would consist of the string “wac” followed by four randomly chosen hex characters (0-9, A-F). With this naming scheme, there were 65,535 possible filename options.
Due to the %TEMP% folder being writable by a low-privilege user, Mandiant tested the behavior of the repair tool when all possible filenames already existed within the %TEMP% folder. Mandiant created a PowerShell script to copy an arbitrary test executable to each possible file name in the range of wac0000.tmp to wacFFFF.tmp.
# Path to the permutations file
$csvFilePath = ‘.permutations.csv’
# Path to the executable
$exePath = ‘.test.exe’
# Target directory (using the system’s temp directory)
$targetDirectory = [System.IO.Path]::GetTempPath()
# Read the csv file content
$csvContent = Get-Content -Path $csvFilePath
# Split the content into individual values
$values = $csvContent -split “,”
# Loop through each value and copy the exe to the target directory with the new name
Foreach ($value in $values) {
$newFilePath = Join-Path -Path $targetDirectory -ChildPath ($value + “.tmp”)
Copy-Item -Path $exePath -Destination $newFilePath
}
Write-Output “Copy operation completed to $targetDirectory”
Figure 2: Creating all possible .tmp files in %TEMP%
Figure 3: Excerpt of .tmp files created in %TEMP%
After filling the previously identified namespace, Mandiant reran the MSI repair function to observe its subsequent behavior. Upon review of the ProcMon output, Mandiant observed that when the namespace was filled, the application would failover to an incrementing filename pattern. The pattern began with wac1.tmp and incremented the number each time in a predictable pattern, if the previous file existed. To prove this theory, Mandiant manually created wac1.tmp and wac2.tmp, then observed the MSI repair action in ProcMon. When running the MSI repair function, the resulting filename was wac3.tmp.
Figure 4: MSIExec.exe writing and executing a predicted .tmp file
Additionally, Mandiant observed that there was a small delay between the file write action and the file execution action, which could potentially result in a race condition vulnerability. Since Mandiant could now force the program to use a predetermined filename, Mandiant wrote another PowerShell script designed to attempt to win the race condition by copying a file (test.exe) to the %TEMP% folder, using the predicted filename, between the file write and execution in order to overwrite the file created by MSIExec.exe. In this test, test.exe was a simple proof-of-concept executable that would start notepad.exe.
while ($true) {
if (Test-Path -Path "C:UsersUSERAppDataLocalTempwac3.tmp") {
Copy-Item -Path "C:UsersUSERDesktoptest.exe" -Destination
"C:UsersUSERAppDataLocalTempwac3.tmp" -Force
}
}
Figure 5: PowerShell race condition script to copy arbitrary file into %TEMP%
With the %TEMP% folder prepared with the wac1.tmp and wac2.tmp files created, Mandiant ran both the PowerShell script and MSI repair action targeting wac3.tmp. With the race condition script running, execution of the repair action resulted in the test.exe file overwriting the intended binary and subsequently being executed by MSIExec.exe, opening cmd.exe as NT AUTHORITYSYSTEM.
Figure 6: Obtaining NT AUTHORITY SYSTEM command prompt
Defensive Considerations
As discussed in Mandiant’s previous blog post, misconfigured Custom Actions can be trivial to find and exploit, making them a significant security risk for organizations. It is essential for software developers to follow secure coding practices and review their implemented Custom Actions to prevent attackers from hijacking high-privilege operations triggered by the MSI repair functionality. Refer to the original blog post for general best practices when configuring Custom Actions. In discovery of CVE-2023-6080, Mandiant identified several misconfigurations and oversights that allowed for privilege escalation to NT AUTHORITYSYSTEM.
The SysTrack MSI performed file operations including creation and execution in the user’s %TEMP% folder, which provides a low-privilege user the opportunity to alter files being actively used in a high-privilege context. Software developers should keep folder permissions in mind and ensure all privileged file operations are performed from folders that are appropriately secured. This can include altering the read/write permissions for the folder, or using built-in folders such as C:Program Files or C:Program Files (x86), which are inherently protected from low-privilege users.
Additionally, the software’s filename generation schema included a failover mechanism that allowed an attacker to force the application into using a predetermined filename. When using randomized filenames, developers should use a sufficiently large length to ensure that an attacker cannot exhaust all possible filenames and force the application into unexpected behavior. In this case, knowing the target filename before execution made it significantly easier to beat the race condition, as opposed to dynamically identifying and replacing the target file between the time of its creation by MSIExec.exe and the time of its execution.
Something security professionals must also consider is the safety of the programs running on corporate machines. Many approved applications may inadvertently contain security vulnerabilities that increase the risk in our environments. Mandiant recommends that companies consider auditing the security of their individual endpoints to ensure that defense in depth is maintained at an organizational level. Furthermore, where possible, companies should monitor the spawning of administrative shells such as cmd.exe and powershell.exe in an elevated context to alert on possible privilege escalation attempts.
A Final Word
Domain privilege escalation is often the focus of security vendors and penetration tests, but it is not the only avenue for privilege escalation or compromise of data integrity in a corporate environment. Compromise of integrity on a single system can allow an attacker to mount further attacks throughout the network; for example, the Network Access Account used by SCCM can be compromised through a single workstation and when misconfigured can be used to escalate privileges within the domain and pivot to additional systems within the network.
Mandiant offers dedicated endpoint security assessments, during which customer endpoints are tested from multiple contexts, including the perspective of an adversary with low-privilege access attempting to escalate privileges. For more information about Mandiant’s technical consulting services, including comprehensive endpoint security assessments, visit our website.
We would like to extend a special thanks to Andrew Oliveau, who was a member of the testing team that discovered this vulnerability during his time at Mandiant.
CVE-2023-6080 Disclosure Timeline
June 13, 2024 – Vulnerability reported to Lakeside Software
July 1, 2024 – Lakeside Software confirmed the vulnerability
August 7, 2024 – Confirmed vulnerability fixed in version 11.0
For developers who want to use the PyTorch deep learning framework with Cloud TPUs, the PyTorch/XLA Python package is key, offering developers a way to run their PyTorch models on Cloud TPUs with only a few minor code changes. It does so by leveraging OpenXLA, developed by Google, which gives developers the ability to define their model once and run it on many different types of machine learning accelerators (i.e., GPUs, TPUs, etc.).
The latest release of PyTorch/XLA comes with several improvements that improve its performance for developers:
A new experimental scan operator to speed up compilation for repetitive blocks of code (i.e., for loops)
Host offloading to move TPU tensors to the host CPU’s memory to fit larger models on fewer TPUs
Improved goodput for tracing-bound models through a new base Docker image compiled with the C++ 2011 Standard application binary interface (C++ 11 ABI) flags
In addition to these improvements we’ve also re-organized the documentation to make it easier to find what you’re looking for!
Let’s take a look at each of these features in greater depth.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3a54db9ee0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Experimental scan operator
Have you ever experienced long compilation times, for example when working with large language models and PyTorch/XLA — especially when dealing with models with numerous decoder layers? During graph tracing, where we traverse the graph of all the operations being performed by the model, these iterative loops are completely “unrolled” — i.e., each loop iteration is copied and pasted for every cycle — resulting in large computation graphs. These larger graphs lead directly to longer compilation times. But now there’s a new solution: the new experimental scan function, inspired by jax.lax.scan.
The scan operator works by changing how loops are handled during compilation. Instead of compiling each iteration of the loop independently, which creates redundant blocks, scan compiles only the first iteration. The resulting compiled high-level operation (HLO) is then reused for all subsequent iterations. This means that there is less HLO or intermediate code that is being generated for each subsequent loop. Compared to a for loop, scan compiles in a fraction of the time since it only compiles the first loop iteration. This improves the developer iteration time when working on models with many homogeneous layers, such as LLMs.
Building on top of torch_xla.experimental.scan, the torch_xla.experimental.scan_layers function offers a simplified interface for looping over sequences of nn.Modules. Think of it as a way to tell PyTorch/XLA “These modules are all the same, just compile them once and reuse them!” For example:
code_block
<ListValue: [StructValue([(‘code’, ‘import torchrnimport torch.nn as nnrnimport torch_xlarnfrom torch_xla.experimental.scan_layers import scan_layersrnrnclass DecoderLayer(nn.Module):rn def __init__(self, size):rn super().__init__()rn self.linear = nn.Linear(size, size)rnrn def forward(self, x):rn return self.linear(x)rnrnwith torch_xla.device():rn layers = [DecoderLayer(1024) for _ in range(64)]rn x = torch.randn(1, 1024)rnrn# Instead of a for loop, we can scan_layers once:rn# for layer in layers:rn# x = layer(x)rnx = scan_layers(layers, x)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3a54d60610>)])]>
One thing to note is that custom pallas kernels do not yet support scan. Here is a complete example of using scan_layers in an LLM for reference.
Host offloading
Another powerful tool for memory optimization in PyTorch/XLA is host offloading. This technique allows you to temporarily move tensors from the TPU to the host CPU’s memory, freeing up valuable device memory during training. This is especially helpful for large models where memory pressure is a concern. You can use torch_xla.experimental.stablehlo_custom_call.place_to_host to offload a tensor and torch_xla.experimental.stablehlo_custom_call.place_to_device to retrieve it later. A typical use case involves offloading intermediate activations during the forward pass and then bringing them back during the backward pass. Here’s an example of host offloading for reference.
Strategic use of host offloading, such as when you’re working with limited memory and are unable to use the accelerator continuously, may significantly improve your ability to train large and complex models within the memory constraints of your hardware.
Alternative base Docker image
Have you ever encountered a situation where your TPUs are sitting idle while your host CPU is heavily loaded tracing your model execution graph for just-in-time compilation? This suggests your model is “tracing bound,” meaning performance is limited by the speed of tracing operations.
The C++11 ABI image offers a solution. Starting with this release, PyTorch/XLA offers a choice of C++ ABI flavors for both Python wheels and Docker images. This gives you a choice for which version of C++ you’d like to use with PyTorch/XLA. You’ll now find builds with both the pre-C++11 ABI, which remains the default to match PyTorch upstream, and the more modern C++11 ABI.
Switching to the C++11 ABI wheels or Docker images can lead to noticeable improvements in the above-mentioned scenarios. For example, we observed a 20% relative improvement in goodput with the Mixtral 8x7B model on v5p-256 Cloud TPU (with a global batch size of 1024) when we switched from the pre-C++11 ABI to the C++11 ABI! ML Goodput gives us an understanding of how efficiently a given model utilizes the hardware. So if we have a higher goodput measurement for the same model on the same hardware, that indicates better performance of the model.
An example of using a C++11 ABI docker image in your Dockerfile might look something like:
code_block
<ListValue: [StructValue([(‘code’, ‘# Use the C++11 ABI PyTorch/XLA image as the basernFROM us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/xla:r2.6.0_3.10_tpuvm_cxx11rnrn# Install any additional dependencies herern# RUN pip install my-other-packagernrn# Copy your code into the containerrnCOPY . /apprnWORKDIR /apprnrn# Run your training scriptrnCMD [“python”, “train.py”]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3a54d60460>)])]>
Alternatively, if you are not using Docker images, because you’re testing locally for instance, you can install the C++11 ABI wheels for version 2.6 using the following command (Python 3.10 example):
The above command works for Python 3.10. We have instructions for other versions within our documentation.
The flexibility to choose between C++ ABIs lets you choose the optimal build for your specific workload and hardware, ultimately leading to better performance and efficiency in your PyTorch/XLA projects!
So, what are you waiting for, go try out the latest version of PyTorch/XLA! For additional information check out the latest release notes.
A note on GPU support
We aren’t offering a PyTorch/XLA:GPU wheel in the PyTorch/XLA 2.6 release. We understand this is important and plan to reinstate GPU support by the 2.7 release. PyTorch/XLA remains an open-source project and we welcome contributions from the community to help maintain and improve the project. To contribute, please start with the contributors guide.
The latest stable version where a PyTorch/XLA:GPU wheel is available is torch_xla 2.5.
Modern AI workloads require powerful accelerators and high-speed interconnects to run sophisticated model architectures on an ever-growing diverse range of model sizes and modalities. In addition to large-scale training, these complex models need the latest high-performance computing solutions for fine-tuning and inference.
Today, we’re excited to bring the highly-anticipated NVIDIA Blackwell GPUs to Google Cloud with the preview of A4 VMs, powered by NVIDIA HGX B200. The A4 VM features eight Blackwell GPUs interconnected by fifth-generation NVIDIA NVLink, and offers a significant performance boost over the previous generation A3 High VM. Each GPU delivers 2.25 times the peak compute and 2.25 times the HBM capacity, making A4 VMs a versatile option for training and fine-tuning for a wide range of model architectures, while the increased compute and HBM capacity makes it well-suited for low-latency serving.
The A4 VM integrates Google’s infrastructure innovations with Blackwell GPUs to bring the best cloud experience for Google Cloud customers, from scale and performance, to ease-of-use and cost optimization. Some of these innovations include:
Enhanced networking: A4 VMs are built on servers with our Titanium ML network adapter, optimized to deliver a secure, high-performance cloud experience for AI workloads, building on NVIDIA ConnectX-7 network interface cards (NICs). Combined with our datacenter-wide 4-way rail-aligned network, A4 VMs deliver non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE). Customers can scale to tens of thousands of GPUs with our Jupiter network fabric with 13 Petabits/sec of bi-sectional bandwidth.
Google Kubernetes Engine: With support for up to 65,000 nodes per cluster, GKE is the most scalable and fully automated Kubernetes service for customers to implement a robust, production-ready AI platform. Out of the box, A4 VMs are natively integrated with GKE. Integrating with other Google Cloud services, GKE facilitates a robust environment for the data processing and distributed computing that underpin AI workloads.
Vertex AI: A4 VMs will be accessible through Vertex AI, our fully managed, unified AI development platform for building and using generative AI, and which is powered by the AI Hypercomputer architecture under the hood.
Open software: In addition to PyTorch and CUDA, we work closely with NVIDIA to optimize JAX and XLA, enabling the overlap of collective communication and computation on GPUs. Additionally, we added optimized model configurations and example scripts for GPUs with XLA flags enabled.
Hypercompute Cluster: Our new highly scalable clustering system streamlines infrastructure and workload provisioning, and ongoing operations of AI supercomputers with tight GKE and Slurm integration.
Multiple consumption models: In addition to the On-demand, Committed use discount, and Spot consumption models, we reimagined cloud consumption for the unique needs of AI workloads with Dynamic Workload Scheduler, which offers two modes for different workloads: Flex Start mode for enhanced obtainability and better economics, and Calendar mode for predictable job start times and durations.
Hudson River Trading, a multi-asset-class quantitative trading firm, will leverage A4 VMs to train its next generation of capital market model research. The A4 VM, with its enhanced inter-GPU connectivity and high-bandwidth memory, is ideal for the demands of larger datasets and sophisticated algorithms, accelerating Hudson River Trading’s ability to react to the market.
“We’re excited to leverage A4, powered by NVIDIA’s Blackwell B200 GPUs. Running our workload on cutting edge AI Infrastructure is essential for enabling low-latency trading decisions and enhancing our models across markets. We’re looking forward to leveraging the innovations in Hypercompute Cluster to accelerate deployment of training our latest models that deliver quant-based algorithmic trading.” – Iain Dunning, Head of AI Lab, Hudson River Trading
“NVIDIA and Google Cloud have a long-standing partnership to bring our most advanced GPU-accelerated AI infrastructure to customers. The Blackwell architecture represents a giant step forward for the AI industry, so we’re excited that the B200 GPU is now available with the new A4 VM. We look forward to seeing how customers build on the new Google Cloud offering to accelerate their AI mission.” – Ian Buck, Vice-President and General Manager of Hyperscale and HPC, NVIDIA
Better together: A4 VMs and Hypercompute Cluster
Effectively scaling AI model training requires precise and scalable orchestration of infrastructure resources. These workloads often stretch across thousands of VMs, pushing the limits of compute, storage, and networking.
Hypercompute Cluster enables you to deploy and manage these large clusters of A4 VMs with compute, storage and networking as a single unit. This makes it easy to manage complexity while delivering exceptionally high performance and resilience for large distributed workloads. Hypercompute Cluster is engineered to:
Deliver high performance through co-location of A4 VMs densely packed to enable optimal workload placement
Optimize resource scheduling and workload performance with GKE and Slurm, packed with intelligent features like topology-aware scheduling
Increase reliability with built-in self-healing capabilities, proactive health checks, and automated recovery from failures
Enhance observability and monitoring for timely and customized insights
Automate provisioning, configuration, and scaling, integrated with GKE and Slurm
We’re excited to be the first hyperscaler to announce preview availability of an NVIDIA Blackwell B200-based offering. Together, A4 VMs and Hypercompute Cluster make it easier for organizations to create and deliver AI solutions across all industries. If you’re interested in learning more, please reach out to your Google Cloud representative.
We are thrilled to announce the collaboration between Google Cloud, AWS, and Azure on Kube Resource Orchestrator, or kro (pronounced “crow”). kro introduces a Kubernetes-native, cloud-agnostic way to define groupings of Kubernetes resources. With kro, you can group your applications and their dependencies as a single resource that can be easily consumed by end users.
Challenges of Kubernetes resource orchestration
Platform and DevOps teams want to define standards for how application teams deploy their workloads, and they want to use Kubernetes as the platform for creating and enforcing these standards. Each service needs to handle everything from resource creation to security configurations, monitoring setup, defining the end-user interface, and more. There are client-side templating tools that can help with this (e.g., Helm, Kustomize), but Kubernetes lacks a native way for platform teams to create custom groupings of resources for consumption by end users.
Before kro, platform teams needed to invest in custom solutions such as building custom Kubernetes controllers, or using packaging tools like Helm, which can’t leverage the benefits of Kubernetes CRDs. These approaches are costly to build, maintain, and troubleshoot, and complex for non-Kubernetes experts to consume. This is a problem many Kubernetes users face. Rather than developing vendor-specific solutions, we’ve partnered with Amazon and Microsoft on making K8s APIs simpler for all Kubernetes users.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2ac7fe4d00>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
How kro simplifies the developer experience
kro is a Kubernetes-native framework that lets you create reusable APIs to deploy multiple resources as a single unit. You can use it to encapsulate a Kubernetes deployment and its dependencies into a single API that your application teams can use, even if they aren’t familiar with Kubernetes. You can use kro to create custom end-user interfaces that expose only the parameters an end user should see, hiding the complexity of Kubernetes and cloud-provider APIs.
kro does this by introducing the concept of a ResourceGraphDefinition, which specifies how a standard Kubernetes Custom Resource Definition (CRD) should be expanded into a set of Kubernetes resources. End users define a single resource, which kro then expands into the custom resources defined in the CRD.
kro can be used to group and manage any Kubernetes resources. Tools like ACK, KCC, or ASO define CRDs to manage cloud provider resources from Kubernetes (these tools enable cloud provider resources, like storage buckets, to be created and managed as Kubernetes resources). kro can also be used to group resources from these tools, along with any other Kubernetes resources, to define an entire application deployment and the cloud provider resources it depends on.
Example use cases
Below, you’ll find some examples of kro being used with Google Cloud. You can find additional examples on the kro website.
Example 1: GKE cluster definition
Imagine that a platform administrator wants to give end users in their organization self-service access to create GKE clusters. The platform administrator creates a kro ResourceGraphDefinition called GKEclusterRGD that defines the required Kubernetes resources and a CRD called GKEcluster that exposes only the options they want to be configurable by end users. In addition to creating a cluster, the platform team also wants clusters to deploy administrative workloads such as policies, agents, etc. The ResourceGraphDefinition defines the following resources, using KCC to provide the mappings from K8s CRDs to Google Cloud APIs:
GKE cluster, Container Node Pools, IAM ServiceAccount, IAM PolicyMember, Services, Policies
The platform administrator would then define the end-user interface so that they can create a new cluster by creating an instance of the CRD that defines:
Everything related to policy, service accounts, and service activation (and how these resources relate to each other) is hidden from the end user, simplifying their experience.
Example 2: Web application definition
In this example, a DevOpsEngineer wants to create a reusable definition of a web application and its dependencies. They create a ResourceGraphDefinition called WebAppRGD, which defines a new Kubernetes CRD called WebApp. This new resource encapsulates all the necessary resources for a web application environment, including:
Deployments, service, service accounts, monitoring agents, and cloud resources like object storage buckets.
The WebAppRGD ResourceGraphDefinition can set a default configuration, and also define which parameters can be set by the end user at deployment time (kro gives you the flexibility to decide what is immutable, and what an end user is able to configure). A developer then creates an instance of the WebApp CRD, inputting any user-facing parameters. kro then deploys the desired Kubernetes resource.
Key benefits of kro
We believe kro is a big step forward for platform engineering teams, delivering a number of advantages:
Kubernetes-native: kro leverages Kubernetes Custom Resource Definitions (CRDs) to extend Kubernetes, so it works with any Kubernetes resource and integrates with existing Kubernetes tools and workflows.
Lets you create a simplified end user experience: kro makes it easy to define end-user interfaces for complex groups of Kubernetes resources, making it easy for people who are not Kubernetes experts to consume services built on Kubernetes.
Enables standardized services for application teams: kro templates can be reused across different projects and environments, promoting consistency and reducing duplication of effort.
Get started with kro
kro is available as an open-source project on GitHub. The GitHub organization is currently jointly owned by teams from Google, AWS, and Microsoft, and we welcome contributions from the community. We also have a website with documentation on installing and using kro, including example use cases. As an early-stage project, kro is not yet ready for production use, but we still encourage you to test it out in your own Kubernetes development environments!
Welcome to the second Cloud CISO Perspectives for January 2025. Iain Mulholland, senior director, Security Engineering, shares insights on the state of ransomware in the cloud from our new Threat Horizons Report. The research and intelligence in the report should prove helpful to all cloud providers and security professionals. Similarly, the recommended risk mitigations will work well with Google Cloud, but are generally applicable to all clouds.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe101b1f70>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
How cloud security can adapt to ransomware threats in 2025
By Iain Mulholland, senior director, Security Engineering, Google Cloud
How should cloud providers and cloud customers respond to the threat of ransomware? Cloud security strategies in 2025 should prioritize protecting against data exfiltration and identity access abuse, we explain in our new Threat Horizons Report.
Iain Mulholland, senior director, Security Engineering, Google Cloud
Research and intelligence in the report shows that threat actors have made stealing data and exploiting weaknesses in identity security top targets. We’ve seen recent adaptations from some threat actor groups, where they’ve started using new ransomware families to achieve their goals. We’ve also observed them rapidly adapt their tactics to evade detection and attribution, making it harder to accurately identify the source of attacks — and increasing the likelihood that victims will pay ransom demands.
As part of our shared fate approach, where we are active partners with our customers in helping them secure their cloud use by sharing our expertise, best practices, and detailed guidance, this edition of Threat Horizons provides all cloud security professionals with a deeper understanding of the threats they face, coupled with actionable risk mitigations from Google’s security and threat intelligence experts.
These mitigations will work well with Google Cloud, but are generally applicable for other clouds, too.
Evolving ransomware and data-theft risks in the cloud
Ransomware and data threats in the cloud are not new, and investigations and analysis of the threats and risks they pose has been a key part of previous Threat Horizons Reports. Notably, Google Cloud security and intelligence experts exposed ransomware-related trends in the Threat Horizons Report published in February 2024, that included threat actors prioritizing data exfiltration over encryption and exploiting server-side vulnerabilities.
We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication, regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms.
We observed in the second half of 2024 a concerning shift that threat actors were becoming more adept at obscuring their identities. This latest evolution in their tactics, techniques, and procedures makes it harder for defenders to counter their attacks and increases the likelihood of ransom payments — which totalled $1.1 billion in 2023. We also saw threat actors adapt by relying more on ransomware-as-a-service (RaaS) to target cloud services, which we detail in the full report.
We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication (MFA), regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms. Importantly, educate employees about security best practices to help prevent credential compromise.
Government insights can help here, too. Guidance from the Cybersecurity and Infrastructure Security Agency’s Ransomware Vulnerability Warning Pilot can proactively identify and warn about vulnerabilities that could be exploited by ransomware actors.
I’ve summarized risk mitigations to enhance your Google Cloud security posture to better protect against threats including account takeover, which could lead to threat actor ransomware and data extortion operations.
To help prevent cloud account takeover, your organization can:
Enroll in MFA: Google Cloud’s phased approach to mandatory MFA can make it harder for attackers to compromise accounts even if they have stolen credentials and authentication cookies.
Implement robust Identity and Access Management (IAM) policies: Use IAM policies to grant users only the necessary permissions for their jobs. Google Cloud offers a range of tools to help implement and manage IAM policies, including Policy Analyzer.
To help mitigate ransomware and extortion risks, your organization can:
Establish acloud-specific backup strategy: Disaster recovery testing should include configurations, templates, and full infrastructure redeployment and backups should be immutable for maximum protection.
Enable proactive virtual machine scanning: Part of SCC, Virtual Machine Threat Detection (VMTD) scans virtual machines for malicious applications to detect threats, including ransomware.
Monitor and control unexpected costs: With Google Cloud, you can identify and manage unusual spending patterns across all projects linked to a billing account, which could indicate unauthorized activity.
Organizations can use multiple Google Cloud products to enhance protection against ransomware and data theft extortion. Security Command Center can help establish a multicloud security foundation for your organization that can help detect data exfiltration and misconfigurations. Sensitive Data Protection can help protect against potential data exfiltration by identifying and classifying sensitive data in your Google Cloud environment, and also help you monitor for unauthorized access and movement of data.
Threats beyond ransomware
There’s much more to the cloud threat landscape than ransomware, and also more that organizations can do to mitigate the risks they face. As above, I’ve summarized here five more threat landscape trends that we identify in the report, and our suggested mitigations on how you can improve your organization’s security posture.
Service account risks, including over-privileged service accounts exploited with lateral movement tactics.
What you should do: Investigate and protect service accounts to help prevent exploitation of overprivileged accounts and reduce detection noise from false positives.
Identity exploitation, including compromised user identities in hybrid environments exploited with lateral movement between on-premises and cloud environments.
What you should do: Combine strong authentication with attribute-based validation, modernize playbooks and processes for comprehensive identity incident response (including enforcing mandatory MFA.)
Attacks against cloud databases, including active vulnerability exploits and exploiting weak credentials that guard sensitive information.
Diversified attack methods, including privilege escalation that allows threat actors to charge against victim billing accounts in an effort to maximize their profits from compromised accounts.
What you should do: As discussed above, enroll in MFA, use automated sensitive monitoring and alerting, and implement robust IAM policies.
Data theft and extortion attacks, including MFA bypass techniques and aggressive communication strategies with victims, use increasingly sophisticated tactics against cloud-based services to compromise accounts and maximize profits.
What you should do: Use a defense-in-depth strategy that includes strong password policies, MFA, regular reviews of user access, leaked credential monitoring, account lockout mechanisms, and employee education. Robust tools such as SCC can help monitor for data exfiltration and unauthorized access of data.
We provide more detail on each of these in the full report.
How Threat Horizons Reports can help
The Threat Horizons report series is intended to present a snapshot of the current state of threats to cloud environments, and how we can work together to mitigate those risks and improve cloud security for all. The reports provide decision-makers with strategic threat intelligence that cloud providers, customers, cloud security leaders, and practitioners can use today.
Threat Horizon reports are informed by Google Threat Intelligence Group (GTIG), Mandiant, Google Cloud’s Office of the CISO, Product Security Engineering, and Google Cloud intelligence, security, and product teams.
The Threat Horizons Report for the first half of 2025 can be read in full here. Previous Threat Horizons reports are available here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe0fd3a790>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Get ready for a unique, immersive security experience at Next ‘25: Here’s why Google Cloud Next is shaping up to be a must-attend event for security experts and the security-curious alike. Read more.
How Google secures its own cloud use: Take a peek under the hood at how we use and secure our own cloud environments, as part of our new “How Google Does It” series. Read more.
Privacy-preserving Confidential Computing now on even more machines and services: Confidential Computing is available on even more machine types than before. Here’s what’s new. Read more.
Use custom Org Policies to enforce CIS benchmarks for GKE: Many CIS recommendations for GKE can be enforced with custom Organization Policies. Here’s how. Read more.
Making GKE more secure with supply-chain attestation and SLSA: You can now verify the integrity of Google Kubernetes Engine components with SLSA, the Supply-chain Levels for Software Artifacts framework. Read more.
Office of the CISO 2024 year in review: Google Cloud’s Office of the CISO shared insights in 2024 on how to approach generative AI securely, featured industry experts on the Cloud Security Podcast, published research papers, and examined security lessons learned across many sectors. Read more.
Celebrating one year of AI bug bounties at Alphabet: What we learned in the first year of AI bug bounties, and how those lessons will inform our approach to vulnerability rewards going forward. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Tell Google Cloud what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe0fd3ac10>), (‘btn_text’, ‘Vote now’), (‘href’, ‘https://www.linkedin.com/feed/update/urn:li:activity:7290368088598822913/’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
How to stop cryptocurrency heists: Many factors are spurring a spike in cryptocurrency heists, including the lucrative nature of their rewards and the challenges associated with attribution to malicious actors. In our new Securing Cryptocurrency Organizations guide, we detail the defense measures organizations should take to stop cryptocurrency heists. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
How the modern CISO balances risk, innovation, business strategy, and cloud: John Rogers, CISO, MSCI, talks about the biggest cloud security challenges CISOs are facing today — and they’re evolving — with host Anton Chuvakin and guest co-host Marina Kaganovich from Google Cloud’s Office of the CISO. Listen here.
Slaying the ransomware dragon: Can startups succeed where others have failed, and once and for all end ransomware? Bob Blakley, co-founder and chief product officer of ransomware defense startup Mimic, tells hosts Anton Chuvakin and Tim Peacock his personal reasons for joining the fight against ransomware, and how his company can help. Listen here.
Behind the Binary: How a gamer became a renowned hacker: Stephen Eckels, from Google Mandiant’s FLARE team, discusses how video game modding helped kickstart his cybersecurity career. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in February with more security-related updates from Google Cloud.
In today’s complex digital world, building truly intelligent applications requires more than just raw data — you need to understand the intricate relationships within that data. Graph analysis helps reveal these hidden connections, and when combined with techniques like full-text search and vector search, enables you to deliver a new class of AI-enabled application experiences. However, traditional approaches based on niche tools result in data silos, operational overhead, and scalability challenges. That’s why we introduced Spanner Graph, and today we’re excited to announce that it’s generally available.
In a previous post, we described how Spanner Graph reimagines graph data management with a unified database that integrates graph, relational, search, and gen AI capabilities with virtually unlimited scalability. With Spanner Graph, you gain access to an intuitive ISO Standard Graph Query Language (GQL) interface that simplifies pattern matching and relationship traversal. You also benefit from full interoperability between GQL and SQL, for tight integration between graph and tabular data. Powerful vector and full-text search enable fast data retrieval using semantic meaning and keywords. And you can rely on Spanner’s scalability, availability, and consistency to provide a solid data foundation. Finally, integration with Vertex AI gives you access to powerful AI models directly within Spanner Graph.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe10508ee0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
What’s new in Spanner Graph
Since the preview, we have added exciting new capabilities and partner integrations to make it easier for you to build with Spanner Graph. Let’s take a closer look.
1) Spanner Graph Notebook: Graph visualization is key to developing with graphs. The new open-source Spanner Graph Notebook tool provides an efficient way to query Spanner Graph visually. This tool is natively installed in Google Colab, meaning you can use it directly within that environment. You can also leverage it in notebook environments like Jupyter Notebook. With this tool, you can use magic commands with GQL to visualize query results and graph schemas with multiple layout options, inspect node and edge properties, and analyze neighbor relationships.
Open-source Spanner Graph Notebook.
2) GraphRAG with LangChain integration: Spanner Graph’s integration with LangChain allows for quick prototyping of GraphRAG applications. Conventional RAG, while capable of grounding the LLM by providing relevant context from your data using vector search, cannot leverage the implicit relationships present in your data. GraphRAG overcomes this limitation by constructing a graph from your data that captures these complex relationships. At retrieval time, GraphRAG uses the combined power of graph queries with vector search to provide a richer context to the LLM, enabling it to generate more accurate and relevant answers.
3) Graph schema in Spanner Studio: The Spanner Studio Explorer panel now displays a list of defined graphs, their nodes and edges, and the associated labels and properties. You can explore and understand the structure of your graph data at a glance, making it easier to design, debug, and optimize your applications.
4) Graph query improvements: Spanner Graph now supports the path data type and functions, allowing you to retrieve and analyze the specific sequence of nodes and relationships that connect two nodes in your graph. For example, you can create a path variable in a path pattern, using the IS_ACYCLIC function to check if the path has repeating nodes, and return the entire path:
code_block
<ListValue: [StructValue([(‘code’, ‘GRAPH FinGraphrnMATCH p = (:Account)-[:Transfers]->{2,5}(:Account)rnRETURN IS_ACYCLIC(p) AS is_acyclic_path, TO_JSON(p) AS full_path;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebe12caf910>)])]>
5) Graph visualization partner integrations: Spanner Graph is now integrated with leading graph visualization partners. For example, Spanner Graph customers can use GraphXR, Kineviz’s flagship product, which combines cutting-edge visualization technology with advanced analytics to help organizations make sense of complex, connected data.
“We are thrilled to partner with Google Cloud to bring graph analytics to big data. By integrating GraphXR with Spanner Graph, we’re empowering businesses to visualize and interact with their data in ways that were previously unimaginable.” – Weidong Yang, CEO, Kineviz
“Businesses can finally handle graph data with both speed and scale. By combining Graphistry’s GPU-accelerated graph visualization and AI with Spanner Graph’s global-scale querying, teams can now easily go all the way from raw data to graph-informed action. Whether detecting fraud, analyzing journeys, hunting hackers, or surfacing risks, this partnership is enabling teams to move with confidence.” – Leo Meyerovich, Founder and CEO, Graphistry
Visual analytics capabilities in Graphistry: zooming, clustering, filtering, histograms, time bar filtering, node styling (colors), allowing point-and-click analysis to quickly understand the data, clusters, identify patterns, anomalies and other insights.
Furthermore, you can use G.V(), a quick-to-install graph database client, with Spanner Graph to perform day-to-day graph visualization and data analytics tasks with ease. Data professionals benefit from high-performance graph visualization, no-code data exploration, and highly customizable data visualization options.
“Graphs thrive on connections, which is why I’m so excited about this new partnership between G.V() and Google Cloud Spanner Graph. Spanner Graph turns big data into graphs, and G.V() effortlessly turns graphs into interactive data visualizations. I’m keen to see what data professionals build combining both solutions.” – Arthur Bigeard, Founder, gdotv Ltd.
Visually querying and exploring Spanner Graph with G.V().
What customers are saying
Through our road to GA, we have also been working with multiple customers to help them innovate with Spanner Graph:
“The Commercial Financial Network manages commercial credit data for more than 30 million U.S. businesses Managing the hierarchy of these businesses can be complex due to the volume of these hierarchies, as well as the dynamic nature driven by mergers and acquisitions, Equifax is committed to providing lenders with the accurate, reliable and timely information they need as they make financial decisions. Spanner Graph helps us manage these rapidly changing, dynamic business hierarchies easily at scale.” – Yuvaraj Sankaran, Chief Architect of Global Platforms, Equifax
“As we strive to enhance our fraud detection capabilities, having a robust, multi-model database like Google Spanner is crucial for our success. By integrating SQL for transactional data management with advanced graph data analysis, we can efficiently manage and analyze evaluated fraud data. Spanner’s new capabilities significantly improve our ability to maintain data integrity and uncover complex fraud patterns, ensuring our systems are secure and reliable.” – Hai Sadon, Data Platform Group Manager, Transmit Security
“Spanner Graph has provided a novel and performant way for us to query this data, allowing us to deliver features faster and with greater peace of mind. Its flexible data modeling and high-performance querying have made it far easier to leverage the vast amount of data we have in our online applications.” – Aaron Tang, Senior Principal Engineer, U-NEXT
Are you a cloud architect or IT admin tasked with ensuring deployments are following best practices and generating configuration validation reports? The struggle of adopting best practices is real. And not just the first time: ensuring that a config doesn’t drift from org-wide best practices over time is notoriously difficult.
Workload Manager provides a rule-based validation service for evaluating your workloads running on Google Cloud. Workload Manager scans your workloads, including SAP and Microsoft SQL Server, to detect deviations from standards, rules, and best practices to improve system quality, reliability, and performance. .
Introducing custom rules in Workload Manager
Today, we’re excited to extend Workload Manager with custom rules (GA), a detective-based service that helps ensure your validations are not blocking any deployments, but that allows you to easily detect compliance issues across different architectural intents. Now, you can flexibly and consistently validate your Google Cloud deployments across Projects, Folders and Orgs against best practices and custom standards to help ensure that they remain compliant.
Here’s how to get started with Workload Manager custom rules in a matter of minutes.
1) Codify best practices and validate resources Identify best practices relevant to your deployments from the Google Cloud Architecture Framework, codify them in Rego, a declarative policy language that’s used to define rules and express policies over complex data structures, and run or schedule evaluation scans across your deployments.
You can create new Rego rules based on your preferences, or reach out to your account team to get more help crafting new rules.
2) Export findings to BigQuery dataset and visualize them using Looker You can configure your own BigQuery dataset to export each validation scan and easily integrate it with your existing reporting systems, build a new Looker dashboard, or export results to Google Sheets to plan remediation steps.
Additionally, you can configure Pub/Sub-based notifications to send email, Google Chat messages, or integrate with your third-party systems based on different evaluation success criteria.
A flexible system to do more than typical config validation
With custom rules you can build rules with complex logic and validation requirements across multiple domains. You can delegate build and management to your subject matter experts, reducing development time and accelerating the time to release new policies.
And with central BigQuery table export, you can combine violation findings from multiple evaluations and easily integrate with your reporting system to build a central compliance program.
Get started today with custom rules in Workload Manager by referring to the documentation and testing sample policies against your deployments.
Need more help? Engage with your account teams to get more help in crafting, curating and adopting best practices.
Rapid advancements in artificial intelligence (AI) are unlocking new possibilities for the way we work and accelerating innovation in science, technology, and beyond. In cybersecurity, AI is poised to transform digital defense, empowering defenders and enhancing our collective security. Large language models (LLMs) open new possibilities for defenders, from sifting through complex telemetry to secure coding, vulnerabilitydiscovery, and streamlining operations. However, some of these same AI capabilities are also available to attackers, leading to understandable anxieties about the potential for AI to be misused for malicious purposes.
Much of the current discourse around cyber threat actors’ misuse of AI is confined to theoretical research. While these studies demonstrate the potential for malicious exploitation of AI, they don’t necessarily reflect the reality of how AI is currently being used by threat actors in the wild. To bridge this gap, we are sharing a comprehensive analysis of how threat actors interacted with Google’s AI-powered assistant, Gemini. Our analysis was grounded by the expertise of Google’s Threat Intelligence Group (GTIG), which combines decades of experience tracking threat actors on the front lines and protecting Google, our users, and our customers from government-backed attackers, targeted 0-day exploits, coordinated information operations (IO), and serious cyber crime networks.
We believe the private sector, governments, educational institutions, and other stakeholders must work together to maximize AI’s benefits while also reducing the risks of abuse. At Google, we are committed to developing responsible AI guided by our principles, and we share resources and best practices to enable responsible AI development across the industry. We continuously improve our AI models to make them less susceptible to misuse, and we apply our intelligence to improve Google’s defenses and protect users from cyber threat activity. We also proactively disrupt malicious activity to protect our users and help make the internet safer. We share our findings with the security community to raise awareness and enable stronger protections for all.
aside_block
<ListValue: [StructValue([(‘title’, ‘Adversarial Misuse of Generative AI’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4a6bb3fa00>), (‘btn_text’, ‘Download now’), (‘href’, ‘https://services.google.com/fh/files/misc/adversarial-misuse-generative-ai.pdf’), (‘image’, None)])]>
Executive Summary
Google Threat Intelligence Group (GTIG) is committed to tracking and protecting against cyber threat activity. We relentlessly defend Google, our users, and our customers by building the most complete threat picture to disrupt adversaries. As part of that effort, we investigate activity associated with threat actors to protect against malicious activity, including the misuse of generative AI or LLMs.
This report shares our findings on government-backed threat actor use of the Gemini web application. The report encompasses new findings across advanced persistent threat (APT) and coordinated information operations (IO) actors tracked by GTIG. By using a mix of analyst review and LLM-assisted analysis, we investigated prompts by APT and IO threat actors who attempted to misuse Gemini.
Advanced Persistent Threat (APT) refers to government-backed hacking activity, including cyber espionage and destructive computer network attacks.
Information Operations (IO) attempt to influence online audiences in a deceptive, coordinated manner. Examples include sockpuppet accounts and comment brigading.
GTIG takes a holistic, intelligence-driven approach to detecting and disrupting threat activity, and our understanding of government-backed threat actors and their campaigns provides the needed context to identify threat enabling activity. We use a wide variety of technical signals to track government-backed threat actors and their infrastructure, and we are able to correlate those signals with activity on our platforms to protect Google and our users. By tracking this activity, we’re able to leverage our insights to counter threats across Google platforms, including disrupting the activity of threat actors who have misused Gemini. We also actively share our insights with the public to raise awareness and enable stronger protections across the wider ecosystem.
Our analysis of government-backed threat actor use of Gemini focused on understanding how threat actors are using AI in their operations and if any of this activity represents novel or unique AI-enabled attack or abuse techniques. Our findings, which are consistent with those of our industry peers, reveal that while AI can be a useful tool for threat actors, it is not yet the game-changer it is sometimes portrayed to be. While we do see threat actors using generative AI to perform common tasks like troubleshooting, research, and content generation, we do not see indications of them developing novel capabilities.
Our key findings include:
We did not observe any original or persistent attempts by threat actors to use prompt attacks or other machine learning (ML)-focused threats as outlined in the Secure AI Framework (SAIF) risk taxonomy. Rather than engineering tailored prompts, threat actors used more basic measures or publicly available jailbreak prompts in unsuccessful attempts to bypass Gemini’s safety controls.
Threat actors are experimenting with Gemini to enable their operations, finding productivity gains but not yet developing novel capabilities. At present, they primarily use AI for research, troubleshooting code, and creating and localizing content.
APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, research into vulnerabilities, payload development, and assistance with malicious scripting and evasion techniques. Iranian APT actors were the heaviest users of Gemini, using it for a wide range of purposes. Of note, we observed limited use of Gemini by Russian APT actors during the period of analysis.
IO actors used Gemini for research; content generation including developing personas and messaging; translation and localization; and to find ways to increase their reach. Again, Iranian IO actors were the heaviest users of Gemini, accounting for three quarters of all use by IO actors. We also observed Chinese and Russian IO actors using Gemini primarily for general research and content creation.
Gemini’s safety and security measures restricted content that would enhance adversary capabilities as observed in this dataset. Gemini provided assistance with common tasks like creating content, summarizing, explaining complex concepts, and even simple coding tasks. Assisting with more elaborate or explicitly malicious tasks generated safety responses from Gemini.
Threat actors attempted unsuccessfully to use Gemini to enable abuse of Google products, including researching techniques for Gmail phishing, stealing data, coding a Chrome infostealer, and bypassing Google’s account verification methods.
Rather than enabling disruptive change, generative AI allows threat actors to move faster and at higher volume. For skilled actors, generative AI tools provide a helpful framework, similar to the use of Metasploit or Cobalt Strike in cyber threat activity. For less skilled actors, they also provide a learning and productivity tool, enabling them to more quickly develop tools and incorporate existing techniques. However, current LLMs on their own are unlikely to enable breakthrough capabilities for threat actors.We note that the AI landscape is in constant flux, with new AI models and agentic systems emerging daily. As this evolution unfolds, GTIG anticipates the threat landscape to evolve in stride as threat actors adopt new AI technologies in their operations.
AI-Focused Threats
Attackers can use LLMs in two ways. One way is attempting to leverage LLMs to accelerate their campaigns (e.g., by generating code for malware or content for phishing emails). The overwhelming majority of activity we observed falls into this category. The second way attackers can use LLMs is to instruct a model or AI agent to take a malicious action (e.g., finding sensitive user data and exfiltrating it). These risks are outlined in Google’s Secure AI Framework (SAIF) risk taxonomy.
We did not observe any original or persistent attempts by threat actors to use prompt attacks or other AI-specific threats. Rather than engineering tailored prompts, threat actors used more basic measures, such as rephrasing a prompt or sending the same prompt multiple times. These attempts were unsuccessful.
Jailbreak Attempts: Basic and Based on Publicly Available Prompts
We observed a handful of cases of low-effort experimentation using publicly available jailbreak prompts in unsuccessful attempts to bypass Gemini’s safety controls. Threat actors copied and pasted publicly available prompts and appended small variations in the final instruction (e.g., basic instructions to create ransomware or malware). Gemini responded with safety fallback responses and declined to follow the threat actor’s instructions.
In one example of a failed jailbreak attempt, an APT actor copied publicly available prompts into Gemini and appended basic instructions to perform coding tasks. These tasks included encoding text from a file and writing it to an executable and writing Python code for a distributed denial-of-service (DDoS) tool. In the former case, Gemini provided Python code to convert Base64 to hex, but provided a safety filtered response when the user entered a follow-up prompt that requested the same code as a VBScript.
The same group used a different publicly available jailbreak prompt to request Python code for DDoS. Gemini provided a safety filtered response stating that it could not assist, and the threat actor abandoned the session and did not attempt further interaction.
What is an AI jailbreak?
Jailbreaks are one type of Prompt Injection attack, causing an AI model to behave in ways that they’ve been trained to avoid (e.g., outputting unsafe content or leaking sensitive information). Prompt Injections generally cause the LLM to execute malicious “injected” instructions as part of data that were not meant to be executed by the LLM.
Controls against prompt injection include input/output validation and sanitization as well as adversarial training and testing. Training, tuning, and evaluation processes also help fortify models against prompt injection.
Example of a jailbreak prompt publicly available on GitHub
Some malicious actors unsuccessfully attempted to prompt Gemini for guidance on abusing Google products, such as advanced phishing techniques for Gmail, assistance coding a Chrome infostealer, and methods to bypass Google’s account creation verification methods. These attempts were unsuccessful. Gemini did not produce malware or other content that could plausibly be used in a successful malicious campaign. Instead, the responses consisted of safety-guided content and generally helpful, neutral advice about coding and cybersecurity. In our continuous work to protect Google and our users, we have not seen threat actors either expand their capabilities or better succeed in their efforts to bypass Google’s defenses.
Government-backed attackers attempted to use Gemini for coding and scripting tasks, gathering information about potential targets, researching publicly known vulnerabilities, and enabling post-compromise activities, such as defense evasion in a target environment.
Iran: Iranian APT actors were the heaviest users of Gemini, using it for a wide range of purposes, including research on defense organizations, vulnerability research, and creating content for campaigns. APT42 focused on crafting phishing campaigns, conducting reconnaissance on defense experts and organizations, and generating content with cybersecurity themes.
China: Chinese APT actors used Gemini to conduct reconnaissance, for scripting and development, to troubleshoot code, and to research how to obtain deeper access to target networks. They focused on topics such as lateral movement, privilege escalation, data exfiltration, and detection evasion.
North Korea: North Korean APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, payload development, and assistance with malicious scripting and evasion techniques. They also used Gemini to research topics of strategic interest to the North Korean government, such as the South Korean military and cryptocurrency. Of note, North Korean actors also used Gemini to draft cover letters and research jobs—activities that would likely support North Korea’s efforts to place clandestine IT workers at Western companies.
Russia: With Russian APT actors, we observed limited use of Gemini during the period of analysis. Their Gemini use focused on coding tasks, including converting publicly available malware into another coding language and adding encryption functions to existing code.
Google analyzed Gemini activity associated with known APT actors and identified APT groups from more than 20 countries that used Gemini. The highest volume of usage was from Iran and China. APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, research into vulnerabilities, payload development, and assistance with malicious scripting and evasion techniques. The top use cases by APT actors focused on:
Assistance with coding tasks, including troubleshooting, tool and script development and converting or rewriting existing code
Vulnerability research focused on publicly reported vulnerabilities and specific CVEs
General research on various technologies, translations and technical explanations
Reconnaissance about likely targets, including details about specific organizations
Enabling post-compromise activity, such seeking advice on techniques to evade detection, escalate privileges or conduct internal reconnaissance in a target environment
We observed APT actors use Gemini attempting to support all phases of the attack lifecycle.
Attack Lifecycle
Topics of Gemini Usage
Reconnaissance
Attacker gathers information about the target
Iran
Recon on experts, international defense organizations, government organizations
Topics related to Iran-Israel proxy conflict
North Korea
Research on companies across multiple sectors and geos
Recon on US military and operations in South Korea
Research free hosting providers
China
Research on US military, US-based IT service providers
Understand public database of US intelligence personnel
Research on target network ranges; determine domain names of targets
Weaponization
Attacker develops or acquires tools to exploit target
Develop webcam recording code in C++
Convert Chrome infostealer function from Python to Node.js
Rewrite publicly available malware into another language
Add AES encryption functionality to provided code
Delivery
Attacker delivers weaponized exploit or payload to the target system
Better understanding advanced phishing techniques
Generating content for targeting a US defense organization
Generating content with cybersecurity and AI themes
Exploitation
Attacker exploits vulnerability to gain access
Reverse engineer endpoint detection and response (EDR) server components for health check and authentication
Access Microsoft Exchange using password hash
Research vulnerabilities in WinRM protocol
Understand publicly reported vulnerabilities, including Internet of Things (IoT) bugs
Installation
Attacker installs tools or malware to maintain access
Sign an Outlook Visual Studio Tools for Office (VSTO) plug-in and deploy it silently to all computers
Add a self-signed certificate to Active Directory
Research Mimikatz for Windows 11
Research Chrome extensions that provide parental controls and monitoring
Command and control (C2)
Attacker establishes communication channel with the compromised system
Generate code to remotely access Windows Event Log
Active Directory management commands
JSON Web Token (JWT) security and routing rules in Ruby on Rails
Character encoding issues in smbclient
Command to check IPs of admins on the domain controller
Actions on objectives
Attacker achieves their intended goal such as data theft or disruption
Automate workflows with Selenium (e.g. logging into compromised account)
Generate a PHP script to extract emails from Gmail into electronic mail (EML) files
Upload large files to OneDrive
Solution to TLS 1.3 visibility challenges
Iranian Government-Backed Actors
Iranian government-backed actors accounted for the largest Gemini use linked to APT actors. Across Iranian government-backed actors, we observed a broad scope of research and use cases, including to enable reconnaissance on targets, for research into publicly reported vulnerabilities, to request translation and technical explanations, and to create content for possible use in future campaigns. Their use reflected strategic Iranian interests including research focused on defense organizations and experts, defense systems, foreign governments, individual dissidents, the Israel-Hamas conflict, and social issues in Iran.
At a Glance: Iranian APT Actors Using Gemini
Over 10 Iran-backed groups observed using Gemini
Google abuse-focused use cases:
Researching methods for extracting data from Android devices, including SMS messages, accounts, contacts, and social media accounts
Example use cases:
Coding and scripting
PowerShell and Linux commands
Python code for website scraping
Debugging and improving a Ghidra script
Developing PHP scripts to collect and store user IP addresses and browser information in a MySQL database
Assistance with C# programming
Modifying assembly code
Help understanding error messages
Vulnerability research
Research on specific CVEs and technologies, such as WinRM and IoT devices
Exploitation techniques and proof-of-concept code
Research on server-side request forgery (SSRF) exploitation techniques
Research on the open-source router exploitation tool RomBuster
Research about organizations
International defense organizations
Military and government organizations
Cybersecurity companies
International organizations that monitor development of advanced weapons
Research about warfare defenses
Information on the Iran-Israel proxy conflict
Unmanned aerial vehicles (UAV)
Anti-drone systems
Satellite technology
Remote sensing technology
Israel defense systems
Generating content
Generating content with cybersecurity and AI themes
Tailoring content to target a defense organization
Translating various texts into Farsi, Hebrew, and English
Crafting Phishing Campaigns
Over 30% of Iranian APT actors’ Gemini use was linked to APT42. APT42’s Gemini activity reflected the group’s focus on crafting successful phishing campaigns. We observed the group using Gemini to conduct reconnaissance into individual policy and defense experts, as well as organizations of interest for the group.
In addition to reconnaissance, APT42 used the text generation and editing capabilities of Gemini to craft material for phishing campaigns, including generating content with cybersecurity themes and tailoring the output to a US defense organization. APT42 also utilized Gemini for translation including localization, or tailoring content for a local audience. This includes content tailored to local culture and local language, such as asking for translations to be in fluent English.
Vulnerability Research
The majority of APT42’s efforts focused on research into publicly known vulnerabilities, such as a request to generate a list of critical vulnerabilities from 2023. They also focused on vulnerabilities in specific products such as Mikrotik, Apereo, and Atlassian.
Of note, APT42 appeared to be researching how to use generative AI tools for offensive purposes, asking Gemini for help preparing training content for a red team focused on how offensive teams can use AI tools in their operations.
Research Into Military and Weapons Systems
APT42 also appears to have used Gemini’s translation and explanation functions to better understand publicly available information on defense systems. Their efforts included general research into the Israel-Hamas conflict, as well as strategic trends in China’s defense industry. The threat actor also used Gemini for technical explanations about US-made aerospace systems.
Another Iranian APT group also focused on understanding warfare defenses including specific research into satellite signal jamming and anti-drone systems. Other Iranian APT actors researched specific defense systems, including researching information about specific unmanned aerial vehicle (UAV) models, jamming F-35 fighter jets, anti-drone systems, and Israel’s missile defense systems.
People’s Republic of China (PRC) Government-Backed Actors
Government-backed actors linked to the People’s Republic of China (PRC) attempted to use Gemini to enable reconnaissance on targets, for scripting and development, to request translation and explanation of technical concepts, and attempting to enable deeper access to a network following initial compromise.PRC threat actors’ usage resembled an IT admin seeking to streamline, troubleshoot, or automate their tasks. In a malicious context, however, this activity could be used to enable lateral movement, privilege escalation, data exfiltration, and detection evasion.
At a Glance: People’s Republic of China APT Actors Using Gemini
Over 20 China-backed groups observed using Gemini
Notable use cases:
Reconnaissance
Research US military and US-based IT organizations
Gather US government network ranges
Understand publicly available information about US intelligence community personnel
Determine domain names of targets spanning eight countries, mostly government agencies
Access Microsoft Exchange using password hash
Vulnerability research
Reverse engineer Carbon Black EDR’s server components for health check and authentication
Scripting and development
Generate code to remotely access Windows Event Log
Active Directory management commands
Translation and explanation
Understand graph databases (Nebula Graph)
Solutions to TLS 1.3 visibility challenges
Understand a malicious PHP script
Web JWT security and routing rules in Ruby on Rails
Deeper system access and post-compromise actions
Sign an Outlook VSTO plug-in and deploy it silently to all computers
Add a self-signed certificate to Active Directory
Upload large files to OneDrive
Character encoding issues in smbclient
Command to check IPs of admins on the Domain Controller
Record passwords on the VMware vCenter
Impacket troubleshooting
Enabling Deeper Access in a Target Network
PRC-backed APT actors also used Gemini to work through scripting and development tasks, many of which appeared intended to enable deeper access in a target network after threat actors obtained initial access. For example, one PRC-backed group asked Gemini for assistance figuring out how to sign a plugin for Microsoft Outlook and silently deploy it to all computers. The same actor also asked Gemini to generate code to remotely access Windows Event Log; sought instructions on how to add a self-signed certificate to Active Directory; and asked Gemini for a command to identify the IP addresses of administrators on the domain controller. Other actors used Gemini for help troubleshooting Chinese character encoding issues in smbclient and how to record passwords on the VMware vCenter.
In another example, PRC-backed APT actors asked Gemini for assistance with Active Directory management commands and requested help troubleshooting impacket, a Python-based tool for working with network protocols. While impacket is commonly used for benign purposes, the context of the threat actor made it clear that the actor was using the tool for malicious purposes.
Explaining Tools, Concepts, and Code
PRC actors utilized Gemini to learn about specific tools and technologies and develop solutions to technical challenges. For example, a PRC APT actor used Gemini to break down how to use the graph database Nebula Graph. In another instance, the same actor used Gemini to offer possible solutions to TLS 1.3 visibility challenges. Another PRC-backed APT group sought to understand a malicious PHP script.
Vulnerability Research and Reverse Engineering
In one case, a PRC-backed APT actor attempted unsuccessfully to get Gemini’s help reverse engineering the endpoint detection and response (EDR) tool Carbon Black. The same threat actor copied disassembled Python bytecode into Gemini to convert the bytecode into Python code. It’s not clear what their objective was.
Unsuccessful Attempts to Elicit Internal System Information From Gemini
In one case, the PRC-backed APT actor APT41 attempted unsuccessfully to use Gemini to learn about Gemini’s underlying infrastructure and systems. The actor asked Gemini to share details such as its IP address, kernel version, and network configuration. Gemini responded but did not disclose sensitive information. In a helpful tone, the responses provided publicly available details that would be widely known about the topic, while also indicating that the requested information is kept secret to prevent unauthorized access.
North Korean Government-Backed Actors
North Korean APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, payload development, and assistance with malicious scripting and evasion techniques. They also used Gemini to research topics of strategic interest to the North Korean government, such as South Korean nuclear technology and cryptocurrency. We also observed that North Korean actors were using LLMs in likely attempts to enable North Korea’s efforts to place clandestine IT workers at Western companies.
At a Glance: North Korean APT Actors Using Gemini
Nine North Korea-backed groups observed using Gemini
Google-focused use cases:
Research advanced techniques for phishing Gmail
Scripting to steal data from compromised Gmail accounts
Understanding a Chrome extension that provides parental controls (capable of taking screenshots, keylogging)
Convert Chrome infostealer function from Python to Node.js
Bypassing restrictions on Google Voice
Generate code snippets for a Chrome extension
Notable use cases:
Enabling clandestine IT worker scheme
Best Discord servers for freelancers
Exchange with overseas employees
Jobs on LinkedIn
Average salary
Drafting work proposals
Generate cover letters from job postings
Research on topics
Free hosting providers
Cryptocurrency
Operational technology (OT) and industrial networks
Nuclear technology and power plants in South Korea
Historic cyber events (e.g., major worms and DDoS attacks; Russia-Ukraine conflict) and cyber forces of foreign militaries
Research about organizations
Companies across 11 sectors and 13 countries
South Korean military
US military
German defense organizations
Malware development
Evasion techniques
Automating workflows for logging into compromised accounts
Understanding Mimikatz for Windows 11
Scripting and troubleshooting
Clandestine IT Worker Threat
North Korean APT actors used Gemini to draft cover letters and research jobs—activities that would likely support efforts by North Korean nationals to use fake identities and obtain freelance and full-time jobs at foreign companies while concealing their true identities and locations. One North Korea-backed group utilized Gemini to draft cover letters and proposals for job descriptions, researched average salaries for specific jobs, and asked about jobs on LinkedIn. The group also used Gemini for information about overseas employee exchanges. Many of the topics would be common for anyone researching and applying for jobs.
While normally employment-related research would be typical for any job seeker, we assess the usage is likely related to North Korea’s ongoing efforts to place clandestine workers in freelance gigs or full-time jobs at Western firms. The scheme, which involves thousands of North Korean workers and has affected hundreds of US-based companies, uses IT workers with false identities to complete freelance work and send wages back to the North Korean regime.
North Korea’s AI toolkit
Outside of their use of Gemini, North Korean cyber threat actors have shown a long-standing interest in AI tools. They likely use AI applications to augment malicious operations and improve efficiency and capabilities, and for producing content to support their campaigns, such as phishing lures and profile photos for fake personas. We assess with high confidence that North Korean cyber threat actors will continue to demonstrate an interest in these emerging technologies for the foreseeable future.
DPRK IT Workers
We have observed DPRK IT Workers leverage accounts on assistive writing tools, Monica (monica.im) and Ahrefs (ahrefs.com), which could potentially aid the group’s work despite a lack of language fluency. Additionally, the group has maintained accounts on Data Annotation Tech, a company hiring individuals to train AI models. Notably, a profile photo used by a suspected IT worker bore a noticeable resemblance to multiple different images on the internet, suggesting that a manipulation tool was used to generate the threat actor’s profile photo.
APT43
Google Threat Intelligence Group (GTIG) has detected evidence of APT43 actors accessing multiple publicly available LLM tools; however, the intended purpose is not clear. Based on the capabilities of these platforms and historical APT43 activities, it is possible these applications could be used in the creation of rapport-building emails, lure content, and malicious PowerShell and scripting efforts.
GTIG has detected APT43 actors reference publicly available AI chatbot tools alongside the topic “북핵 해결” (translation: “North Korean nuclear issue solution”), indicating the group is using AI applications to conduct technical research as well as open-source analysis on South Korean foreign and military affairs and nuclear issues.
GTIG has identified APT43 actors accessing multiple publicly available AI image generation tools, including tools used for image manipulation and creating realistic-looking human portraits.
Target Research and Reconnaissance
North Korean actors also engaged with Gemini with several questions that appeared focused on conducting initial research and reconnaissance into prospective targets. They also researched organizations and industries that are typical targets for North Korean actors, including the US and South Korean militaries and defense contractors. One North Korean APT group asked Gemini for information about companies and organizations across a variety of industry sectors and regions. Some of this Gemini usage related directly to organizations that the same group had attempted to target in phishing and malware campaigns that Google previously detected and disrupted.
In addition to research into companies, North Korean APT actors researched nuclear technology and power plants in South Korea, such as site locations, recent news articles, and the security status of the plants. Gemini responded with widely available, public information and facts that would be easily discoverable in an online search.
Help with Scripting, Payload Development, Defense Evasion
North Korean actors also tried to use Gemini to assist with development and scripting tasks. One North Korea-backed group attempted to use Gemini to help develop webcam recording code in C++. Gemini provided multiple versions of code, and repeated efforts by the actor potentially suggested their frustration by Gemini’s answers. The same group also asked Gemini to generate a robots.txt file to block crawlers and an .htaccess file to redirect all URLs except CSS extensions.
One North Korean APT actor used Gemini for assistance developing code for sandbox evasion. For example, the threat actor utilized Gemini to write code in C++ to detect VM environments and Hyper-V virtual machines. Gemini provided responses with short code snippets to perform simple sandbox checks. The same group also sought help troubleshooting Java errors when implementing AES encryption, and separately asked Gemini if it is possible to acquire a system password on Windows 11 using Mimikatz.
Russian Government-Backed Actors
During the period of analysis, we observed limited use of Gemini by Russia-backed APT actors. Of this limited use, the majority of usage appeared benign, rather than threat-enabling. The reasons for this low engagement are unclear. It is possible Russian actors avoided Gemini out of operational security considerations, staying off Western-controlled platforms to avoid monitoring of their activities. They may be using AI tools produced by Russian firms or locally hosting LLMs, which would ensure full control of their infrastructure. Alternatively, they may have favored other Western LLMs.
One Russian government-backed group used Gemini to request help with a handful of tasks, including help rewriting publicly available malware into another language, adding encryption functionality to code, and explanations for how a specific block of publicly available malicious code functions.
At a Glance: Russian APT Actors Using Gemini
Three Russia-backed groups observed using Gemini
Notable use cases:
Scripting
Help rewriting public malware into another language
Payload crafting
Add AES encryption functionality to provided code
Translation and explanation
Understand how some public malicious code works
Financially Motivated Actors Using LLMs
Threat actors in underground marketplaces are advertising ways to bypass security guardrails to help LLMs with malware development, phishing, and other malicious tasks. The offerings include jailbroken LLMs that are ready-made for malicious use.
Throughout 2023 and 2024, Google Threat Intelligence Group (GTIG) observed underground forum posts related to LLMs, indicating there is a burgeoning market for nefarious versions of LLMs. Some advertisements boast customized and jailbroken LLMs that don’t have restrictions for malware development purposes, or they tout a lack of security measures typically found on legitimate services, allowing the user to prompt the LLM about any topic or task without incurring security guardrails or limits on their queries. Examplesinclude FraudGPT, which has been advertised on Telegram as having no limitations, and WormGPT, a privacy focused, “uncensored” LLM capable of developing malware.
Financially motivated actors are using LLMs to help augment business email compromise (BEC) operations. GTIG has noted evidence of financially motivated actors using manipulated video and voice content in business email compromise (BEC) scams. Media reports indicate that financially motivated actors have reportedly used WormGPT to create more persuasive BEC messages.
Findings: Information Operations (IO) Actors Misusing Gemini
At a Glance: Information Operations Actors
IO actors attempted to use Gemini for research, content generation, translation and localization, and to find ways to increase their reach.
Iran: Iranian IO actors used Gemini for a wide range of tasks, accounting for three quarters of all IO prompts. They used Gemini for content creation and manipulation, including generating articles, rewriting text with a specific tone, and optimizing it for better reach. Their activity also focused on translation and localization, adapting content for different audiences, and for general research into news, current events, and political issues.
China: Pro-China IO actors used Gemini primarily for general research on various topics, including a variety of topics of strategic interest to the Chinese government. The most prolific IO actor we track, DRAGONBRIDGE, was responsible for the majority of this activity. They also used Gemini to research current events and politics, and in a few cases, they used Gemini to generate articles or content on specific topics.
Russia: Russian IO actors used Gemini primarily for general research, content creation, and translation. For example, their use involved assistance drafting content, rewriting article titles, and planning social media campaigns. Some activity demonstrated an interest in developing AI capabilities, asking for information on tools for creating online AI chatbots, developer tools for interacting with LLMs, and options for textual content analysis.
IO actors used Gemini for research, content generation including developing personas and messaging, translation and localization, and to find ways to increase their reach.Common use cases include general research into news and current events as well as specific research into individuals and organizations. In addition to creating content for campaigns, including personas and content, the actors researched increasing the efficacy of campaigns, including automating distribution, using search engine optimization (SEO) to optimize the reach of campaigns, and increasing operational security. As with government-backed groups, IO actors also used Gemini for translation and localization and for understanding the meanings or context of content.
Iran-Linked Information Operations Actors
Iran-based information operations (IO) groups used Gemini for a wide range of tasks, including general research, translation and localization, content creation and manipulation, and generating content with a specific bias or tone. We also observed Iran-based IO actors engage with Gemini about news events and ask Gemini to provide details on economic and political issues in Iran, the US, the Middle East, and Europe.
In line with their practice of mixing original and borrowed content, Iranian IO actors translated existing material, including news-like articles. They then used Gemini to explain the context and meaning of particular phrases within the given text.
Iran-based IO actors also used Gemini to localize the content, seeking human-like translation and asking Gemini for help with tasks like making the text sound like a native English speaker. They used Gemini to manipulate text (e.g., asking for help rewriting existing text on immigration and crime in a specific style or tone).
Iran’s activity also included research about improving the reach of their campaigns. For example, they attempted to generate SEO-optimized content, likely in an effort to reach a larger audience. Some actors also used Gemini to suggest strategies for increasing engagement on social media.
At a Glance: Iran-Linked IO Actors Using Gemini
Eight Iran-linked IO groups observed using Gemini
Example use cases:
Content creation – text
Generate article titles
Generate SEO-optimized content and titles
Draft a report critical of Bahrain
Draft titles and hashtags in English and Farsi for videos that are catchy or create urgency to watch the content
Draft titles and descriptions promoting Islam
Translation – content in / out of native language
Translate into Farsi-provided texts about a variety of topics, including the Iranian election, human rights, international law, Islam, and other topics
Translate Farsi-language idioms and proverbs to other languages
Translate news about the US economy, US government, and politics into Farsi, using a specified tone
Draft a French-language headline to get viewers to engage with specific content
Content manipulation – copy editing to refine content
Reformulate specific text about Sharia law
Paraphrase content describing specific improvements to Iran’s export economy
Rewrite a provided text about diplomacy and economic challenges with countries like China and Germany
Provide synonyms for specific words or phrases
Rewrite provided text about Islam and Iraq in different styles or tones
Proofread provided content
Content creation – biased text
Generate or reformulate text to criticize a government minister and other individuals for failures or other actions
Describe how a popular American TV show perpetuates harmful stereotypes
Generate Islam-themed titles for thumbnail previews on social media
General research – news and events
Provide an overview of current events in specific regions
Research about the Iran-Iraq war
Define specific terms
Suggest social media channels for information about Islam and the Quran
Provide information on countries’ policies toward the Middle East
Create persona – photo generation
Create a logo
PRC-Linked Information Operations Actors
IO actors linked to the People’s Republic of China (PRC) used Gemini primarily for general research on a wide variety of topics. The most prolific IO actor we track, the pro-China group DRAGONBRIDGE, was responsible for approximately three quarters of this activity. Of their activity, the majority use was general research about a wide variety of topics, ranging from details about the features of various social media platforms to questions about various topics of strategic interest to the PRC government. Actors researched information on current events and politics in other regions, with a focus on the US and Taiwan. They also showed interest in assessing the impact and risk of certain events. In a handful of cases, DRAGONBRIDGE used Gemini to generate articles or content on specific topics.
At a Glance: PRC-Linked IO Actors Using Gemini
Three PRC-linked IO groups observed using Gemini
Example use cases:
General research – political and social topics
Research about specific countries, organizations, and individuals
Research relations between specific countries and China
Research on topics sensitive to the the Chinese government (e.g., five poisons)
Research on Taiwanese politicians and their actions toward China
Research on US politics and political figures and their attitudes on China
Research foreign press coverage about China
Summarize key takeaways from a video
General research – technology
Compare functionality and features of different social media platforms
Explain technical concepts and suggestions for useful tools
Translation – content in / out of native language
Translate and summarize text between Chinese and other languages
Content creation – text
Draft articles on topics such as the use of AI and social movements in specific regions
Generate a summary of a movie trailer about a Chinese dissident
Create persona – text generation
Generate a company profile for a media company
DRAGONBRIDGE has experimented with other generative AI tools to create synthetic content in support of their IO campaigns. As early as 2022, the group used a commercial AI service in videos on YouTube to depict AI-generated news presenters. Their use of AI-generated video continued through 2024 but has not resulted in significantly higher engagement from real viewers. Google detected and terminated the channels distributing this content immediately upon discovery. DRAGONBRIDGE’s use of AI-generated videos or images has not resulted in significantly higher engagement from real viewers.
Russia-Linked Information Operations Actors
Russian IO actors used Gemini for general research, content creation, and translation. Half of this activity was associated with the Russian IO actor we track as KRYMSKYBRIDGE, which is linked to a Russian consulting firm that works with the Russian government. Approximately 40% of activity was linked to actors associated with Russian state sponsored entities formerly controlled by the late Russian oligarch Yevgeny Prigozhin. We also observed usage by actors tracked publicly as Doppelganger.
The majority of Russian IO actor usage was related to general research tasks, ranging from the Russia-Ukraine war to details about various tools and online services. Russian IO actors also used Gemini for content creation, rewriting article titles and planning social media campaigns. Translation to and from Russian was also a common task.
Russian IO actors focused on the generative AI landscape, which may indicate an interest in developing native capabilities in AI on infrastructure they control. They researched tools that can be used to create an online AI chatbot and developer tools for interacting with LLMs. One Russian IO actor used Gemini to suggest options for textual content analysis.
Pro-Russia IO actors have used AI in their influence campaigns in the past. In 2024, the actor known as CopyCop likely used LLMs to generate content, and some stories on their sites included metadata indicating an LLM was prompted to rewrite articles from genuine news sources with a particular political perspective or tone. CopyCop’s inauthentic news sites pose as US- and Europe-based news outlets and post Kremlin-aligned views on Western policy, the war in Ukraine, and domestic politics in the US and Europe.
At a Glance: Russia-Linked IO Actors Using Gemini
Four Russia-linked IO groups observed using Gemini
Example use cases:
General research
Research into the Russia-Ukraine war
Explain subscription plans and API details for online services
Research on different generative AI platforms, software, and systems for interacting with LLMs
Research on tools and methods for creating an online chatbot
Research tools for content analysis
Translation – content in / out of native language
Translate technical and business terminology into Russian
Translate text to/from Russian
Content creation – text
Draft a proposal for a social media agency
Rewrite article titles to garner more attention
Plan and strategize campaigns
Develop content strategy for different social media platforms and regions
Brainstorm ideas for a PR campaign and accompanying visual designs
Building AI Safely and Responsibly
We believe our approach to AI must be both bold and responsible. To us, that means developing AI in a way that maximizes the positive benefits to society while addressing the challenges. Guided by ourAI Principles, Google designs AI systems with robust security measures and strong safety guardrails, and we continuously test the security and safety of our models to improve them. Our policy guidelines and prohibited use policies prioritize safety and responsible use of Google’s generative AI tools. Google’s policy development process includes identifying emerging trends, thinking end-to-end, and designing for safety. We continuously enhance safeguards in our products to offer scaled protections to users across the globe.
At Google, we leverage threat intelligence to disrupt adversary operations. We investigate abuse of our products, services, users and platforms, including malicious cyber activities by government-backed threat actors, and work with law enforcement when appropriate. Moreover, our learnings from countering malicious activities are fed back into our product development to improve safety and security for our AI models. Google DeepMind also develops threat models for generative AI to identify potential vulnerabilities, and creates new evaluation and training techniques to address misuse caused by them. In conjunction with this research, DeepMind has shared how they’re actively deploying defenses within AI systems along with measurement and monitoring tools, one of which is a robust evaluation framework used to automatically red team an AI system’s vulnerability to indirect prompt injection attacks. Our AI development and Trust & Safety teams also work closely with our threat intelligence, security, and modelling teams to stem misuse.
The potential of AI, especially generative AI, is immense. As innovation moves forward, the industry needs security standards for building and deploying AI responsibly. That’s why we introduced the Secure AI Framework (SAIF), a conceptual framework to secure AI systems. We’ve shared a comprehensive toolkit for developers with resources and guidance for designing, building, and evaluating AI models responsibly. We’ve also shared best practices for implementing safeguards, evaluating model safety, and red teaming to test and secure AI systems.
About the Authors
Google Threat Intelligence Group brings together the Mandiant Intelligence and Threat Analysis Group (TAG) teams, and focuses on identifying, analyzing, mitigating, and eliminating entire classes of cyber threats against Alphabet, our users, and our customers. Our work includes countering threats from government-backed attackers, targeted 0-day exploits, coordinated information operations (IO), and serious cyber crime networks. We apply our intelligence to improve Google’s defenses and protect our users and customers.