The ability to deploy Swift’s Alliance Connect Virtual in Google Cloud allows financial institutions to leverage the scalability, flexibility, and cost-effectiveness of cloud infrastructure while maintaining the security and reliability standards required for financial transactions. By virtualizing the traditionally hardware-based Swift VPN connections, institutions can streamline their infrastructure, reduce operational overhead, and accelerate their digital transformation initiatives. Additionally, Google Cloud’s robust security features and compliance certifications help keep sensitive financial data protected.
“Cloud technology has been game-changing for the financial industry over the past decade and will be a key enabler of future transaction forms and flows. With the launch of Alliance Connect Virtual, Swift has taken a major step forward in supporting our customers’ cloud journeys, offering seamless and secure access to Swift via the public cloud. Teaming up with Google Cloud, we’re proud to deliver flexible and resilient solutions that align with the fast-growing cloud-first mindset of our customers, driving innovation while maintaining the highest levels of security and reliability. The feedback we have received from our pilot customers on Google Cloud has been overwhelmingly positive, and we are looking forward to seeing the adoption of the new offer scale.” – Sophie Racquet, Head of Alliance Connect Product Management, Swift
Architecting Alliance Connect Virtual on Google Cloud
The following diagrams show reference architectures of the deployment of the Alliance Connect Virtual connectivity project on Google Cloud. Alliance Connect Virtual is set up in Google Cloud and it provides connectivity to Swift via virtualized Juniper vSRX VPN and via internet or pseudo-leased-line connections to the Swift Network through network providers, based on the customer-chosen connectivity offering (Gold, Silver or Bronze). A pseudo leased line consists of four VLAN attachments and each pair of VLAN attachments has its own Cloud Router and two Partner Interconnect connections.
Alliance Connect Virtual is offered in three packages: Bronze, Silver and Gold. Depending on your Swift traffic’s criticality along with resiliency requirements, you can use the tier that best aligns with your needs. Find below the architecture for each package.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e877e8ba670>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Alliance Connect Virtual Gold:
The Alliance Connect Virtual Gold connectivity package provides the strongest resiliency and service level of the three options. The connectivity to Swift is made through Partner Interconnect provisioning two connections of equal capacity, with an enterprise-grade connection to Google Cloud that has the higher throughput of the three packages. Traffic goes through a service provider with a dedicated connection. By bypassing the public internet, your traffic takes fewer hops, so there are fewer points of failure where your traffic might get dropped or disrupted. This option is designed for customers handling more than 40,000 messages per day.
Alliance Connect Virtual Silver:
The Alliance Connect Virtual Silver package provides connectivity through one dedicated pseudo leased connection through a network provider using Partner Interconnect, providing high bandwidth and throughput. In this setup an internet connection is added as backup. This option is designed for customers handling between 1,000 and 40,000 messages per day.
Alliance Connect Virtual Bronze:
The Bronze Alliance Virtual Connect option provides low-cost internet connectivity. In this setup you can connect two VPN boxes in order to maintain a backup connection in case of failure. This option is designed for customers handling up to 1,000 messages per day.
Find out more about the different Alliance Virtual Connect Packages here.
This architecture includes the following components:
A set of VPC Networks for different vSRX network interfaces to segregate the traffic (Untrust VPC, Trust VPC, Interconnect VPC and Management VPC ). The traffic to Partner Interconnect or the internet goes through the Untrust VPC.
A set of VPC Subnets for different vSRX network interfaces to segregate the traffic (Untrust Subnets, Trust Subnets, Interconnect Subnets and Management Subnets)
A set of Firewall rules to control egress/ingress traffic between the Swift Network and other VPCs
Configuration of the Routes for the VPCs created above
Cloud Routers as per the architecture above that provide the routing for Cloud Interconnect.
Swift offers various messaging interfaces tailored to different customer needs and levels of complexity. Below we showcase the architecture of how the different messaging applications listed below can be deployed on Google Cloud and connect via Alliance Virtual Connect.
Alliance Cloud
Alliance Access
Alliance Messaging Hub
Along with the messaging interface, the High Availability (HA) tool is deployed in the application project. This tool is used to enhance the resilience and uptime of the connection to the Swift network through Alliance Connect Virtual (the connectivity packs deployed in the VPN project). The HA VM application achieve this by:
Monitoring and managing routing tables: This helps ensure that if one connection path to the Swift network or one availability zone becomes unavailable, the traffic can be seamlessly rerouted through the alternative path, minimizing disruption.
Maintaining redundant vSRX machines: Typically, the HA VMs oversee the two Compute Engine VMs that host the Juniper vSRX VPN, with one vSRX acting as the primary connection point and the other on standby. If the primary vSRX fails, the other vSRX automatically takes over the connection, helping to ensure continuity of service.
1. Alliance Cloud on Google Cloud:
Alliance Cloud is a fully managed, financial cloud-based messaging interface that connects customers to Swift’s services with the benefits of cloud deployments, such as reduced infrastructure management. Alliance Cloud offers a reduced total cost of ownership given that it is managed and hosted by Swift. Find more information on their website.
Alliance Cloud offers the following connectivity options to integrate messaging flows of the customers’ back-office applications with Alliance Cloud
Alliance Cloud offers a direct API called the Swift Messaging API (more information is available on the Swift messaging API | Swift Developer Portal), allowing customer back-office systems to integrate with Alliance Cloud using RESTful APIs. This can be achieved by choosing from Swift’s API footprint options; zero footprint, Swift SDK or Swift Microgateway (more information can be found on the Swift developer portal)
Alliance Cloud offers a software footprint through the Swift Integration Layer. This offers both file and RESTful API connectivity between the Swift Integration Layer and the customer back-office applications.
2. Alliance Access on Google Cloud:
Alliance Access is a Swift messaging interface that enables a secure connection to Swift by banks and financial institutions. Find more information on the Swift website. Alliance Access components can be deployed and managed within your Google Cloud environment. The following components will make up the Alliance Access solution:
Alliance Access Server: This is the core of the solution, a software application installed on the institution’s infrastructure. It acts as the interface between the institution’s internal systems and the Swift network.
Alliance Web Platform: A web-based interface that allows users to monitor message flows, manage configurations, and perform various operational tasks related to Swift messaging.
Alliance Gateway: A component that provides additional security and routing capabilities, by concentrating your flows from different interfaces through to Swift.
SwiftNet Link (SNL): Enables Alliance Gateway to perform application-to-application communication over SwiftNet services. Connectivity can be established via the different connectivity packs of Alliance Virtual Connect on Google Cloud.
Below, we present a few Reference Architectures on how a deployment of Alliance Access, using Alliance Virtual Connect in Google Cloud to establish connectivity to Swift network could look like:
Alliance Access itself does not require an independent Oracle database instance for its core functionality as it comes with its own embedded Oracle database Standard Edition instance. For the Alliance Access deployment on Google Cloud the reference architecture above uses the embedded OracleDB which is the deployment method supported for Alliance Access on Google Cloud.
Alliance Gateway and Alliance Web Platform come with an embedded Oracle database Standard Edition. These products mainly use it for storing configuration and logs, and do not store business data.
3. Alliance Messaging Hub
Alliance Messaging Hub (AMH) is a modular, financial messaging solution offered by Swift. AMH provides extensive throughput and sophisticated data management, delivering routing between different messaging services. Find more information on their website. The following components will make up the Alliance Messaging Hub (AMH) solution:
AMH Physical Nodes (servers): This is the core of the solution. An AMH Physical Node is a software application that acts as the interface between the institution’s internal systems and the Swift network. One or more such servers can be deployed.
Alliance Gateway:An optional component that provides additional security and routing capabilities, by concentrating your flows from different interfaces to Swift.
SNL:Enables Alliance Gateway to perform application-to-application communication over SwiftNet services. It can be established via the different connectivity packs of Alliance Virtual Connect on Google Cloud.
An Oracle Database shared by AMH Physical Nodes: Unlike Alliance Access, AMH does not come with the option of an embedded Oracle database. AMH Customers need to provide the database. To host their Oracle database on Google Cloud, customers can useBare Metal Solution, which provides a secure environment in which they can run specialized workloads, such as Oracle databases on high-performance, bare-metal servers. On the other hand, the Google Cloud and Oracle partnership opens up many possibilities for customers to host their Oracle database on the cloud , such as using Oracle Database@Google Cloud or hosting Oracle on Compute Engine. Oracle Database@Google Cloud allows customers to host database services in a Google Cloud datacenter running on Oracle Cloud Infrastructure (OCI) hardware.
Oracle Database@Google Cloud
Oracle Database on Google Compute Engine
Bare Metal Solution
OCI and Google Cross-Cloud Interconnect
Why deploy Swift connectivity on Google Cloud
Deploying the Swift connectivity stack on Google Cloud offers a compelling solution for financial institutions due to the platform’s inherent advantages:
Google Cloud’s robust infrastructure, designed to meet specific workload and industry needs, ensures high availability and reliability for mission-critical financial operations.
This infrastructure is optimized for AI, allowing institutions to leverage advanced analytics and automation for enhanced efficiency and security.
Additionally, Google Cloud’s commitment to sustainability aligns with the growing emphasis on responsible business practices, helping organizations minimize their environmental footprint while benefiting from advanced technology.
Furthermore, Google Cloud’s collaborative tools, powered by AI, streamline communication and workflow processes, empowering teams to work more efficiently and effectively.
The reference architectures above enable a secure and reliable connection to Swift by leveraging Google Cloud Infrastructure and network components. The following Google Cloud components play a crucial role in establishing a secure connection to Swift:
Partner Interconnect: Google Cloud Partner Interconnect offers a way to connect Swift’s on-premises network and Alliance Connect Virtual VPC network through a supported service provider. This type of connection provides secure and reliable data transfer, bypassing the public internet. This solution is also scalable, allowing you to increase capacity as your needs change.
Bare Metal Rack HSM: A key component of the Swift architecture is Swift HSM. It is a dedicated hardware device that safeguards Swift’s Public Key Infrastructure (PKI) credentials, ensuring secure signing of live traffic and authentication of production services. In order to leverage the benefits of the cloud for the hosting of Swift HSM, customers can leverage Bare Metal Rack HSM. Bare Metal Rack HSM provides dedicated racks and switches for hosting HSMs, ensuring isolation and a high degree of control over the environment. This aligns well with the security requirements of Swift HSM, which demands robust protection of sensitive key material. The Bare Metal Rack HSM solution is hosted in colocation facilities with active peering fabrics, ensuring low-latency connections to Google Cloud workloads. Google’s standards for these facilities and redundant infrastructure contribute to a highly available service. It is also hosted in facilities compliant with PCI-DSS, PCI-3DS, and SOC 1, 2, and 3 standards.
Oracle Database: The deployment of Alliance Messaging Hub will require Swift customers to deploy an Oracle Database. Google provides customers with several options to deploy oracle databases through the partnership of Google and Oracle which makes it easy for customers to migrate, modernize, and manage their Oracle-based applications in the cloud. You can find here the different ways to deploy Oracle on Google Cloud offering flexibility for your deployments.
To learn more about the exciting collaboration between Google Cloud and Swift, contact your Google Cloud sales representative, partner manager, or your Swift account manager.
Written By: Jacob Paullus, Daniel McNamara, Jake Rawlins, Steven Karschnia
Executive Summary
Mandiant exploited flaws in the Microsoft Software Installer (MSI) repair action of Lakeside Software’s SysTrack installer to obtain arbitrary code execution.
An attacker with low-privilege access to a system running the vulnerable version of SysTrack could escalate privileges locally.
Mandiant responsibly disclosed this vulnerability to Lakeside Software, and the issue has been addressed in version 11.0.
Introduction
Building upon the insights shared in a previous Mandiant blog post, Escalating Privileges via Third-Party Windows Installers, this case study explores the ongoing challenge of securing third-party Windows installers. These vulnerabilities are rooted in insecure coding practices when creating Microsoft Software Installer (MSI) Custom Actions and can be caused by references to missing files, broken shortcuts, or insecure folder permissions. These oversights create gaps that inadvertently allow attackers the ability to escalate privileges.
As covered in our previous blog post, after software is installed with an MSI file, Windows caches the MSI file in the C:WindowsInstaller folder for later use. This allows users on the system to access and use the “repair” feature, which is intended to address various issues that may be impacting the installed software. During execution of an MSI repair, several operations (such as file creation or execution) may be triggered from an NT AUTHORITYSYSTEM context, even if initiated by a low-privilege user, thereby creating privilege escalation opportunities.
This blog post specifically focuses on the discovery and exploitation of CVE-2023-6080, a local privilege escalation vulnerability that Mandiant identified in Lakeside Software’s SysTrack Agent version 10.7.8.
Exploiting the SysTrack Installer
Mandiant began by using Microsoft’s Process Monitor (ProcMon) to analyze and review file operations executed during the repair process of SysTrack’s MSI. While running the repair process as a low-privileged user, Mandiant observed file creation and execution within the user’s %TEMP% folder from MSIExec.exe.
Figure 1: MSIExec.exe copying and executing .tmp file in user’s %TEMP% folder
Each time Mandiant ran the repair functionality, MSIExec.exe wrote a new .tmp file to the %TEMP% folder using a formula-based name, and then executed it. Mandiant discovered, through dynamic analysis of the installer, that the name generated by the repair function would consist of the string “wac” followed by four randomly chosen hex characters (0-9, A-F). With this naming scheme, there were 65,535 possible filename options.
Due to the %TEMP% folder being writable by a low-privilege user, Mandiant tested the behavior of the repair tool when all possible filenames already existed within the %TEMP% folder. Mandiant created a PowerShell script to copy an arbitrary test executable to each possible file name in the range of wac0000.tmp to wacFFFF.tmp.
# Path to the permutations file
$csvFilePath = ‘.permutations.csv’
# Path to the executable
$exePath = ‘.test.exe’
# Target directory (using the system’s temp directory)
$targetDirectory = [System.IO.Path]::GetTempPath()
# Read the csv file content
$csvContent = Get-Content -Path $csvFilePath
# Split the content into individual values
$values = $csvContent -split “,”
# Loop through each value and copy the exe to the target directory with the new name
Foreach ($value in $values) {
$newFilePath = Join-Path -Path $targetDirectory -ChildPath ($value + “.tmp”)
Copy-Item -Path $exePath -Destination $newFilePath
}
Write-Output “Copy operation completed to $targetDirectory”
Figure 2: Creating all possible .tmp files in %TEMP%
Figure 3: Excerpt of .tmp files created in %TEMP%
After filling the previously identified namespace, Mandiant reran the MSI repair function to observe its subsequent behavior. Upon review of the ProcMon output, Mandiant observed that when the namespace was filled, the application would failover to an incrementing filename pattern. The pattern began with wac1.tmp and incremented the number each time in a predictable pattern, if the previous file existed. To prove this theory, Mandiant manually created wac1.tmp and wac2.tmp, then observed the MSI repair action in ProcMon. When running the MSI repair function, the resulting filename was wac3.tmp.
Figure 4: MSIExec.exe writing and executing a predicted .tmp file
Additionally, Mandiant observed that there was a small delay between the file write action and the file execution action, which could potentially result in a race condition vulnerability. Since Mandiant could now force the program to use a predetermined filename, Mandiant wrote another PowerShell script designed to attempt to win the race condition by copying a file (test.exe) to the %TEMP% folder, using the predicted filename, between the file write and execution in order to overwrite the file created by MSIExec.exe. In this test, test.exe was a simple proof-of-concept executable that would start notepad.exe.
while ($true) {
if (Test-Path -Path "C:UsersUSERAppDataLocalTempwac3.tmp") {
Copy-Item -Path "C:UsersUSERDesktoptest.exe" -Destination
"C:UsersUSERAppDataLocalTempwac3.tmp" -Force
}
}
Figure 5: PowerShell race condition script to copy arbitrary file into %TEMP%
With the %TEMP% folder prepared with the wac1.tmp and wac2.tmp files created, Mandiant ran both the PowerShell script and MSI repair action targeting wac3.tmp. With the race condition script running, execution of the repair action resulted in the test.exe file overwriting the intended binary and subsequently being executed by MSIExec.exe, opening cmd.exe as NT AUTHORITYSYSTEM.
Figure 6: Obtaining NT AUTHORITY SYSTEM command prompt
Defensive Considerations
As discussed in Mandiant’s previous blog post, misconfigured Custom Actions can be trivial to find and exploit, making them a significant security risk for organizations. It is essential for software developers to follow secure coding practices and review their implemented Custom Actions to prevent attackers from hijacking high-privilege operations triggered by the MSI repair functionality. Refer to the original blog post for general best practices when configuring Custom Actions. In discovery of CVE-2023-6080, Mandiant identified several misconfigurations and oversights that allowed for privilege escalation to NT AUTHORITYSYSTEM.
The SysTrack MSI performed file operations including creation and execution in the user’s %TEMP% folder, which provides a low-privilege user the opportunity to alter files being actively used in a high-privilege context. Software developers should keep folder permissions in mind and ensure all privileged file operations are performed from folders that are appropriately secured. This can include altering the read/write permissions for the folder, or using built-in folders such as C:Program Files or C:Program Files (x86), which are inherently protected from low-privilege users.
Additionally, the software’s filename generation schema included a failover mechanism that allowed an attacker to force the application into using a predetermined filename. When using randomized filenames, developers should use a sufficiently large length to ensure that an attacker cannot exhaust all possible filenames and force the application into unexpected behavior. In this case, knowing the target filename before execution made it significantly easier to beat the race condition, as opposed to dynamically identifying and replacing the target file between the time of its creation by MSIExec.exe and the time of its execution.
Something security professionals must also consider is the safety of the programs running on corporate machines. Many approved applications may inadvertently contain security vulnerabilities that increase the risk in our environments. Mandiant recommends that companies consider auditing the security of their individual endpoints to ensure that defense in depth is maintained at an organizational level. Furthermore, where possible, companies should monitor the spawning of administrative shells such as cmd.exe and powershell.exe in an elevated context to alert on possible privilege escalation attempts.
A Final Word
Domain privilege escalation is often the focus of security vendors and penetration tests, but it is not the only avenue for privilege escalation or compromise of data integrity in a corporate environment. Compromise of integrity on a single system can allow an attacker to mount further attacks throughout the network; for example, the Network Access Account used by SCCM can be compromised through a single workstation and when misconfigured can be used to escalate privileges within the domain and pivot to additional systems within the network.
Mandiant offers dedicated endpoint security assessments, during which customer endpoints are tested from multiple contexts, including the perspective of an adversary with low-privilege access attempting to escalate privileges. For more information about Mandiant’s technical consulting services, including comprehensive endpoint security assessments, visit our website.
We would like to extend a special thanks to Andrew Oliveau, who was a member of the testing team that discovered this vulnerability during his time at Mandiant.
CVE-2023-6080 Disclosure Timeline
June 13, 2024 – Vulnerability reported to Lakeside Software
July 1, 2024 – Lakeside Software confirmed the vulnerability
August 7, 2024 – Confirmed vulnerability fixed in version 11.0
For developers who want to use the PyTorch deep learning framework with Cloud TPUs, the PyTorch/XLA Python package is key, offering developers a way to run their PyTorch models on Cloud TPUs with only a few minor code changes. It does so by leveraging OpenXLA, developed by Google, which gives developers the ability to define their model once and run it on many different types of machine learning accelerators (i.e., GPUs, TPUs, etc.).
The latest release of PyTorch/XLA comes with several improvements that improve its performance for developers:
A new experimental scan operator to speed up compilation for repetitive blocks of code (i.e., for loops)
Host offloading to move TPU tensors to the host CPU’s memory to fit larger models on fewer TPUs
Improved goodput for tracing-bound models through a new base Docker image compiled with the C++ 2011 Standard application binary interface (C++ 11 ABI) flags
In addition to these improvements we’ve also re-organized the documentation to make it easier to find what you’re looking for!
Let’s take a look at each of these features in greater depth.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3a54db9ee0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Experimental scan operator
Have you ever experienced long compilation times, for example when working with large language models and PyTorch/XLA — especially when dealing with models with numerous decoder layers? During graph tracing, where we traverse the graph of all the operations being performed by the model, these iterative loops are completely “unrolled” — i.e., each loop iteration is copied and pasted for every cycle — resulting in large computation graphs. These larger graphs lead directly to longer compilation times. But now there’s a new solution: the new experimental scan function, inspired by jax.lax.scan.
The scan operator works by changing how loops are handled during compilation. Instead of compiling each iteration of the loop independently, which creates redundant blocks, scan compiles only the first iteration. The resulting compiled high-level operation (HLO) is then reused for all subsequent iterations. This means that there is less HLO or intermediate code that is being generated for each subsequent loop. Compared to a for loop, scan compiles in a fraction of the time since it only compiles the first loop iteration. This improves the developer iteration time when working on models with many homogeneous layers, such as LLMs.
Building on top of torch_xla.experimental.scan, the torch_xla.experimental.scan_layers function offers a simplified interface for looping over sequences of nn.Modules. Think of it as a way to tell PyTorch/XLA “These modules are all the same, just compile them once and reuse them!” For example:
code_block
<ListValue: [StructValue([(‘code’, ‘import torchrnimport torch.nn as nnrnimport torch_xlarnfrom torch_xla.experimental.scan_layers import scan_layersrnrnclass DecoderLayer(nn.Module):rn def __init__(self, size):rn super().__init__()rn self.linear = nn.Linear(size, size)rnrn def forward(self, x):rn return self.linear(x)rnrnwith torch_xla.device():rn layers = [DecoderLayer(1024) for _ in range(64)]rn x = torch.randn(1, 1024)rnrn# Instead of a for loop, we can scan_layers once:rn# for layer in layers:rn# x = layer(x)rnx = scan_layers(layers, x)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3a54d60610>)])]>
One thing to note is that custom pallas kernels do not yet support scan. Here is a complete example of using scan_layers in an LLM for reference.
Host offloading
Another powerful tool for memory optimization in PyTorch/XLA is host offloading. This technique allows you to temporarily move tensors from the TPU to the host CPU’s memory, freeing up valuable device memory during training. This is especially helpful for large models where memory pressure is a concern. You can use torch_xla.experimental.stablehlo_custom_call.place_to_host to offload a tensor and torch_xla.experimental.stablehlo_custom_call.place_to_device to retrieve it later. A typical use case involves offloading intermediate activations during the forward pass and then bringing them back during the backward pass. Here’s an example of host offloading for reference.
Strategic use of host offloading, such as when you’re working with limited memory and are unable to use the accelerator continuously, may significantly improve your ability to train large and complex models within the memory constraints of your hardware.
Alternative base Docker image
Have you ever encountered a situation where your TPUs are sitting idle while your host CPU is heavily loaded tracing your model execution graph for just-in-time compilation? This suggests your model is “tracing bound,” meaning performance is limited by the speed of tracing operations.
The C++11 ABI image offers a solution. Starting with this release, PyTorch/XLA offers a choice of C++ ABI flavors for both Python wheels and Docker images. This gives you a choice for which version of C++ you’d like to use with PyTorch/XLA. You’ll now find builds with both the pre-C++11 ABI, which remains the default to match PyTorch upstream, and the more modern C++11 ABI.
Switching to the C++11 ABI wheels or Docker images can lead to noticeable improvements in the above-mentioned scenarios. For example, we observed a 20% relative improvement in goodput with the Mixtral 8x7B model on v5p-256 Cloud TPU (with a global batch size of 1024) when we switched from the pre-C++11 ABI to the C++11 ABI! ML Goodput gives us an understanding of how efficiently a given model utilizes the hardware. So if we have a higher goodput measurement for the same model on the same hardware, that indicates better performance of the model.
An example of using a C++11 ABI docker image in your Dockerfile might look something like:
code_block
<ListValue: [StructValue([(‘code’, ‘# Use the C++11 ABI PyTorch/XLA image as the basernFROM us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/xla:r2.6.0_3.10_tpuvm_cxx11rnrn# Install any additional dependencies herern# RUN pip install my-other-packagernrn# Copy your code into the containerrnCOPY . /apprnWORKDIR /apprnrn# Run your training scriptrnCMD [“python”, “train.py”]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3a54d60460>)])]>
Alternatively, if you are not using Docker images, because you’re testing locally for instance, you can install the C++11 ABI wheels for version 2.6 using the following command (Python 3.10 example):
The above command works for Python 3.10. We have instructions for other versions within our documentation.
The flexibility to choose between C++ ABIs lets you choose the optimal build for your specific workload and hardware, ultimately leading to better performance and efficiency in your PyTorch/XLA projects!
So, what are you waiting for, go try out the latest version of PyTorch/XLA! For additional information check out the latest release notes.
A note on GPU support
We aren’t offering a PyTorch/XLA:GPU wheel in the PyTorch/XLA 2.6 release. We understand this is important and plan to reinstate GPU support by the 2.7 release. PyTorch/XLA remains an open-source project and we welcome contributions from the community to help maintain and improve the project. To contribute, please start with the contributors guide.
The latest stable version where a PyTorch/XLA:GPU wheel is available is torch_xla 2.5.
Modern AI workloads require powerful accelerators and high-speed interconnects to run sophisticated model architectures on an ever-growing diverse range of model sizes and modalities. In addition to large-scale training, these complex models need the latest high-performance computing solutions for fine-tuning and inference.
Today, we’re excited to bring the highly-anticipated NVIDIA Blackwell GPUs to Google Cloud with the preview of A4 VMs, powered by NVIDIA HGX B200. The A4 VM features eight Blackwell GPUs interconnected by fifth-generation NVIDIA NVLink, and offers a significant performance boost over the previous generation A3 High VM. Each GPU delivers 2.25 times the peak compute and 2.25 times the HBM capacity, making A4 VMs a versatile option for training and fine-tuning for a wide range of model architectures, while the increased compute and HBM capacity makes it well-suited for low-latency serving.
The A4 VM integrates Google’s infrastructure innovations with Blackwell GPUs to bring the best cloud experience for Google Cloud customers, from scale and performance, to ease-of-use and cost optimization. Some of these innovations include:
Enhanced networking: A4 VMs are built on servers with our Titanium ML network adapter, optimized to deliver a secure, high-performance cloud experience for AI workloads, building on NVIDIA ConnectX-7 network interface cards (NICs). Combined with our datacenter-wide 4-way rail-aligned network, A4 VMs deliver non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE). Customers can scale to tens of thousands of GPUs with our Jupiter network fabric with 13 Petabits/sec of bi-sectional bandwidth.
Google Kubernetes Engine: With support for up to 65,000 nodes per cluster, GKE is the most scalable and fully automated Kubernetes service for customers to implement a robust, production-ready AI platform. Out of the box, A4 VMs are natively integrated with GKE. Integrating with other Google Cloud services, GKE facilitates a robust environment for the data processing and distributed computing that underpin AI workloads.
Vertex AI: A4 VMs will be accessible through Vertex AI, our fully managed, unified AI development platform for building and using generative AI, and which is powered by the AI Hypercomputer architecture under the hood.
Open software: In addition to PyTorch and CUDA, we work closely with NVIDIA to optimize JAX and XLA, enabling the overlap of collective communication and computation on GPUs. Additionally, we added optimized model configurations and example scripts for GPUs with XLA flags enabled.
Hypercompute Cluster: Our new highly scalable clustering system streamlines infrastructure and workload provisioning, and ongoing operations of AI supercomputers with tight GKE and Slurm integration.
Multiple consumption models: In addition to the On-demand, Committed use discount, and Spot consumption models, we reimagined cloud consumption for the unique needs of AI workloads with Dynamic Workload Scheduler, which offers two modes for different workloads: Flex Start mode for enhanced obtainability and better economics, and Calendar mode for predictable job start times and durations.
Hudson River Trading, a multi-asset-class quantitative trading firm, will leverage A4 VMs to train its next generation of capital market model research. The A4 VM, with its enhanced inter-GPU connectivity and high-bandwidth memory, is ideal for the demands of larger datasets and sophisticated algorithms, accelerating Hudson River Trading’s ability to react to the market.
“We’re excited to leverage A4, powered by NVIDIA’s Blackwell B200 GPUs. Running our workload on cutting edge AI Infrastructure is essential for enabling low-latency trading decisions and enhancing our models across markets. We’re looking forward to leveraging the innovations in Hypercompute Cluster to accelerate deployment of training our latest models that deliver quant-based algorithmic trading.” – Iain Dunning, Head of AI Lab, Hudson River Trading
“NVIDIA and Google Cloud have a long-standing partnership to bring our most advanced GPU-accelerated AI infrastructure to customers. The Blackwell architecture represents a giant step forward for the AI industry, so we’re excited that the B200 GPU is now available with the new A4 VM. We look forward to seeing how customers build on the new Google Cloud offering to accelerate their AI mission.” – Ian Buck, Vice-President and General Manager of Hyperscale and HPC, NVIDIA
Better together: A4 VMs and Hypercompute Cluster
Effectively scaling AI model training requires precise and scalable orchestration of infrastructure resources. These workloads often stretch across thousands of VMs, pushing the limits of compute, storage, and networking.
Hypercompute Cluster enables you to deploy and manage these large clusters of A4 VMs with compute, storage and networking as a single unit. This makes it easy to manage complexity while delivering exceptionally high performance and resilience for large distributed workloads. Hypercompute Cluster is engineered to:
Deliver high performance through co-location of A4 VMs densely packed to enable optimal workload placement
Optimize resource scheduling and workload performance with GKE and Slurm, packed with intelligent features like topology-aware scheduling
Increase reliability with built-in self-healing capabilities, proactive health checks, and automated recovery from failures
Enhance observability and monitoring for timely and customized insights
Automate provisioning, configuration, and scaling, integrated with GKE and Slurm
We’re excited to be the first hyperscaler to announce preview availability of an NVIDIA Blackwell B200-based offering. Together, A4 VMs and Hypercompute Cluster make it easier for organizations to create and deliver AI solutions across all industries. If you’re interested in learning more, please reach out to your Google Cloud representative.
We are thrilled to announce the collaboration between Google Cloud, AWS, and Azure on Kube Resource Orchestrator, or kro (pronounced “crow”). kro introduces a Kubernetes-native, cloud-agnostic way to define groupings of Kubernetes resources. With kro, you can group your applications and their dependencies as a single resource that can be easily consumed by end users.
Challenges of Kubernetes resource orchestration
Platform and DevOps teams want to define standards for how application teams deploy their workloads, and they want to use Kubernetes as the platform for creating and enforcing these standards. Each service needs to handle everything from resource creation to security configurations, monitoring setup, defining the end-user interface, and more. There are client-side templating tools that can help with this (e.g., Helm, Kustomize), but Kubernetes lacks a native way for platform teams to create custom groupings of resources for consumption by end users.
Before kro, platform teams needed to invest in custom solutions such as building custom Kubernetes controllers, or using packaging tools like Helm, which can’t leverage the benefits of Kubernetes CRDs. These approaches are costly to build, maintain, and troubleshoot, and complex for non-Kubernetes experts to consume. This is a problem many Kubernetes users face. Rather than developing vendor-specific solutions, we’ve partnered with Amazon and Microsoft on making K8s APIs simpler for all Kubernetes users.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2ac7fe4d00>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
How kro simplifies the developer experience
kro is a Kubernetes-native framework that lets you create reusable APIs to deploy multiple resources as a single unit. You can use it to encapsulate a Kubernetes deployment and its dependencies into a single API that your application teams can use, even if they aren’t familiar with Kubernetes. You can use kro to create custom end-user interfaces that expose only the parameters an end user should see, hiding the complexity of Kubernetes and cloud-provider APIs.
kro does this by introducing the concept of a ResourceGraphDefinition, which specifies how a standard Kubernetes Custom Resource Definition (CRD) should be expanded into a set of Kubernetes resources. End users define a single resource, which kro then expands into the custom resources defined in the CRD.
kro can be used to group and manage any Kubernetes resources. Tools like ACK, KCC, or ASO define CRDs to manage cloud provider resources from Kubernetes (these tools enable cloud provider resources, like storage buckets, to be created and managed as Kubernetes resources). kro can also be used to group resources from these tools, along with any other Kubernetes resources, to define an entire application deployment and the cloud provider resources it depends on.
Example use cases
Below, you’ll find some examples of kro being used with Google Cloud. You can find additional examples on the kro website.
Example 1: GKE cluster definition
Imagine that a platform administrator wants to give end users in their organization self-service access to create GKE clusters. The platform administrator creates a kro ResourceGraphDefinition called GKEclusterRGD that defines the required Kubernetes resources and a CRD called GKEcluster that exposes only the options they want to be configurable by end users. In addition to creating a cluster, the platform team also wants clusters to deploy administrative workloads such as policies, agents, etc. The ResourceGraphDefinition defines the following resources, using KCC to provide the mappings from K8s CRDs to Google Cloud APIs:
GKE cluster, Container Node Pools, IAM ServiceAccount, IAM PolicyMember, Services, Policies
The platform administrator would then define the end-user interface so that they can create a new cluster by creating an instance of the CRD that defines:
Everything related to policy, service accounts, and service activation (and how these resources relate to each other) is hidden from the end user, simplifying their experience.
Example 2: Web application definition
In this example, a DevOpsEngineer wants to create a reusable definition of a web application and its dependencies. They create a ResourceGraphDefinition called WebAppRGD, which defines a new Kubernetes CRD called WebApp. This new resource encapsulates all the necessary resources for a web application environment, including:
Deployments, service, service accounts, monitoring agents, and cloud resources like object storage buckets.
The WebAppRGD ResourceGraphDefinition can set a default configuration, and also define which parameters can be set by the end user at deployment time (kro gives you the flexibility to decide what is immutable, and what an end user is able to configure). A developer then creates an instance of the WebApp CRD, inputting any user-facing parameters. kro then deploys the desired Kubernetes resource.
Key benefits of kro
We believe kro is a big step forward for platform engineering teams, delivering a number of advantages:
Kubernetes-native: kro leverages Kubernetes Custom Resource Definitions (CRDs) to extend Kubernetes, so it works with any Kubernetes resource and integrates with existing Kubernetes tools and workflows.
Lets you create a simplified end user experience: kro makes it easy to define end-user interfaces for complex groups of Kubernetes resources, making it easy for people who are not Kubernetes experts to consume services built on Kubernetes.
Enables standardized services for application teams: kro templates can be reused across different projects and environments, promoting consistency and reducing duplication of effort.
Get started with kro
kro is available as an open-source project on GitHub. The GitHub organization is currently jointly owned by teams from Google, AWS, and Microsoft, and we welcome contributions from the community. We also have a website with documentation on installing and using kro, including example use cases. As an early-stage project, kro is not yet ready for production use, but we still encourage you to test it out in your own Kubernetes development environments!
Welcome to the second Cloud CISO Perspectives for January 2025. Iain Mulholland, senior director, Security Engineering, shares insights on the state of ransomware in the cloud from our new Threat Horizons Report. The research and intelligence in the report should prove helpful to all cloud providers and security professionals. Similarly, the recommended risk mitigations will work well with Google Cloud, but are generally applicable to all clouds.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe101b1f70>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
How cloud security can adapt to ransomware threats in 2025
By Iain Mulholland, senior director, Security Engineering, Google Cloud
How should cloud providers and cloud customers respond to the threat of ransomware? Cloud security strategies in 2025 should prioritize protecting against data exfiltration and identity access abuse, we explain in our new Threat Horizons Report.
Iain Mulholland, senior director, Security Engineering, Google Cloud
Research and intelligence in the report shows that threat actors have made stealing data and exploiting weaknesses in identity security top targets. We’ve seen recent adaptations from some threat actor groups, where they’ve started using new ransomware families to achieve their goals. We’ve also observed them rapidly adapt their tactics to evade detection and attribution, making it harder to accurately identify the source of attacks — and increasing the likelihood that victims will pay ransom demands.
As part of our shared fate approach, where we are active partners with our customers in helping them secure their cloud use by sharing our expertise, best practices, and detailed guidance, this edition of Threat Horizons provides all cloud security professionals with a deeper understanding of the threats they face, coupled with actionable risk mitigations from Google’s security and threat intelligence experts.
These mitigations will work well with Google Cloud, but are generally applicable for other clouds, too.
Evolving ransomware and data-theft risks in the cloud
Ransomware and data threats in the cloud are not new, and investigations and analysis of the threats and risks they pose has been a key part of previous Threat Horizons Reports. Notably, Google Cloud security and intelligence experts exposed ransomware-related trends in the Threat Horizons Report published in February 2024, that included threat actors prioritizing data exfiltration over encryption and exploiting server-side vulnerabilities.
We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication, regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms.
We observed in the second half of 2024 a concerning shift that threat actors were becoming more adept at obscuring their identities. This latest evolution in their tactics, techniques, and procedures makes it harder for defenders to counter their attacks and increases the likelihood of ransom payments — which totalled $1.1 billion in 2023. We also saw threat actors adapt by relying more on ransomware-as-a-service (RaaS) to target cloud services, which we detail in the full report.
We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication (MFA), regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms. Importantly, educate employees about security best practices to help prevent credential compromise.
Government insights can help here, too. Guidance from the Cybersecurity and Infrastructure Security Agency’s Ransomware Vulnerability Warning Pilot can proactively identify and warn about vulnerabilities that could be exploited by ransomware actors.
I’ve summarized risk mitigations to enhance your Google Cloud security posture to better protect against threats including account takeover, which could lead to threat actor ransomware and data extortion operations.
To help prevent cloud account takeover, your organization can:
Enroll in MFA: Google Cloud’s phased approach to mandatory MFA can make it harder for attackers to compromise accounts even if they have stolen credentials and authentication cookies.
Implement robust Identity and Access Management (IAM) policies: Use IAM policies to grant users only the necessary permissions for their jobs. Google Cloud offers a range of tools to help implement and manage IAM policies, including Policy Analyzer.
To help mitigate ransomware and extortion risks, your organization can:
Establish acloud-specific backup strategy: Disaster recovery testing should include configurations, templates, and full infrastructure redeployment and backups should be immutable for maximum protection.
Enable proactive virtual machine scanning: Part of SCC, Virtual Machine Threat Detection (VMTD) scans virtual machines for malicious applications to detect threats, including ransomware.
Monitor and control unexpected costs: With Google Cloud, you can identify and manage unusual spending patterns across all projects linked to a billing account, which could indicate unauthorized activity.
Organizations can use multiple Google Cloud products to enhance protection against ransomware and data theft extortion. Security Command Center can help establish a multicloud security foundation for your organization that can help detect data exfiltration and misconfigurations. Sensitive Data Protection can help protect against potential data exfiltration by identifying and classifying sensitive data in your Google Cloud environment, and also help you monitor for unauthorized access and movement of data.
Threats beyond ransomware
There’s much more to the cloud threat landscape than ransomware, and also more that organizations can do to mitigate the risks they face. As above, I’ve summarized here five more threat landscape trends that we identify in the report, and our suggested mitigations on how you can improve your organization’s security posture.
Service account risks, including over-privileged service accounts exploited with lateral movement tactics.
What you should do: Investigate and protect service accounts to help prevent exploitation of overprivileged accounts and reduce detection noise from false positives.
Identity exploitation, including compromised user identities in hybrid environments exploited with lateral movement between on-premises and cloud environments.
What you should do: Combine strong authentication with attribute-based validation, modernize playbooks and processes for comprehensive identity incident response (including enforcing mandatory MFA.)
Attacks against cloud databases, including active vulnerability exploits and exploiting weak credentials that guard sensitive information.
Diversified attack methods, including privilege escalation that allows threat actors to charge against victim billing accounts in an effort to maximize their profits from compromised accounts.
What you should do: As discussed above, enroll in MFA, use automated sensitive monitoring and alerting, and implement robust IAM policies.
Data theft and extortion attacks, including MFA bypass techniques and aggressive communication strategies with victims, use increasingly sophisticated tactics against cloud-based services to compromise accounts and maximize profits.
What you should do: Use a defense-in-depth strategy that includes strong password policies, MFA, regular reviews of user access, leaked credential monitoring, account lockout mechanisms, and employee education. Robust tools such as SCC can help monitor for data exfiltration and unauthorized access of data.
We provide more detail on each of these in the full report.
How Threat Horizons Reports can help
The Threat Horizons report series is intended to present a snapshot of the current state of threats to cloud environments, and how we can work together to mitigate those risks and improve cloud security for all. The reports provide decision-makers with strategic threat intelligence that cloud providers, customers, cloud security leaders, and practitioners can use today.
Threat Horizon reports are informed by Google Threat Intelligence Group (GTIG), Mandiant, Google Cloud’s Office of the CISO, Product Security Engineering, and Google Cloud intelligence, security, and product teams.
The Threat Horizons Report for the first half of 2025 can be read in full here. Previous Threat Horizons reports are available here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe0fd3a790>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Get ready for a unique, immersive security experience at Next ‘25: Here’s why Google Cloud Next is shaping up to be a must-attend event for security experts and the security-curious alike. Read more.
How Google secures its own cloud use: Take a peek under the hood at how we use and secure our own cloud environments, as part of our new “How Google Does It” series. Read more.
Privacy-preserving Confidential Computing now on even more machines and services: Confidential Computing is available on even more machine types than before. Here’s what’s new. Read more.
Use custom Org Policies to enforce CIS benchmarks for GKE: Many CIS recommendations for GKE can be enforced with custom Organization Policies. Here’s how. Read more.
Making GKE more secure with supply-chain attestation and SLSA: You can now verify the integrity of Google Kubernetes Engine components with SLSA, the Supply-chain Levels for Software Artifacts framework. Read more.
Office of the CISO 2024 year in review: Google Cloud’s Office of the CISO shared insights in 2024 on how to approach generative AI securely, featured industry experts on the Cloud Security Podcast, published research papers, and examined security lessons learned across many sectors. Read more.
Celebrating one year of AI bug bounties at Alphabet: What we learned in the first year of AI bug bounties, and how those lessons will inform our approach to vulnerability rewards going forward. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Tell Google Cloud what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe0fd3ac10>), (‘btn_text’, ‘Vote now’), (‘href’, ‘https://www.linkedin.com/feed/update/urn:li:activity:7290368088598822913/’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
How to stop cryptocurrency heists: Many factors are spurring a spike in cryptocurrency heists, including the lucrative nature of their rewards and the challenges associated with attribution to malicious actors. In our new Securing Cryptocurrency Organizations guide, we detail the defense measures organizations should take to stop cryptocurrency heists. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
How the modern CISO balances risk, innovation, business strategy, and cloud: John Rogers, CISO, MSCI, talks about the biggest cloud security challenges CISOs are facing today — and they’re evolving — with host Anton Chuvakin and guest co-host Marina Kaganovich from Google Cloud’s Office of the CISO. Listen here.
Slaying the ransomware dragon: Can startups succeed where others have failed, and once and for all end ransomware? Bob Blakley, co-founder and chief product officer of ransomware defense startup Mimic, tells hosts Anton Chuvakin and Tim Peacock his personal reasons for joining the fight against ransomware, and how his company can help. Listen here.
Behind the Binary: How a gamer became a renowned hacker: Stephen Eckels, from Google Mandiant’s FLARE team, discusses how video game modding helped kickstart his cybersecurity career. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in February with more security-related updates from Google Cloud.
In today’s complex digital world, building truly intelligent applications requires more than just raw data — you need to understand the intricate relationships within that data. Graph analysis helps reveal these hidden connections, and when combined with techniques like full-text search and vector search, enables you to deliver a new class of AI-enabled application experiences. However, traditional approaches based on niche tools result in data silos, operational overhead, and scalability challenges. That’s why we introduced Spanner Graph, and today we’re excited to announce that it’s generally available.
In a previous post, we described how Spanner Graph reimagines graph data management with a unified database that integrates graph, relational, search, and gen AI capabilities with virtually unlimited scalability. With Spanner Graph, you gain access to an intuitive ISO Standard Graph Query Language (GQL) interface that simplifies pattern matching and relationship traversal. You also benefit from full interoperability between GQL and SQL, for tight integration between graph and tabular data. Powerful vector and full-text search enable fast data retrieval using semantic meaning and keywords. And you can rely on Spanner’s scalability, availability, and consistency to provide a solid data foundation. Finally, integration with Vertex AI gives you access to powerful AI models directly within Spanner Graph.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe10508ee0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
What’s new in Spanner Graph
Since the preview, we have added exciting new capabilities and partner integrations to make it easier for you to build with Spanner Graph. Let’s take a closer look.
1) Spanner Graph Notebook: Graph visualization is key to developing with graphs. The new open-source Spanner Graph Notebook tool provides an efficient way to query Spanner Graph visually. This tool is natively installed in Google Colab, meaning you can use it directly within that environment. You can also leverage it in notebook environments like Jupyter Notebook. With this tool, you can use magic commands with GQL to visualize query results and graph schemas with multiple layout options, inspect node and edge properties, and analyze neighbor relationships.
Open-source Spanner Graph Notebook.
2) GraphRAG with LangChain integration: Spanner Graph’s integration with LangChain allows for quick prototyping of GraphRAG applications. Conventional RAG, while capable of grounding the LLM by providing relevant context from your data using vector search, cannot leverage the implicit relationships present in your data. GraphRAG overcomes this limitation by constructing a graph from your data that captures these complex relationships. At retrieval time, GraphRAG uses the combined power of graph queries with vector search to provide a richer context to the LLM, enabling it to generate more accurate and relevant answers.
3) Graph schema in Spanner Studio: The Spanner Studio Explorer panel now displays a list of defined graphs, their nodes and edges, and the associated labels and properties. You can explore and understand the structure of your graph data at a glance, making it easier to design, debug, and optimize your applications.
4) Graph query improvements: Spanner Graph now supports the path data type and functions, allowing you to retrieve and analyze the specific sequence of nodes and relationships that connect two nodes in your graph. For example, you can create a path variable in a path pattern, using the IS_ACYCLIC function to check if the path has repeating nodes, and return the entire path:
code_block
<ListValue: [StructValue([(‘code’, ‘GRAPH FinGraphrnMATCH p = (:Account)-[:Transfers]->{2,5}(:Account)rnRETURN IS_ACYCLIC(p) AS is_acyclic_path, TO_JSON(p) AS full_path;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebe12caf910>)])]>
5) Graph visualization partner integrations: Spanner Graph is now integrated with leading graph visualization partners. For example, Spanner Graph customers can use GraphXR, Kineviz’s flagship product, which combines cutting-edge visualization technology with advanced analytics to help organizations make sense of complex, connected data.
“We are thrilled to partner with Google Cloud to bring graph analytics to big data. By integrating GraphXR with Spanner Graph, we’re empowering businesses to visualize and interact with their data in ways that were previously unimaginable.” – Weidong Yang, CEO, Kineviz
“Businesses can finally handle graph data with both speed and scale. By combining Graphistry’s GPU-accelerated graph visualization and AI with Spanner Graph’s global-scale querying, teams can now easily go all the way from raw data to graph-informed action. Whether detecting fraud, analyzing journeys, hunting hackers, or surfacing risks, this partnership is enabling teams to move with confidence.” – Leo Meyerovich, Founder and CEO, Graphistry
Visual analytics capabilities in Graphistry: zooming, clustering, filtering, histograms, time bar filtering, node styling (colors), allowing point-and-click analysis to quickly understand the data, clusters, identify patterns, anomalies and other insights.
Furthermore, you can use G.V(), a quick-to-install graph database client, with Spanner Graph to perform day-to-day graph visualization and data analytics tasks with ease. Data professionals benefit from high-performance graph visualization, no-code data exploration, and highly customizable data visualization options.
“Graphs thrive on connections, which is why I’m so excited about this new partnership between G.V() and Google Cloud Spanner Graph. Spanner Graph turns big data into graphs, and G.V() effortlessly turns graphs into interactive data visualizations. I’m keen to see what data professionals build combining both solutions.” – Arthur Bigeard, Founder, gdotv Ltd.
Visually querying and exploring Spanner Graph with G.V().
What customers are saying
Through our road to GA, we have also been working with multiple customers to help them innovate with Spanner Graph:
“The Commercial Financial Network manages commercial credit data for more than 30 million U.S. businesses Managing the hierarchy of these businesses can be complex due to the volume of these hierarchies, as well as the dynamic nature driven by mergers and acquisitions, Equifax is committed to providing lenders with the accurate, reliable and timely information they need as they make financial decisions. Spanner Graph helps us manage these rapidly changing, dynamic business hierarchies easily at scale.” – Yuvaraj Sankaran, Chief Architect of Global Platforms, Equifax
“As we strive to enhance our fraud detection capabilities, having a robust, multi-model database like Google Spanner is crucial for our success. By integrating SQL for transactional data management with advanced graph data analysis, we can efficiently manage and analyze evaluated fraud data. Spanner’s new capabilities significantly improve our ability to maintain data integrity and uncover complex fraud patterns, ensuring our systems are secure and reliable.” – Hai Sadon, Data Platform Group Manager, Transmit Security
“Spanner Graph has provided a novel and performant way for us to query this data, allowing us to deliver features faster and with greater peace of mind. Its flexible data modeling and high-performance querying have made it far easier to leverage the vast amount of data we have in our online applications.” – Aaron Tang, Senior Principal Engineer, U-NEXT
Are you a cloud architect or IT admin tasked with ensuring deployments are following best practices and generating configuration validation reports? The struggle of adopting best practices is real. And not just the first time: ensuring that a config doesn’t drift from org-wide best practices over time is notoriously difficult.
Workload Manager provides a rule-based validation service for evaluating your workloads running on Google Cloud. Workload Manager scans your workloads, including SAP and Microsoft SQL Server, to detect deviations from standards, rules, and best practices to improve system quality, reliability, and performance. .
Introducing custom rules in Workload Manager
Today, we’re excited to extend Workload Manager with custom rules (GA), a detective-based service that helps ensure your validations are not blocking any deployments, but that allows you to easily detect compliance issues across different architectural intents. Now, you can flexibly and consistently validate your Google Cloud deployments across Projects, Folders and Orgs against best practices and custom standards to help ensure that they remain compliant.
Here’s how to get started with Workload Manager custom rules in a matter of minutes.
1) Codify best practices and validate resources Identify best practices relevant to your deployments from the Google Cloud Architecture Framework, codify them in Rego, a declarative policy language that’s used to define rules and express policies over complex data structures, and run or schedule evaluation scans across your deployments.
You can create new Rego rules based on your preferences, or reach out to your account team to get more help crafting new rules.
2) Export findings to BigQuery dataset and visualize them using Looker You can configure your own BigQuery dataset to export each validation scan and easily integrate it with your existing reporting systems, build a new Looker dashboard, or export results to Google Sheets to plan remediation steps.
Additionally, you can configure Pub/Sub-based notifications to send email, Google Chat messages, or integrate with your third-party systems based on different evaluation success criteria.
A flexible system to do more than typical config validation
With custom rules you can build rules with complex logic and validation requirements across multiple domains. You can delegate build and management to your subject matter experts, reducing development time and accelerating the time to release new policies.
And with central BigQuery table export, you can combine violation findings from multiple evaluations and easily integrate with your reporting system to build a central compliance program.
Get started today with custom rules in Workload Manager by referring to the documentation and testing sample policies against your deployments.
Need more help? Engage with your account teams to get more help in crafting, curating and adopting best practices.
Rapid advancements in artificial intelligence (AI) are unlocking new possibilities for the way we work and accelerating innovation in science, technology, and beyond. In cybersecurity, AI is poised to transform digital defense, empowering defenders and enhancing our collective security. Large language models (LLMs) open new possibilities for defenders, from sifting through complex telemetry to secure coding, vulnerabilitydiscovery, and streamlining operations. However, some of these same AI capabilities are also available to attackers, leading to understandable anxieties about the potential for AI to be misused for malicious purposes.
Much of the current discourse around cyber threat actors’ misuse of AI is confined to theoretical research. While these studies demonstrate the potential for malicious exploitation of AI, they don’t necessarily reflect the reality of how AI is currently being used by threat actors in the wild. To bridge this gap, we are sharing a comprehensive analysis of how threat actors interacted with Google’s AI-powered assistant, Gemini. Our analysis was grounded by the expertise of Google’s Threat Intelligence Group (GTIG), which combines decades of experience tracking threat actors on the front lines and protecting Google, our users, and our customers from government-backed attackers, targeted 0-day exploits, coordinated information operations (IO), and serious cyber crime networks.
We believe the private sector, governments, educational institutions, and other stakeholders must work together to maximize AI’s benefits while also reducing the risks of abuse. At Google, we are committed to developing responsible AI guided by our principles, and we share resources and best practices to enable responsible AI development across the industry. We continuously improve our AI models to make them less susceptible to misuse, and we apply our intelligence to improve Google’s defenses and protect users from cyber threat activity. We also proactively disrupt malicious activity to protect our users and help make the internet safer. We share our findings with the security community to raise awareness and enable stronger protections for all.
aside_block
<ListValue: [StructValue([(‘title’, ‘Adversarial Misuse of Generative AI’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4a6bb3fa00>), (‘btn_text’, ‘Download now’), (‘href’, ‘https://services.google.com/fh/files/misc/adversarial-misuse-generative-ai.pdf’), (‘image’, None)])]>
Executive Summary
Google Threat Intelligence Group (GTIG) is committed to tracking and protecting against cyber threat activity. We relentlessly defend Google, our users, and our customers by building the most complete threat picture to disrupt adversaries. As part of that effort, we investigate activity associated with threat actors to protect against malicious activity, including the misuse of generative AI or LLMs.
This report shares our findings on government-backed threat actor use of the Gemini web application. The report encompasses new findings across advanced persistent threat (APT) and coordinated information operations (IO) actors tracked by GTIG. By using a mix of analyst review and LLM-assisted analysis, we investigated prompts by APT and IO threat actors who attempted to misuse Gemini.
Advanced Persistent Threat (APT) refers to government-backed hacking activity, including cyber espionage and destructive computer network attacks.
Information Operations (IO) attempt to influence online audiences in a deceptive, coordinated manner. Examples include sockpuppet accounts and comment brigading.
GTIG takes a holistic, intelligence-driven approach to detecting and disrupting threat activity, and our understanding of government-backed threat actors and their campaigns provides the needed context to identify threat enabling activity. We use a wide variety of technical signals to track government-backed threat actors and their infrastructure, and we are able to correlate those signals with activity on our platforms to protect Google and our users. By tracking this activity, we’re able to leverage our insights to counter threats across Google platforms, including disrupting the activity of threat actors who have misused Gemini. We also actively share our insights with the public to raise awareness and enable stronger protections across the wider ecosystem.
Our analysis of government-backed threat actor use of Gemini focused on understanding how threat actors are using AI in their operations and if any of this activity represents novel or unique AI-enabled attack or abuse techniques. Our findings, which are consistent with those of our industry peers, reveal that while AI can be a useful tool for threat actors, it is not yet the game-changer it is sometimes portrayed to be. While we do see threat actors using generative AI to perform common tasks like troubleshooting, research, and content generation, we do not see indications of them developing novel capabilities.
Our key findings include:
We did not observe any original or persistent attempts by threat actors to use prompt attacks or other machine learning (ML)-focused threats as outlined in the Secure AI Framework (SAIF) risk taxonomy. Rather than engineering tailored prompts, threat actors used more basic measures or publicly available jailbreak prompts in unsuccessful attempts to bypass Gemini’s safety controls.
Threat actors are experimenting with Gemini to enable their operations, finding productivity gains but not yet developing novel capabilities. At present, they primarily use AI for research, troubleshooting code, and creating and localizing content.
APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, research into vulnerabilities, payload development, and assistance with malicious scripting and evasion techniques. Iranian APT actors were the heaviest users of Gemini, using it for a wide range of purposes. Of note, we observed limited use of Gemini by Russian APT actors during the period of analysis.
IO actors used Gemini for research; content generation including developing personas and messaging; translation and localization; and to find ways to increase their reach. Again, Iranian IO actors were the heaviest users of Gemini, accounting for three quarters of all use by IO actors. We also observed Chinese and Russian IO actors using Gemini primarily for general research and content creation.
Gemini’s safety and security measures restricted content that would enhance adversary capabilities as observed in this dataset. Gemini provided assistance with common tasks like creating content, summarizing, explaining complex concepts, and even simple coding tasks. Assisting with more elaborate or explicitly malicious tasks generated safety responses from Gemini.
Threat actors attempted unsuccessfully to use Gemini to enable abuse of Google products, including researching techniques for Gmail phishing, stealing data, coding a Chrome infostealer, and bypassing Google’s account verification methods.
Rather than enabling disruptive change, generative AI allows threat actors to move faster and at higher volume. For skilled actors, generative AI tools provide a helpful framework, similar to the use of Metasploit or Cobalt Strike in cyber threat activity. For less skilled actors, they also provide a learning and productivity tool, enabling them to more quickly develop tools and incorporate existing techniques. However, current LLMs on their own are unlikely to enable breakthrough capabilities for threat actors.We note that the AI landscape is in constant flux, with new AI models and agentic systems emerging daily. As this evolution unfolds, GTIG anticipates the threat landscape to evolve in stride as threat actors adopt new AI technologies in their operations.
AI-Focused Threats
Attackers can use LLMs in two ways. One way is attempting to leverage LLMs to accelerate their campaigns (e.g., by generating code for malware or content for phishing emails). The overwhelming majority of activity we observed falls into this category. The second way attackers can use LLMs is to instruct a model or AI agent to take a malicious action (e.g., finding sensitive user data and exfiltrating it). These risks are outlined in Google’s Secure AI Framework (SAIF) risk taxonomy.
We did not observe any original or persistent attempts by threat actors to use prompt attacks or other AI-specific threats. Rather than engineering tailored prompts, threat actors used more basic measures, such as rephrasing a prompt or sending the same prompt multiple times. These attempts were unsuccessful.
Jailbreak Attempts: Basic and Based on Publicly Available Prompts
We observed a handful of cases of low-effort experimentation using publicly available jailbreak prompts in unsuccessful attempts to bypass Gemini’s safety controls. Threat actors copied and pasted publicly available prompts and appended small variations in the final instruction (e.g., basic instructions to create ransomware or malware). Gemini responded with safety fallback responses and declined to follow the threat actor’s instructions.
In one example of a failed jailbreak attempt, an APT actor copied publicly available prompts into Gemini and appended basic instructions to perform coding tasks. These tasks included encoding text from a file and writing it to an executable and writing Python code for a distributed denial-of-service (DDoS) tool. In the former case, Gemini provided Python code to convert Base64 to hex, but provided a safety filtered response when the user entered a follow-up prompt that requested the same code as a VBScript.
The same group used a different publicly available jailbreak prompt to request Python code for DDoS. Gemini provided a safety filtered response stating that it could not assist, and the threat actor abandoned the session and did not attempt further interaction.
What is an AI jailbreak?
Jailbreaks are one type of Prompt Injection attack, causing an AI model to behave in ways that they’ve been trained to avoid (e.g., outputting unsafe content or leaking sensitive information). Prompt Injections generally cause the LLM to execute malicious “injected” instructions as part of data that were not meant to be executed by the LLM.
Controls against prompt injection include input/output validation and sanitization as well as adversarial training and testing. Training, tuning, and evaluation processes also help fortify models against prompt injection.
Example of a jailbreak prompt publicly available on GitHub
Some malicious actors unsuccessfully attempted to prompt Gemini for guidance on abusing Google products, such as advanced phishing techniques for Gmail, assistance coding a Chrome infostealer, and methods to bypass Google’s account creation verification methods. These attempts were unsuccessful. Gemini did not produce malware or other content that could plausibly be used in a successful malicious campaign. Instead, the responses consisted of safety-guided content and generally helpful, neutral advice about coding and cybersecurity. In our continuous work to protect Google and our users, we have not seen threat actors either expand their capabilities or better succeed in their efforts to bypass Google’s defenses.
Government-backed attackers attempted to use Gemini for coding and scripting tasks, gathering information about potential targets, researching publicly known vulnerabilities, and enabling post-compromise activities, such as defense evasion in a target environment.
Iran: Iranian APT actors were the heaviest users of Gemini, using it for a wide range of purposes, including research on defense organizations, vulnerability research, and creating content for campaigns. APT42 focused on crafting phishing campaigns, conducting reconnaissance on defense experts and organizations, and generating content with cybersecurity themes.
China: Chinese APT actors used Gemini to conduct reconnaissance, for scripting and development, to troubleshoot code, and to research how to obtain deeper access to target networks. They focused on topics such as lateral movement, privilege escalation, data exfiltration, and detection evasion.
North Korea: North Korean APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, payload development, and assistance with malicious scripting and evasion techniques. They also used Gemini to research topics of strategic interest to the North Korean government, such as the South Korean military and cryptocurrency. Of note, North Korean actors also used Gemini to draft cover letters and research jobs—activities that would likely support North Korea’s efforts to place clandestine IT workers at Western companies.
Russia: With Russian APT actors, we observed limited use of Gemini during the period of analysis. Their Gemini use focused on coding tasks, including converting publicly available malware into another coding language and adding encryption functions to existing code.
Google analyzed Gemini activity associated with known APT actors and identified APT groups from more than 20 countries that used Gemini. The highest volume of usage was from Iran and China. APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, research into vulnerabilities, payload development, and assistance with malicious scripting and evasion techniques. The top use cases by APT actors focused on:
Assistance with coding tasks, including troubleshooting, tool and script development and converting or rewriting existing code
Vulnerability research focused on publicly reported vulnerabilities and specific CVEs
General research on various technologies, translations and technical explanations
Reconnaissance about likely targets, including details about specific organizations
Enabling post-compromise activity, such seeking advice on techniques to evade detection, escalate privileges or conduct internal reconnaissance in a target environment
We observed APT actors use Gemini attempting to support all phases of the attack lifecycle.
Attack Lifecycle
Topics of Gemini Usage
Reconnaissance
Attacker gathers information about the target
Iran
Recon on experts, international defense organizations, government organizations
Topics related to Iran-Israel proxy conflict
North Korea
Research on companies across multiple sectors and geos
Recon on US military and operations in South Korea
Research free hosting providers
China
Research on US military, US-based IT service providers
Understand public database of US intelligence personnel
Research on target network ranges; determine domain names of targets
Weaponization
Attacker develops or acquires tools to exploit target
Develop webcam recording code in C++
Convert Chrome infostealer function from Python to Node.js
Rewrite publicly available malware into another language
Add AES encryption functionality to provided code
Delivery
Attacker delivers weaponized exploit or payload to the target system
Better understanding advanced phishing techniques
Generating content for targeting a US defense organization
Generating content with cybersecurity and AI themes
Exploitation
Attacker exploits vulnerability to gain access
Reverse engineer endpoint detection and response (EDR) server components for health check and authentication
Access Microsoft Exchange using password hash
Research vulnerabilities in WinRM protocol
Understand publicly reported vulnerabilities, including Internet of Things (IoT) bugs
Installation
Attacker installs tools or malware to maintain access
Sign an Outlook Visual Studio Tools for Office (VSTO) plug-in and deploy it silently to all computers
Add a self-signed certificate to Active Directory
Research Mimikatz for Windows 11
Research Chrome extensions that provide parental controls and monitoring
Command and control (C2)
Attacker establishes communication channel with the compromised system
Generate code to remotely access Windows Event Log
Active Directory management commands
JSON Web Token (JWT) security and routing rules in Ruby on Rails
Character encoding issues in smbclient
Command to check IPs of admins on the domain controller
Actions on objectives
Attacker achieves their intended goal such as data theft or disruption
Automate workflows with Selenium (e.g. logging into compromised account)
Generate a PHP script to extract emails from Gmail into electronic mail (EML) files
Upload large files to OneDrive
Solution to TLS 1.3 visibility challenges
Iranian Government-Backed Actors
Iranian government-backed actors accounted for the largest Gemini use linked to APT actors. Across Iranian government-backed actors, we observed a broad scope of research and use cases, including to enable reconnaissance on targets, for research into publicly reported vulnerabilities, to request translation and technical explanations, and to create content for possible use in future campaigns. Their use reflected strategic Iranian interests including research focused on defense organizations and experts, defense systems, foreign governments, individual dissidents, the Israel-Hamas conflict, and social issues in Iran.
At a Glance: Iranian APT Actors Using Gemini
Over 10 Iran-backed groups observed using Gemini
Google abuse-focused use cases:
Researching methods for extracting data from Android devices, including SMS messages, accounts, contacts, and social media accounts
Example use cases:
Coding and scripting
PowerShell and Linux commands
Python code for website scraping
Debugging and improving a Ghidra script
Developing PHP scripts to collect and store user IP addresses and browser information in a MySQL database
Assistance with C# programming
Modifying assembly code
Help understanding error messages
Vulnerability research
Research on specific CVEs and technologies, such as WinRM and IoT devices
Exploitation techniques and proof-of-concept code
Research on server-side request forgery (SSRF) exploitation techniques
Research on the open-source router exploitation tool RomBuster
Research about organizations
International defense organizations
Military and government organizations
Cybersecurity companies
International organizations that monitor development of advanced weapons
Research about warfare defenses
Information on the Iran-Israel proxy conflict
Unmanned aerial vehicles (UAV)
Anti-drone systems
Satellite technology
Remote sensing technology
Israel defense systems
Generating content
Generating content with cybersecurity and AI themes
Tailoring content to target a defense organization
Translating various texts into Farsi, Hebrew, and English
Crafting Phishing Campaigns
Over 30% of Iranian APT actors’ Gemini use was linked to APT42. APT42’s Gemini activity reflected the group’s focus on crafting successful phishing campaigns. We observed the group using Gemini to conduct reconnaissance into individual policy and defense experts, as well as organizations of interest for the group.
In addition to reconnaissance, APT42 used the text generation and editing capabilities of Gemini to craft material for phishing campaigns, including generating content with cybersecurity themes and tailoring the output to a US defense organization. APT42 also utilized Gemini for translation including localization, or tailoring content for a local audience. This includes content tailored to local culture and local language, such as asking for translations to be in fluent English.
Vulnerability Research
The majority of APT42’s efforts focused on research into publicly known vulnerabilities, such as a request to generate a list of critical vulnerabilities from 2023. They also focused on vulnerabilities in specific products such as Mikrotik, Apereo, and Atlassian.
Of note, APT42 appeared to be researching how to use generative AI tools for offensive purposes, asking Gemini for help preparing training content for a red team focused on how offensive teams can use AI tools in their operations.
Research Into Military and Weapons Systems
APT42 also appears to have used Gemini’s translation and explanation functions to better understand publicly available information on defense systems. Their efforts included general research into the Israel-Hamas conflict, as well as strategic trends in China’s defense industry. The threat actor also used Gemini for technical explanations about US-made aerospace systems.
Another Iranian APT group also focused on understanding warfare defenses including specific research into satellite signal jamming and anti-drone systems. Other Iranian APT actors researched specific defense systems, including researching information about specific unmanned aerial vehicle (UAV) models, jamming F-35 fighter jets, anti-drone systems, and Israel’s missile defense systems.
People’s Republic of China (PRC) Government-Backed Actors
Government-backed actors linked to the People’s Republic of China (PRC) attempted to use Gemini to enable reconnaissance on targets, for scripting and development, to request translation and explanation of technical concepts, and attempting to enable deeper access to a network following initial compromise.PRC threat actors’ usage resembled an IT admin seeking to streamline, troubleshoot, or automate their tasks. In a malicious context, however, this activity could be used to enable lateral movement, privilege escalation, data exfiltration, and detection evasion.
At a Glance: People’s Republic of China APT Actors Using Gemini
Over 20 China-backed groups observed using Gemini
Notable use cases:
Reconnaissance
Research US military and US-based IT organizations
Gather US government network ranges
Understand publicly available information about US intelligence community personnel
Determine domain names of targets spanning eight countries, mostly government agencies
Access Microsoft Exchange using password hash
Vulnerability research
Reverse engineer Carbon Black EDR’s server components for health check and authentication
Scripting and development
Generate code to remotely access Windows Event Log
Active Directory management commands
Translation and explanation
Understand graph databases (Nebula Graph)
Solutions to TLS 1.3 visibility challenges
Understand a malicious PHP script
Web JWT security and routing rules in Ruby on Rails
Deeper system access and post-compromise actions
Sign an Outlook VSTO plug-in and deploy it silently to all computers
Add a self-signed certificate to Active Directory
Upload large files to OneDrive
Character encoding issues in smbclient
Command to check IPs of admins on the Domain Controller
Record passwords on the VMware vCenter
Impacket troubleshooting
Enabling Deeper Access in a Target Network
PRC-backed APT actors also used Gemini to work through scripting and development tasks, many of which appeared intended to enable deeper access in a target network after threat actors obtained initial access. For example, one PRC-backed group asked Gemini for assistance figuring out how to sign a plugin for Microsoft Outlook and silently deploy it to all computers. The same actor also asked Gemini to generate code to remotely access Windows Event Log; sought instructions on how to add a self-signed certificate to Active Directory; and asked Gemini for a command to identify the IP addresses of administrators on the domain controller. Other actors used Gemini for help troubleshooting Chinese character encoding issues in smbclient and how to record passwords on the VMware vCenter.
In another example, PRC-backed APT actors asked Gemini for assistance with Active Directory management commands and requested help troubleshooting impacket, a Python-based tool for working with network protocols. While impacket is commonly used for benign purposes, the context of the threat actor made it clear that the actor was using the tool for malicious purposes.
Explaining Tools, Concepts, and Code
PRC actors utilized Gemini to learn about specific tools and technologies and develop solutions to technical challenges. For example, a PRC APT actor used Gemini to break down how to use the graph database Nebula Graph. In another instance, the same actor used Gemini to offer possible solutions to TLS 1.3 visibility challenges. Another PRC-backed APT group sought to understand a malicious PHP script.
Vulnerability Research and Reverse Engineering
In one case, a PRC-backed APT actor attempted unsuccessfully to get Gemini’s help reverse engineering the endpoint detection and response (EDR) tool Carbon Black. The same threat actor copied disassembled Python bytecode into Gemini to convert the bytecode into Python code. It’s not clear what their objective was.
Unsuccessful Attempts to Elicit Internal System Information From Gemini
In one case, the PRC-backed APT actor APT41 attempted unsuccessfully to use Gemini to learn about Gemini’s underlying infrastructure and systems. The actor asked Gemini to share details such as its IP address, kernel version, and network configuration. Gemini responded but did not disclose sensitive information. In a helpful tone, the responses provided publicly available details that would be widely known about the topic, while also indicating that the requested information is kept secret to prevent unauthorized access.
North Korean Government-Backed Actors
North Korean APT actors used Gemini to support several phases of the attack lifecycle, including researching potential infrastructure and free hosting providers, reconnaissance on target organizations, payload development, and assistance with malicious scripting and evasion techniques. They also used Gemini to research topics of strategic interest to the North Korean government, such as South Korean nuclear technology and cryptocurrency. We also observed that North Korean actors were using LLMs in likely attempts to enable North Korea’s efforts to place clandestine IT workers at Western companies.
At a Glance: North Korean APT Actors Using Gemini
Nine North Korea-backed groups observed using Gemini
Google-focused use cases:
Research advanced techniques for phishing Gmail
Scripting to steal data from compromised Gmail accounts
Understanding a Chrome extension that provides parental controls (capable of taking screenshots, keylogging)
Convert Chrome infostealer function from Python to Node.js
Bypassing restrictions on Google Voice
Generate code snippets for a Chrome extension
Notable use cases:
Enabling clandestine IT worker scheme
Best Discord servers for freelancers
Exchange with overseas employees
Jobs on LinkedIn
Average salary
Drafting work proposals
Generate cover letters from job postings
Research on topics
Free hosting providers
Cryptocurrency
Operational technology (OT) and industrial networks
Nuclear technology and power plants in South Korea
Historic cyber events (e.g., major worms and DDoS attacks; Russia-Ukraine conflict) and cyber forces of foreign militaries
Research about organizations
Companies across 11 sectors and 13 countries
South Korean military
US military
German defense organizations
Malware development
Evasion techniques
Automating workflows for logging into compromised accounts
Understanding Mimikatz for Windows 11
Scripting and troubleshooting
Clandestine IT Worker Threat
North Korean APT actors used Gemini to draft cover letters and research jobs—activities that would likely support efforts by North Korean nationals to use fake identities and obtain freelance and full-time jobs at foreign companies while concealing their true identities and locations. One North Korea-backed group utilized Gemini to draft cover letters and proposals for job descriptions, researched average salaries for specific jobs, and asked about jobs on LinkedIn. The group also used Gemini for information about overseas employee exchanges. Many of the topics would be common for anyone researching and applying for jobs.
While normally employment-related research would be typical for any job seeker, we assess the usage is likely related to North Korea’s ongoing efforts to place clandestine workers in freelance gigs or full-time jobs at Western firms. The scheme, which involves thousands of North Korean workers and has affected hundreds of US-based companies, uses IT workers with false identities to complete freelance work and send wages back to the North Korean regime.
North Korea’s AI toolkit
Outside of their use of Gemini, North Korean cyber threat actors have shown a long-standing interest in AI tools. They likely use AI applications to augment malicious operations and improve efficiency and capabilities, and for producing content to support their campaigns, such as phishing lures and profile photos for fake personas. We assess with high confidence that North Korean cyber threat actors will continue to demonstrate an interest in these emerging technologies for the foreseeable future.
DPRK IT Workers
We have observed DPRK IT Workers leverage accounts on assistive writing tools, Monica (monica.im) and Ahrefs (ahrefs.com), which could potentially aid the group’s work despite a lack of language fluency. Additionally, the group has maintained accounts on Data Annotation Tech, a company hiring individuals to train AI models. Notably, a profile photo used by a suspected IT worker bore a noticeable resemblance to multiple different images on the internet, suggesting that a manipulation tool was used to generate the threat actor’s profile photo.
APT43
Google Threat Intelligence Group (GTIG) has detected evidence of APT43 actors accessing multiple publicly available LLM tools; however, the intended purpose is not clear. Based on the capabilities of these platforms and historical APT43 activities, it is possible these applications could be used in the creation of rapport-building emails, lure content, and malicious PowerShell and scripting efforts.
GTIG has detected APT43 actors reference publicly available AI chatbot tools alongside the topic “북핵 해결” (translation: “North Korean nuclear issue solution”), indicating the group is using AI applications to conduct technical research as well as open-source analysis on South Korean foreign and military affairs and nuclear issues.
GTIG has identified APT43 actors accessing multiple publicly available AI image generation tools, including tools used for image manipulation and creating realistic-looking human portraits.
Target Research and Reconnaissance
North Korean actors also engaged with Gemini with several questions that appeared focused on conducting initial research and reconnaissance into prospective targets. They also researched organizations and industries that are typical targets for North Korean actors, including the US and South Korean militaries and defense contractors. One North Korean APT group asked Gemini for information about companies and organizations across a variety of industry sectors and regions. Some of this Gemini usage related directly to organizations that the same group had attempted to target in phishing and malware campaigns that Google previously detected and disrupted.
In addition to research into companies, North Korean APT actors researched nuclear technology and power plants in South Korea, such as site locations, recent news articles, and the security status of the plants. Gemini responded with widely available, public information and facts that would be easily discoverable in an online search.
Help with Scripting, Payload Development, Defense Evasion
North Korean actors also tried to use Gemini to assist with development and scripting tasks. One North Korea-backed group attempted to use Gemini to help develop webcam recording code in C++. Gemini provided multiple versions of code, and repeated efforts by the actor potentially suggested their frustration by Gemini’s answers. The same group also asked Gemini to generate a robots.txt file to block crawlers and an .htaccess file to redirect all URLs except CSS extensions.
One North Korean APT actor used Gemini for assistance developing code for sandbox evasion. For example, the threat actor utilized Gemini to write code in C++ to detect VM environments and Hyper-V virtual machines. Gemini provided responses with short code snippets to perform simple sandbox checks. The same group also sought help troubleshooting Java errors when implementing AES encryption, and separately asked Gemini if it is possible to acquire a system password on Windows 11 using Mimikatz.
Russian Government-Backed Actors
During the period of analysis, we observed limited use of Gemini by Russia-backed APT actors. Of this limited use, the majority of usage appeared benign, rather than threat-enabling. The reasons for this low engagement are unclear. It is possible Russian actors avoided Gemini out of operational security considerations, staying off Western-controlled platforms to avoid monitoring of their activities. They may be using AI tools produced by Russian firms or locally hosting LLMs, which would ensure full control of their infrastructure. Alternatively, they may have favored other Western LLMs.
One Russian government-backed group used Gemini to request help with a handful of tasks, including help rewriting publicly available malware into another language, adding encryption functionality to code, and explanations for how a specific block of publicly available malicious code functions.
At a Glance: Russian APT Actors Using Gemini
Three Russia-backed groups observed using Gemini
Notable use cases:
Scripting
Help rewriting public malware into another language
Payload crafting
Add AES encryption functionality to provided code
Translation and explanation
Understand how some public malicious code works
Financially Motivated Actors Using LLMs
Threat actors in underground marketplaces are advertising ways to bypass security guardrails to help LLMs with malware development, phishing, and other malicious tasks. The offerings include jailbroken LLMs that are ready-made for malicious use.
Throughout 2023 and 2024, Google Threat Intelligence Group (GTIG) observed underground forum posts related to LLMs, indicating there is a burgeoning market for nefarious versions of LLMs. Some advertisements boast customized and jailbroken LLMs that don’t have restrictions for malware development purposes, or they tout a lack of security measures typically found on legitimate services, allowing the user to prompt the LLM about any topic or task without incurring security guardrails or limits on their queries. Examplesinclude FraudGPT, which has been advertised on Telegram as having no limitations, and WormGPT, a privacy focused, “uncensored” LLM capable of developing malware.
Financially motivated actors are using LLMs to help augment business email compromise (BEC) operations. GTIG has noted evidence of financially motivated actors using manipulated video and voice content in business email compromise (BEC) scams. Media reports indicate that financially motivated actors have reportedly used WormGPT to create more persuasive BEC messages.
Findings: Information Operations (IO) Actors Misusing Gemini
At a Glance: Information Operations Actors
IO actors attempted to use Gemini for research, content generation, translation and localization, and to find ways to increase their reach.
Iran: Iranian IO actors used Gemini for a wide range of tasks, accounting for three quarters of all IO prompts. They used Gemini for content creation and manipulation, including generating articles, rewriting text with a specific tone, and optimizing it for better reach. Their activity also focused on translation and localization, adapting content for different audiences, and for general research into news, current events, and political issues.
China: Pro-China IO actors used Gemini primarily for general research on various topics, including a variety of topics of strategic interest to the Chinese government. The most prolific IO actor we track, DRAGONBRIDGE, was responsible for the majority of this activity. They also used Gemini to research current events and politics, and in a few cases, they used Gemini to generate articles or content on specific topics.
Russia: Russian IO actors used Gemini primarily for general research, content creation, and translation. For example, their use involved assistance drafting content, rewriting article titles, and planning social media campaigns. Some activity demonstrated an interest in developing AI capabilities, asking for information on tools for creating online AI chatbots, developer tools for interacting with LLMs, and options for textual content analysis.
IO actors used Gemini for research, content generation including developing personas and messaging, translation and localization, and to find ways to increase their reach.Common use cases include general research into news and current events as well as specific research into individuals and organizations. In addition to creating content for campaigns, including personas and content, the actors researched increasing the efficacy of campaigns, including automating distribution, using search engine optimization (SEO) to optimize the reach of campaigns, and increasing operational security. As with government-backed groups, IO actors also used Gemini for translation and localization and for understanding the meanings or context of content.
Iran-Linked Information Operations Actors
Iran-based information operations (IO) groups used Gemini for a wide range of tasks, including general research, translation and localization, content creation and manipulation, and generating content with a specific bias or tone. We also observed Iran-based IO actors engage with Gemini about news events and ask Gemini to provide details on economic and political issues in Iran, the US, the Middle East, and Europe.
In line with their practice of mixing original and borrowed content, Iranian IO actors translated existing material, including news-like articles. They then used Gemini to explain the context and meaning of particular phrases within the given text.
Iran-based IO actors also used Gemini to localize the content, seeking human-like translation and asking Gemini for help with tasks like making the text sound like a native English speaker. They used Gemini to manipulate text (e.g., asking for help rewriting existing text on immigration and crime in a specific style or tone).
Iran’s activity also included research about improving the reach of their campaigns. For example, they attempted to generate SEO-optimized content, likely in an effort to reach a larger audience. Some actors also used Gemini to suggest strategies for increasing engagement on social media.
At a Glance: Iran-Linked IO Actors Using Gemini
Eight Iran-linked IO groups observed using Gemini
Example use cases:
Content creation – text
Generate article titles
Generate SEO-optimized content and titles
Draft a report critical of Bahrain
Draft titles and hashtags in English and Farsi for videos that are catchy or create urgency to watch the content
Draft titles and descriptions promoting Islam
Translation – content in / out of native language
Translate into Farsi-provided texts about a variety of topics, including the Iranian election, human rights, international law, Islam, and other topics
Translate Farsi-language idioms and proverbs to other languages
Translate news about the US economy, US government, and politics into Farsi, using a specified tone
Draft a French-language headline to get viewers to engage with specific content
Content manipulation – copy editing to refine content
Reformulate specific text about Sharia law
Paraphrase content describing specific improvements to Iran’s export economy
Rewrite a provided text about diplomacy and economic challenges with countries like China and Germany
Provide synonyms for specific words or phrases
Rewrite provided text about Islam and Iraq in different styles or tones
Proofread provided content
Content creation – biased text
Generate or reformulate text to criticize a government minister and other individuals for failures or other actions
Describe how a popular American TV show perpetuates harmful stereotypes
Generate Islam-themed titles for thumbnail previews on social media
General research – news and events
Provide an overview of current events in specific regions
Research about the Iran-Iraq war
Define specific terms
Suggest social media channels for information about Islam and the Quran
Provide information on countries’ policies toward the Middle East
Create persona – photo generation
Create a logo
PRC-Linked Information Operations Actors
IO actors linked to the People’s Republic of China (PRC) used Gemini primarily for general research on a wide variety of topics. The most prolific IO actor we track, the pro-China group DRAGONBRIDGE, was responsible for approximately three quarters of this activity. Of their activity, the majority use was general research about a wide variety of topics, ranging from details about the features of various social media platforms to questions about various topics of strategic interest to the PRC government. Actors researched information on current events and politics in other regions, with a focus on the US and Taiwan. They also showed interest in assessing the impact and risk of certain events. In a handful of cases, DRAGONBRIDGE used Gemini to generate articles or content on specific topics.
At a Glance: PRC-Linked IO Actors Using Gemini
Three PRC-linked IO groups observed using Gemini
Example use cases:
General research – political and social topics
Research about specific countries, organizations, and individuals
Research relations between specific countries and China
Research on topics sensitive to the the Chinese government (e.g., five poisons)
Research on Taiwanese politicians and their actions toward China
Research on US politics and political figures and their attitudes on China
Research foreign press coverage about China
Summarize key takeaways from a video
General research – technology
Compare functionality and features of different social media platforms
Explain technical concepts and suggestions for useful tools
Translation – content in / out of native language
Translate and summarize text between Chinese and other languages
Content creation – text
Draft articles on topics such as the use of AI and social movements in specific regions
Generate a summary of a movie trailer about a Chinese dissident
Create persona – text generation
Generate a company profile for a media company
DRAGONBRIDGE has experimented with other generative AI tools to create synthetic content in support of their IO campaigns. As early as 2022, the group used a commercial AI service in videos on YouTube to depict AI-generated news presenters. Their use of AI-generated video continued through 2024 but has not resulted in significantly higher engagement from real viewers. Google detected and terminated the channels distributing this content immediately upon discovery. DRAGONBRIDGE’s use of AI-generated videos or images has not resulted in significantly higher engagement from real viewers.
Russia-Linked Information Operations Actors
Russian IO actors used Gemini for general research, content creation, and translation. Half of this activity was associated with the Russian IO actor we track as KRYMSKYBRIDGE, which is linked to a Russian consulting firm that works with the Russian government. Approximately 40% of activity was linked to actors associated with Russian state sponsored entities formerly controlled by the late Russian oligarch Yevgeny Prigozhin. We also observed usage by actors tracked publicly as Doppelganger.
The majority of Russian IO actor usage was related to general research tasks, ranging from the Russia-Ukraine war to details about various tools and online services. Russian IO actors also used Gemini for content creation, rewriting article titles and planning social media campaigns. Translation to and from Russian was also a common task.
Russian IO actors focused on the generative AI landscape, which may indicate an interest in developing native capabilities in AI on infrastructure they control. They researched tools that can be used to create an online AI chatbot and developer tools for interacting with LLMs. One Russian IO actor used Gemini to suggest options for textual content analysis.
Pro-Russia IO actors have used AI in their influence campaigns in the past. In 2024, the actor known as CopyCop likely used LLMs to generate content, and some stories on their sites included metadata indicating an LLM was prompted to rewrite articles from genuine news sources with a particular political perspective or tone. CopyCop’s inauthentic news sites pose as US- and Europe-based news outlets and post Kremlin-aligned views on Western policy, the war in Ukraine, and domestic politics in the US and Europe.
At a Glance: Russia-Linked IO Actors Using Gemini
Four Russia-linked IO groups observed using Gemini
Example use cases:
General research
Research into the Russia-Ukraine war
Explain subscription plans and API details for online services
Research on different generative AI platforms, software, and systems for interacting with LLMs
Research on tools and methods for creating an online chatbot
Research tools for content analysis
Translation – content in / out of native language
Translate technical and business terminology into Russian
Translate text to/from Russian
Content creation – text
Draft a proposal for a social media agency
Rewrite article titles to garner more attention
Plan and strategize campaigns
Develop content strategy for different social media platforms and regions
Brainstorm ideas for a PR campaign and accompanying visual designs
Building AI Safely and Responsibly
We believe our approach to AI must be both bold and responsible. To us, that means developing AI in a way that maximizes the positive benefits to society while addressing the challenges. Guided by ourAI Principles, Google designs AI systems with robust security measures and strong safety guardrails, and we continuously test the security and safety of our models to improve them. Our policy guidelines and prohibited use policies prioritize safety and responsible use of Google’s generative AI tools. Google’s policy development process includes identifying emerging trends, thinking end-to-end, and designing for safety. We continuously enhance safeguards in our products to offer scaled protections to users across the globe.
At Google, we leverage threat intelligence to disrupt adversary operations. We investigate abuse of our products, services, users and platforms, including malicious cyber activities by government-backed threat actors, and work with law enforcement when appropriate. Moreover, our learnings from countering malicious activities are fed back into our product development to improve safety and security for our AI models. Google DeepMind also develops threat models for generative AI to identify potential vulnerabilities, and creates new evaluation and training techniques to address misuse caused by them. In conjunction with this research, DeepMind has shared how they’re actively deploying defenses within AI systems along with measurement and monitoring tools, one of which is a robust evaluation framework used to automatically red team an AI system’s vulnerability to indirect prompt injection attacks. Our AI development and Trust & Safety teams also work closely with our threat intelligence, security, and modelling teams to stem misuse.
The potential of AI, especially generative AI, is immense. As innovation moves forward, the industry needs security standards for building and deploying AI responsibly. That’s why we introduced the Secure AI Framework (SAIF), a conceptual framework to secure AI systems. We’ve shared a comprehensive toolkit for developers with resources and guidance for designing, building, and evaluating AI models responsibly. We’ve also shared best practices for implementing safeguards, evaluating model safety, and red teaming to test and secure AI systems.
About the Authors
Google Threat Intelligence Group brings together the Mandiant Intelligence and Threat Analysis Group (TAG) teams, and focuses on identifying, analyzing, mitigating, and eliminating entire classes of cyber threats against Alphabet, our users, and our customers. Our work includes countering threats from government-backed attackers, targeted 0-day exploits, coordinated information operations (IO), and serious cyber crime networks. We apply our intelligence to improve Google’s defenses and protect our users and customers.
Editor’s note: Today’s post is by Travis Naraine, IT Infrastructure Engineer, and Harel Shaked, Director of IT Services and Support, both for Outbrain, a leading technology platform that drives business results by engaging people across the open internet. Outbrain adopted Chrome Enterprise and integrations from Spin.AI to create policies for secure app and extension use and manage automatic updates for its dispersed workforce.
With a workforce as dispersed as ours, security is always a challenge. We standardized on Chrome Enterprise browser two years ago, and it’s become the linchpin of our cloud-first strategy, giving us a way to manage all of our users and stay secure. But we had concerns about browser extensions and we felt it was time to find a solution.
The value of extension management
We know people like to use browser extensions to improve their productivity and to access the tools and features they need to do their jobs. We also know there are malicious extensions available online. But vetting, testing, and blocking extensions manually was time-consuming and not 100% effective because it didn’t give us visibility into which extensions and apps were already in our environment.
Our process was reactive instead of proactive, raising concerns over missed opportunities to detect and block risky extensions. We needed a more automated way to enable employees to safely install Chrome Enterprise extensions.
Tools for extension risk assessment
As we explored solutions for another security project, we came across Spin.AI’s SpinOne platform, which includes the SaaS Security Posture Management (SSPM) solution for third-party application security. SSPM had several points in its favor, including features for continuous app assessment for browser extensions and the ability to easily integrate with Chrome Enterprise. The SpinOne platform met several of our SaaS security needs, and we like to stay with one vendor whenever possible.
Now we use Chrome Enterprise extension risk assessment, powered by Spin.AI, to generate risk scores and comprehensive risk assessment reports that assist in decisions about allowing or blocking extensions. In addition, with Chrome Enterprise Core‘s extension workflow, Outbrain employees can easily submit extension requests for IT and security teams to review and allow or deny use of the extensions.
The automated process through Chrome Enterprise saves significant time compared with manual reviews. The new policies and the Chrome Enterprise and Spin.AI solution has created an environment that nudges users to think more about anything they were installing—extensions, and other apps as well.
Using extensions securely and safely
Chrome Enterprise makes management and control easy, enforcing policies for the browser and extensions with less complexity. We even develop our own in-house extensions for Chrome Enterprise for tasks like inspecting widgets within the company.
In addition to setting browser policies through the Google Admin console, we can manage automatic updates to ensure our employees are using the newest version of Chrome with the latest security patches, further reducing our exposure to vulnerabilities.
We definitely have fewer worries about browser security today. We know that Spin.AI and Chrome Enterprise are doing their job in the background, so we’re not constantly concerned that a user is installing something malicious. We can set it and forget it.
Since 2022, Google Threat Intelligence Group (GTIG) has been tracking multiple cyber espionage operations conducted by China-nexus actors utilizing POISONPLUG.SHADOW. These operations employ a custom obfuscating compiler that we refer to as “ScatterBrain,” facilitating attacks against various entities across Europe and the Asia Pacific (APAC) region. ScatterBrain appears to be a substantial evolution of ScatterBee, an obfuscating compiler previously analyzed by PWC.
GTIG assesses that POISONPLUG is an advanced modular backdoor used by multiple distinct, but likely related threat groups based in the PRC, however we assess that POISONPLUG.SHADOW usage appears to be further restricted to clusters associated with APT41.
GTIG currently tracks three known POISONPLUG variants:
POISONPLUG
POISONPLUG.DEED
POISONPLUG.SHADOW
POISONPLUG.SHADOW—often referred to as “Shadowpad,” a malware family name first introduced by Kaspersky—stands out due to its use of a custom obfuscating compiler specifically designed to evade detection and analysis. Its complexity is compounded by not only the extensive obfuscation mechanisms employed but also by the attackers’ highly sophisticated threat tactics. These elements collectively make analysis exceptionally challenging and complicate efforts to identify, understand, and mitigate the associated threats it poses.
In addressing these challenges, GTIG collaborates closely with the FLARE team to dissect and analyze POISONPLUG.SHADOW. This partnership utilizes state-of-the-art reverse engineering techniques and comprehensive threat intelligence capabilities required to mitigate the sophisticated threats posed by this threat actor. We remain dedicated to advancing methodologies and fostering innovation to adapt to and counteract the ever-evolving tactics of threat actors, ensuring the security of Google and our customers against sophisticated cyber espionage operations.
Overview
In this blog post, we present our in-depth analysis of the ScatterBrain obfuscator, which has led to the development of a complete stand-alone static deobfuscator library independent of any binary analysis frameworks. Our analysis is based solely on the obfuscated samples we have successfully recovered, as we do not possess the obfuscating compiler itself. Despite this limitation, we have been able to comprehensively infer every aspect of the obfuscator and the necessary requirements to break it. Our analysis further reveals that ScatterBrain is continuously evolving, with incremental changes identified over time, highlighting its ongoing development.
This publication begins by exploring the fundamental primitives of ScatterBrain, outlining all its components and the challenges they present for analysis. We then detail the steps required to subvert and remove each protection mechanism, culminating in our deobfuscator. Our library takes protected binaries generated by ScatterBrain as input and produces fully functional deobfuscated binaries as output.
By detailing the inner workings of ScatterBrain and sharing our deobfuscator, we hope to provide valuable insights into developing effective countermeasures. Our blog post is intentionally exhaustive, drawing from our experience in dealing with obfuscation for clients, where we observed a significant lack of clarity in understanding modern obfuscation techniques. Similarly, analysts often struggle with understanding even relatively simplistic obfuscation methods primarily because standard binary analysis tooling is not designed to account for them. Therefore, our goal is to alleviate this burden and help enhance the collective understanding against commonly seen protection mechanisms.
For general questions about obfuscating compilers, we refer to our previous work on the topic, which provides an introduction and overview.
ScatterBrain Obfuscator
Introduction
ScatterBrain is a sophisticated obfuscating compiler that integrates multiple operational modes and protection components to significantly complicate the analysis of the binaries it generates. Designed to render modern binary analysis frameworks and defender tools ineffective, ScatterBrain disrupts both static and dynamic analyses.
Protection Modes: ScatterBrain operates in three distinct modes, each determining the overall structure and intensity of the applied protections. These modes allow the compiler to adapt its obfuscation strategies based on the specific requirements of the attack.
Protection Components: The compiler employs key protection components that include the following:
Selective or Full Control Flow Graph (CFG) Obfuscation: This technique restructures the program’s control flow, making it very difficult to analyze and create detection rules for.
Instruction Mutations: ScatterBrain alters instructions to obscure their true functionality without changing the program’s behavior.
Complete Import Protection: ScatterBrain employs a complete protection of a binary’s import table, making it extremely difficult to understand how the binary interacts with the underlying operating system.
These protection mechanisms collectively make it extremely challenging for analysts to deconstruct and understand the functionality of the obfuscated binaries. As a result, ScatterBrain poses a formidable obstacle for cybersecurity professionals attempting to dissect and mitigate the threats it generates.
Modes of Operation
A mode refers to how ScatterBrain will transform a given binary into its obfuscated representation. It is distinct from the actual core obfuscation mechanisms themselves and is more about the overall strategy of applying protections. Our analysis further revealed a consistent pattern in applying various protection modes at specific stages of an attack chain:
Selective: A group of individually selected functions are protected, leaving the remainder of the binary in its original state. Any import references within the selected functions are also obfuscated. This mode was observed to be used strictly for dropper samples of an attack chain.
Complete: The entirety of the code section and all imports are protected. This mode was applied solely to the plugins embedded within the main backdoor payload.
Complete “headerless”: This is an extension of the Complete mode with added data protections and the removal of the PE header. This mode was exclusively reserved for the final backdoor payload.
Selective
The selective mode of protection allows users of the obfuscator to selectively target individual functions within the binary for protection. Protecting an individual function involves keeping the function at its original starting address (produced by the original compiler and linker) and substituting the first instruction with a jump to the obfuscated code. The generated obfuscations are stored linearly from this starting point up to a designated “end marker” that signifies the ending boundary of the applied protection. This entire range constitutes a protected function.
The disassembly of a call site to a protected function can take the following from:
Figure 1: Disassembly of a call to a protected function
The start of the protected function:
.text:180001039 PROTECTED_FUNCTION
.text:180001039 jmp loc_18000DF97 ; jmp into obfuscated code
.text:180001039 sub_180001039 endp
.text:000000018000103E db 48h ; H. ; garbage data
.text:000000018000103F db 0FFh
.text:0000000180001040 db 0C1h
Figure 2: Disassembly inside of a protected function
The “end marker” consists of two sets of padding instructions, an int3 instruction and a single multi-nop instruction:
END_MARKER:
.text:18001A95C CC CC CC CC CC CC CC CC CC CC 66
66 0F 1F 84 00 00 00 00 00
.text:18001A95C int 3
.text:18001A95D int 3
.text:18001A95E int 3
.text:18001A95F int 3
.text:18001A960 int 3
.text:18001A961 int 3
.text:18001A962 int 3
.text:18001A963 int 3
.text:18001A964 int 3
.text:18001A965 int 3
.text:18001A966 db 66h, 66h ; @NOTE: IDA doesn't disassemble properly
.text:18001A966 nop word ptr [rax+rax+00000000h]
; -------------------------------------------------------------------------
; next, original function
.text:18001A970 ; [0000001F BYTES: COLLAPSED FUNCTION
__security_check_cookie. PRESS CTRL-NUMPAD+ TO EXPAND]
Figure 3: Disassembly listing of an end marker
Complete
The complete mode protects every function within the .text section of the binary, with all protections integrated directly into a single code section. There are no end markers to signify protected regions; instead, every function is uniformly protected, ensuring comprehensive coverage without additional sectioning.
This mode forces the need for some kind of deobfuscation tooling. Whereas selective mode only protects the selected functions and leaves everything else in its original state, this mode makes the output binary extremely difficult to analyze without accounting for the obfuscation.
Complete Headerless
This complete mode extends the complete approach to add further data obfuscations alongside the code protections. It is the most comprehensive mode of protection and was observed to be exclusively limited to the final payloads of an attack chain. It incorporates the following properties:
Full PE header of the protected binary is removed.
Custom loading logic (a loader) is introduced.
Becomes the entry point of the protected binary
Responsible of ensuring the protected binary is functional
Includes the option of mapping the final payload within a separate memory region distinct from the initial memory region it was loaded in
Metadata is protected via hash-like integrity checks.
The metadata is utilized by the loader as part of its initialization sequence.
Import protection will require relocation adjustments.
Done through an “import fixup table”
The loader’s entry routine crudely merges with the original entry of the binary by inserting multiple jmpinstructions to bridge the two together. The following is what the entry point looks like after running our deobfuscator against a binary protected in headerless mode.
Figure 4: Deobfuscated loader entry
The loader’s metadata is stored in the .data section of the protected binary. It is found via a memory scan that applies bitwise XOR operations against predefined constants. The use of these not only locates the metadata but also serves a dual purpose of verifying its integrity. By checking that the data matches expected patterns when XORed with these constants, the loader ensures that the metadata has not been altered or tampered with.
Figure 5: Memory scan to identify the loader’s metadata inside the .data section
The metadata contains the following (in order):
Import fixup table(fully explained in the Import Protection section)
Integrity-hash constants
Relative virtual address (RVA) of the.datasection
Offset to the import fixup table from the start of the.datasection
Size, in bytes, of the fixup table
Global pointer to the memory address that the backdoor is at
Encrypted and compressed data specific to the backdoor
Backdoor config and plugins
Figure 6: Loader’s metadata
Core Protection Components
Instruction Dispatcher
The instruction dispatcher is the central protection component that transforms the natural control flow of a binary (or individual function) into scattered basic blocks that end with a unique dispatcher routine that dynamically guides the execution of the protected binary.
Figure 7: Illustration of the control flow instruction dispatchers induce
Each call to a dispatcher is immediately followed by a 32-bit encoded displacement positioned at what would normally be the return address for the call. The dispatcher decodes this displacement to calculate the destination target for the next group of instructions to execute. A protected binary can easily contain thousands or even tens of thousands of these dispatchers making manual analysis of them practically infeasible. Additionally, the dynamic dispatching and decoding logic employed by each dispatcher effectively disrupts CFG reconstruction methods used by all binary analysis frameworks.
The decoding logic is unique for each dispatcher and is carried out using a combination of add, sub, xor, and, or, and lea instructions. The decoded offset value is then either subtracted from or added to the expected return address of the dispatcher call to determine the final destination address. This calculated address directs execution to the next block of instructions, which will similarly end with a dispatcher that uniquely decodes and jumps to subsequent instruction blocks, continuing this process iteratively to control the program flow.
The following screenshot illustrates what a dispatcher instance looks like when constructed in IDA Pro. Notice the scattered addresses present even within instruction dispatchers, which result from the obfuscator transforming fallthrough instructions—instructions that naturally follow the preceding instruction—into pairs of conditional branches that use opposite conditions. This ensures that one branch is always taken, effectively creating an unconditional jump. Additionally, a mov instruction that functions as a no-op is inserted to split these branches, further obscuring the control flow.
Figure 8: Example of an instruction dispatcher and all of its components
The core logic for any dispatcher can be categorized into the following four phases:
Preservation of Execution Context
Each dispatcher selects a single working register (e.g., RSI as depicted in the screenshot) during the obfuscation process. This register is used in conjunction with the stack to carry out the intended decoding operations and dispatch.
The RFLAGS register in turn is safeguarded by employing pushfq and popfq instructions before carrying out the decoding sequence.
Retrieval of Encoded Displacement
Each dispatcher retrieves a 32-bit encoded displacement located at the return address of its corresponding call instruction. This encoded displacement serves as the basis for determining the next destination address.
Decoding Sequence
Each dispatcher employs a unique decoding sequence composed of the following arithmetic and logical instructions: xor, sub, add, mul, imul, div, idiv, and, or, and not. This variability ensures that no two dispatchers operate identically, significantly increasing the complexity of the control flow.
Termination and Dispatch
The ret instruction is strategically used to simultaneously signal the end of the dispatcher function and redirect the program’s control flow to the previously calculated destination address.
It is reasonable to infer that the obfuscator utilizes a template similar to the one illustrated in Figure 9 when applying its transformations to the original binary:
Figure 9: Instruction dispatcher template
Opaque Predicates
ScatterBrain uses a series of seemingly trivial opaque predicates (OP) that appear straightforward to analysts but significantly challenge contemporary binary analysis frameworks, especially when used collectively. These opaque predicates effectively disrupt static CFG recovery techniques not specifically designed to counter their logic. Additionally, they complicate symbolic execution approaches as well by inducing path explosions and hindering path prioritization. In the following sections, we will showcase a few examples produced by ScatterBrain.
test OP
This opaque predicate is constructed around the behavior of the testinstruction when paired with an immediate zero value. Given that the testinstruction effectively performs a bitwise AND operation, the obfuscator exploits the fact that any value bitwise AND-ed with zero always invariably results in zero.
Here are some abstracted examples we can find in a protected binary—abstracted in the sense that all instructions are not guaranteed to follow one another directly; other forms of mutations can be between them as can instruction dispatchers.
test bl, 0
jnp loc_56C96 ; we never satisfy these conditions
------------------------------
test r8, 0
jo near ptr loc_3CBC8
------------------------------
test r13, 0
jnp near ptr loc_1A834
------------------------------
test eax, 0
jnz near ptr loc_46806
Figure 10: Test opaque predicate examples
To grasp the implementation logic of this opaque predicate, the semantics of the testinstruction and its effects on the processor’s flags register are required. The instruction can affect six different flags in the following manner:
Overflow Flag (OF): Always cleared
Carry Flag (CF): Always cleared
Sign Flag (SF): Set if the most significant bit (MSB) of the result is set; otherwise cleared
Zero Flag (ZF): Set if the result is 0; otherwise cleared
Parity Flag (PF): Set if the number of set bits in the least significant byte (LSB) of the result is even; otherwise cleared
Auxiliary Carry Flag (AF): Undefined
Applying this understanding to the sequences produced by ScatterBrain, it is evident that the generated conditions can never be logically satisfied:
Sequence
Condition Description
test <reg>, 0; jo
OFis always cleared
test <reg>, 0; jnae/jc/jb
CFis always cleared
test <reg>, 0; js
Resulting value will always be zero; therefore, SFcan never be set
test <reg>, 0; jnp/jpo
The number of bits in zero is always zero, which is an even number; therefore, PFcan never be set
test <reg>, 0; jne/jnz
Resulting value will always be zero; therefore, ZFwill always be set
Table 1: Test opaque predicate understanding
jcc OP
The opaque predicate is designed to statically obscure the original immediate branch targets for conditional branch (jcc) instructions. Consider the following examples:
test eax, eax
ja loc_3BF9C
ja loc_2D154
test r13, r13
jns loc_3EA84
jns loc_53AD9
test eax, eax
jnz loc_99C5
jnz loc_121EC
cmp eax, FFFFFFFF
jz loc_273EE
jz loc_4C227
Figure 11: jcc opaque predicate examples
The implementation is straightforward: each original jccinstruction is duplicated with a bogus branch target. Since both jccinstructions are functionally identical except for their respective branch destinations, we can determine with certainty that the first jccin each pair is the original instruction. This original jccdictates the correct branch target to follow when the respective condition is met, while the duplicated jccserves to confuse analysis tools by introducing misleading branch paths.
Stack-Based OP
The stack-based opaque predicate is designed to check whether the current stack pointer (rsp) is below a predetermined immediate threshold—a condition that can never be true. It is consistently implemented by pairing the cmp rsp instruction with a jb (jump if below) condition immediately afterward.
cmp rsp, 0x8d6e
jb near ptr unk_180009FDA
Figure 12: Stack-based opaque predicate example
This technique inserts conditions that are always false, causing CFG algorithms to follow both branches and thereby disrupt their ability to accurately reconstruct the control flow.
Import Protection
The obfuscator implements a sophisticated import protection layer. This mechanism conceals the binary ‘s dependencies by transforming each original callor jmpinstruction directed at an import through a unique stub dispatcher routine that knows how to dynamically resolve and invoke the import in question.
Figure 13: Illustration of all the components involved in the import protection
It consists of the following components:
Import-specific encrypteddata: Each protected import is represented by a unique dispatcher stub and a scattered data structure that stores RVAs to both the encrypted dynamic-link library (DLL) and application programming interface (API) names. We refer to this structure as obf_imp_t. Each dispatcher stub is hardcoded with a reference to its respective obf_imp_t.
Dispatcher stub: This is an obfuscated stub that dynamically resolves and invokes the intended import. While every stub shares an identical template, each contains a unique hardcoded RVA that identifies and locates its corresponding obf_imp_t.
Resolver routine: Called from the dispatcher stub, this obfuscated routine resolves the import and returns it to the dispatcher, which facilitates the final call to the intended import. It begins by locating the encrypted DLL and API names based on the information in obf_imp_t. After decrypting these names, the routine uses them to resolve the memory address of the API.
Import decryption routine: Called from the resolver routine, this obfuscated routine is responsible for decrypting the DLL and API name blobs through a custom stream cipher implementation. It uses a hardcoded 32-bit salt that is unique per protected sample.
Fixup Table: Present only in headerless mode, this is a relocation fixup table that the loader in headerless mode uses to correct all memory displacements to the following import protection components:
Encrypted DLL names
Encrypted API names
Import dispatcher references
Dispatcher Stub
The core of the import protection mechanism is the dispatcher stub. Each stub is tailored to an individual import and consistently employs a leainstruction to access its respective obf_imp_t, which it passes as the only input to the resolver routine.
push rcx ; save RCX
lea rcx, [rip+obf_imp_t] ; fetch import-specific obf_imp_t
push rdx ; save all other registers the stub uses
push r8
push r9
sub rsp, 28h
call ObfImportResolver ; resolve the import and return it in RAX
add rsp, 28h
pop r9 ; restore all saved registers
pop r8
pop rdx
pop rcx
jmp rax ; invoke resolved import
Figure 14: Deobfuscated import dispatcher stub
Each stub is obfuscated through the mutation mechanisms outlined earlier. This applies to the resolver and import decryption routines as well. The following is what the execution flow of a stub can look like. Note the scattered addresses that while presented sequentially are actually jumping all around the code segment due to the instruction dispatchers.
obf_imp_tis the central data structure that contains the relevant information to resolve each import. It has the following form:
struct obf_imp_t { // sizeof=0x18
uint32_t CryptDllNameRVA; // NOTE: will be 64-bits, due to padding
uint32_t CryptAPINameRVA; // NOTE: will be 64-bits, due to padding
uint64_t ResolvedImportAPI; // Where the resolved address is stored
};
Figure 16: obf_imp_t in its original C struct source form
It is processed by the resolver routine, which uses the embedded RVAs to locate the encrypted DLL and API names, decrypting each in turn. After decrypting each name blob, it usesLoadLibraryAto ensure the DLL dependency is loaded in memory and leveragesGetProcAddressto retrieve the address of the import.
The import decryption logic is implemented using a Linear Congruential Generator (LCG) algorithm to generate a pseudo-random key stream, which is then used in a XOR-based stream cipher for decryption. It operates on the following formula:
Xn + 1 = (a • Xn+ c) mod 232
where:
ais always hardcoded to 17and functions as the multiplier
cis a unique 32-bit constant determined by the encryption context and is unique per-protected sample
We refer to it as the imp_decrypt_const
mod 232 confines the sequence values to a 32-bit range
The decryption logic initializes with a value from the encrypted data and iteratively generates new values using the outlined LCG formula. Each iteration produces a byte derived from the calculated value, which is then XOR’ed with the corresponding encrypted byte. This process continues byte-by-byte until it reaches a termination condition.
A fully recovered Python implementation for the decryption logic is provided in Figure 18.
Figure 18: Complete Python implementation of the import string decryption routine
Import Fixup Table
The import relocation fixup table is a fixed-size array composed of two 32-bit RVA entries. The first RVA represents the memory displacement of where the data is referenced from. The second RVA points to the actual data in question. The entries in the fixup table can be categorized into three distinct types, each corresponding to a specific import component:
Encrypted DLL names
Encrypted API names
Import dispatcher references
Figure 19: Illustration of the import fixup table
The location of the fixup table is determined by the loader’s metadata, which specifies an offset from the start of the .data section to the start of the table. During initialization, the loader is responsible for applying the relocation fixups for each entry in the table.
Figure 20: Loader metadata that shows the Import fixup table entries and metadata used to find it
Recovery
Effective recovery from an obfuscated binary necessitates a thorough understanding of the protection mechanisms employed. While deobfuscation often benefits from working with an intermediate representation (IR) rather than the raw disassembly—an IR provides more granular control in undoing transformations—this obfuscator preserves the original compiled code, merely enveloping it with additional protection layers. Given this context, our deobfuscation strategy focuses on stripping away the obfuscator’s transformations from the disassembly to reveal the original instructions and data. This is achieved through a series of hierarchical phases, where each subsequent phase builds upon the previous one to ensure comprehensive deobfuscation.
We categorize this approach into three distinct categories that we eventually integrate:
CFG Recovery
Restoring the natural control flow by removing obfuscation artifacts at the instruction and basic block levels. This involves two phases:
Accounting for instruction dispatchers: Addressing the core of control flow protection that obscure the execution flow
Function identification andrecovery: Cataloging scattered instructions and reassembling them into their original function counterparts
Import Recovery
Original Import Table: The goal is to reconstruct the original import table, ensuring that all necessary library and function references are accurately restored.
Binary Rewriting
Generating Deobfuscated Executables: This process entails creating a new, deobfuscated executable that maintains the original functionality while removing ScatterBrain’s modifications.
Given the complexity of each category, we concentrate on the core aspects necessary to break the obfuscator by providing a guided walkthrough of our deobfuscator’s source code and highlighting the essential logic required to reverse these transformations. This step-by-step examination demonstrates how each obfuscation technique is methodically undone, ultimately restoring the binary’s original structure.
Our directory structure reflects this organized approach:
Figure 21: Directory structure of our deobfuscator library
This comprehensive recovery process not only restores the binaries to their original state but also equips analysts with the tools and knowledge necessary to combat similar obfuscation techniques in the future.
CFG Recovery
The primary obstacle disrupting the natural control flow graph is the use of instruction dispatchers. Eliminating these dispatchers is our first priority in obtaining the CFG. Afterward, we need to reorganize the scattered instructions back into their original function representations—a problem known as function identification, which is notoriously difficult to generalize. Therefore, we approach it using our specific knowledge about the obfuscator.
Linearizing the Scattered CFG
Our initial step in recovering the original CFG is to eliminate the scattering effect induced by instruction dispatchers. We will transform all dispatcher call instructions into direct branches to their resolved targets. This transformation linearizes the execution flow, making it straightforward to statically pursue the second phase of our CFG recovery. This will be implemented via brute-force scanning, static parsing, emulation, and instruction patching.
Function Identification and Recovery
We leverage a recursive descent algorithm that employs a depth-first search (DFS) strategy applied to known entry points of code, attempting to exhaust all code paths by “single-stepping” one instruction at a time. We add additional logic to the processing of each instruction in the form of “mutation rules” that stipulate how each individual instruction needs to be processed. These rules aid in stripping away the obfuscator’s code from the original.
Removing Instruction Dispatchers
Eliminating instruction dispatchers involves identifying each dispatcher location and its corresponding dispatch target. Recall that the target is a uniquely encoded 32-bit displacement located at the return address of the dispatcher call. To remove instruction dispatchers, it is essential to first understand how to accurately identify them. We begin by categorizing the defining properties of individual instruction dispatchers:
Target of a Near Call
Dispatchers are always the destination of a near call instruction, represented by the E8 opcode followed by a 32-bit displacement.
References Encoded 32-Bit Displacement at Return Address
Dispatchers reference the encoded 32-bit displacement located at the return address on the stack by performing a 32-bit read from the stack pointer. This displacement is essential for determining the next execution target.
Pairing of pushfqand popfqInstructions to Safeguard Decoding
Dispatchers use a pair of pushfq and popfq instructions to preserve the state of the RFLAGS register during the decoding process. This ensures that the dispatcher does not alter the original execution context, maintaining the integrity of register contents.
End with aretInstruction
Each dispatcher concludes with a ret instruction, which not only ends the dispatcher function but also redirects control to the next set of instructions, effectively continuing the execution flow.
Leveraging the aforementioned categorizations, we implement the following approach to identify and remove instruction dispatchers:
Brute-Force Scanner for Near Call Locations
Develop a scanner that searches for all near call instructions within the code section of the protected binary. This scanner generates a huge array of potential call locations that may serve as dispatchers.
Implementation of a Fingerprint Routine
The brute-force scan yields a large number of false positives, requiring an efficient method to filter them. While emulation can filter out false positives, it is computationally expensive to do it for the brute-force results.
Introduce a shallow fingerprinting routine that traverses the disassembly of each candidate to identify key dispatcher characteristics, such as the presence of pushfq and popfq sequences. This significantly improves performance by eliminating most false positives before concretely verifying them through emulation.
Emulation of Targets to Recover Destinations
Emulate execution starting from each verified call site to accurately recover the actual dispatch targets. Emulating from the call site ensures that the emulator processes the encoded offset data at the return address, abstracting away the specific decoding logic employed by each dispatcher.
A successful emulation also serves as the final verification step to confirm that we have identified a dispatcher.
Identification of Dispatch Targets via ret Instructions
Utilize the terminating ret instruction to accurately identify the dispatch target within the binary.
The ret instruction is a definitive marker indicating the end of a dispatcher function and the point at which control is redirected, making it a reliable indicator for target identification.
Brute-Force Scanner
The following Python code implements the brute-force scanner, which performs a comprehensive byte signature scan within the code segment of a protected binary. The scanner systematically identifies all potential callinstruction locations by scanning for the 0xE8 opcode associated with near call instructions. The identified addresses are then stored for subsequent analysis and verification.
Figure 22: Python implementation of the brute-force scanner
Fingerprinting Dispatchers
The fingerprinting routine leverages the unique characteristics of instruction dispatchers, as detailed in the Instruction Dispatchers section, to statically identify potential dispatcher locations within a protected binary. This identification process utilizes the results from the prior brute-force scan. For each address in this array, the routine disassembles the code and examines the resulting disassembly listing to determine if it matches known dispatcher signatures.
This method is not intended to guarantee 100% accuracy, but rather serve as a cost-effective approach to identifying call locations with a high likelihood of being instruction dispatchers. Subsequent emulation will be employed to confirm these identifications.
Successful Decoding of a callInstruction
The identified location must successfully decode to a call instruction. Dispatchers are always invoked via a call instruction. Additionally, dispatchers utilize the return address from the call site to locate their encoded 32-bit displacement.
Absence of Subsequent callInstructions
Dispatchers must not contain any call instructions within their disassembly listing. The presence of any call instructions within a presumed dispatcher range immediately disqualifies the call location as a dispatcher candidate.
Absence of Privileged Instructions and Indirect Control Transfers
Similarly to call instructions, the dispatcher cannot include privileged instructions or indirect unconditional jmps. Any presence of any such instructions invalidates the call location.
Detection of pushfqand popfqGuard Sequences
The dispatcher must contain pushfq and popfq instructions to safeguard the RFLAGS register during decoding. These sequences are unique to dispatchers and suffice for a generic identification without worrying about the differences that arise between how the decoding takes place.
Figure 23 is the fingerprint verification routine that incorporates all the aforementioned characteristics and validation checks given a potential call location:
Figure 23: The dispatch fingerprint routine
Emulating Dispatchers to Resolve Destination Targets
After filtering potential dispatchers using the fingerprinting routine, the next step is to emulate them in order to recover their destination targets.
Figure 24: Emulation sequence used to recover dispatcher destination targets
The Python code in Figure 24 performs this logic and operates as follows:
Initialization of the Emulator
Creates the core engine for simulating execution (EmulateIntel64), maps the protected binary image (imgbuffer) into the emulator’s memory space, maps the Thread Environment Block (TEB) as well to simulate a realistic Windows execution environment, and creates an initial snapshot to facilitate fast resets before each emulation run without needing to reinitialize the entire emulator each time.
MAX_DISPATCHER_RANGE specifies the maximum number of instructions to emulate for each dispatcher. The value 45 is chosen arbitrarily, sufficient given the limited instruction count in dispatchers even with the added mutations.
A try/except block is used to handle any exceptions during emulation. It is assumed that exceptions result from false positives among the potential dispatchers identified earlier and can be safely ignored.
Emulating Each Potential Dispatcher
For each potential dispatcher address (call_dispatch_ea), the emulator’s context is restored to the initial snapshot. The program counter (emu.pc) is set to the address of each dispatcher. emu.stepi() executes one instruction at the current program counter, after which the instruction is analyzed to determine whether we have finished.
If the instruction is a ret, the emulation has reached the dispatch point.
The dispatch target address is read from the stack using emu.parse_u64(emu.rsp).
The results are captured by d.dispatchers_to_target, which maps the dispatcher address to the dispatch target. The dispatcher address is additionally stored in the d.dispatcher_locs lookup cache.
The break statement exits the inner loop, proceeding to the next dispatcher.
Patching and Linearization
After collecting and verifying every captured instruction dispatcher, the final step is to replace each call location with a direct branch to its respective destination target. Since both near call and jmp instructions occupy 5 bytes in size, this replacement can be seamlessly performed by merely patching the jmp instruction over the call.
Figure 25: Patching sequence to transform instruction dispatcher calls to unconditional jmps to their destination targets
We utilize the dispatchers_to_target map, established in the previous section, which associates each dispatcher call location with its corresponding destination target. By iterating through this map, we identify each dispatcher call location and replace the original call instruction with a jmp. This substitution redirects the execution flow directly to the intended target addresses.
This removal is pivotal to our deobfuscation strategy as it removes the intended dynamic dispatch element that instruction dispatchers were designed to provide. Although the code is still scattered throughout the code segment, the execution flow is now statically deterministic, making it immediately apparent which instruction leads to the next one.
When we compare these results to the initial screenshot from the Instruction Dispatcher section, the blocks still appear scattered. However, their execution flow has been linearized. This progress allows us to move forward to the second phase of our CFG recovery.
Figure 26: Linearized instruction dispatcher control flow
Function Identification and Recovery
By eliminating the effects of instruction dispatchers, we have linearized the execution flow. The next step involves assimilating the dispersed code and leveraging the linearized control flow to reconstruct the original functions that comprised the unprotected binary. This recovery phase involves several stages, including raw instruction recovery, normalization, and the construction of the final CFG.
Function identification and recovery is encapsulated in the following two abstractions:
Recovered instruction (RecoveredInstr): The fundamental unit for representing individual instructions recovered from an obfuscated binary. Each instance encapsulates not only the raw instruction data but also metadata essential for relocation, normalization, and analysis within the CFG recovery process.
Recovered function (RecoveredFunc): The end result of successfully recovering an individual function from an obfuscated binary. It aggregates multiple RecoveredInstr instances, representing the sequence of instructions that constitute the unprotected function. The complete CFG recovery process results in an array of RecoveredFunc instances, each corresponding to a distinct function within the binary. We will utilize these results in the final Building Relocations in Deobfuscated Binaries section to produce fully deobfuscated binaries.
We do not utilize a basic block abstraction for our recovery approach given the following reasons. Properly abstracting basic blocks presupposes complete CFG recovery, which introduces unnecessary complexity and overhead for our purposes. Instead, it is simpler and more efficient to conceptualize a function as an aggregation of individual instructions rather than a collection of basic blocks in this particular deobfuscation context.
Figure 27: RecoveredInstr type definition
Figure 28: RecoveredFunc type definition
DFS Rule-Guided Stepping Introduction
We opted for a recursive-depth algorithm given the following reasons:
Natural fit for code traversal: DFS allows us to infer function boundaries based solely on the flow of execution. It mirrors the way functions call other functions, making it intuitive to implement and reason about when reconstructing function boundaries. It also simplifies following the flow of loops and conditional branches.
Guaranteed execution paths: We concentrate on code that is definitely executed. Given we have at least one known entry point into the obfuscated code, we know execution must pass through it in order to reach other parts of the code. While other parts of the code may be more indirectly invoked, this entry point serves as a foundational starting point.
By recursively exploring from this known entry, we will almost certainly encounter and identify virtually all code paths and functions during our traversal.
Adapts to instruction mutations: We tailor the logic of the traversal with callbacks or “rules” that stipulate how we process each individual instruction. This helps us account for known instruction mutations and aids in stripping away the obfuscator’s code.
The core data structures involved in this process are the following: CFGResult, CFGStepState, and RuleHandler:
CFGResult: Container for the results of the CFG recovery process. It aggregates all pertinent information required to represent the CFG of a function within the binary, which it primarily consumes from CFGStepState.
CFGStepState: Maintains the state throughout the CFG recovery process, particularly during the controlled-step traversal. It encapsulates all necessary information to manage the traversal state, track progress, and store intermediate results.
Recovered cache: Stores instructions that have been recovered for a protected function without any additional cleanup or verification. This initial collection is essential for preserving the raw state of the instructions as they exist within the obfuscated binary before any normalization or validation processes are applied after. It is always the first pass of recovery.
Normalized cache: The final pass in the CFG recovery process. It transforms the raw instructions stored in the recovered cache into a fully normalized CFG by removing all obfuscator-introduced instructions and ensuring the creation of valid, coherent functions.
Exploration stack: Manages the set of instruction addresses that are pending exploration during the DFS traversal for a protected function. It determines the order in which instructions are processed and utilizes a visited set to ensure that each instruction is processed only once.
Obfuscator backbone: A mapping to preserve essential control flow links introduced by the obfuscator
RuleHandler: Mutation rules are merely callbacks that adhere to a specific function signature and are invoked during each instruction step of the CFG recovery process. They take as input the current protected binary, CFGStepState, and the current step-in instruction. Each rule contains specific logic designed to detect particular types of instruction characteristics introduced by the obfuscator. Based on the detection of these characteristics, the rules determine how the traversal should proceed. For instance, a rule might decide to continue traversal, skip certain instructions, or halt the process based on the nature of the mutation.
Figure 29: CFGResult type definition
Figure 30: CFGStepState type definition
Figure 31: RuleHandler type definition
The following figure is an example of a rule that is used to detect the patched instruction dispatchers we introduced in the previous section and differentiating them from standard jmpinstructions:
Figure 32: RuleHandler example that identifies patched instruction dispatchers and differentiates them from standard jmp instructions
DFS Rule-Guided Stepping Implementation
The remaining component is a routine that orchestrates the CFG recovery process for a given function address within the protected binary. It leverages the CFGStepState to manage the DFS traversal and applies mutation rules to decode and recover instructions systematically. The result will be an aggregate of RecoveredInstrinstances that constitute the first pass of raw recovery:
Figure 33: Flow chart of our DFS rule-guided stepping algorithm
The following Python code directly implements the algorithm outlined in Figure 33. It initializes the CFG stepping state and commences a DFS traversal starting from the function’s entry address. During each step of the traversal, the current instruction address is retrieved from the to_explore exploration stack and checked against the visitedset to prevent redundant processing. The instruction at the current address is then decoded, and a series of mutation rules are applied to handle any obfuscator-induced instruction modifications. Based on the outcomes of these rules, the traversal may continue, skip certain instructions, or halt entirely.
Recovered instructions are appended to the recoveredcache, and their corresponding mappings are updated within the CFGStepState. The to_explore stack is subsequently updated with the address of the next sequential instruction to ensure systematic traversal. This iterative process continues until all relevant instructions have been explored, culminating in a CFGResult that encapsulates the fully recovered CFG.
With the raw instructions successfully recovered, the next step is to normalize the control flow. While the raw recovery process ensures that all original instructions are captured, these instructions alone do not form a cohesive and orderly function. To achieve a streamlined control flow, we must filter and refine the recovered instructions—a process we refer to as normalization. This stage involves several key tasks:
Updating branch targets: Once all of the obfuscator-introduced code (instruction dispatchers and mutations) are fully removed, all branch instructions must be redirected to their correct destinations. The scattering effect introduced by obfuscation often leaves branches pointing to unrelated code segments.
Merging overlapping basic blocks: Contrary to the idea of a basic block as a strictly single-entry, single-exit structure, compilers can produce code in which one basic block begins within another. This overlapping of basic blocks commonly appears in loop structures. As a result, these overlaps must be resolved to ensure a coherent CFG.
Proper function boundary instruction: Each function must begin and end at well-defined boundaries within the binary’s memory space. Correctly identifying and enforcing these boundaries is essential for accurate CFG representation and subsequent analysis.
Simplifying with Synthetic Boundary Jumps
Rather than relying on traditional basic block abstractions—which can impose unnecessary overhead—we employ synthetic boundary jumps to simplify CFG normalization. These artificial jmp instructions link otherwise disjointed instructions, allowing us to avoid splitting overlapping blocks and ensuring that each function concludes at a proper boundary instruction. This approach also streamlines our binary rewriting process when reconstructing the recovered functions into the final deobfuscated output binary.
Merging overlapping basic blocks and ensuring functions have proper boundary instructions amount to the same problem—determining which scattered instructions should be linked together. To illustrate this, we will examine how synthetic jumps effectively resolve this issue by ensuring that functions conclude with the correct boundary instructions. The exact same approach applies to merging basic blocks together.
Synthetic Boundary Jumps to Ensure Function Boundaries
Consider an example where we have successfully recovered a function using our DFS-based rule-guided approach. Inspecting the recovered instructions in the CFGState reveals a mov instruction as the final operation. If we were to reconstruct this function in memory as-is, the absence of a subsequent fallthrough instruction would compromise the function’s logic.
Figure 35: Example of a raw recovery that does not end with a natural function boundary instruction
To address this, we introduce a synthetic jump whenever the last recovered instruction is not a natural function boundary (e.g., ret, jmp, int3).
Figure 36: Simple Python routine that identifies function boundary instructions
We determine the fallthrough address, and if it points to an obfuscator-introduced instruction, we continue forward until reaching the first regular instruction. We call this traversal “walking the obfuscator’s backbone”:
Figure 37: Python routine that implements walking the obfuscator’s backbone logic
We then link these points with a synthetic jump. The synthetic jump inherits the original address as metadata, effectively indicating which instruction it is logically connected to.
Figure 38: Example of adding a synthetic boundary jmp to create a natural function boundary
Updating Branch Targets
After normalizing the control flow, adjusting branch targets becomes a straightforward process. Each branch instruction in the recovered code may still point to obfuscator-introduced instructions rather than the intended destinations. By iterating through thenormalized_flowcache (generated in the next section), we identify branching instructions and verify their targets using thewalk_backboneroutine.
This ensures that all branch targets are redirected away from the obfuscator’s artifacts and correctly aligned with the intended execution paths. Notice we can ignore callinstructions given that any non-dispatcher callinstruction is guaranteed to always be legitimate and never part of the obfuscator’s protection. These will, however, need to be updated during the final relocation phase outlined in the Building Relocations in Deobfuscated Binaries section.
Once recalculated, we reassemble and decode the instructions with updated displacements, preserving both correctness and consistency.
Figure 39: Python routine responsible for updating all branch targets
Putting It All Together
Putting it all together, we developed the following algorithm that builds upon the previously recovered instructions, ensuring that each instruction, branch, and block is properly connected, resulting in a completely recovered and deobfuscated CFG for an entire protected binary. We utilize the recovered cache to construct a new, normalized cache. The algorithm employs the following steps:
Iterate Over All Recovered Instructions
Traverse all recovered instructions produced from our DFS-based stepping approach.
Add Instruction to Normalized Cache
For each instruction, add it to the normalized cache, which captures the results of the normalization pass.
Identify Boundary Instructions
Determine whether the current instruction is a boundary instruction.
If it is a boundary instruction, skip further processing of this instruction and continue to the next one (return to Step 1).
Calculate Expected Fallthrough Instruction
Determine the expected fallthrough instruction by identifying the sequential instruction that follows the current one in memory.
Verify Fallthrough Instruction
Compare the calculated fallthrough instruction with the next instruction in the recovered cache.
If the fallthrough instruction is not the next sequential instruction in memory,check whether it’s a recovered instruction we already normalized:
If it is, add a synthetic jump to link the two together in the normalized cache.
If it is not, obtain the connecting fallthrough instruction from the recovery cache and append it to the normalized cache.
If the fallthrough instruction matches the next instruction in the recovered cache:
Do nothing, as the recovered instruction already correctly points to the fallthrough. Proceed to Step 6.
Handle Final Instruction
Check if the current instruction is the final instruction in the recovered cache.
If it is the final instruction:
Add a final synthetic boundary jump, because if we reach this stage, we failed the check in Step 3.
Continue iteration, which will cause the loop to exit.
If it is not the final instruction:
Continue iteration as normal (return to Step 1).
Figure 40: Flow chart of our normalization algorithm
The Python code in Figure 41 directly implements these normalization steps. It iterates over the recovered instructions and adds them to a normalized cache (normalized_flow), creates a linear mapping, and identifies where synthetic jumps are required. When a branch target points to obfuscator-injected code, it walks the backbone (walk_backbone) to find the next legitimate instruction. If the end of a function is reached without a natural boundary, a synthetic jump is created to maintain proper continuity. After the completion of the iteration, every branch target is updated (update_branch_targets), as illustrated in the previous section, to ensure that each instruction is correctly linked, resulting in a fully normalized CFG:
Figure 41: Python implementation of our normalization algorithm
Observing the Results
After applying our two primary passes, we have nearly eliminated all of the protection mechanisms. Although import protection remains to be addressed, our approach effectively transforms an incomprehensible mess into a perfectly recovered CFG.
For example, Figure 42 and Figure 43 illustrate the before and after of a critical function within the backdoor payload, which is a component of its plugin manager system. Through additional analysis of the output, we can identify functionalities that would have been impossible to delineate, much less in such detail, without our deobfuscation process.
Figure 42: Original obfuscated shadow::PluginProtocolCreateAndConfigure routine
Figure 43: Completely deobfuscated and functional shadow::PluginProtocolCreateAndConfigure routine
Import Recovery
Recovering and restoring the original import table revolves around identifying which import location is associated with which import dispatcher stub. From the stub dispatcher, we can parse the respective obf_imp_treference in order to determine the protected import that it represents.
We pursue the following logic:
Identify each valid call/jmp location associated to an import
The memory displacement for these will point to the respective dispatcher stub.
For HEADERLESS mode, we need to first resolve the fixup table to ensure the displacement points to a valid dispatcher stub.
For each valid location traverse the dispatcher stub to extract the obf_imp_t
The obf_imp_tcontains the RVAs to the encrypted DLL and API names.
Implement the string decryption logic
We need to reimplement the decryption logic in order to recover the DLL and API names.
This was already done in the initial Import Protection section.
We encapsulate the recovery of imports with the following RecoveredImportdata structure:
Figure 44: RecoveredImport type definition
RecoveredImportserves as the result produced for each import that we recover. It contains all the relevant data that we will use to rebuild the original import table when producing the deobfuscated image.
Locate Protected Import CALL and JMP Sites
Each protected import location will be reflected as either an indirect near call (FF/2) or an indirect near jmp (FF/4):
Figure 45: Disassembly of import calls and jmps representation
Indirect near calls and jmps fall under the FF group opcode where the Reg field within the ModR/M byte identifies the specific operation for the group:
/2: corresponds to CALL r/m64
/4: corresponds to JMP r/m64
Taking an indirect near call as an example and breaking it down looks like the following:
FF: group opcode.
15: ModR/M byte specifying CALL r/m64 with RIP-relative addressing.
15 is encoded in binary as 00010101
Mod (bits 6-7): 00
Indicates either a direct RIP-relative displacement or memory addressing with no displacement.
Reg (bits 3-5): 010
Identifies the call operation for the group
R/M (bits 0-2): 101
In 64-bit mode with Mod 00and R/M101, this indicates RIP-relative addressing.
<32-bit displacement>: added to RIPto compute the absolute address.
To find each protected import location and their associated dispatcher stubs we implement a trivial brute force scanner that locates all potential indirect near call/jmps via their first two opcodes.
Figure 46: Brute-force scanner to locate all possible import locations
The provided code scans the code section of a protected binary to identify and record all locations with opcode patterns associated with indirect call and jmp instructions. This is the first step we take, upon which we apply additional verifications to guarantee it is a valid import site.
Resolving the Import Fixup Table
We have to resolve the fixup table when we recover imports for the HEADERLESS protection in order to identify which import location is associated with which dispatcher. The memory displacement at the protected import site will be paired with its resolved location inside the table. We use this displacement as a lookup into the table to find its resolved location.
Let’s take a jmpinstruction to a particular import as an example.
Figure 47: Example of a jmp import instruction including its entry in the import fixup table and the associated dispatcher stub
The jmpinstruction’s displacement references the memory location 0x63A88, which points to garbage data. When we inspect the entry for this import in the fixup table using the memory displacement, we can identify the location of the dispatcher stub associated with this import at 0x295E1. The loader will update the referenced data at 0x63A88with 0x295E1, so that when the jmpinstruction is invoked, execution is appropriately redirected to the dispatcher stub.
Figure 48 is the deobfuscated code in the loader responsible for resolving the fixup table. We need to mimic this behavior in order to associate which import location targets which dispatcher.
$_Loop_Resolve_ImpFixupTbl:
mov ecx, [rdx+4] ; fixup , either DLL, API, or ImpStub
mov eax, [rdx] ; target ref loc that needs to be "fixed up"
inc ebp ; update the counter
add rcx, r13 ; calculate fixup fully (r13 is imgbase)
add rdx, 8 ; next pair entry
mov [r13+rax+0], rcx ; update the target ref loc w/ full fixup
movsxd rax, dword ptr [rsi+18h] ; fetch imptbl total size, in bytes
shr rax, 3 ; account for size as a pair-entry
cmp ebp, eax ; check if done processing all entries
jl $_Loop_Resolve_ImpTbl
Figure 48: Deobfuscated disassembly of the algorithm used to resolve the import fixup table
Resolving the import fixup table requires us to have first identified the data section within the protected binary and the metadata that identifies the import table (IMPTBL_OFFSET, IMPTBL_SIZE). The offset to the fixup table is from the start of the data section.
Figure 49: Python re-implementation of the algorithm used to resolve the import fixup table
Having the start of the fixup table, we simply iterate one entry at a time and identify which import displacement (location) is associated with which dispatcher stub (fixup).
Recovering the Import
Having obtained all potential import locations from the brute-force scan and accounted for relocations in HEADERLESS mode, we can proceed with the final verifications to recover each protected import. The recovery process is conducted as follows:
Decode the location into a valid call or jmp instruction
Any failure in decoding indicates that the location does not contain a valid instruction and can be safely ignored.
Use the memory displacement to locate the stub for the import
In HEADERLESS mode, each displacement serves as a lookup key into the fixup table for the respective dispatcher.
Extract the obf_imp_tstructure within the dispatcher
This is achieved by statically traversing a dispatcher’s disassembly listing.
The first lea instruction encountered will contain the reference to the obf_imp_t.
Process the obf_imp_tto decrypt both the DLL and API names
Utilize the two RVAs contained within the structure to locate the encrypted blobs for the DLL and API names.
Decrypt the blobs using the outlined import decryption routine.
Figure 50: Loop that recovers each protected import
The Python code iterates through every potential import location (potential_stubs) and attempts to decode each presumed call or jmp instruction to an import. A try/except block is employed to handle any failures, such as instruction decoding errors or other exceptions that may arise. The assumption is that any error invalidates our understanding of the recovery process and can be safely ignored. In the full code, these errors are logged and tracked for further analysis should they arise.
Next, the code invokes a GET_STUB_DISPLACEMENT helper function that obtains the RVA to the dispatcher associated with the import. Depending on the mode of protection, one of the following routines is used:
Figure 51: Routines that retrieve the stub RVA based on the protection mode
The recover_import_stubfunction is utilized to reconstruct the control flow graph (CFG) of the import stub, while _extract_lea_refexamines the instructions in the CFG to locate the leareference to the obf_imp_t. The GET_DLL_API_NAMESfunction operates similarly to GET_STUB_DISPLACEMENT, accounting for slight differences depending on the protection mode:
Figure 52: Routines that decrypt the DLL and API blobs based on the protection mode
After obtaining the decrypted DLL and API names, the code possesses all the necessary information to reveal the import that the protection conceals. The final individual output of each import entry is captured in a RecoveredImport object and two dictionaries:
d.imports
This dictionary maps the address of each protected import to its recovered state. It allows for the association of the complete recovery details with the specific location in the binary where the import occurs.
d.imp_dict_builder
This dictionary maps each DLL name to a set of its corresponding API names. It is used to reconstruct the import table, ensuring a unique set of DLLs and the APIs utilized by the binary.
This systematic collection and organization prepare the necessary data to facilitate the restoration of the original functionality in the deobfuscated output. In Figure 53 and Figure 54, we can observe these two containers to showcase their structure after a successful recovery:
Figure 53: Output of the d.imports dictionary after a successful recovery
Figure 54: Output of the d.imp_dict_builder dictionary after a successful recovery
Observing the Final Results
This final step—rebuilding the import table using this data—is performed by the build_import_table function in the pefile_utils.py source file. This part is omitted from the blog post due to its unavoidable length and the numerous tedious steps involved. However, the code is well-commented and structured to thoroughly address and showcase all aspects necessary for reconstructing the import table.
Nonetheless, the following figure illustrates how we generate a fully functional binary from a headerless-protected input. Recall that a headerless-protected input is a raw, headerless PE binary, almost analogous to a shellcode blob. From this blob we produce an entirely new, functioning binary with the entirety of its import protection completely restored. And we can do the same for all protection modes.
Figure 55: Display of completely restored import table for a binary protected in HEADERLESS mode
Building Relocations in Deobfuscated Binaries
Now that we can fully recover the CFG of protected binaries and provide complete restoration of the original import tables, the final phase of the deobfuscator involves merging these elements to produce a functional deobfuscated binary. The code responsible for this process is encapsulated within the recover_output64.py and the pefile_utils.pyPython files.
The rebuild process comprises two primary steps:
Building the Output Image Template
Building Relocations
1. Building the Output Image Template
Creating an output image template is essential for generating the deobfuscated binary. This involves two key tasks:
Template PE Image: A Portable Executable (PE) template that serves as the container for the output binary that incorporates the restoration of all obfuscated components. We also need to be cognizant of all the different characteristics between in-memory PE executables and on-file PE executables.
Handling Different Protection Modes: Different protection modes and input stipulate different requirements.
Headerless variants have their file headers stripped. We must account for these variations to accurately reconstruct a functioning binary.
Selective protection preserves the original imports to maintain functionality as well as includes a specific import protection for all the imports leveraged within the selected functions.
2. Building Relocations
Building relocations is a critical and intricate part of the deobfuscation process. This step ensures that all address references within the deobfuscated binary are correctly adjusted to maintain functionality. It generally revolves around the following two phases:
Calculating Relocatable Displacements: Identifying all memory references within the binary that require relocation. This involves calculating the new addresses where these references will point to. The technique we will use is generating a lookup table that maps original memory references to their new relocatable addresses.
Apply Fixups: Modifies the binary’s code to reflect the new relocatable addresses. This utilizes the aforementioned lookup table to apply necessary fixups to all instruction displacements that reference memory. This ensures that all memory references within the binary correctly point to their intended locations.
We intentionally omit the details of showcasing the rebuilding of the output binary image because, while essential to the deobfuscation process, it is straightforward enough and just overly tedious to be worthwhile examining in any depth. Instead, we focus exclusively on relocations, as they are more nuanced and reveal important characteristics that are not as apparent but must be understood when rewriting binaries.
Overview of the Relocation Process
Rebuilding relocations is a critical step in restoring a deobfuscated binary to an executable state. This process involves adjusting memory references within the code so that all references point to the correct locations after the code has been moved or modified. On the x86-64 architecture, this primarily concerns instructions that use RIP-relative addressing, a mode where memory references are relative to the instruction pointer.
Relocation is necessary when the layout of a binary changes, such as when code is inserted, removed, or shifted during deobfuscation. Given our deobfuscation approach extracts the original instructions from the obfuscator, we are required to relocate each recovered instruction appropriately into a new code segment. This ensures that the deobfuscated state preserves the validity of all memory references and that the accuracy of the original control and data flow is sustained.
Understanding Instruction Relocation
Instruction relocation revolves around the following:
Instruction’s memory address: the location in memory where an instruction resides.
Instruction’s memory memory references: references to memory locations used by the instruction’s operands.
Consider the following two instructions as illustrations:
Figure 56: Illustration of two instructions that require relocation
Unconditional jmpinstructionThis instruction is located at memory address 0x1000.It references its branch target at address 0x4E22. The displacement encoded within the instruction is 0x3E1D, which is used to calculate the branch target relative to the instruction’s position. Since it employs RIP-relative addressing, the destination is calculated by adding the displacement to the length of the instruction and its memory address.
leainstructionThis is the branch target for the jmpinstruction located at 0x4E22. It also contains a memory reference to the data segment, with an encoded displacement of 0x157.
When relocating these instructions, we must address both of the following aspects:
Changing the instruction’s address: When we move an instruction to a new memory location during the relocation process, we inherently change its memory address. For example, if we relocate this instruction from 0x1000 to 0x2000, the instruction’s address becomes 0x2000.
Adjusting memory displacements: The displacement within the instruction (0x3E1Dfor the jmp, 0x157for the lea) is calculated based on the instruction’s original location and the location of its reference. If the instruction moves, the displacement no longer points to the correct target address. Therefore, we must recalculate the displacement to reflect the instruction’s new position.
Figure 57: Updated illustration demonstration of what relocation would look like
When relocating instructions during the deobfuscation process, we must ensure accurate control flow and data access. This requires us to adjust both the instruction’s memory address and any displacements that reference other memory locations. Failing to update these values invalidates the recovered CFG.
What Is RIP-Relative Addressing?
RIP-relative addressing is a mode where the instruction references memory at an offset relative to the RIP (instruction pointer) register, which points to the next instruction to be executed. Instead of using absolute addresses, the instruction encapsulates the referenced address via a signed 32-bit displacement from the current instruction pointer.
Addressing relative to the instruction pointer exists on x86 as well, but only for control-transfer instructions that support a relative displacement (e.g., JCC conditional instructions, near CALLs, and near JMPs). The x64 ISA extended this to account for almost all memory references being RIP-relative. For example, most data references in x64 Windows binaries are RIP-relative.
An excellent tool to visualize the intricacies of a decoded Intel x64 instruction is ZydisInfo. Here we use it to illustrate how a LEA instruction (encoded as488D151B510600) references RIP-relative memory at 0x6511b.
Figure 58: ZydisInfo output for the lea instruction
For most instructions, the displacement is encoded in the final four bytes of the instruction. When an immediate value is stored at a memory location, the immediate follows the displacement. Immediate values are restricted to a maximum of 32 bits, meaning 64-bit immediates cannot be used following a displacement. However, 8-bit and 16-bit immediate values are supported within this encoding scheme.
Figure 59: ZydisInfo output for the mov instruction storing an immediate operand
Displacements for control-transfer instructions are encoded as immediate operands, with the RIP register implicitly acting as the base. This is evident when decoding a jnz instruction, where the displacement is directly embedded within the instruction and calculated relative to the current RIP.
Figure 60: ZydisInfo output for the jnz instruction with an immediate operand as the displacement
Steps in the Relocation Process
For rebuilding relocations we take the following approach:
Rebuilding the code section and creating a relocation mapWith the recovered CFG and imports, we commit the changes to a new code section that contains the fully deobfuscated code. We do this by:
Function-by-function processing: rebuild each function one at a time. This allows us to manage the relocation of each instruction within its respective function.
Tracking instruction locations: As we rebuild each function, we track the new memory locations of each instruction. This involves maintaining a global relocation dictionary that maps original instruction addresses to their new addresses in the deobfuscated binary. This dictionary is crucial for accurately updating references during the fixup phase.
Applying fixupsAfter rebuilding the code section and establishing the relocation map, we proceed to modify the instructions so that their memory references point to the correct locations in the deobfuscated binary. This restores the binary’s complete functionality and is achieved by adjusting memory references to code or data an instruction may have.
Rebuilding the Code Section and Creating a Relocation Map
To construct the new deobfuscated code segment, we iterate over each recovered function and copy all instructions sequentially, starting from a fixed offset—for example, 0x1000. During this process, we build a global relocation dictionary (global_relocs) that maps each instruction to its relocated address. This mapping is essential for adjusting memory references during the fixup phase.
The global_relocs dictionary uses a tuple as the key for lookups, and each key is associated with the relocated address of the instruction it represents. The tuple consists of the following three components:
Original starting address of the function: The address where the function begins in the protected binary. It identifies the function to which the instruction belongs.
Original instruction address within the function: The address of the instruction in the protected binary. For the first instruction in a function, this will be the function’s starting address.
Synthetic boundary JMP flag: A boolean value indicating whether the instruction is a synthetic boundary jump introduced during normalization. These synthetic instructions were not present in the original obfuscated binary, and we need to account for them specifically during relocation because they have no original address.
Figure 61: Illustration of how the new code segment and relocation map are generated
The following Python code implements the logic outlined in Figure 61. Error handling and logging code has been stripped for brevity.
Figure 62: Python logic that implements the building of the code segment and generation of the relocation map
Initialize current offset Set the starting point in the new image buffer where the code section will be placed. The variable curr_off is initialized to starting_off, which is typically 0x1000. This represents the conventional start address of the .text section in PE files. For SELECTIVE mode, this will be the offset to the start of the protected function.
Iterate over recovered functions Loop through each recovered function in the deobfuscated control flow graph (d.cfg). func_ea is the original function entry address, and rfn is a RecoveredFunc object encapsulating the recovered function’s instructions and metadata.
Handle the function start address first
Set function’s relocated start address: Assign the current offset to rfn.reloc_ea, marking where this function will begin in the new image buffer.
Update global relocation map: Add an entry to the global relocation map d.global_relocs to map the original function address to its new location.
Iterate over each recovered instruction Loop through the normalized flow of instructions within the function. We use the normalized_flow as it allows us to iterate over each instruction linearly as we apply it to the new image.
Set instruction’s relocated address: Assign the current offset to r.reloc_ea, indicating where this instruction will reside in the new image buffer.
Update global relocation map: Add an entry to d.global_relocs for the instruction, mapping its original address to the relocated address.
Update the output image: Write the instruction bytes to the new image buffer d.newimgbuffer at the current offset. If the instruction was modified during deobfuscation (r.updated_bytes), use those bytes; otherwise, use the original bytes (r.instr.bytes).
Advance the offset: Increment curr_off by the size of the instruction to point to the next free position in the buffer and move on to the next instruction until the remainder are exhausted.
Align current offset to 16-byte boundaryAfter processing all instructions in a function, align curr_off to the next 16-byte boundary. We use 8 bytes as an arbitrary pointer-sized value from the last instruction to pad so that the next function won’t conflict with the last instruction of the previous function. This further ensures proper memory alignment for the next function, which is essential for performance and correctness on x86-64 architectures. Then repeat the process from step 2 until all functions have been exhausted.
This step-by-step process accurately rebuilds the deobfuscated binary’s executable code section. By relocating each instruction, the code prepares the output template for the subsequent fixup phase, where references are adjusted to point to their correct locations.
Applying Fixups
After building the deobfuscated code section and relocating each recovered function in full, we apply fixups to correct addresses within the recovered code. This process adjusts the instruction bytes in the new output image so that all references point to the correct locations. It is the final step in reconstructing a functional deobfuscated binary.
We categorize fixups into three distinct categories, based primarily on whether they apply to control flow or data flow instructions. We further distinguish between two types of control flow instructions: standard branching instructions and those introduced by the obfuscator through the import protection. Each type has specific nuances that require tailored handling, allowing us to apply precise logic to each category.
Import Relocations: These involve calls and jumps to recovered imports.
Control Flow Relocations: All standard control flow branching branching instructions.
Data Flow Relocations: Instructions that reference static memory locations.
Using these three categorizations, the core logic boils down to the following two phases:
Resolving displacement fixups
Differentiate between displacements encoded as immediate operands (branching instructions) and those in memory operands (data accesses and import calls).
Calculate the correct fixup values for these displacements using the d.global_relocs map generated prior.
Update the output image buffer
Once the displacements have been resolved, write the updated instruction bytes into the new code segment to reflect the changes permanently.
To achieve this, we utilize several helper functions and lambda expressions. The following is a step-by-step explanation of the code responsible for calculating the fixups and updating the instruction bytes.
Figure 63: Helper routines that aid in applying fixups
Define lambda helper expressions
PACK_FIXUP: packs a 32-bit fixup value into a little-endian byte array.
CALC_FIXUP: calculates the fixup value by computing the difference between the destination address (dest) and the end of the current instruction (r.reloc_ea + size), ensuring it fits within 32 bits.
IS_IN_DATA: checks if a given address is within the data section of the binary. We exclude relocating these addresses, as we preserve the data section at its original location.
Resolve fixups for each instruction
Import and data flow relocations
Utilize the resolve_disp_fixup_and_apply helper function as both encode the displacement within a memory operand.
Control flow relocations
Use the resolve_imm_fixup_and_apply helper as the displacement is encoded in an immediate operand.
During our CFG recovery, we transformed each jmp and jcc instruction to its near jump equivalent (from 2 bytes to 6 bytes) to avoid the shortcomings of 1-byte short branches.
We force a 32-bit displacement for each branch to guarantee a sufficient range for every fixup.
Update the output image buffer
Decode the updated instruction bytes to have it reflect within the RecoveredInstrthat represents it.
Write the updated bytes to the new image buffer
updated_bytesreflects the final opcodes for a fully relocated instruction.
With the helpers in place, the following Python code implements the final processing for each relocation type.
Figure 64: The three core loops that address each relocation category
Import Relocations: The first for loop handles fixups for import relocations, utilizing data generated during the Import Recovery phase. It iterates over every recovered instructionrwithin therfn.relocs_importscache and does the following:
Prepare updated instruction bytes: initialize r.updated_byteswith a mutable copy of the original instruction bytes to prepare it for modification.
Retrieve import entry and displacement: obtain the import entry from the imports dictionaryd.importsand retrieve the new RVA from d.import_to_rva_map using the import’s API name.
Apply fixup: use theresolve_disp_fixup_and_apply helper to calculate and apply the fixup for the new RVA. This adjusts the instruction’s displacement to correctly reference the imported function.
Update image buffer: write r.updated_bytesback into the new image usingupdate_reloc_in_img. This finalizes the fixup for the instruction in the output image.
Control Flow Relocations: The second for loop handles fixups for control flow branching relocations (call, jmp, jcc). Iterating over each entryin rfn.relocs_ctrlflow, it does the following:
Retrieve destination: extract the original branch destination target from the immediate operand.
Get relocated address: reference the relocation dictionaryd.global_relocsto obtain the branch target’s relocated address. If it’s a call target, then we specifically look up the relocated address for the start of the called function.
Apply fixup: useresolve_imm_fixup_and_applyto adjust the branch target to its relocated address.
Update buffer: finalize the fixup by writingr.updated_bytesback into the new image using update_reloc_in_img.
Data Flow Relocations: The final loop handles the resolution of all static memory references stored withinrfn.relocs_dataflow. First, we establish a list of KNOWN instructions that require data reference relocations. Given the extensive variety of such instructions, this categorization simplifies our approach and ensures a comprehensive understanding of all possible instructions present in the protected binaries. Following this, the logic mirrors that of the import and control flow relocations, systematically processing each relevant instruction to accurately adjust their memory references.
After reconstructing the code section and establishing the relocation map, we proceeded to adjust each instruction categorized for relocation within the deobfuscated binary. This was the final step in restoring the output binary’s full functionality, as it ensures that each instruction accurately references the intended code or data segments.
Observing the Results
To demonstrate our deobfuscation library for ScatterBrain, we conduct a test study showcasing its functionality. For this test study, we select three samples: a POISONPLUG.SHADOW headerless backdoor and two embedded plugins.
We develop a Python script, example_deobfuscator.py, that consumes from our library and implements all of the recovery techniques outlined earlier. Figure 65 and Figure 66 showcase the code within our example deobfuscator:
Figure 65: The first half of the Python code in example_deobfuscator.py
Figure 66: The second half of the Python code in example_deobfuscator.py
Running example_deobfuscator.py we can see the following. Note, it takes a bit given we have to emulate more than 16,000 instruction dispatchers that were found within the headerless backdoor.
Figure 67: The three core loops that address each relocation category
Focusing on the headerless backdoor both for brevity and also because it is the most involved in deobfuscating, we first observe its initial state inside the IDA Pro disassembler before we inspect the output results from our deobfuscator. We can see that it is virtually impenetrable to analysis.
Figure 68: Observing the obfuscated headerless backdoor in IDA Pro
After running our example deobfuscator and producing a brand new deobfuscated binary, we can see the drastic difference in output. All the original control flow has been recovered, all of the protected imports have been restored, and all required relocations have been applied. We also account for the deliberately removed PE header of the headerless backdoor that ScatterBrain removes.
Figure 69: Observing the deobfuscated headerless backdoor in IDA Pro
Given we produce functional binaries as part of the output, the subsequent deobfuscated binary can be either run directly or debugged within your favorite debugger of choice.
Figure 70: Debugging the deobfuscated headerless backdoor in everyone’s favorite debugger
Conclusion
In this blog post, we delved into the sophisticated ScatterBrain obfuscator used by POISONPLUG.SHADOW, an advanced modular backdoor leveraged by specific China-nexus threat actors GTIG has been tracking since 2022. Our exploration of ScatterBrain highlighted the intricate challenges it poses for defenders. By systematically outlining and addressing each protection mechanism, we demonstrated the significant effort required to create an effective deobfuscation solution.
Ultimately, we hope that our work provides valuable insights and practical tools for analysts and cybersecurity professionals. Our dedication to advancing methodologies and fostering collaborative innovation ensures that we remain at the forefront of combating sophisticated threats like POISONPLUG.SHADOW. Through this exhaustive examination and the introduction of our deobfuscator, we contribute to the ongoing efforts to mitigate the risks posed by highly obfuscated malware, reinforcing the resilience of cybersecurity defenses against evolving adversarial tactics.
Special thanks to Conor Quigley and Luke Jenkins from the Google Threat Intelligence Group for their contributions to both Mandiant and Google’s efforts in understanding and combating the POISONPLUG threat. We also appreciate the ongoing support and dedication of the teams at Google, whose combined efforts have been crucial in enhancing our cybersecurity defenses against sophisticated adversaries.
Google Kubernetes Engine (GKE) provides users with a lot of options when it comes to configuring their cluster networks. But with today’s highly dynamic environments, GKE platform operators tell us that they want more flexibility when it comes to changing up their configurations. To help, today we are excited to announce a set of features and capabilities designed to make GKE cluster and control-plane networking more flexible and easier to configure.
Specifically, we’ve decoupled GKE control-plane access from node-pool IP configuration, providing you with granular control over each aspect. Furthermore, we’ve introduced enhancements to each sub-component, including:
Cluster control-plane access
Added a DNS-based approach to accessing the control plane. In addition, you can now enable or disable IP-based or DNS-based access to control-plane endpoints at any time.
Each node-pool now has its own configuration, and you can now detach or attach a public IP for each node-pool independently at any time during the node-pool’s lifecycle.
You can now change a cluster’s default configuration of attaching a public IP on the newly provisioned node pools at any time. This configuration change doesn’t require you to re-create your cluster.
Regardless of how you configure a cluster’s control-plane access, or attach and detach a public IP from a node pool, the traffic between nodes to the cluster’s control plane always remains private, no matter what.
With these new changes, going forward:
GKE platform admins and operators can now easily switch between less restrictive networking configurations (e.g., control plane and/or nodes accessible from the internet) and the most restrictive configurations, where only authorized users can access the control plane, and nodes are not exposed to the internet. The decision to make a cluster public or private is no longer immutable, giving customers more flexibility without having to make upfront decisions.
There are more ways to connect to the GKE control plane. In addition to IP-based access, we now introduce DNS-based access to the control plane. You can use IAM and authentication-based policies to add policy-based, dynamic security to access the GKE control plane.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e9c95d658e0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Previous challenges
Due to the complexities and varieties of customers’ workloads and use cases, it is important to provide a simple and flexible way for customers to configure and operate the connectivity to GKE control plane and GKE nodes.
Control-plane connectivity and node-pool configuration are a key part of configuring GKE. We’ve continuously enhanced GKE’s networking capabilities to address customer concerns, providing more options for secure and flexible connectivity, including capabilities such as private clusters, VPC Peering-based connectivity, Private Service Connect-based connectivity, and private/public node pools.
While there have been a lot of improvements in configuration, usability and secure connectivity, there were still certain configuration challenges when it comes to complexity, usability and scale, such as:
InflexibleGKE control plane and node access configuration: GKE customers need to make an upfront one-way decision whether to create a private or public cluster during the cluster creation process. This configuration could not be changed unless the cluster is re-created.
The node pool network IP/ type configuration could not be changed once a cluster was created.
Confusing terms such as Public / Private clusters, creating confusion as to whether the configuration is for control-plane access or node-pool configuration.
Benefits of the new features
With these changes to GKE networking, we hope you will see benefits in the following areas.
Flexibility:
Clusters now have unified and flexible configuration. Clusters with or without external endpoints all share the same architecture and support the same functionality. You can secure access to clusters based on controls and best practices that meet your needs. All communication between the nodes in your cluster and the control plane use a private internal IP address.
You can change the control plane access and cluster node configuration settings at any time without having to re-create the cluster.
Security:
DNS-based endpoints with VPC Service Controls provide a multi-layer security model that protects your cluster against unauthorized networks as well as from unauthorized identities accessing the control plane. VPC Service Controls integrate with Cloud Audit Logs to monitor access to the control plane.
Private nodes and the workloads running on them are not directly accessible from the public internet, significantly reducing the potential for external attacks targeting your workloads.
You can block control plane access from Google Cloud external IP addresses or from external IP addresses to fully isolate the cluster control plane and reduce exposure to potential security threats.
Compliance: If you work in an industry with strict data-access and storage regulations, private nodes help ensure that sensitive data remains within your private network.
Control: Private nodes give you granular control over how traffic flows in and out of your cluster. You can configure firewall rules and network policies to allow only authorized communication. If you operate across a multi-cloud environment, private nodes can help you establish secure and controlled communication between different environments.
Getting started
Accessing the cluster control plane
There are now several ways to access a cluster’s control plane: via traditional public or private IP-based endpoints, and the new DNS-based endpoint. Whereas IP-based endpoints entail tedious IP address configuration (including static authorized network configuration, allowing private accessing from any regions, etc.), DNS-based endpoints offer a simplified, IAM policy-based, dynamic, flexible and more secure way to access a cluster’s control plane.
With these changes, you can now configure the cluster’s control plane to be reachable by all three endpoints (DNS-based, public or private IP-based) at same time, locking the cluster down to the granularity of a single endpoint in any permutation that you would like. You can apply your desired configuration at cluster creation time or adjust it later.
Here’s how to configure access for GKE node-pools.
GKE Standard Mode: In GKE Standard mode of operation, a private IP is always attached to every node no matter what. This private IP is used for private connectivity to the cluster’s control plane.
You can add or remove a public IP to all nodes in a node-pool at node-pool creation time. This configuration can be performed on each node-pool independently.
Each cluster has a default behavior flag that’s used at node-pool creation time if the flag is not explicitly set beforehand during node-pool creation time.
Note: Mutating a cluster’s default state does not change behavior of existing node pools. The new state is used only when a new node-pool is being created.
GKE Autopilot mode of operation: All workloads running on nodes with or without a public IP are based on the cluster’s default behavior. You can override the cluster’s default behavior on each workload independently by adding the following nodeSelector to your Pod specification:
However, overriding a cluster’s default behavior causes all workloads for which behavior hasn’t been explicitly set to be rescheduled), to run on nodes that match the cluster’s default behavior.
Conclusion
Given the complexity and variety of workloads that run on GKE, it’s important to have a simple and flexible way to configure and operate the connectivity to the GKE control plane and nodes. We hope these enhancements to GKE control-plane connectivity and node-pool configuration will bring new levels of flexibility and simplicity to GKE operations. For further details and documentation, please see:
More and more customers deploy their workloads on Google Cloud. But what if your workloads are sitting in another cloud? Planning, designing, and implementing a migration of your workloads, data, and processes is not an easy task. It gets even harder if you have to meet requirements that have an impact on the migration, such as avoiding downtime (also known as a zero-downtime migration). Moreover, some migrations require a certain amount of refactoring, for example, adapting your workloads to a new environment. This opens up a series of challenges, especially if you’re dealing with third-party or legacy software. You might also need to adapt your deployment and operational processes to work with your new environment.
And what if you don’t want to migrate all your workloads? Even if you’re not moving everything to Google Cloud, adopting a multicloud approach is still a migration. Many organizations choose to keep some workloads in their current cloud provider while moving others to Google Cloud.
Although managing workloads across multiple clouds has its own challenges, particularly when it comes to workload distribution and inter-cloud connectivity, a well-executed multicloud strategy lets you maintain flexibility, avoid vendor lock-in, and improve system resilience.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e9c95d1ca30>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
To help you in your migration journey, we published a series of reference guides about migrating from Amazon Web Services (AWS) to Google Cloud. This series aims to help you design, plan, and implement a migration process from AWS to Google Cloud. It can also help decision makers who are evaluating migration opportunities and want to explore what it looks like to migrate. For example, the series includes guides that cover migration journeys, such as:
These guides follow the phases of the Google Cloud migration framework (assess, plan, migrate, optimize) in the context of specific AWS to Google Cloud migration use cases.
This approach helps to avoid big-bang, risky migrations, when working on each migration plan task. For details about completing each task of this migration plan, see the AWS to Google Cloud migration guides.
Ready to learn more? Learn more about migrating to Google Cloud and discover how Google Cloud Consulting can help you learn, build, operate and succeed.
Organizations are increasingly using Confidential Computing to help protect their sensitive data in use as part of their data protection efforts. Today, we are excited to highlight new Confidential Computing capabilities that make it easier for organizations of all sizes to adopt this important privacy-preserving technology.
1. Confidential GKE Nodes on the general-purpose C3D machine series for GKE Standard mode, generally available
Previously, Confidential GKE Nodes were only available on two machine series powered by the 2nd and 3rd Gen AMD EPYC™ processors: the general-purpose N2D machine series and the compute-optimized C2D machine series. Today, Confidential GKE Nodes are also generally available on the newer, more performant C3D machine series with AMD SEV in GKE Standard mode.
The general-purpose C3D machine series is powered by 4th Gen AMD EPYC™ (Genoa) processors to deliver optimal, reliable, and consistent performance. Customers often use Confidential GKE Nodes to address potential concerns about cloud provider risk, especially since no code changes are required to enable it.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e9c95d16be0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
2. Confidential GKE Nodes on GKE Autopilot mode, generally available
Google Kubernetes Engine (GKE) offers two modes of operation: Standard and Autopilot. In Standard mode, you manage the underlying infrastructure, including configuring the individual nodes. In Autopilot mode, GKE manages the underlying infrastructure such as node configuration, autoscaling, auto-upgrades, baseline security configurations, and baseline networking configuration.
Previously, Confidential GKE Nodes were only offered on GKE Standard mode. Today, Confidential GKE Nodes are generally available on GKE Autopilot mode with the general purpose N2D machine series running with AMD Secure Encryption Virtualization (AMD SEV). This means that you can now use Confidential GKE Nodes to protect your data in use without having to manage the underlying infrastructure.
Confidential GKE Nodes can be enabled on new GKE Autopilot clusters with no code changes. Simply add the command --enable-confidential-nodes during new cluster creation. Additional pricing does apply and this new offering is available in all regions that offer the N2D machine series. Go here to get started today.
3. Confidential Space with Intel TDX-based Confidential VMs, in preview
Confidential Space allows multiple parties to securely collaborate on computations using their combined data without revealing their individual datasets to each other or to the operator enabling this collaboration. This is achieved by isolating data within a Trusted Execution Environment (TEE).
We are seeing adoption and need for these capabilities that are putting sensitive data to use in a private and compliant manner in financial services, Web3, and other industries.
Confidential Space is built on Confidential VMs. Previously, Confidential Space was only available on Confidential VMs with AMD Secure Encryption Virtualization (AMD SEV) enabled. Today, Confidential Space is also available on Confidential VMs with Intel Trust Domain Extensions (Intel TDX) enabled in preview.
Confidential Space with Intel TDX enabled offers data confidentiality, data integrity, and hardware-rooted attestation, further enhancing security. Confidential Space with Intel TDX runs on the general purpose C3 machine series, which are powered by 4th Gen Intel Xeon Scalable CPUs.
These performant C3 VMs also have Intel Advanced Matrix Extensions (Intel AMX), a new built-in accelerator that helps improve the performance of deep-learning training and inference on the CPU, on by default. Confidential Space supporting the additional confidential computing type provides users greater flexibility in selecting the right CPU platform based on performance, cost, and security requirements. Learn more about Confidential Space or check out this new Youtube video about Intel TDX.
4. Confidential VMs with NVIDIA H100 GPUs, in preview
We expanded our capabilities for secure computation last year when we unveiled Confidential VMs on the accelerator-optimized A3 machine series with NVIDIA H100 GPUs. This offering extends hardware-based data protection from the CPU to GPUs, helping to ensure the confidentiality and integrity of artificial intelligence (AI), machine learning (ML), and scientific simulation workloads leveraging GPUs can be protected while data is in use.
Today, these confidential GPUs are available in preview. Confidential VMs on the A3 machine series protects data and code in use, so that means sensitive training data or data labels, proprietary models or model weights, and top secret queries remain protected even during compute-intensive operations, like training, fine tuning, or serving.
This groundbreaking technology combines the power of Confidential Computing and accelerated computing to enable customers to harness the potential of AI while helping to maintain high levels of data security and IP protection, which can open new possibilities for innovation in regulated industries and collaborative AI development.
You can sign up here to try Confidential VMs with NVIDIA H100 GPUs. To learn more, check out our previous announcements on this offering here and here.
What’s coming in 2025
Google Cloud is committed to expanding Confidential Computing to more products and services because we want customers to have easy access to the latest in security innovation. Whether that’s adding Confidential Computing support to newer hardware or on accelerators or to services like GKE Autopilot, we aim to provide our customers with a comprehensive set of Confidential Computing solutions.
Confidential Computing is an essential technology for protecting sensitive data in the cloud, and we look forward to innovating with you in this space. You can explore the Confidential Computing products here.
Today, an increasing number of organizations are using GPUs to run inference1 on their AI/ML models. Since the number of GPUs needed to serve a single inference workload varies, organizations need more granularity in the number of GPUs in their virtual machines (VMs) to keep costs low while scaling with user demand.
You can use A3 High VMs powered by NVIDIA H100 80GB GPUs in multiple generally available machine types of 1NEW, 2NEW, 4NEW, and 8 GPUs.
Accessing smaller H100 machine types
All A3 machine types are available through the fully managed Vertex AI, as nodes through Google Kubernetes Engine (GKE), and as VMs through Google Compute Engine.
Vertex AI Model Garden and Online Prediction (Spot)
Spot
DWS Flex Start mode
a3-highgpu-2g NEW (2 GPUs, 160 GB)
Vertex AI Model Garden and Online Prediction (On-demanda, Spot)
a3-highgpu-4g NEW (4 GPUs, 320 GB)
a3-highgpu-8g (8 GPUs, 640 GB)
Vertex AI Online Prediction (On-Demand, Spot)
Vertex AI Training (On-demand, Spot, DWS Flex Start mode )
On-demand
Spot
DWS Flex Start mode
DWS Calendar mode
a3-megagpu-8g (8 GPUs, 640 GB)
aAvailable only through Model Garden owned capacity.
Google Kubernetes Engine
For almost a decade, GKE has been the platform-of-choice for running web applications and microservices, and now it provides a cost efficient, highly scalable, and open platform for training and serving AI workloads. GKE Autopilot reduces operational cost and offers workload-level SLAs, and is a fantastic choice for inference workloads — bring your workload and let Google do the rest. You can use the 1, 2, and 4 A3 High GPU machine types through both GKE Standard and GKE Autopilot modes of operation.
Below are two examples of creating node pools in your GKE cluster with a3-highgpu-1g machine type using Spot VMs and Dynamic Workload Scheduler Flex Start mode.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e447b6804c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>
Using Spot VMs with GKE
Here’s how to request and deploy a3-highgpu-1g Spot VM on GKE using the gcloud API.
Vertex AI is Google Cloud’s fully managed, unified AI development platform for building and using predictive and generative AI. With the new 1, 2, and 4 A3 High GPU machine types, Model Garden customers can deploy hundreds of open models cost-effectively and with strong performance.
What our customers are saying
“We use Google Kubernetes Engine to run the backend for our AI-assisted software development product. Smaller A3 machine types have enabled us to reduce the latency of our real-time code assist models by 36% compared to A2 machine types, significantly improving user experience.” – Eran Dvey Aharon, VP R&D, Tabnine
Get started today
At Google Cloud, our goal is to provide you with the flexibility you need to run inference for your AI and ML models cost-effectively as well as with great performance. The availability of A3 High VMs using NVIDIA H100 80GB GPUs in smaller machine types provides you with the granularity you need to scale with user demand while keeping costs in check.
1. AI or ML inference is the process by which a trained AI model uses its training data to calculate output data or make predictions about new data points or scenarios.
What goes into your Kubernetes software? Understanding the origin of the software components you deploy is crucial for mitigating risks and ensuring the trustworthiness of your applications. To do this, you need to know your software supply chain.
Google Cloud is committed to providing tools and features that enhance software supply chain transparency, and today we’re excited to announce that you can now verify the integrity of Google Kubernetes Engine components with SLSA, the Supply-chain Levels for Software Artifacts framework.
SLSA is a set of standards that can help attest the integrity of software components. We’ve begun to publish SLSA Verification Summary Attestations (VSAs) for GKE’s Container-Optimized OS (COS) virtual machine (VM) images to GitHub. We’ve also enhanced Google Compute Engine (GCE) audit logs to include VM image identifiers, and begun to route GKE Kubernetes Control Plane GCE audit logs to customer projects. This allows you to use SLSA VSAs to authenticate the VM images used in your GKE clusters.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e447b64f970>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
GCE audit logs improvements
Google Compute Engine audit logs now include the GCE image ID in records related to instance creation events (such as insert, bulk insert, and update operations) when an instance is created from an image. This allows you to trace the precise image used to launch each instance even if an image is deleted and recreated with the same name, as each image instance has a unique immutable ID.
The ID is used to uniquely identify the image when verifying its provenance and integrity using the SLSA VSAs described below. This can provide an invaluable audit trail for security and compliance purposes.
We introduced a new attachDisks field, under usedResources in the metadata field, that for attached disks records the source image name, source image id, and whether it was used as the boot disk. You can find this information in the Logs explorer using a query like:
GCE instance insert audit log record with VM image id field
GKE Control Plane audit and integrity logs now forwarded to your project
New GKE clusters running version 1.29 or later now forward their Control Plane GCE audit logs records for insert, bulk insert, and update operations, and their Shielded VM integrity logs, to the customer project hosting the GKE cluster.
You can identify Control Plane VM instance log records by the presence of a new metadata field. To view the logs use a log explorer query like:
code_block
<ListValue: [StructValue([(‘code’, ‘resource.type=”gce_instance” AND (jsonPayload.metadata.isKubernetesControlPlaneVM=”true” OR protoPayload.metadata.isKubernetesControlPlaneVM=”true”)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e447b64f790>)])]>
Additionally, we’ve added to the forwarded logs a new parentResource map under metadata with two fields, parentResourceType, with a value of “gke_cluster”, and parentResourceId, with the cluster hash as a value, so you can tell which cluster the VMs in the forwarded log records belong to in case you have more than one cluster per project.
This enhancement allows you to gain visibility into the VM images used to create the Control Plane VMs, and the integrity status of the instances, further strengthening your ability to audit and secure your GKE clusters.
KCP VM instance audit log record forwarded to customer project
GKE bolsters VM image verification with SLSA VSAs
Google Kubernetes Engine (GKE) is taking a significant step forward in supply chain security by publishing SLSA Verification Summary Attestations (VSAs) for GKE Container Optimized OS (COS) based VM images. These attestations are available in the Google Cloud GKE VSA GitHub repository. This initiative can provide you with cryptographic proof of the integrity and provenance of the GKE VM images you’re using, help ensure that they haven’t been tampered with, and that they originate from a trusted source.
To locate a VSA for the COS VM image used in your GKE VM instances. Look in the folders at the root of the GitHub repository:
The folder gke-master-images:78064567238 contains VSAs for the Kubernetes control plane VM images.
The folder gke-node-images:238739202978 contains the VSAs for the node VM images.
Using the image ID found in the audit logs you can locate the matching VSA. For example, gke-node-images:238739202978/gke-12811-gke1044000-cos-109-17800-218-52-c-pre:3031893369549136349.intoto.jsonl is the VSA for the the node VM image with an id of 3031893369549136349.
Independent verification with slsa-verifier
You can independently verify the authenticity of GKE VM images using the open-source slsa-verifier tool. This tool allows you to validate the integrity of your GKE VM images by combining the GCE image name and ID, the VSA, and Google’s VSA public signing key.
The public key is
code_block
<ListValue: [StructValue([(‘code’, ‘—–BEGIN PUBLIC KEY—–rnMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEeGa6ZCZn0q6WpaUwJrSk+PPYEscarn3Xkk3UrxvbQtoZzTmq0zIYq+4QQl0YBedSyy+XcwAMaUWTouTrB05WhYtg==rn—–END PUBLIC KEY—–‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e447b64f460>)])]>
To verify a VM image use slsa-verifier as follows:
VM_IMAGE_PROJECT_NAME is the name of the project hosting the VM image (e.g., gke-node-images)
VM_IMAGE_NAME is the image name (e.g., gke-12811-gke1044000-cos-109-17800-218-52-c-pre)
VM_IMAGE_ID is the image ID (e.g. 30318933695491363493)
KEY_PATH is the path to the saved public key
Next steps
These enhancements reflect Google Cloud’s commitment to providing you with the tools and capabilities needed to help build and manage secure, transparent software supply chains. To learn more about how to verify the integrity of the GKE control plane check out the user guide. You can find more information on securing your GKE cluster in the documentation.
Comprehensive agent evaluation is essential for building the next generation of reliable AI. It’s not enough to simply check the outputs; we need to understand the “why” behind an agent’s actions – its reasoning, decision-making process, and the path it takes to reach a solution.
That’s why today, we’re thrilled to announce Vertex AI Gen AI evaluation service is now in public preview. This new feature empowers developers to rigorously assess and understand their AI agents. It includes a powerful set of evaluation metrics specifically designed for agents built with different frameworks, and provides native agent inference capabilities to streamline the evaluation process.
In this post, we’ll explore how evaluation metrics work and share an example of how you can apply this to your agents.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e447b328e20>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Evaluate agents using Vertex AI Gen AI evaluation service
Our evaluation metrics can be grouped in two categories: final response and trajectory evaluation.
Final response asks a simple question: does your agent achieve its goals? You can define custom final response criteria to measure success according to your specific needs. For example, you can assess whether a retail chatbot provides accurate product information or if a research agent summarizes findings effectively, using appropriate tone and style.
To look below the surface, we offer trajectory evaluation to analyze the agent’s decision-making process. Trajectory evaluation is crucial for understanding your agent’s reasoning, identifying potential errors or inefficiencies, and ultimately improving performance. We offer six trajectory evaluation metrics to help you answer these questions:
1. Exact match: Requires the AI agent to produce a sequence of actions (a “trajectory”) that perfectly mirrors the ideal solution.
2. In-order match: The agent’s trajectory needs to include all the necessary actions in the correct order, but it might also include extra, unnecessary steps. Imagine following a recipe correctly but adding a few extra spices along the way.
3. Any-order match: Even more flexible, this metric only cares that the agent’s trajectory includes all the necessary actions, regardless of their order. It’s like reaching your destination, regardless of the route you take.
4. Precision: This metric focuses on the accuracy of the agent’s actions. It calculates the proportion of actions in the predicted trajectory that are also present in the reference trajectory. A high precision means the agent is making mostly relevant actions.
5. Recall: This metric measures the agent’s ability to capture all the essential actions. It calculates the proportion of actions in the reference trajectory that are also present in the predicted trajectory. A high recall means the agent is unlikely to miss crucial steps.
6. Single-tool use: This metric checks for the presence of a specific action within the agent’s trajectory. It’s useful for assessing whether an agent has learned to utilize a particular tool or capability.
Compatibility meets flexibility
Vertex AI Gen AI evaluation service supports a variety of agent architectures.
With today’s launch, you can evaluate agents built with Reasoning Engine (LangChain on Vertex AI), the managed runtime for your agentic applications on Vertex AI. We also support agents built by open-source frameworks, including LangChain, LangGraph, and CrewAI – and we are planning to support upcoming Google Cloud services to build agents.
For maximum flexibility, you can evaluate agents using a custom function that processes prompts and returns responses. To make your evaluation experience easier, we offer native agent inference and automatically log all results in Vertex AI experiments.
Agent evaluation in action
Let’s say you have the following LangGraph customer support agent, and you aim to assess both the responses it generates and the sequence of actions (or “trajectory”) it undertakes to produce those responses.
To assess an agent using Vertex AI Gen AI evaluation service, you start preparing an evaluation dataset. This dataset should ideally contain the following elements:
User prompt: This represents the input that the user provides to the agent.
Reference trajectory: This is the expected sequence of actions that the agent should take to provide the correct response.
Generated trajectory: This is the actual sequence of actions that the agent took to generate a response to the user prompt.
Response: This is the generated response, given the agent’s sequence of actions.
A sample evaluation dataset is shown below.
After you gather your evaluation dataset, define the metrics that you want to use to evaluate the agent. For a complete list of metrics and their interpretations, refer to Evaluate Gen AI agents. Some metrics you can define are listed here:
Notice that the response_follows_trajectory_metric is a custom metric that you can define to evaluate your agent.
Standard text generation metrics, such as coherence, may not be sufficient when evaluating AI agents that interact with environments, as these metrics primarily focus on text structure. Agent responses should be assessed based on their effectiveness within the environment. Vertex AI Gen AI Evaluation service allows you to define custom metrics, like response_follows_trajectory_metric, that assess whether the agent’s response logically follows from its tool choices. For more information on these metrics, please refer to the official notebook.
With your evaluation dataset and metrics defined, you can now run your first agent evaluation job on Vertex AI. Please see the code sample below.
code_block
<ListValue: [StructValue([(‘code’, ‘# Import libraries rnimport vertexairnfrom vertexai.preview.evaluation import EvalTaskrnrn# Initiate Vertex AI sessionrnvertexai.init(project=”my-project-id”, location=”my-location”, experiment=”evaluate-langgraph-agent)rnrn# Define an EvalTaskrnresponse_eval_tool_task = EvalTask(rn dataset=byod_eval_sample_dataset,rn metrics=response_tool_metrics,rn)rnrn# Run evaluationrnresponse_eval_tool_result = response_eval_tool_task.evaluate( experiment_run_name=”response-over-tools”)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e447b64ad30>)])]>
To run the evaluation, initiate an `EvalTask` using the predefined dataset and metrics. Then, run an evaluation job using the evaluate method. Vertex AI Gen AI evaluation tracks the resulting evaluation as an experiment run within Vertex AI Experiments, the managed experiment tracking service on Vertex AI. The evaluation results can be viewed both within the notebook and the Vertex AI Experiments UI. If you’re using Colab Enterprise, you can also view the results in the Experiment side panel as shown below.
Vertex AI Gen AI evaluation service offers summary and metrics tables, providing detailed insights into agent performance. This includes individual user input, trajectory results, and aggregate results for all user input and trajectory pairs across all requested metrics.
Access to these granular evaluation results enables you to create meaningful visualizations of agent performance, including bar and radar charts like the one below:
Get started today
Explore the Vertex AI Gen AI evaluation service in public preview and unlock the full potential of your agentic applications.
Tchibo, a well-known coffee retailer and lifestyle brand based in Germany, needed a faster, smarter way to manage and interpret vast amounts of customer feedback across its diverse product offerings and sales channels. To meet this need, they adopted the AlloyDB for PostgreSQL database, harnessing its advanced analytics and AI capabilities to streamline data retrieval and provide real-time insights.
In this guest post from Henning Kosmallaand Dominik Nowatschin, we learn how Tchibo’s migration accelerated feedback analysis by a factor of 10, empowering Tchibo’s teams to respond quickly to customer needs and reinforcing the company’s commitment to customer-centric innovation.
At Tchibo, we’re about more than just coffee — we’re constantly brewing new ways to connect with our customers.
We’ve grown from a coffee-focused business to a multi-channel retail model, spanning our own stores, e-commerce, and shop-in-shop sections in grocery stores. This setup allows us to serve a diverse customer base, each with unique needs and preferences, while offering an evolving selection of non-food items — from apparel to kitchenware — delivering “a new world every week.”
But it’s not always a smooth pour. Rising global challenges, from inflation to new AI-driven customer expectations, require us to make data-driven decisions quickly to stay competitive. Our previous cloud database solution could handle basic data retrieval, but it couldn’t keep up with the scale and complexity of the data we rely on across our three sales channels. As our data needs grew, we faced a number of issues: our query speeds slowed which delayed access to customer data, we suffered from labor-intensive feedback compilation, and we had difficulty extracting actionable insights from diverse data sources.
Queries often exceeded 10 seconds, even for straightforward insights. And compiling customer feedback reports required up to three days of manual work to sort, categorize, and analyze. We also lacked the flexibility to support advanced, AI-driven applications. This limited our ability to implement innovative tools like retrieval-augmented generation (RAG) workflows, which combine structured and unstructured data for deeper context in AI queries. That’s why we turned to AlloyDB — to power the insights that keep customers at the center of every decision we make.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e447b250f10>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Finding the perfect blend of speed and scale
AlloyDB provided a powerful solution to the limitations we faced with our old database. Its advanced analytics capabilities, built-in vector search, and familiar PostgreSQL foundation offered the speed, adaptability, and usability we needed to serve up insights as fresh and fast as our coffee. One of our most impactful applications, Customer Voice, gives employees instant access to relevant customer feedback. This tool compiles data from product reviews and other sources into actionable summaries, answering questions like, “How do customers feel about our new coffee pad machine?” with concise, actionable summaries.
AlloyDB serves as the foundation of our Customer Voice application, managing a complete data pipeline to support real-time feedback analysis. Its architecture efficiently handles data storage, search, and query processing, so Tchibo teams can gain fresh perspectives from customer insights. Here’s how AlloyDB supports our specific needs:
Data storage: AlloyDB organizes customer feedback and product meta-information in a flexible structure, supporting both standard and advanced queries. This setup allows us to run traditional queries (e.g., “return all reviews with a positive sentiment”) as well as nearest-neighbor (NN) searches using embedding columns to add depth and relevance to the data.
Query interpretation: When employees pose questions to the Customer Voice assistant, a large language model (LLM)—currently Claude 3.5 Sonnet on Vertex AI—interprets the query, identifying core topics like product or category to deliver targeted, relevant answers.
Retrieval and filtering: AlloyDB combines structured queries, NN searches, and reranking/filtering steps to retrieve relevant reviews. The LLM further enriches the data with clustering and summary statistics, providing a full view of customer opinions.
Presentation: Customer Voice delivers these insights through a streamlined interface that highlights individual reviews, key statistics, and summaries, making it easy for employees to act on the information.
Serving data to perfection to fuel decision-making
AlloyDB has transformed Tchibo’s approach to data by enabling faster, deeper, and more scalable access to the customer feedback and analytics we rely on for decision-making.
Supporting high-performance analytics and RAG workflows, AlloyDB now delivers nearly instant insights. Complex queries that once took up to 10 seconds now return results in about a second, enabling faster, data-driven decisions across teams. Generating detailed customer feedback reports previously took days of manual effort. With AlloyDB, this process now takes seconds. This leap has strengthened our commitment to staying connected to customer needs and preferences in real time.
Furthermore, the fully managed operation of AlloyDB has reduced operational overhead, simplifying our ability to scale as data demands grow. Although continuity wasn’t our primary consideration in choosing AlloyDB, its 99.99% SLA availability provides valuable reliability for supporting long-term goals.
Beyond Customer Voice, AlloyDB also supports broader AI initiatives, such as an internal chatbot for intranet queries, giving us the flexibility to scale various retrieval-augmented generation (RAG) use cases efficiently across the organization. Looking forward, we’re exploring expanded AlloyDB capabilities to integrate more structured and unstructured data into our analytics. Partnering with Google Cloud, we’re positioned to explore new data solutions to serve up richer insights, driving growth and innovation at Tchibo.
Ready to get started with AlloyDB in your own environment? Check out the following resources:
It’s a new year, which means new beginnings, fresh starts, ambitious resolutions, and the sinking feeling that you still have outdated tech slowing down and creating unnecessary costs for your business. Thankfully, there’s a no-cost, easy to deploy New Year’s resolution to add to your tech stack.
With just a USB stick, and a side of enthusiasm, you can install ChromeOS Flex and breathe new life into your existing hardware, transforming aging laptops, kiosks, and more into fast, secure, and modern devices. It’s the perfect solution for businesses hoping to refresh devices, improve security, and embrace sustainability – all while saving money. And going into 2025, we’ve certified over 600 devices to work effortlessly with ChromeOS Flex, ensuring that almost every business can benefit from it.
So many organizations have been able to benefit from ChromeOS Flex, from Mercado Libre upgrading devices to improve contact center productivity by 25%, to Strawberry Hotels who deployed ChromeOS Flex to 2,000 devices in under 48 hours to evade ransomware and bolster security. And as ChromeOS Flex helps businesses modernize, we’re always looking for ways to improve. In fact, we’ve made a few updates to support the ever evolving needs of businesses as we head into the new year.
Boosting security, streamlining deployment
We’re bringing the convenience of zero-touch enrollment to ChromeOS Flex. This means that devices installing ChromeOS Flex for the first time can be automatically enrolled into your business domain. With automatic enrollment, IT admins can quickly configure devices with the necessary policies and applications, saving valuable time and resources while helping end-users get up and running more quickly.
Additionally, admins can opt in to allow ChromeOS Flex devices to receive manufacturer provided firmware updates. By keeping firmware up to date, businesses can ensure their devices are protected against the latest threats and benefit from performance optimizations and bug fixes.
Keep your devices running
While we continue to invest in ChromeOS Flex, we also acknowledge that it’s easy for hardware deployments to fall behind. Later this year, it’s expected that hundreds of millions of Windows 10 PCs will lose support. If this is affecting your business, that doesn’t mean your only option is to invest in new devices.
Instead of scrapping perfectly good hardware, ChromeOS Flex can be a tool to modernize without the need to purchase an entirely new device. This gives your deployment a new lease on life, with much of the speed, security, and reliability that comes with ChromeOS. And this not only saves on substantial device costs, but also contributes to environmental sustainability by reducing e-waste.
With ChromeOS device management, it’s easy to manage policies for your entire fleet, all while robust security features protect you from the latest threats. For example, data loss prevention keeps sensitive information secure and remotely wiping a lost or stolen device is as easy as a few clicks. Best of all, you can manage both ChromeOS Flex and ChromeOS devices side by side with the Google Admin console.
Flex your tech in 2025
There has never been a better time to embrace solutions that support your business with enhanced security, lower costs, and sustainability in mind.
ChromeOS Flex delivers on all fronts, and we’re continuing to invest in it as a secure, easy-to-manage, and sustainable solution for businesses. And stay tuned, as we have even more improvements to share with you later this year!
Want to learn more? Visit our website to see how ChromeOS Flex can breathe new life into your devices.