Amazon GuardDuty Malware Protection for Amazon S3 provides a fully-managed offering to scan new object uploads to S3 bucket for malware. Starting February 1, 2025, we are lowering the price for the data scanned dimension by 85%. Over the past few months we have made improvements to our scanning infrastructure and data processing efficiencies, enabling us to reduce the price as part of our commitment to pass savings back to customers.
GuardDuty Malware Protection for S3 is priced based on two dimensions: the number of objects evaluated and the amount of data scanned. We are lowering the price for the data scanned dimension, for example in US East (N. Virginia) from $0.60 to $0.09 per GB. The price for objects evaluated remains unchanged. With this price reduction, you will be more capable of building secure and cost-effective data pipelines on Amazon S3 for applications with untrusted uploads across the enterprise.
The price reduction applies automatically to all AWS Regions where GuardDuty Malware Protection for S3 is available, requiring no action from customers. For additional information visit Amazon GuardDuty pricing page. To receive programmatic updates on new GuardDuty features and threat detections, subscribe to the Amazon GuardDuty AWS Simple Notification Service (SNS) topic.
AWS Toolkit for Visual Studio Code (VSCode) now includes Amazon DocumentDB (with MongoDB compatibility). This integration allows customers to use VS Code to view, create and manage Amazon DocumentDB resources.
To get started, you can download the latest AWS-Toolkit extension from VSCode marketplace or update your AWS-Toolkit extension if you already have it installed in your VSCode environment. Then you can configure the extension with credentials for your AWS account and connect to AWS explorer in VSCode. Once connected, you can interact with the DocumentDB menu item to create DocumentDB clusters or manage your existing DocumentDB clusters. To learn more, see Amazon DocumentDB support in VSCode.
Amazon DocumentDB is a fully managed, native JSON database that makes it simple and cost-effective to operate critical document workloads at virtually any scale without managing infrastructure. To learn more about Amazon DocumentDB, please visit the Amazon DocumentDB product page, pricing page and documentation.
Cost Optimization Hub now supports idle EC2 Auto Scaling group recommendations and rightsizing recommendations for EC2 Auto Scaling groups with scaling policies and multiple instance types. It also allows you to easily filter and aggregate EC2 Auto Scaling group cost optimization opportunities separately from standalone EC2 instances, making it easier to identify EC2 Auto Scaling groups with the highest cost-saving opportunities.
With this launch, you can view, consolidate, and prioritize cost optimization opportunities for EC2 Auto Scaling groups across your organization’s member accounts and AWS Regions through a single dashboard. Cost Optimization Hub helps you quantify estimated savings from these recommendations, taking into account your specific discounts, such as Reserved Instances and Savings Plans, enabling you to easily compare and prioritize recommendations.
The new EC2 Auto Scaling group experience and recommendations are available in Cost Optimization Hub across all AWS Regions where Cost Optimization Hub and AWS Organizations are supported.
Amazon MSK has added support for Express brokers in all AWS Regions where Amazon MSK offers Graviton3 based M7g instances. You can now benefit from Amazon MSK Express brokers in ten additional AWS Regions: Asia Pacific (Mumbai), Europe (Paris), South America (Sao Paulo), Asia Pacific (Hyderabad), Asia Pacific (Seoul), Asia Pacific (Malaysia), Europe (London), Canada (Central), Europe (Spain), and Middle East (Bahrain).
Express brokers are a new broker type for Amazon MSK Provisioned designed to deliver up to 3x more throughput per broker, scale up to 20x faster, and reduce recovery time by 90% as compared to standard Apache Kafka brokers. Express brokers come pre-configured with Kafka best practices by default, support all Kafka APIs, and provide the same low-latency performance that Amazon MSK customers expect, so they can continue using existing client applications without any changes.
You can now create an MSK cluster with Express brokers in these AWS Regions from the Amazon MSK console. To learn more, check out this blog.
One of the most compelling aspects of cloud computing is being able to automatically scale resources up, but almost as importantly, to scale them back down to manage costs and performance. This is standard practice with virtual machines, for instance Compute Engine Managed Instance Groups, but because of their inherent complexity, less so with stateful services such as databases.
Memorystore for Redis Cluster capacity is determined by the number of shards in your cluster, which can be increased/decreased without downtime, and your cluster’s shard size, which maps on to the underlying node type. At this time, the node type of the cluster is immutable. To scale capacity in or out, you modify the number of shards in your cluster. To automate this process, you can deploy the Memorystore Cluster Autoscaler to monitor your cluster metrics, and rightsize your cluster based on that information. The Autoscaler performs the necessary resource adjustments using rulesets that evaluate memory and CPU utilization, without impacting cluster availability.
The following chart shows the Autoscaler in action, with a Memorystore for Redis Cluster instance automatically scaling out as memory utilization increases. The green line represents data being written to the cluster at the rate of one gigabyte every five minutes. The blue line represents the number of shards in the cluster. You can see that the cluster scales out, with the number of shards increasing in proportion to the memory utilization, then plateaus when the writes stop, and finally scales back in when the keys are flushed at the end of the test.
Experience and deployment
To use the Autoscaler, deploy it to one of your Google Cloud projects. The Autoscaler is very flexible and there are multiple options for its deployment, so the repository contains multiple example Terraform deployment configurations, as well as documentation that describes the various deployment models.
Once you’ve deployed the Autoscaler, configure it according to the scaling requirements of the Memorystore instances being managed, to suit your workloads’ characteristics. You do this by setting Autoscaler configuration parameters for each of the Memorystore instances. Once configured, the Autoscaler autonomously manages and scales the Memorystore instances. You can read more about these parameters later in this post, and in the Autoscaler documentation.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4df7d14c70>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Autoscaler architecture
The Autoscaler consists of two main components, the Poller and the Scaler. You can deploy these to either Cloud Run functions or Google Kubernetes Engine (GKE) via Terraform, and configure them so that the Autoscaler runs according to a user-defined schedule. The Poller queries the Memorystore metrics in Cloud Monitoring at a pre-defined interval to determine utilization, and passes them to the Scaler. The Scaler then compares the metrics against the recommended thresholds specified in the rule set, and determines if the instance should be scaled in or out, and if so, by how many shards. You can modify the sample configuration to determine minimum and maximum cluster sizes and any other thresholds suitable for your environment.
Throughout the flow, the Autoscaler writes a step-by-step summary of its recommendations and actions to Cloud Logging for tracking and auditing, as well as metrics to Cloud Monitoring to provide insight into its actions.
Scaling rubrics
Memorystore performance is most commonly limited by in-memory storage and by CPU. The Autoscaler is configured by default to take both of these factors into consideration when scaling, by utilizing the CPU_AND_MEMORY profile. This is a good place to start your deployment, and can be replaced with a custom configuration, if required, to best suit your needs.
Defaults
Metric
Average Default Setting
Max Default Setting
CPU scale OUT
CPU > 70%
Max CPU > 80% and average CPU > 50%
CPU scale IN
CPU < 50% *
Max CPU < 60% and average CPU < 40% *
Memory Scale OUT
Usage > 70%
Max Usage > 80% and average usage > 50%
Memory Scale IN
Usage < 50% *
Max Usage < 60% and average usage < 40% *
* Scale-in will be blocked if there are ongoing key evictions, which occur when the keyspace is full and keys are removed from the cache to make room. Scale in is enabled by default, but can be configured using a custom scaling profile. Refer to the Scaling Profiles section of the documentation for more information on how to do this.
Scaling scenarios and methods
Let’s take a look at some typical scenarios and their specific utilization patterns, and the Autoscaler configurations best suited to each of them. You can read more about the options described in the following section in the configuration documentation.
Standard workloads
With many applications backed by Memorystore, users interact with the application at certain times of day more than others, in a regular pattern — think a banking application where users check their accounts in the morning, make transactions during the afternoon and early evening, but don’t use the application much at night.
We refer to this fairly typical scenario as a “standard workload” whose time series shows:
Large utilization increase or decrease at certain points of the day
Small spikes over and under the threshold
A recommended base configuration for these types of workflow should include:
The LINEAR scalingMethod to cover large scale events
A small value for scaleOutCoolingMinutes — between 5 and 10 minutes — to minimize Autoscaler’s reaction time.
Plateau workloads
Another common scenario is applications with more consistent utilization during the day such as global apps, games, or chat applications. User interactions with these applications are more consistent, so the jumps in utilization are less pronounced than for a standard workload.
These scenarios create a “plateau workload” whose time series shows:
A pattern composed of various plateaus during the day
Some larger spikes within the same plateau
A recommended base configuration for these types of workflow should include:
The STEPWISE scalingMethod, with a stepSize sufficient to cover the largest utilization jump using only a few steps during a normal day, OR
The LINEAR scalingMethod, if there is likely to be a considerable increase or reduction in utilization at certain times, for example when breaking news is shared. Use this method together with a scaleInLimit to avoid reducing the capacity of your instance too quickly
Batch workloads
Customers often need increased capacity for their Memorystore clusters to handle batch processes or a sales event, where the timing is usually known in advance. These scenarios comprise a “batch workload” with the following properties:
A scheduled, well-known peak that requires additional compute capacity
A drop in utilization when the process or event is over
A recommended base configuration for these types of workloads should include two separate scheduled jobs:
One for the batch process or event, that includes an object in the configuration that uses the DIRECT scalingMethod, and a minSize value of the peak number of shards/nodes to cover the process or event
One for regular operations, that includes configuration with the same projectId and instanceId, but using the LINEAR or STEPWISE method. This job will take care of decreasing the capacity when the process or event is over
Be sure to choose an appropriate scaling schedule so that the two configurations don’t conflict. For both Cloud Run functions and GKE deployments, make sure the batch operation starts before the Autoscaler starts to scale the instance back in again. You can use the scaleInLimit parameter to slow the scale-in operation down if needed.
Spiky workloads
Depending on load, it can take around several minutes for Memorystore to update the cluster topology and fully utilize new capacity. Therefore, if your traffic is characterized by very spiky traffic or sudden-onset load patterns, the Autoscaler might not be able to provision capacity quickly enough to avoid latency, or efficiently enough to yield cost savings.
For these spiky workloads, a base configuration should:
Set a minSize that slightly over-provisions the usual instance workload
Use the LINEAR scalingMethod, in combination with a scaleInLimit to avoid further latency when the spike is over
Choose scaling thresholds large enough to smooth out some smaller spikes, while still being reactive to large ones
Advanced usage
As described above, the Autoscaler is preconfigured with scaling rules designed to optimize cluster size based on CPU and memory utilization. However, depending on your workload(s), you may find that you need to modify these rules to suit your utilization, performance and/or budget goals.
There are several ways to customize the rule sets that are used for scaling, in increasing order of effort required:
Choose to scale on only memory or only CPU metrics. This can help if you find your clusters flapping, i.e., alternating rapidly between sizes. You can do this by specifying a scalingProfile of either CPU or MEMORY to override the default CPU_AND_MEMORY in the Autoscaler configuration.
Use your own custom scaling rules by specifying a scalingProfile of CUSTOM, and supplying a custom rule set in the Autoscaler configuration as shown in the example here.
Create your own custom rule sets and make them available for everyone in your organization to use as part of a scaling profile. You can do this by customizing one of the existing scaling profiles to suit your needs. We recommend starting by looking at the existing scaling rules and profiles, and creating your own customizations.
Next steps
The OSS Autoscaler comes with a Terraform configuration to get you started, which can be integrated into your codebase for production deployments. We recommend starting with non-production environments, and progressing through to production when you are confident with the behavior of the Autoscaler alongside your application(s). Some more tips for production deployments are here in the documentation.
If there are additional features you would like to see in the Autoscaler — or would like to contribute to it yourself — please don’t hesitate to raise an issue via the GitHub issues page. We’re looking forward to hearing from you.
Today, we are thrilled to announce the public beta launch of Gen AI Toolbox for Databases in partnership with LangChain, the leading orchestration framework for developers building large language model (LLM) applications.
Gen AI Toolbox for Databases (Toolbox) is an open-source server that empowers application developers to connect production-grade, agent-based generative AI (gen AI) applications to databases. It streamlines the creation, deployment, and management of sophisticated gen AI tools capable of querying databases with secure access, robust observability, scalability, and comprehensive manageability. It also provides connectivity to popular open-source databases such as PostgreSQL, MySQL, as well as Google’s industry-leading Cloud Databases like AlloyDB, Spanner, and Cloud SQL for SQL Server. We are open to contributions from other databases outside of Google Cloud.
In this post, we’ll explore how Gen AI Toolbox for Databases works, and how to get started.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4df7dc48b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Challenges in gen AI tool management
Building AI agents requires using different tools, frameworks, and connecting to various data sources. This process creates several challenges for developers, particularly when these tools need to query databases. These include –
Scaling tool management: Current approaches to tool integration often require extensive, repetitive code and modifications across multiple locations for each tool. This complexity hinders consistency, especially when tools are shared across multiple agents or services. A more streamlined framework integration is needed to simplify tool management and ensure consistency across agents and applications.
Complex database connections: Databases require configuration, connection pooling, and caching for optimal performance at scale.
Security vulnerabilities: Ensuring secure access from gen AI models to sensitive data requires complex integration with auth services, databases and the application, which can be error-prone and introduce security risks.
Inflexible tool updates: Adding new tools or updating existing ones often requires a complete rebuild and redeployment of the application, potentially leading to downtime.
Limited workflow observability: Current solutions lack built-in support for comprehensive monitoring and troubleshooting, making it difficult to gain insights into gen AI workflows with databases.
Components
Gen AI Toolbox for Databases improves how gen AI tools interact with data, addressing common challenges in gen AI tool management. By acting as an intermediary between the application’s orchestration layer and data sources/databases, it enables faster development and more secure data access, improving the production-quality of tools.
Toolbox comprises two components: a server specifying the tools for application use, and a client interacting with this server to load these tools onto orchestration frameworks. This centralizes tool deployment and updates, incorporating built-in production best practices to enhance performance, security, and simplify deployments.
Benefits
Toolbox offers various features that provide better managebility, security and observability for AI Agents. Some of the benefits for application developers are as follows –
Simplified development – Reduced boilerplate code and consolidated integration simplifies tool development and sharing across other agents.
Built-in performance and scale – Built-in connection pooling and optimized connectors for popular databases to handle connection management efficiency.
Zero downtime deployment – A config-driven approach enables seamless deployment of new tools and updates without any service interruption and supports incremental rollouts.
Enhanced security – Using Oauth2 and ODIC, built-in support for common auth providers enables control over Agents’ access to tools and data.
End-to-end observability – Toolbox integrates with OpenTelemetry, providing day-one insights via logging, metrics, and tracing, offering end-to-end observability for better operations.
Compatibility with LangChain
LangChain is the most popular developer framework for building LLM applications, and we’re excited to announce Toolbox compatibility with the LangChain ecosystem from day one. Together with Toolbox, LangGraph can leverage LLMs like Gemini on Vertex AI to build powerful agentic workflows.
LangGraph extends LangChain’s capabilities by providing a framework for building stateful, multi-actor applications with LLMs. Its support for cycles, state management, and coordination enables the development of complex and dynamic AI agents. All of these capabilities integrate seamlessly with Toolbox.
Tool calling is essential for building agents. Agents need to call tools in a controlled and specified way, run the tool reliably, and then pass the correct context back to the LLM. LangGraph provides a low-level agent framework for managing how tools are called and how their responses are integrated, ensuring precision and control. Toolbox then handles the execution itself, seamlessly running the tool and returning results. Together, they create a powerful solution for tool calling in agent workflows.
“The integration of Gen AI Toolbox for Databases with the LangChain ecosystem is a boon for all developers” says Harrison Chase, CEO of LangChain. “In particular, the tight integration between Toolbox and LangGraph will allow developers to build more reliable agents than ever before.”
Get started with Gen AI Toolbox for Databases
Gen AI Toolbox for Databases simplifies gen AI tool development and deployment by automating the entire lifecycle. Here are some resources to get you started:
Last year, we offered our first ever “Google Launchpad for Women” series to empower women within our customer ecosystem to grow their cloud and AI skills. The response from our customers has been tremendous: more than 11,000 women across a breadth of roles – sales, leadership, marketing, finance, and more have completed previous editions of the program. As a result, they are building critical skills that help them put AI to work in their jobs, grow their careers, and help transform their businesses.
This year, in honor of International Women’s Day, we are opening “Google Launchpad for Women,” to thousands of more customer participants, providing them with no-cost training, exam prep, and access to Google experts. Registration is now open to Google Cloud customers in the Americas, EMEA, and Japan, with the three-week program beginning on March 4th in Japan and March 6th in the Americas and EMEA. Program benefits include:
Expert-led training: Two days of in-depth, instructor-led training covering key cloud concepts and best practices.
Industry insights: Engage with Google Cloud experts through panel discussions on topics such as Generative AI.
Exam preparation: Dedicated sessions to prepare for the Cloud Digital Leader certification exam.
Complimentary exam voucher: Participants will receive a voucher for the $99 exam fee.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get hands-on experience for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e24784c42b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome/’), (‘image’, None)])]>
Why these trainings are critical
Harnessing the power of cloud computing and AI is essential for all job roles, not just IT. As more businesses adopt AI, people across business roles utilize this technology every day and often make purchasing decisions about new AI platforms and tools. However, a talent gap remains, and is particularly pronounced for women, who represent about 14% of the global cloud workforce according to recent data from the World Economic Forum.
We aim to help our customers reduce this gap, ensure they have access to the skilled experts they need to advance their digital and AI transformations, and give more people opportunities to grow their careers and lead these transformations. Ultimately, those who complete the Google Launchpad for Women program will be well-equipped to achieve the Cloud Digital Leader certification, putting them at the forefront of the cloud and AI era.
Google Launchpad for Women is open to all Google Cloud customers, regardless of prior technical experience or role. We welcome women from all professional backgrounds who are eager to develop their cloud skills and advance their careers. While this initiative is specifically focused on women, we invite everyone to participate.
Sign up today
Visit the links below to learn more about each regional session and contact your sales rep to sign up today.
Mobile devices have become the go-to for daily tasks like online banking, healthcare management, and personal photo storage, making them prime targets for malicious actors seeking to exploit valuable information. Bad actors often turn to publishing and distributing malware via apps as a lucrative channel for generating illegal and/or unethical profits.
Android takes a multi-layered approach to combating malware to help keep users safe (more later in the post), but while we continuously strengthen our defenses against malware, threat actors are persistently updating their malware to evade detection. Malware developers used to complete their entire malicious aggression using the common Android app development toolkits in Java, which is easier to detect by reversing the Java bytecode. In recent years, malware developers are increasing the use of native code to obfuscate some of the critical malware behaviors and putting their hopes on obscuration in compiled and symbol-stripped Executable and Linkable Format (ELF) files, which can be more difficult and time-consuming to reveal their true intentions.
To combat these new challenges, Android Security and Privacy Team is partnering with Mandiant FLARE to extend the open-source binary analysis tool capa to analyze native ARM ELF files targeting Android. Together, we improved existing and developed new capa rules to detect capabilities observed in Android malware, used the capa rule matches to highlight the highly suspicious code in native files, and prompted Gemini with the highlighted code behaviors for summarization to enhance our review processes for faster decisions.
In this blog post, we will describe how we leverage capa behavior-detection capabilities and state-of-art Gemini summarization by:
Showcasing a malware sample that used various anti-analysis tricks to evade detections
Explaining how our existing and new capa rules identify and highlighted those behaviors
Presenting how Gemini summarizes the highlighted code for security reviews
An Illegal Gambling App Under a Music App Façade
Google Play Store ensures all published apps conform to local laws and regulations. This includes gambling apps, which are prohibited or require licenses in some areas. Developing and distributing illegal gambling apps in such areas can generate significant illicit profits, which sometimes is associated with organized crimes. To bypass Google Play Store’s security-screening procedures, some gambling apps disguise themselves with harmless façades like music or casual games. These apps only reveal their gambling portals in certain geographic markets using various anti-analysis tricks. Unfortunately, dynamic analysis, such as emulation and sandbox detonation, relies on specific device configurations, and threat actors keep trying different combinations of settings to evade our detections. It’s an ongoing game of cat and mouse!
In response, the Android Security and Privacy Team has evolved static analysis techniques, such as those that evaluate the behavior of a complete program and all its conditional logic. So, let’s describe an app that violated Google Play Store rules and show how we can better detect and block other apps like it.
We received reports of a music app opening gambling websites for users in certain geographical areas. It used an interesting trick of hiding key behaviors in a native ELF file that has most symbols (except the exported ones) stripped and is loaded at runtime to evade detection.
When we decompiled the app into Java source code, using a tool like JEB Decompiler, we found that the app has a song-playing functionality as shown in “MainActivity” of Figure 1. This looks like benign behavior and is fully within the limits of Google Play Store policies.
However, there was a small region of initialization code that loads an ELF file as soon as the app is initialized when calling the onCreate function, as shown in com.x.y.z class of Figure 1. To fully understand the behavior of the entire app, we also had to reverse engineer the ELF file, which requires a completely different toolset.
Figure 1: How the app applies anti-analysis techniques
Using a tool like Ghidra, we decompiled the ARM64 ELF file into C source code and found that this app estimates the user’s geographic location using timezone information (“Code Section 1” in Figure 1). The code implements a loop that compares the user’s timezone with a list of target regions (“Data Section” in Figure 1).
If the user’s location matches a value in the list (“Data Section” in Figure 1), this malware:
Downloads an encrypted DEX file from a remote server (“Code Section 2” in Figure 1)
Decrypts the downloaded DEX file (“Code Section 3” in Figure 1)
Loads the decrypted DEX file into memory (“Code Section 4” in Figure 1)
The loaded DEX file uses further server-side cloaking techniques and finally loads a gambling website (Figure 3) to the app users. Compared to the app icon in Figure 2, it is an obvious mismatch of the app’s advertised functionality.
Figure 2: The app icon as published
Figure 3: The loaded gambling website in app
While there are many detection technologies, such as YARA, available for identifying malware distributed in ELF files, they are less resilient to app updates or variations introduced by threat actors. Fortunately, the Android Security and Privacy Team has developed new techniques for detecting malicious Android apps by inspecting their native ELF components. For example, in the gambling app in Figure 3, there are many API calls dynamically resolved via the Java Native Interface (JNI) that interact with the Android runtime. Our detection systems recognized these cross-runtime interactions and reason about their intent. We’ve enumerated behaviors commonly seen in Android malware, such as making ptrace API calls, extracting device information, downloading code from remote servers to local storage, and making various cryptographic operations via JNI, turning them into capa detections we can use to identify and block Google Play Store threats.
Let’s now talk a little more about how this works.
Android capa Rules
capa is a tool that detects capabilities in executable files. You run it against a compiled program, and it tells you what it thinks the program can do. For example, capa might suggest that a file is a backdoor, is capable of installing services, or relies on HTTP to communicate.
Mandiant FLARE extended capa to support BinExport2, an architecture agnostic representation of disassembled programs. This enables capa to match capabilities for additional architectures and file formats, such as those supported by Ghidra and its BinExport2 plugin, with an initial focus on ARM64 ELF files. The Android Security and Privacy Team then created new capa rules focused specifically on detecting capabilities observed in ARM64 ELF files used by various Android malware samples. These proprietary rules alongside capa’s open-source rules are used to detect malware capabilities as part of internal Android malware analysis pipelines.
Referring back to the gambling app in Figure 3, the following Google proprietary rules and open-source capa rules matched the malicious functions performing cloaking techniques for further inspection.
Proprietary rules:
Make ptrace API calls
Extract device configuration information via JNI on Android
Extract timezone via JNI on Android
Encode or decode data using Base64 via JNI on Android
Encrypt or decrypt data using Cipher API via JNI on Android
Instead of browsing hundreds of thousands lines of obfuscated code, our analysts were able to quickly identify the evidence of the app’s wrong-doings using the function addresses matching those rules and enforced on the app.
Gemini Summaries of capa Rule Matches
Safeguarding the Android ecosystem, our Android malware analysis pipelines scan millions of ELF files in-depth every day, each one containing thousands to millions of lines in their decompiled codes. On top of the fast-evolving Gemini capabilities in malware analysis, capa rules are able to select the most interesting code for Gemini summarization, with sharpened focus on a much smaller set of the most suspicious functions.
Figure 4: How capa rules and Gemini help our reviewers and automation
We asked Gemini to summarize the functions matched on capa rules from the earlier gambling app with the following prompt:
You are a world-class Android reverse engineer with malware analysis specialities. You are given a few functions from an ARM/AArch64 ELF file as part of an Android App, with their decompiled pseudo-C codes. You need to analyze these functions to
1. summarize what these functions try to achieve
2. decide whether any of them has code/behavior patterns that would be found in Android malware.
{some more instructions…}
…
These functions are decompiled by Ghidra into pseudo-C format as follows:
[function_1 content]
[function_2 content]
…
In your response, please give your overall verdict assessing the risk level using HIGH / MEDIUM / LOW in the beginning, and explain your rationale for each potential malicious behavior.
Gemini responded with the following suggestions:
VERDICT: HIGH RISK
The provided Android application code exhibits several concerning behaviors strongly indicative of malicious intent. The risk level is assessed as HIGH due to the presence of multiple red flags consistent with sophisticated Android malware techniques. The analysis below details the specific malicious behavior patterns observed:
1. Dynamic Code Loading (HIGH): The function comeOn demonstrates a pattern of dynamically loading and executing code. The app fetches files (potentially DEX or ELF) using getCacheFilePath and downloadFile, then attempts to execute them using getDexClassLoader. This is a classic technique for delivering malicious payloads after initial installation, allowing the malware to easily update its functionality and evade static analysis. The use of custom decryption (decryptFile, decryptAES) further enhances the obfuscation and anti-analysis capabilities, indicating a deliberate attempt to conceal the loaded code’s true nature.
2. Custom Decryption and Execution (HIGH): The decryptFile function performs custom decryption of a file, likely to obfuscate the malicious payload. The decryption utilizes an AES algorithm, which adds a layer of complexity to the reversal process. In combination with the dynamic loading, this points strongly towards a sophisticated malware strategy. The decrypted file is then written to disk (fopen, fwrite) and executed using a custom class loader.
3. Time Cloaking (MEDIUM):thread_function incorporates time-based cloaking. It checks the user’s timezone (getTimeZoneInfo) and potentially alters behavior if the timezone is not “America/Sao Paulo”. More importantly, it appends a timestamp ("?time=" + current time) to a string used for downloading further files. This suggests an attempt to evade detection by changing behavior based on the device’s time and potentially making analysis more challenging by dynamically creating different network requests.
4. Network Cloaking (MEDIUM):thread_function exhibits additional cloaking behavior, though details are limited. The function interacts with network resources to download additional resources, showing its network-active nature. The data used in network operations are likely obfuscated through string manipulation and URL obfuscation, making identification and analysis difficult. The precise cloaking mechanisms used are not fully clear from the decompiled code, but the presence of this behavior is a significant concern.
5. Obfuscation Techniques (HIGH): The entire codebase uses extensive obfuscation techniques. The code is heavily reliant on complex data structures (deques, hash tables, trees) and intricate control flow. Function and variable names are nonsensical, hindering analysis. The use of multiple layers of indirection, combined with aggressive heap memory management (operator_new, operator_delete), makes the code difficult to understand and analyze. These obfuscation methods are typical of malware aiming to evade reverse engineering.
6. Anti-Debugging/Anti-Analysis Techniques (HIGH): The JNI_OnLoad function includes a call to ptrace(PTRACE_TRACEME, 0, 0, 0). This is a clear indicator of an attempt to detect debugging or analysis tools. If a debugger is attached, this function’s behavior will likely differ; it could crash, stall the execution, or even hide its malicious code.
Conclusion:
The combination of dynamic code loading, custom decryption, sophisticated obfuscation, and anti-debugging techniques creates a high likelihood of malicious activity. The incomplete understanding of the role of several custom functions further aggravates the risk. The app’s behavior is heavily contingent on received data and time-sensitive factors, raising serious concerns about its potential for wide-ranging malicious acts. Immediate action should be taken to contain and investigate this application further. A detailed sandbox analysis is critical to identify any network communications, file system manipulation, or other abusive activities.
As seen in the Gemini output, the Android ELF behaviors are explained clearly on the functions matched on capa rules.
In this particular example, Gemini helped to:
Accentuate the function call sequences to perform dynamic code loading, where our analysts can easily inspect the key function calls getCacheFilePath and getDexClassLoader
Identify the timezone extraction with the additional URL parameter hint, where our analysts may try to probe the malicious payload quickly and accurately
Describe more potential suspicious behaviors (e.g. getDexClassLoader JNI call, URL obfuscation) for further rule-writing ideas
capa rules in Android together with Gemini summarization shows great potential for further malware detection with more advanced techniques. Our analysts are closely monitoring the malware trends and techniques in the market and writing up-to-date capa rules to catch the bad actors in the wild.
Android’s Multi-Layered Security Approach
Android’s ever-evolving, multi-layered security approach includes integrating advanced features and working with developers and device implementers to keep the Android platform and ecosystem safe. This includes, but is not limited to:
Advanced built-in protections: Google Play Protect automatically scans every app on Android devices with Google Play Services, no matter the download source. This built-in protection, enabled by default, provides crucial security against malware and unwanted software. Google Play Protect scans more than 200 billion apps daily and performs real-time scanning at the code-level on novel apps to combat emerging and hidden threats, like polymorphic malware. In 2024, Google Play Protect’s real-time scanning identified more than 13 million new malicious apps from outside Google Play.
Google Play and developer protections from malware: To create a trusted experience for everyone on Google Play, we use our SAFE principles as a guide, incorporating multi-layered protections that are always evolving to help keep Google Play safe. These protections start with the developers themselves, who play a crucial role in building secure apps. We provide developers with best-in-class tools, best practices, and on-demand training resources for building safe, high-quality apps. Every app undergoes rigorous review and testing, with only approved apps allowed to appear in the Play Store. Before a user downloads an app from Play, users can explore its user reviews, ratings, and Data safety section on Google Play to help them make an informed decision.
Engagement with the security research community: Google works closely with the security community on multiple levels, including the App Defense Alliance, to advance app safety standards. Android also collaborates with Google Threat Intelligence Group (GTIG) to address emerging threats and safeguard Android users worldwide.
Equipped with the fast-evolving Gemini, our analysts are able to spend less time on those sophisticated samples, minimising the exposure for malicious apps and ensuring the safety of Android ecosystems.
Acknowledgement
Special thanks to Willi Ballenthin, Yannis Gasparis, Mike Hunhoff, and Moritz Raabe for their support.
Amazon Elastic Container Service (Amazon ECS) launches its networking capability called ECS Service Connect in the AWS GovCloud (US-West) and AWS GovCloud (US-East) Regions.
Amazon ECS is a fully managed container orchestration service that makes it easier for you to deploy, manage, and scale containerized applications. With ECS Service Connect, customers can easily configure service discovery, connectivity, traffic observability, and encryption for services running in Amazon ECS. This enables more efficient application development by allowing you to focus on writing application code instead of managing complex networking infrastructure
To learn more about how to get started with Amazon ECS Service Connect and how it works, see our documentation.
Starting today, you can connect to your Amazon DocumentDB (with MongoDB compatibility) clusters with a single click through AWS CloudShell directly from the AWS Management Console. This new feature allows our customers to easily connect to their DocumentDB clusters directly from their browser, without requiring any local setup.
One-click connectivity with Cloudshell automatically configures VPCs, security groups, and MongoDB shell installation, eliminating the need for manual tasks like setting up security groups and ingress/egress rules. This streamlines the process for database administrators and developers, enabling them to connect quickly to their DocumentDB cluster to preview data or execute adhoc queries.
This feature is available in all AWS Regions where both Amazon DocumentDB and AWS CloudShell services are supported. For more information about region availability, see the AWS Region table. To get started, follow Get started with Amazon DocumentDB to create an Amazon DocumentDB cluster, then click the “Connect to cluster” button to connect via CloudShell.
Starting today, you can build, train, and deploy machine learning (ML) models in Asia Pacific (Malaysia).
Amazon SageMaker AI is a fully managed platform that provides every developer and data scientist with the ability to build, train, and deploy machine learning (ML) models quickly. SageMaker AI removes the heavy lifting from each step of the machine learning process to make it easier to develop high quality models.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M6in and M6idn instances are available in AWS Europe (Zurich) region. These sixth-generation network optimized instances, powered by 3rd Generation Intel Xeon Scalable processors and built on the AWS Nitro System, deliver up to 200Gbps network bandwidth, for 2x more network bandwidth over comparable fifth-generation instances. Customers can use M6in and M6idn instances to scale their performance and throughput of network-intensive workloads such as high-performance file systems, distributed web scale in-memory caches, caching fleets, real-time big data analytics, and Telco applications such as 5G User Plane Function.
M6in and M6idn instances are available in 10 different instance sizes including metal, offering up to 128 vCPUs and 512 GiB of memory. They deliver up to 100Gbps of Amazon Elastic Block Store (EBS) bandwidth, and up to 400K IOPS. M6in and M6idn instances offer Elastic Fabric Adapter (EFA) networking support on 32xlarge and metal sizes. M6idn instances offer up to 7.6 TB of high-speed, low-latency instance storage.
With this regional expansion, M6in and M6idn instances are available in the following AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Europe (Ireland, Frankfurt, Spain, Stockholm, Zurich), Asia Pacific (Mumbai, Singapore, Tokyo, Sydney), Canada (Central), and AWS GovCloud (US-West). Customers can purchase the new instances through Savings Plans, On-Demand, and Spot instances. To learn more, see M6in and M6idn instances page.
Amazon Q Business, the generative AI-powered assistant for finding information, gaining insight, and taking action at work, introduced an intelligent orchestration feature that simplifies conversational experience by understanding user queries, and automatically directing the queries to the right data sources and plugins to generate relevant responses.
Earlier, users would need to manually switch between different work applications to complete their tasks and gain insights from data sources. Enabling orchestration simplifies the user conversational experience by automatically directing user queries to indexed enterprise data sources and plugins, eliminating the need for manual selection. Orchestration works with retrieval augmented generative (RAG) data for enterprise-specific knowledge, and with both built-in and custom plugins, allowing users to interact with popular business tools like Jira, Salesforce, ServiceNow, Smartsheet, and PagerDuty – all through a single conversation in Amazon Q Business.
The new orchestration feature is available in all AWS Regions where Amazon Q Business is available. To get started with the feature, customers can access the setting directly from the Amazon Q Business console. To learn more, please visit Using global controls in Amazon Q Business. To learn more about Amazon Q Business plugins and how they can enhance your organization’s productivity, visit the Amazon Q Business webpage and explore the Amazon Q Business plugin documentation.
AWS Wickr is excited to announce the launch of “Wickr Files,” a dedicated space that allows you to access and manage files in conversations. Moderators of Wickr rooms and users in self-moderated group conversations, can now upload and organize files in folders. Users can toggle between “Messages” and “Files” tabs to access relevant content and streamline collaboration.
AWS Wickr is a security-first messaging and collaboration service with features designed to help keep your internal and external communications secure, private, and compliant. AWS Wickr protects one-to-one and group messaging, voice and video calling, file sharing, screen sharing, and location sharing with end-to-end encryption. Customers have full administrative control over data, which includes addressing information governance polices, configuring ephemeral messaging options, and deleting credentials for lost or stolen devices. You can log both internal and external conversations in an AWS Wickr network to a private data store that you manage, for data retention and auditing purposes.
AWS Wickr is available in commercial AWS Regions that include US East (N. Virginia), AWS Canada (Central), AWS Asia Pacific (Malaysia, Singapore, Sydney, and Tokyo), and AWS Europe (London, Frankfurt, Stockholm, and Zurich). It is also available in AWS GovCloud (US-West) as Department of Defense Impact Level 5 (DoD IL5)-authorized AWS WickrGov.
To learn more and get started, see the following resources:
AWS Database Migration Service Serverless (AWS DMSS) now supports files on S3 source endpoints. Using AWS DMSS, you can now migrate data in CSV or parquet format from S3 to any supported AWS DMSS target endpoint. AWS DMSS supports both full load and Change Data Capture (CDC) migration modes for CSV and Parquet files with S3 source endpoints using AWS DMS console, AWS CLI, or AWS SDKs in all regions where DMSS is available.
For information about AWS DMS S3 source endpoints see DMS S3. Visit AWS DMS to get started.
For AWS DMS regional availability, please refer to the AWS Region Table.
Today, AWS Identity and Access Management (IAM) announced support for encrypted SAML assertions. SAML is an open standard that many identity providers (IdPs) use for federated single sign-on (SSO), enabling users or applications in your company to log into the AWS Management Console or call AWS API operations. You can now configure your identity provider to encrypt the SAML assertions that it sends to IAM. This ensures that your assertions are encrypted when passed through intermediaries (for example, the end user’s web browser).
You can use the AWS IAM console, APIs, or CLI to configure SAML encryption for your identity providers that support encryption. Please refer to IAM and your identity provider’s product documentation for detailed configuration steps.
This feature is available in all AWS commercial and AWS GovCloud (US) Regions. To learn more, please refer to IAM documentation.
Amazon Redshift announces the general availability of Query Editor V2 with Amazon Redshift in the Asia Pacific (Malaysia) region. Amazon Redshift Query Editor V2 makes data in your Amazon Redshift data warehouse and data lake more accessible with a web-based tool for SQL users such as data analysts, data scientists, and database developers. With Amazon Redshift Query Editor V2, users can explore, analyze, and collaborate on data. It reduces the operational costs of managing query tools by providing a web-based application that allows you to focus on exploring your data without managing your infrastructure.
The Amazon Redshift Query Editor V2 is a separate web-based SQL client application that you use to author and run queries on your Amazon Redshift data warehouse. You can use it to edit and run queries, visualize results, and share your work with your team. With Amazon Redshift Query Editor V2, you can create databases, schemas, tables, and user-defined functions (UDFs). In a tree-view panel, for each of your databases, you can view its schemas. For each schema, you can view its tables, views, UDFs, and stored procedures. The Amazon Redshift Query Editor V2 comes with sample data and notebooks available for you to be loaded into a sample database and corresponding schema. You can use it to load data into a database in an Amazon Redshift cluster or workgroup.
Today, AWS is announcing the general availability of a new smaller size of Amazon EC2 F2 instances, F2.6xlarge. Amazon EC2 F2 instances are the second-generation of FPGA-powered instances, purpose built for developing and deploying reconfigurable hardware in the cloud. The new F2.6xlarge instance size has 1 FPGA, compared to the currently available F2 sizes with 4 and 8 FPGAs.
Amazon EC2 F2 instances are ideal for FPGA-accelerated solutions in genomics, multimedia processing, big data, network acceleration, and more. AWS customers can use the new instance size, F2.6xlarge, for applications that are more cost-efficient or performant when running on a single FPGA. F2.6xlarge also provides a lower cost entry point to using F2 instances.
F2 instances are the first to feature an FPGA with 16 GB of high bandwidth memory (HBM). The new F2.6xlarge size features a 3rd generation AMD EPYC processor with 24 vCPUs, 256 GB of system memory, 950 GB of NVMe SSD local storage, and 12.5 Gbps of network bandwidth.
The F2.6xlarge size is now available in the US East (N. Virginia), Europe (London), and US West (Oregon) AWS Regions. To learn more, please visit the Amazon EC2 F2 Instances page.
AWS Step Functions has increased the default quota for the maximum number of registered state machines and activities from 10,000 to 100,000. This tenfold increase enables customers to create and manage significantly more workflows within a single AWS account, providing greater scalability and flexibility.
AWS Step Functions is a visual workflow service capable of orchestrating over 14,000+ API actions from over 220 AWS services to build distributed applications and orchestrate data processing workloads. With this quota increase, customers can now create up to 100,000 state machines and activities, unblocking use cases that require a large number of workflows such as applications that dynamically create new workflows. This new quota is automatically applied to all AWS accounts and requires no action on your part.
This quota increase is available in all AWS Regions where Step Functions is available. To learn more, see the AWS Step Functions Developer Guide.
As AI continues to unlock new opportunities for business growth and societal benefits, we’re working to reduce the carbon intensity of AI systems — including by optimizing software, improving hardware efficiency, and powering AI models with carbon-free energy.
Today we’re releasing a first-of-its-kind study1 on the lifetime emissions of our Tensor Processing Unit (TPU) hardware. Over two generations — from TPU v4 to Trillium — more efficient TPU hardware design has led to a 3x improvement in the carbon-efficiency of AI workloads.2
Our life-cycle assessment (LCA) provides the first detailed estimate of emissions from an AI accelerator, using observational data from raw material extraction and manufacturing, to energy consumption during operation. These measurements provide a snapshot of the average, chip-level carbon intensity of Google’s TPU hardware, and enable us to compare efficiency across generations.
Introducing Compute Carbon Intensity (CCI)
Our study examined five models of TPUs to estimate their full life-cycle emissions and understand how hardware design decisions have impacted their carbon-efficiency. To measure emissions relative to computational performance and enable apples-to-apples comparisons between chips, we developed a new metric — Compute Carbon Intensity (CCI) — that we believe can enable greater transparency and innovation across the industry.
CCI quantifies an AI accelerator chip’s carbon emissions per unit of computation (measured in grams of CO2e per Exa-FLOP).3 Lower CCI scores mean lower emissions from the AI hardware platform for a given AI workload — for example training an AI model. We’ve used CCI to track the progress we’ve made in increasing the carbon-efficiency of our TPUs, and we’re excited to share the results.
Key takeaways
Google’s TPUs have become significantly more carbon-efficient. Our study found a 3x improvement in the CCI of our TPU chips over 4 years, from TPU v4 to Trillium. By choosing newer generations of TPUs — like our 6th-generation TPU, Trillium — our customers not only get cutting-edge performance, but also generate fewer carbon emissions for the same AI workload.
Operational electricity emissions are key. Today, operational electricity emissions comprise the vast majority (70%+) of a Google TPU’s lifetime emissions. This underscores the importance of improving the energy efficiency of AI chips and reducing the carbon intensity of the electricity that powers them. Google’s efforts torun on 24/7 carbon-free energy (CFE) on every grid where we operate by 2030 aims directly at reducing the largest contributor to TPU emissions — operational electricity consumption.
Manufacturing matters. While operational emissions dominate an AI chip’s lifetime emissions, emissions associated with chip manufacturing are still notable — and their share of total emissions will increase as we reduce operational emissions with carbon-free energy. The study’s detailed manufacturing LCA helps us target our manufacturing decarbonization efforts towards the highest-impact initiatives. We’re actively working with our supply chain partners to reduce these emissions through more sustainable manufacturing processes and materials.
Our significant improvements in AI hardware carbon-efficiency in this paper complement rapid advancements in AI model and algorithm design. Outside of this study, continued optimization of AI models is reducing the number of computations required for a given model performance. Some models that once required a supercomputer to run can now be run on a laptop, and at Google we’re using techniques like Accurate Quantized Training and speculative decoding to further increase model efficiency. We expect model advancements to continue unlocking carbon-efficiency gains, and are working to quantify the impact of software design on carbon-efficiency in future studies.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud TPU API’), (‘body’, <wagtail.rich_text.RichText object at 0x3e570b0e0af0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/marketplace/product/google/tpu.googleapis.com’), (‘image’, None)])]>
Partnering for a sustainable AI future
The detailed approach we’ve taken here allows us to target our efforts to continue increasing the carbon-efficiency of our TPUs.
This life-cycle analysis of AI hardware is an important first step in quantifying and sharing the carbon-efficiency of our AI systems, but it’s just the beginning. We will continue to analyze other aspects of AI’s emissions footprint — for example AI model emissions and software efficiency gains — and share our insights with customers and the broader industry.
1. The authors would like to thank and acknowledge the co-authors for their important contributions: Ian Schneider, Hui Xu, Stephan Benecke, Tim Huang, and Cooper Elsworth. 2. A February 2025 Google case study quantified the full lifecycle emissions of TPU hardware as a point-in-time snapshot across Google’s generations of TPUs. To estimate operational emissions from electricity consumption of running workloads, we used a one month sample of observed machine power data from our entire TPU fleet, applying Google’s 2023 average fleetwide carbon intensity. To estimate embodied emissions from manufacturing, transportation, and retirement, we performed a life-cycle assessment of the hardware. Data center construction emissions were estimated based on Google’s disclosed 2023 carbon footprint. These findings do not represent model-level emissions, nor are they a complete quantification of Google’s AI emissions. Based on the TPU location of a specific workload, CCI results of specific workloads may vary. 3. CCI includes both estimates of lifetime embodied and operational emissions in order to understand the impact of improved chip design on our TPUs. In this study, we hold the impact of carbon-free energy on carbon intensity constant across generations, by using Google’s 2023 average fleetwide carbon intensity. We did this purposefully to remove the impact of deployment location on the results.