Earlier this week, we released Go 1.24, the latest version of Google’s open-source programming language for productively building scalable, production-ready backend and cloud-based systems.
There’s a lot to love about Go 1.24, including support for post-quantum cryptography, a weak pointer implementation, and substantial performance improvements to the Go runtime. Go 1.24 also significantly expands its capabilities for WebAssembly (Wasm), a binary instruction format that provides for the execution of high-performance, low-level code at speeds approaching native performance. With a new `go:wasmexport` compiler directive and the ability to build a reactor for the WebAssembly System Interface (WASI), developers can now export functions from their Go code to Wasm — including in long-running applications — fostering deeper integrations with Wasm hosts and unlocking new possibilities for Go-based Wasm applications.
These additions represent a significant step forward in Go’s Wasm story. For some types of applications, like those running at the edge, Wasm is critical to serving performance-critical use cases. Now, developers can leverage Go’s signature capabilities to ensure that these use cases are also scalable, secure, and production-ready.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb7c2e44790>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
How does it work?
Go first added support for compiling to Wasm in Go 1.11 via the `js/wasm` port, and added a new port for the WASI preview 1 syscall API in Go 1.21. Now, with Go 1.24, the new `go:wasmexport` compiler directive makes Go functions accessible to a Wasm host, enabling the host to call into a Go application like a plugin or other extension mechanism. And, with the new WASI reactor build flag, a Go application remains live after its initialization function finishes, helping to ensure that exported functions remain callable without requiring reinitialization — an important feature in long-running applications or services.
For more details, be sure to check out this post from the Go blog and read more in the Go docs.
Run Wasm at the edge with Google Cloud
Starting today, you can now run Go compiled Wasm plugins for applications built on Google Cloud at the edge. To do so, you need to leverage Service Extensions with Google Cloud’s Application Load Balancers. Service Extensions allows you to run your own custom code directly in the request/response path in a fully managed Google environment with optimal latency, so you can customize load balancers to meet your business requirements. All you need to do is provide the code — Google Cloud manages the rest.
To get started with Service Extensions plugins and Go, take a look at our growing samples repository with a local testing toolkit and follow our quickstart guide in the documentation.
As organizations rush to adopt generative AI-driven chatbots and agents, it’s important to reduce the risk of exposure to threat actors who force AI models to create harmful content.
We want to highlight two powerful capabilities of Vertex AI that can help manage this risk — content filters and system instructions. Today, we’ll show how you can use them to ensure consistent and trustworthy interactions.
Content filters: Post-response defenses
By analyzing generated text and blocking responses that trigger specific criteria, content filters can help block the output of harmful content. They function independently from Gemini models as part of a layered defense against threat actors who attempt to jailbreak the model.
Gemini models on Vertex AI use two types of content filters:
Non-configurable safety filters automatically block outputs containing prohibited content, such as child sexual abuse material (CSAM) and personally identifiable information (PII).
Configurable content filters allow you to define blocking thresholds in four harm categories (hate speech, harassment, sexually explicit, and dangerous content,) based on probability and severity scores. These filters are default off but you can configure them according to your needs.
It’s important to note that, like any automated system, these filters can occasionally produce false positives, incorrectly flagging benign content. This can negatively impact user experience, particularly in conversational settings. System instructions (below) can help mitigate some of these limitations.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e71d60e52b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
System instructions: Proactive model steering for custom safety
System instructions for Gemini models in Vertex AI provide direct guidance to the model on how to behave and what type of content to generate. By providing specific instructions, you can proactively steer the model away from generating undesirable content to meet your organization’s unique needs.
You can craft system instructions to define content safety guidelines, such as prohibited and sensitive topics, and disclaimer language, as well as brand safety guidelines to ensure the model’s outputs align with your brand’s voice, tone, values, and target audience.
System instructions have the following advantages over content filters:
You can define specific harms and topics you want to avoid, so you’re not restricted to a small set of categories.
You can be prescriptive and detailed. For example, instead of just saying “avoid nudity,” you can define what you mean by nudity in your cultural context and outline allowed exceptions.
You can iterate instructions to meet your needs. For example, if you notice that the instruction “avoid dangerous content” leads to the model being excessively cautious or avoiding a wider range of topics than intended, you can make the instruction more specific, such as “don’t generate violent content” or “avoid discussion of illegal drug use.”
However, system instructions have the following limitations:
They are theoretically more susceptible to zero-shot and other complex jailbreak techniques.
They can cause the model to be overly cautious on borderline topics.
In some situations, a complex system instruction for safety may inadvertently impact overall output quality.
We recommend using both content filters and system instructions.
Evaluate your safety configuration
You can create your own evaluation sets, and test model performance with your specific configurations ahead of time. We recommend creating separate harmful and benign sets, so you can measure how effective your configuration is at catching harmful content and how often it incorrectly blocks benign content.
Investing in an evaluation set can help reduce the time it takes to test the model when implementing changes in the future.
How to get started
Both content filters and system instructions play a role in ensuring safe and responsible use of Gemini. The best approach depends on your specific requirements and risk tolerance. To get started, check out content filters and system instructions for safety documentation.
Generative AI is now well beyond the hype and into the realm of practical application. But while organizations are eager to build enterprise-ready gen AI solutions on top of large language models (LLMs), they face challenges in managing, securing, and scaling these deployments, especially when it comes to APIs. As part of the platform team, you may already be building a unified gen AI platform. Some common questions you might have are:
How do you ensure security and safety for your organization? As with any API, LLM APIs represent an attack vector. What are the LLM-specific considerations you need to worry about?
How do you stay within budget when your LLM adoption grows, while ensuring that each team has appropriate LLM capacity they need to continue to innovate and make your business more productive?
How do you put the right observability capabilities in place to understand your usage patterns, help troubleshoot issues, and capture compliance data?
How do you give end users of your gen AI applications the best possible experience, i.e., provide responses from the most appropriate models with minimal downtime?
Apigee, Google Cloud’s API management platform, has enabled our customers to address API challenges like these for over a decade. Here is an overview of the AI-powered digital value chain leveraging Apigee API Management.
Figure 1: AI-powered Digital Value chain
Gen AI, powered by AI agents and LLMs, is changing how customers interact with businesses, creating a large opportunity for any business. Apigee streamlines the integration of gen AI agents into applications by bolstering their security, scalability, and governance through features like authentication, traffic control, analytics, and policy enforcement. It also manages interactions with LLMs, improving security and efficiency. Additionally, Application Integration, an Integration-Platform-as-a-Service solution from Google cloud, offers pre-built connectors that allow gen AI agents to easily connect with databases and external systems, helping them fulfill user requests.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0ad423a3d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
This blog details how Apigee’s customers have been using the product to address challenges specific to LLM APIs. We’re also releasing a comprehensive set of reference solutions that enable you to get started on addressing these challenges yourself with Apigee. You can also view a webinar on the same topic, complete with product demos.
Apigee as a proxy for agents
AI agents leverage capabilities from LLMs to accomplish tasks for end-users. These agents can be built using a variety of tools — from no-code and low-code platforms, to full-code frameworks like LangChain or LlamaIndex. Apigee acts as an intermediary between your AI application and its agents. It enhances security by allowing you to defend your LLM APIs against theOWASP Top 10 API Security risks, manages user authentication and authorization, and optimizes performance through features like semantic caching. Additionally, Apigee enforces token limits to control costs and can even orchestrate complex interactions between multiple AI agents for advanced use cases.
Apigee as a gateway between LLM application and models
Depending on the task at hand, your AI agents might need to tap into the power of different LLMs. Apigee simplifies this by intelligently routing and managing failover of requests to the most suitable LLM using Apigee’s flexible configurations and templates. It also streamlines the onboarding of new AI applications and agents while providing robust access control for your LLMs. Beyond LLMs, agents often need to connect with databases and external systems to fully address users’ needs. Apigee’s robust API Management platform enables these interactions via managed APIs, and for more complex integrations, where custom business logic is required, you can leverage Google Cloud’s Application Integration platform.
It’s important to remember that these patterns aren’t one-size-fits-all. Your specific use cases will influence the architecture pattern for an agent and LLM interaction. For example, you might not always need to route requests to multiple LLMs. In some scenarios, you could connect directly to databases and external systems from the Apigee agent proxy layer. The key is flexibility — Apigee lets you adapt the architecture to match your exact needs.
Now let’s break down the specific areas where Apigee helps one by one:
AI safety For any API managed with Apigee, you can call out to Model Armor, Google Cloud’s model safety offering that allows you to inspect every prompt and response to protect you against potential prompt attacks and help your LLMs respond within the guardrails you set. For example, you can specify that your LLM application does not provide answers about financial or political topics.
Latency and cost Model response latency continues to be a major factor when building LLM-powered applications, and this will only get worse as more reasoning happens during inference. With Apigee, you can implement a semantic cache that allows you to cache responses to any model for semantically similar questions. This dramatically reduces the time end users need to wait for a response.
Performance Different models are good at different things. For example, Gemini Pro models provide the highest quality answers, while Gemini Flash models excel at speed and efficiency. You can route users’ prompts to the best model for the job, depending on the use case or application.
You can decide which model to use by specifying it in your API call and Apigee routes it to your desired model while keeping a consistent API contract. See this reference solution to get started.
Distribution and usage limits With Apigee you can create a unified portal with self-service access to all the models in your organization. You can also set up usage limits by individual apps and developers to maintain capacity for those who need it, while also controlling overall costs. See how you can set up usage limits in Apigee using LLM token counts here.
Availability Due to the high computational demands of LLM inference, model providers regularly restrict the number of tokens you can use in a certain time window. If you reach a model limit, requests from your applications will get throttled, which could lead to your end users being locked out of the model. In order to prevent this, you can implement a circuit breaker in Apigee so that requests are re-routed to a model with available capacity. See this reference solution to get started.
Reporting As a platform team, you need visibility into usage of the various models you support as well as which apps are consuming how many tokens. You might want to use this data for internal cost reporting or to optimize. Whatever your motivation, with Apigee, you can build dashboards that let you see usage based on the actual tokens counts — the currency of LLM APIs. This way you can see the true usage volume across your applications. See this reference solution to get started.
Auditing and troubleshooting Perhaps you need to log all interactions with LLMs (prompts, responses, RAG data) to meet compliance or troubleshooting requirements. Or perhaps you want to analyze response quality to continue to improve your LLM applications. With Apigee you can safely log any LLM interaction with Cloud Logging, de-identify it, and inspect it from a familiar interface. Get started here.
Security With APIs increasingly seen as an attack surface, security is paramount to any API program. Apigee can act as a secure gateway for LLM APIs, allowing you to control access with API keys, OAuth 2.0, and JWT validation. This helps you enforce using enterprise security standards to authenticate users and applications that interact with your models. Apigee can also help prevent abuse and overload by enforcing rate limits and quotas, safeguarding LLMs from malicious attacks and unexpected traffic spikes.
In addition to these security controls, you can also use Apigee to control the model providers and models that can be used. You can do this by creating policies that define the models that can be accessed by which users or applications. For example, you could create a policy that only allows certain users to access your most powerful LLMs, or you could create a policy that only allows certain applications to access your LLMs for specific tasks. This gives you granular control over how your LLMs are used, so they are only used for their intended purposes.
By integrating Apigee with your LLM architecture, you create a secure and reliable environment for your AI applications to thrive.
Ready to unlock the full potential of gen AI?
Explore Apigee’s comprehensive capabilities for operationalizing AI and start building secure, scalable, and efficient gen AI solutions today! Visit our Apigee generative AI samples page to learn more and get started, watch a webinar with more details, or contact us here!
Google Cloud Next 2025 is coming up fast, and it’s shaping up to be a must-attend event for the cybersecurity community and anyone passionate about learning more about the threat landscape. We’re going to offer an immersive experience packed with opportunities to connect with experts, explore innovative technologies, and hone your skills in the ever-evolving world of cloud security and governance, frontline threat intelligence, enterprise compliance and resilience, AI risk management, and incident response.
Whether you’re a seasoned security pro or just starting your security journey, Next ’25 has something for you.
Immerse yourself in the Security Hub
The heart of our security presence at Next ‘25 will be the Security Hub, a dynamic space designed for engagement and exploration. Here, you can dive deep into the full portfolio of Google Cloud Security products, experience expanded demos, and get your most pressing questions answered by the engineers who build them.
Experience the SOC Arena
Step into our Security Operations Center (SOC) Arena for a front-row seat to real-world attack scenarios. Witness the latest hacker tactics and learn how Google Cloud equips cybersecurity teams with the data, AI, and scalable analytics needed to quickly detect and remediate attacks. Between SOC sessions, security experts and key partners will deliver lightning talks, sharing foundational insights and valuable resources to bolster your security knowledge.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe972f3700>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Sharpen your skills in the Security Situation Room
The Situation Room offers two unique avenues for boosting your security expertise:
Security Tabletop Workshop: Prepare your organization for challenging security incidents by participating in a realistic cybersecurity tabletop exercise. Role-play different personas in a data breach, ransomware attack, and other simulated incidents, and explore potential responses, gaining insights into how your team might react, recognizing the opportunity to learn from varied perspectives and refine your approach through collaborative exploration. This exercise can help you identify vulnerabilities, evaluate incident response strategies, address gaps, foster collaboration, clarify roles, and ultimately reduce the potential impact of future attacks.
Birds of a Feather Sessions: These no-slide, discussion-focused sessions offer invaluable opportunities to connect with peers and Google Cloud Security experts. Dive into topics including securing AI, identity and access management, network security, and protection against fraud and abuse. Share challenges, discuss best practices, and explore cutting-edge trends in a collaborative environment as you network, learn, and contribute to the vibrant Google Cloud Security community.
Get hands-on in the Security Sandbox
The Security Sandbox is where the action happens. Two interactive experiences await:
Capture the Flag (CTF): Test your cybersecurity prowess in Google Threat Intelligence’s CTF challenge. This unique game blends real-world data from CISA advisories, ransom notes, and Dark Web intelligence into a simulated threat hunt.
Use industry-standard tools and data to navigate clues, analyze evidence, and solve puzzles. This CTF is designed for all skill levels, offering a chance to learn valuable techniques, experience the thrill of an investigation, and even win prizes.
ThreatSpace: Step into Google Cloud’s ThreatSpace, a digital training ground where you can experience real cyberattacks and practice your incident response skills in a safe environment. Mandiant’s red team will simulate attacks while their incident response team guides you through the investigation. Use Google Cloud Security tools including Security Operations and Threat Intelligence to uncover the attacker’s methods and prevent further damage.
Connect and recharge at Coffee Talk
Grab a coffee, snag a copy of “Defenders Advantage,” and chat with Google Cloud Security experts. Learn how our products and services can empower your security strategy across the domains of intelligence, detection, response, validation, hunting, and mission control and get personalized advice for your organization.
Register today
Next ’25 is your chance to immerse yourself in the world of cybersecurity, connect with industry leaders, and gain the knowledge and skills you need to stay ahead of the curve. To join us, register here.
Cybercrime makes up a majority of the malicious activity online and occupies the majority of defenders’ resources. In 2024, Mandiant Consulting responded to almost four times more intrusions conducted by financially motivated actors than state-backed intrusions. Despite this overwhelming volume, cybercrime receives much less attention from national security practitioners than the threat from state-backed groups. While the threat from state-backed hacking is rightly understood to be severe, it should not be evaluated in isolation from financially motivated intrusions.
A hospital disrupted by a state-backed group using a wiper and a hospital disrupted by a financially motivated group using ransomware have the same impact on patient care. Likewise, sensitive data stolen from an organization and posted on a data leak site can be exploited by an adversary in the same way data exfiltrated in an espionage operation can be. These examples are particularly salient today, as criminals increasingly target and leak data from hospitals. Healthcare’s share of posts on data leak sites has doubled over the past three years, even as the number of data leak sites tracked by Google Threat Intelligence Group has increased by nearly 50% year over year. The impact of these attacks mean that they must be taken seriously as a national security threat, no matter the motivation of the actors behind it.
Cybercrime also facilitates state-backed hacking by allowing states to purchase cyber capabilities, or co-opt criminals to conduct state-directed operations to steal data or engage in disruption. Russia has drawn on criminal capabilities to fuel the cyber support to their war in Ukraine. GRU-linked APT44 (aka Sandworm), a unit of Russian military intelligence, has employed malware available from cybercrime communities to conduct espionage and disruptive operations in Ukraine and CIGAR (aka RomCom), a group that historically focused on cybercrime, has conducted espionage operations against the Ukrainian government since 2022. However, this is not limited to Russia. Iranian threat groups deploy ransomware to raise funds while simultaneously conducting espionage, and Chinese espionage groups often supplement their income with cybercrime. Most notably, North Korea uses state-backed groups to directly generate revenue for the regime. North Korea has heavily targeted cryptocurrencies, compromising exchanges and individual victims’ crypto wallets.
Despite the overlaps in effects and collaboration with states, tackling the root causes of cybercrime requires fundamentally different solutions. Cybercrime involves collaboration between disparate groups often across borders and without respect to sovereignty. Any solution requires international cooperation by both law enforcement and intelligence agencies to track, arrest, and prosecute these criminals. Individual takedowns can have important temporary effects, but the collaborative nature of cybercrime means that the disrupted group will be quickly replaced by others offering the same service. Achieving broader success will require collaboration between countries and public and private sectors on systemic solutions such as increasing education and resilience efforts.
aside_block
<ListValue: [StructValue([(‘title’, ‘Cybercrime: A Multifaceted National Security Threat’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb5935d2400>), (‘btn_text’, ‘Download now’), (‘href’, ‘https://services.google.com/fh/files/misc/cybercrime-multifaceted-national-security-threat.pdf’), (‘image’, <GAEImage: cybercrime-cover>)])]>
Stand-Alone Cybercrime is a Threat to Countries’ National Security
Financially motivated cyber intrusions, even those without any ties to state goals, harm national security. A single incident can be impactful enough on its own to have a severe consequence on the victim and disrupt citizens’ access to critical goods and services. The enormous volume of financially motivated intrusions occurring every day also has a cumulative impact, hurting national economic competitiveness and placing huge strain on cyber defenders, leading to decreased readiness and burnout.
A Single Financially-Motivated Operation Can Have Severe Effects
Cybercrime, particularly ransomware attacks, are a serious threat to critical infrastructure. Disruptions to energy infrastructure, such as the 2021 Colonial Pipeline attack, a 2022 incident at the Amsterdam-Rotterdam-Antwerp refining hub, and the 2023 attack on Petro-Canada, have disrupted citizens’ ability to access vital goods. While the impacts in these cases were temporary and recoverable, a ransomware attack during a weather emergency or other acute situation could have devastating consequences.
Beyond energy, the ransomware attacks on the healthcare sector have had the most severe consequences on everyday people. At the height of the pandemic in early 2020, it appeared that ransomware groups might steer clear of hospitals, with multiple groups making statements to that effect, but the forbearance did not hold. Healthcare organizations’ critical missions and the high impact of disruptions have led them to be perceived as more likely to pay a ransom and led some groups to increase their focus on targeting healthcare. The healthcare industry, especially hospitals, almost certainly continues to be a lucrative target for ransomware operators given the sensitivity of patient data and the criticality of the services that it provides.
Since 2022, Google Threat Intelligence Group (GTIG) has observed a notable increase in the number of data leak site (DLS) victims from within the hospital subsector. Data leak sites, which are used to release victim data following data theft extortion incidents, are intended to pressure victims to pay a ransom demand or give threat actors additional leverage during ransom negotiations.
In July 2024, the Qilin (aka “AGENDA”) DLS announced upcoming attacks targeting US healthcare organizations. They followed through with this threat by adding a regional medical center to their list of claimed victims on the DLS the following week, and adding multiple healthcare and dental clinics in August 2024. The ransomware operators have purportedly stated that they focus their targeting on sectors that pay well, and one of those sectors is healthcare.
In March 2024, the RAMP forum actor “badbone,” who has been associated with INC ransomware, sought illicit access to Dutch and French medical, government, and educational organizations, stating that they were willing to pay 2–5% more for hospitals, particularly ones with emergency services.
Studies from academics and internal hospital reviews have shown that the disruptions from ransomware attacks go beyond inconvenience and have led to life-threatening consequences for patients. Disruptions can impact not just individual hospitals but also the broader healthcare supply chain. Cyberattacks on companies that manufacture critical medications and life-saving therapies can have far-reaching consequences worldwide.
A recent study from researchers at the University of Minnesota – Twin Cities School of Public Health showed that among patients already admitted to a hospital when a ransomware attack takes place, “in-hospital mortality increases by 35 – 41%.”
Public reporting stated that UK National Health Service data showed a June 2024 ransomware incident at a contractor led to multiple cases of “long-term or permanent impact on physical, mental or social function or shortening of life-expectancy,” with more numerous cases of less severe effects.
Ransomware operators are aware that their attacks on hospitals will have severe consequences and will likely increase government attention on them. Although some have devised strategies to mitigate the blowback from these operations, the potential monetary rewards associated with targeting hospitals continue to drive attacks on the healthcare sector.
The actor “FireWalker,” who has recruited partners for REDBIKE (aka Akira) ransomware operations, indicated a willingness to accept access to government and medical targets, but in those cases a different ransomware called “FOULFOG” would be used.
Leaked private communications broadly referred to as the “ContiLeaks” reveal that the actors expected their plan to target the US healthcare system in the fall of 2020 to cause alarm, with one actor stating “there will be panic.”
Economic Disruption
On May 8, 2022, Costa Rican President Rodrigo Chaves declared a national emergency caused by CONTI ransomware attacks against several Costa Rican government agencies the month prior. These intrusions caused widespread disruptions in government medical, tax, pension, and customs systems. With imports and exports halted, ports were overwhelmed, and the country reportedly experienced millions of dollars of losses. The remediation costs extended beyond Costa Rica; Spain supported the immediate response efforts, and in 2023, the US announced $25 million USD in cybersecurity aid to Costa Rica.
While the Costa Rica incident was exceptional, responding to a cybercrime incident can involve significant expenses for the affected entity, such as paying multi-million dollar ransom demands, loss of income due to system downtime, providing credit monitoring services to impacted clients, and paying remediation costs and fines. In just one example, a US healthcare organization reported $872 million USD in “unfavorable cyberattack effects” after a disruptive incident. In the most extreme cases, these costs can contribute to organizations ceasing operations or declaring bankruptcy.
In addition to the direct impacts to individual organizations, financial impacts often extend to taxpayers and can have significant impacts on the national economy due to follow-on effects of the disruptions. The US Federal Bureau of Investigation’s Internet Crime Complaint Center (IC3) has indicated that between October 2013 and December 2023, business email compromise (BEC) operations alone led to $55 billion USD in losses. The cumulative effect of these cybercrime incidents can have an impact on a country’s economic competitiveness. This can be particularly severe for smaller or developing countries, especially those with a less diverse economy.
Data Leak Sites Add Additional Threats
In addition to deploying ransomware to interfere with business operations, criminal groups have added the threat of leaking data stolen from victims to bolster their extortion operations. This now standard tactic has increased the volume of sensitive data being posted by criminals and created an opportunity for it to be obtained and exploited by state intelligence agencies.
Threat actors post proprietary company data—including research and product designs—on data leak sites where they are accessible to the victims’ competitors. GTIG has previously observed threat actors sharing tips for targeting valuable data for extortion operations. In our research, GTIG identified Conti “case instructions” indicating that actors should prioritize certain types of data to use as leverage in negotiations, including files containing confidential information, document scans, HR documents, company projects, and information protected by the General Data Protection Regulation (GDPR).
The number of data leak sites has proliferated, with the number of sites tracked by GTIG almost doubling since 2022. Leaks of confidential business and personal information by extortion groups can cause embarrassment and legal consequences for the affected organization, but they also pose national security threats. If a company’s confidential intellectual property is leaked, it can undermine the firm’s competitive position in the market and undermine the host country’s economic competitiveness. The wide-scale leaking of personally identifiable information (PII) also creates an opportunity for foreign governments to collect this information to facilitate surveillance and tracking of a country’s citizens.
Cybercrime Directly Supporting State Activity
Since the earliest computer network intrusions, financially motivated actors have conducted operations for the benefit of hostile governments. While this pattern has been consistent, the heightened level of cyber activity following Russia’s war in Ukraine has shown that, in times of heightened need, the latent talent pool of cybercriminals can be paid or coerced to support state goals. Operations carried out in support of the state, but by criminal actors, have numerous benefits for their sponsors, including a lower cost and increased deniability. As the volume of financially motivated activity increases, the potential danger it presents does as well.
States as a Customer in Cybercrime Ecosystems
Modern cybercriminals are likely to specialize in a particular area of cybercrime and partner with other entities with diverse specializations to conduct operations. The specialization of cybercrime capabilities presents an opportunity for state-backed groups to simply show up as another customer for a group that normally sells to other criminals. Purchasing malware, credentials, or other key resources from illicit forums can be cheaper for state-backed groups than developing them in-house, while also providing some ability to blend in to financially motivated operations and attract less notice.
Russian State Increasingly Leveraging Malware, Tooling Sourced from Crime Marketplaces
Google assesses that resource constraints and operational demands have contributed to Russian cyber espionage groups’ increasing use of free or publicly available malware and tooling, including those commonly employed by criminal actors to conduct their operations. Following Russia’s full-scale invasion of Ukraine, GTIG has observed groups suspected to be affiliated with Russian military intelligence services adopt this type of “low-equity” approach to managing their arsenal of malware, utilities, and infrastructure. The tools procured from financially motivated actors are more widespread and lower cost than those developed by the government. This means that if an operation using this malware is discovered, the cost of developing a new tool will not be borne by the intelligence agency; additionally, the use of such tools may assist in complicating attribution efforts. Notably, multiple threat clusters with links to Russian military intelligence have leveraged disruptive malware adapted from existing ransomware variants to target Ukrainian entities.
APT44 (Sandworm, FROZENBARENTS)
APT44, a threat group sponsored by Russian military intelligence, almost certainly relies on a diverse set of Russian companies and criminal marketplaces to source and sustain its more frequently operated offensive capabilities. The group has used criminally sourced tools and infrastructure as a source of disposable capabilities that can be operationalized on short notice without immediate links to its past operations. Since Russia’s full-scale invasion of Ukraine, APT44 has increased its use of such tooling, including malware such as DARKCRYSTALRAT (DCRAT), WARZONE, and RADTHIEF (“Rhadamanthys Stealer”), and bulletproof hosting infrastructure such as that provided by the Russian-speaking actor “yalishanda,” who advertises in cybercriminal underground communities.
APT44 campaigns in 2022 and 2023 deployed RADTHIEF against victims in Ukraine and Poland. In one campaign, spear-phishing emails targeted a Ukrainian drone manufacturer and leveraged SMOKELOADER, a publicly available downloader popularized in a Russian-language underground forum that is still frequently used in criminal operations, to load RADTHIEF.
APT44 also has a history of deploying disruptive malware built upon known ransomware variants. In October 2022, a cluster we assessed with moderate confidence to be APT44 deployed PRESSTEA (aka Prestige) ransomware against logistics entities in Poland and Ukraine, a rare instance in which APT44 deployed disruptive capabilities against a NATO country. In June 2017, the group conducted an attack leveraging ETERNALPETYA (aka NotPetya), a wiper disguised as ransomware, timed to coincide with Ukraine’s Constitution Day marking its independence from Russia. Nearly two years earlier, in late 2015, the group used a modified BLACKENERGY variant to disrupt the Ukrainian power grid. BLACKENERGY originally emerged as a distributed denial-of-service (DDoS) tool, with later versions sold in criminal marketplaces.
UNC2589 (FROZENVISTA)
UNC2589, a threat cluster whose activity has been publicly attributed to the Russian General Staff Main Intelligence Directorate (GRU)’s 161st Specialist Training Center (Unit 29155), has conducted full-spectrum cyber operations, including destructive attacks, against Ukraine. The actor is known to rely on non-military elements including cybercriminals and private-sector organizations to enable their operations, and GTIG has observed the use of a variety of malware-as-a-service tools that are prominently sold in Russian-speaking cybercrime communities.
In January 2022, a month prior to the invasion, UNC2589 deployed PAYWIPE (also known as WHISPERGATE) and SHADYLOOK wipers against Ukrainian government entities in what may have been a preliminary strike, using the GOOSECHASE downloader and FINETIDE dropper to drop and execute SHADYLOOK on the target machine. US Department of Justice indictmentsidentified a Russian civilian, who GTIG assesses was a likely criminal contractor, as managing the digital environments used to stage the payloads used in the attacks. Additionally, CERT-UAcorroborated GTIG’s findings of strong similarities between SHADYLOOK and WhiteBlackCrypt ransomware (also tracked as WARYLOOK). GOOSECHASE and FINETIDE are also publicly available for purchase on underground forums.
Turla (SUMMIT)
In September 2022, GTIG identified an operation leveraging a legacy ANDROMEDA infection to gain initial access to selective targets conducted by Turla, a cyber espionage group we assess to be sponsored by Russia’s Federal Security Service (FSB). Turla re-registered expired command-and-control (C&C or C2) domains previously used by ANDROMEDA, a common commodity malware that was widespread in the early 2010s, to profile victims; it then selectively deployed KOPILUWAK and QUIETCANARY to targets in Ukraine. The ANDROMEDA backdoor whose C2 was hijacked by Turla was first uploaded to VirusTotal in 2013 and spreads from infected USB keys.
While GTIG has continued to observe ANDROMEDA infections across a wide variety of victims, GTIG has only observed suspected Turla payloads delivered in Ukraine. However, Turla’s tactic of piggybacking on widely distributed, financially motivated malware to enable follow-on compromises is one that can be used against a wide range of organizations. Additionally, the use of older malware and infrastructure may cause such a threat to be overlooked by defenders triaging a wide variety of alerts.
In December 2024, Microsoft reported on the use of Amadey bot malware related to cyber criminal activity to target Ukrainian military entities by Secret Blizzard, an actor that aligns approximately with what we track as Turla. While we are unable to confirm this activity, Microsoft’s findings suggest that Turla has continued to leverage the tactic of using cybercrime malware.
APT29 (ICECAP)
In late 2021, GTIG reported on a campaign conducted by APT29, a threat group assessed to be sponsored by the Russian Foreign Intelligence Service (SVR), in which operators used credentials likely procured from an infostealer malware campaign conducted by a third-party actor to gain initial access to European entities. Infostealers are a broad classification of malware that have the capability or primary goal of collecting and stealing a range of sensitive user information such as credentials, browser data and cookies, email data, and cryptocurrency wallets.An analysis of workstations belonging to the target revealed that some systems had been infected with the CRYPTBOT infostealer shortly before a stolen session token used to gain access to the targets’ Microsoft 365 environment was generated.
An example of the sale of government credentials on an underground forum
Use of Cybercrime Tools by Iran and China
While Russia is the country that has most frequently been identified drawing on resources from criminal forums, they are not the only ones. For instance, in May 2024, GTIG identified a suspected Iranian group, UNC5203, using the aforementioned RADTHIEF backdoor in an operation using themes associated with the Israeli nuclear research industry.
In multiple investigations, the Chinese espionage operator UNC2286 was observed ostensibly carrying out extortion operations, including using STEAMTRAIN ransomware, possibly to mask its activities. The ransomware dropped a JPG file named “Read Me.jpg” that largely copies the ransomware note delivered with DARKSIDE. However, no links have been established with the DARKSIDE ransomware-as-a-service (RaaS), suggesting the similarities are largely superficial and intended to lend credibility to the extortion attempt. Deliberately mixing ransomware activities with espionage intrusions supports the Chinese Government’s public efforts to confound attribution by conflating cyber espionage activity and ransomware operations.
Criminals Supporting State Goals
In addition to purchasing tools for state-backed intrusion groups to use, countries can directly hire or co-opt financially motivated attackers to conduct espionage and attack missions on behalf of the state. Russia, in particular, has leveraged cybercriminals for state operations.
Current and Former Russian Cybercriminal Actors Engage in Targeted Activity Supporting State Objectives
Russian intelligence services have increasingly leveraged pre-existing or new relationships with cybercriminal groups to advance national objectives and augment intelligence collection. They have done so in particular since the beginning of Russia’s full-scale invasion of Ukraine. GTIG judges that this is a combination of new efforts by the Russian state and the continuation of ongoing efforts for other financially motivated, Russia-based threat actors that had relationships with the Russian intelligence services that predated the invasion. In at least some cases, current and former members of Russian cybercriminal groups have carried out intrusion activity likely in support of state objectives.
CIGAR (UNC4895, RomCom)
CIGAR (also tracked as UNC4895 and publicly reported as RomCom) is a dual financial and espionage-motivated threat group. Active since at least 2019, the group historically conducted financially motivated operations before expanding into espionage activity that GTIG judges fulfills espionage requirements in support of Russian national interests following the start of Russia’s full-scale invasion of Ukraine. CIGAR’s ongoing engagement in both types of activity differentiates the group from threat actors like APT44 or UNC2589, which leverage cybercrime actors and tooling toward state objectives. While the precise nature of the relationship between CIGAR and the Russian state is unclear, the group’s high operational tempo, constant evolution of its malware arsenal and delivery methods, and its access to and exploitation of multiple zero-day vulnerabilities suggest a level of sophistication and resourcefulness unusual for a typical cybercrime actor.
Targeted intrusion activity from CIGAR dates back to late 2022, targeting Ukrainian military and government entities. In October 2022, CERT-UA reported on a phishing campaign that distributed emails allegedly on behalf of the Press Service of the General Staff of the Armed Forces of Ukraine, which led to the deployment of the group’s signature RomCom malware. Two months later, in December 2022, CERT-UA highlighted a RomCom operation targeting users of DELTA, a situational awareness and battlefield management system used by the Ukrainian military.
CIGAR activity in 2023 and 2024 included the leveraging of zero-day vulnerabilities to conduct intrusion activity. In late June 2023, a phishing operation targeting European government and military entities used lures related to the Ukrainian World Congress, a nonprofit involved in advocacy for Ukrainian interests, and a then-upcoming NATO summit, to deploy the MAGICSPELL downloader, which exploited CVE-2023-36884 as a zero-day in Microsoft Word. In 2024, the group was reported to exploit the Firefox vulnerability CVE-2024-9680, chained together with the Windows vulnerability CVE-2024-49039, to deploy RomCom.
CONTI
At the outset of Russia’s full-scale invasion of Ukraine, the CONTI ransomware group publicly announced its support for the Russian government, and subsequent leaks of server logs allegedly containing chat messages from members of the group revealed that at least some individuals were interested in conducting targeted attacks,and may have been taking targeting directions from a third party. GTIG further assessed that former CONTI members comprise part of an initial access broker group conducting targeted attacks against Ukraine tracked by CERT-UA as UAC-0098.
UAC-0098 historically delivered the IcedID banking trojan, leading to human-operated ransomware attacks, and GTIG assesses that the group previously acted as an initial access broker for various ransomware groups including CONTI and Quantum. In early 2022, however, the actor shifted its focus to Ukrainian entities in the government and hospitality sectors as well as European humanitarian and nonprofit organizations.
UNC5174 uses the “Uteus” hacktivist persona who has claimed to be affiliated with China’s Ministry of State Security, working as an access broker and possible contractor who conducts for-profit intrusions. UNC5174 has weaponized multiple vulnerabilities soon after they were publicly announced, attempting to compromise numerous devices before they could be patched. For example, in February 2024, UNC5174 was observed exploiting CVE-2024-1709 in ConnectWise ScreenConnect to compromise hundreds of institutions primarily in the US and Canada, and in April 2024, GTIG confirmed UNC5174 had weaponized CVE-2024-3400 in an attempt to exploit Palo Alto Network’s (PAN’s) GlobalProtect appliances. In both cases, multiple China-nexus clusters were identified leveraging the exploits, underscoring how UNC5174 may enable additional operators.
Hybrid Groups Enable Cheap Capabilities
Another form of financially motivated activity supporting state goals are groups whose main mission may be state-sponsored espionage are, either tacitly or explicitly, allowed to conduct financially motivated operations to supplement their income. This can allow a government to offset direct costs that would be required to maintain groups with robust capabilities.
Moonlighting Among Chinese Contractors
APT41
APT41 is a prolific cyber operator working out of the People’s Republic of China and most likely a contractor for the Ministry of State Security. In addition to state-sponsored espionage campaigns against a wide array of industries, APT41 has a long history of conducting financially motivated operations. The group’s cybercrime activity has mostly focused on the video game sector, including ransomware deployment. APT 41 has also enabled other Chinese espionage groups, with digital certificates stolen by APT41 later employed by other Chinese groups. APT41’s cybercrime has continued since GTIG’s 2019 report, with the United States Secret Service attributing an operation that stole millions in COVID relief funds to APT41, and GTIG identifying an operation targeting state and local governments.
Iranian Groups Deploy Ransomware for Disruption and Profit
Over the past several years, GTIG has observed Iranian espionage groups conducting ransomware operations and disruptive hack-and-leak operations. Although much of this activity is likely primarily driven by disruptive intent, some actors working on behalf of the Iranian government may also be seeking ways to monetize stolen data for personal gain, and Iran’s declining economic climate may serve as an impetus for this activity.
UNC757
In August 2024, the US Federal Bureau of Investigation (FBI), Cybersecurity and Infrastructure Security Agency (CISA), and Department of Defense Cybercrime Center (DC3) released a joint advisory indicating that a group of Iran-based cyber actors known as UNC757 collaborated with ransomware affiliates including NoEscape, Ransomhouse, and ALPHV to gain network access to organizations across various sectors and then help the affiliates deploy ransomware for a percentage of the profits. The advisory further indicated that the group stole data from targeted networks likely in support of the Iranian government, and their ransomware operations were likely not sanctioned by the Government of Iran.
GTIG is unable to independently corroborate UNC757’s reported collaboration with ransomware affiliates. However, the group has historical, suspected ties to the persona “nanash” that posted an advertisement in mid-2020 on a cybercrime forum claiming to have access to various networks, as well as hack-and-leak operations associated with the PAY2KEY ransomware and corresponding persona that targeted Israeli firms.
Examples of Dual Motive (Financial Gain and Espionage)
In multiple incidents, individuals who have conducted cyber intrusions on behalf of the Iranian government have also been identified conducting financially motivated intrusion.
A 2020 US Department of Justice indictment indicated that two Iranian nationals conducted cyber intrusion operations targeting data “pertaining to national security, foreign policy intelligence, non-military nuclear information, aerospace data, human rights activist information, victim financial information and personally identifiable information, and intellectual property, including unpublished scientific research.” The intrusions in some cases were conducted at the behest of the Iranian government, while in other instances, the defendants sold hacked data for financial gain.
In 2017, the US DoJ indicted an Iranian national who attempted to extort HBO by threatening to release stolen content. The individual had previously worked on behalf of the Iranian military to conduct cyber operations targeting military and nuclear software systems and Israeli infrastructure.
DPRK Cyber Threat Actors Conduct Financially Motivated Operations to Generate Revenue for Regime, Fund Espionage Campaigns
Financially motivated operations are broadly prevalent among threat actors linked to the Democratic People’s Republic of Korea (DPRK). These include groups focused on generating revenue for the regime as well as those that use the illicit funds to support their intelligence-gathering efforts. Cybercrime focuses on the cryptocurrency sector and blockchain-related platforms, leveraging tactics including but not limited to the creation and deployment of malicious applications posing as cryptocurrency trading platforms and the airdropping of malicious non-fungible tokens (NFTs) that redirect the user to wallet-stealing phishing websites. A March 2024 United Nations (UN) report estimated North Korean cryptocurrency theft between 2017 and 2023 at approximately $3 billion.
APT38
APT38, a financially motivated group aligned with the Reconnaissance General Bureau (RGB), was responsible for the attempted theft of vast sums of money from institutions worldwide, including via compromises targeting SWIFT systems. Publicreporting has associated the group with the use of money mules and casinos to withdraw and launder funds from fraudulent ATM and SWIFT transactions. In publicly reported heists alone, APT38’s attempted thefts from financial institutions totaled over $1.1 billion USD, and by conservative estimates, successful operations have amounted to over $100 million USD. The group has also deployed destructive malware against target networks to render them inoperable following theft operations. While APT38 now appears to be defunct, we have observed evidence of its operators regrouping into other clusters, including those heavily targeting cryptocurrency and blockchain-related entities and other financials.
UNC1069 (CryptoCore), UNC4899 (TraderTraitor)
Limited indicators suggest that threat clusters GTIG tracks as UNC1069 (publicly referred to as CryptoCore) and UNC4899 (also reported as TraderTraitor) are successors to the now-defunct APT38. These clusters focus on financial gain, primarily by targeting cryptocurrency and blockchain entities. In December 2024, a joint statement released by the US FBI, DC3, and National Police Agency of Japan (NPA) reported on TraderTraitor’s theft of cryptocurrency then valued at $308 million USD from a Japan-based company.
APT43 (Kimsuky)
APT43, a prolific cyber actor whose collection requirements align with the mission of the RGB, funds itself through cybercrime operations to support its primary mission of collecting strategic intelligence, in contrast to groups focused primarily on revenue generation like APT38. While the group’s espionage targeting is broad, it has demonstrated a particular interest in foreign policy and nuclear security, leveraging moderately sophisticated technical capabilities coupled with aggressive social engineering tactics against government organizations, academia, and think tanks. Meanwhile, APT43’s financially motivated operations focus on stealing and laundering cryptocurrency to buy operational infrastructure.
UNC3782
UNC3782, a suspected North Korean threat actor active since at least 2022, conducts both financial crime operations against the cryptocurrency sector and espionage activity, including the targeting of South Korean organizations attempting to combat cryptocurrency-related crimes, such as law firms and related government and media entities. UNC3782 has targeted users on cryptocurrency platforms including Ethereum, Bitcoin, Arbitrum, Binance Smart Chain, Cronos, Polygon, TRON, and Solana; Solana in particular constitutes a target-rich environment for criminal actors due to the platform’s rapid growth.
APT45 (Andariel)
APT45, a North Korean cyber operator active since at least 2009, has conducted espionage operations focusing on government, defense, nuclear, and healthcare and pharmaceutical entities. The group has also expanded its remit to financially motivated operations, and we suspect that it engaged in the development of ransomware, distinguishing it from other DPRK-nexus actors.
DPRK IT Workers
DPRK IT workers pose as non-North Korean nationals seeking employment at a wide range of organizations globally to generate revenue for the North Korean regime, enabling it to evade sanctions and fund its weapons of mass destruction (WMD) and ballistic missiles programs. IT workers have also increasingly leveraged their privileged access at employer organizations to engage in or enable malicious intrusion activity and, in some cases, extort those organizations with threats of data leaks or sales of proprietary company information following the termination of their employment.,
While DPRK IT worker operations are widely reported to target US companies, they have increasingly expanded to Europe and other parts of the world. Tactics to evade detection include the use of front companies and services of “facilitators,” non-North Korean individuals who provide services such as money and/or cryptocurrency laundering, assistance during the hiring process, and receiving and hosting company laptops to enable the workers remote access in exchange for a percentage of the workers’ incomes.
A Comprehensive Approach is Required
We believe tackling this challenge will require a new and stronger approach recognizing the cybercriminal threat as a national security priority requiring international cooperation. While some welcome enhancements have been made in recent years, more must—and can—be done. The structure of the cybercrime ecosystem makes it particularly resilient to takedowns. Financially motivated actors tend to specialize in a single facet of cybercrime and regularly work with others to accomplish bigger schemes. While some actors may repeatedly team up with particular partners, actors regularly have multiple suppliers (or customers) for a given service.
If a single ransomware-as-a-service provider is taken down, many others are already in place to fill in the gap that has been created. This resilient ecosystem means that while individual takedowns can disrupt particular operations and create temporary inconveniences for cybercriminals, these methods need to be paired with wide-ranging efforts to improve defense and crack down on these criminals’ ability to carry out their operations. We urge policymakers to consider taking a number of steps:
Demonstrably elevate cybercrime as a national security priority: Governments must recognize cybercrime as a pernicious national security threat and allocate resources accordingly. This includes prioritizing intelligence collection and analysis on cybercriminal organizations, enhancing law enforcement capacity to investigate and prosecute cybercrime, and fostering international cooperation to dismantle these transnational networks.
Strengthen cybersecurity defenses: Policymakers should promote the adoption of robust cybersecurity measures across all sectors, particularly critical infrastructure. This includes incentivizing the implementation of security best practices, investing in research and development of advanced security technologies, enabling digital modernization and uptake of new technologies that can advantage defenders, and supporting initiatives that enhance the resilience of digital systems against attacks and related deceptive practices.
Disrupt the cybercrime ecosystem: Targeted efforts are needed to disrupt the cybercrime ecosystem by targeting key enablers such as malware developers, bulletproof hosting providers, and financial intermediaries such as cryptocurrency exchanges. This requires a combination of legal, technical, and financial measures to dismantle the infrastructure that supports cybercriminal operations and coordinated international efforts to enable the same.
Enhance international cooperation: cybercrime transcends national borders, necessitating strong international collaboration to effectively combat this threat. Policymakers should prioritize and resource international frameworks for cyber threat information sharing, joint investigations, and coordinated takedowns of cybercriminal networks, including by actively contributing to the strengthening of international organizations and initiatives dedicated to combating cybercrime, such as the Global Anti-Scams Alliance (GASA). They should also prioritize collective efforts to publicly decry malicious cyber activity through joint public attribution and coordinated sanctions, where appropriate.
Empower individuals and businesses: Raising awareness about cyber threats and promoting cybersecurity education is crucial to building a resilient society. Policymakers should support initiatives that educate individuals and businesses about online safety, encourage the adoption of secure practices, empower service providers to take action against cybercriminals including through enabling legislation, and provide resources for reporting and recovering from cyberattacks.
Elevate strong private sector security practices: Ransomware and other forms of cybercrime predominantly exploit insecure, often legacy technology architectures. Policymakers should consider steps to prioritize technology transformation, including the adoption of technologies/products with a strong security track record; diversifying vendors to mitigate risk resulting from overreliance on a single technology; and requiring interoperability across the technology stack.
aside_block
<ListValue: [StructValue([(‘title’, ‘The Evolution of Cybercrime’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb5935d2e20>), (‘btn_text’, ‘Watch now’), (‘href’, ‘https://www.youtube.com/watch?v=NtANWZPHUak’), (‘image’, <GAEImage: evolution of cybercrime>)])]>
About the Authors
Google Threat Intelligence Group brings together the Mandiant Intelligence and Threat Analysis Group (TAG) teams, and focuses on identifying, analyzing, mitigating, and eliminating entire classes of cyber threats against Alphabet, our users, and our customers. Our work includes countering threats from government-backed attackers, targeted 0-day exploits, coordinated information operations (IO), and serious cybercrime networks. We apply our intelligence to improve Google’s defenses and protect our users and customers.
The recent explosion of machine learning (ML) applications has created unprecedented demand for power delivery in the data center infrastructure that underpins those applications. Unlike server clusters in the traditional data center, where tens of thousands of workloads coexist with uncorrelated power profiles, large-scale batch-synchronized ML training workloads exhibit substantially different power usage patterns. Under these new usage conditions, it is increasingly challenging to ensure the reliability and availability of the ML infrastructure, as well as to improve data-center goodput and energy efficiency.
Google has been at the forefront of data center infrastructure design for several decades, with a long list of innovations to our name. In this blog post, we highlight one of the key innovations that allowed us to manage unprecedented power and thermal fluctuations in our ML infrastructure. This innovation underscores the power of full codesign across the stack — from ASIC chip to data center, across both hardware and software. We also discuss the implications of this approach and propose a call to action for the broader industry.
New ML workloads lead to new ML power challenges
Today’s ML workloads require synchronized computation across tens of thousands of accelerator chips, together with their hosts, storage, and networking systems; these workloads often occupy one entire data-center cluster — or even multiples of them. The peak power utilization of these workloads could approach the rated power of all the underlying IT equipment, making power overscription much more difficult. Furthermore, power consumption rises and falls between idle and peak utilization levels much more steeply, due to the fact that the entire cluster’s power usage is now dominated by no more than a few large ML workloads. You can observe these power fluctuations when a workload launches or finishes, or when it is halted, then resumed or rescheduled. You may also observe a similar pattern when the workload is running normally, mostly attributable to alternating compute- and networking-intensive phases of the workload within a training step. Depending on the workload’s characteristics, these inter- and intra-job power fluctuations can occur very frequently. This can result in multiple unintended consequences on the functionality, performance, and reliability of the data center infrastructure.
Fig. 1. Large power fluctuations observed on cluster level with large-scale synchronized ML workloads
In fact, in our latest batch-synchronous ML workloads running on dedicated ML clusters, we observed power fluctuations in the tens of megawatts (MW), as shown in Fig.1. And compared to a traditional load variation profile, the ramp speed could be almost instantaneous, repeat as frequently as every few seconds, and last for weeks… or even months!
Fluctuations of this kind pose the following risks:
Functionality and long-term reliability issues with rack and data center equipment, resulting in hardware-induced outages, reduced energy efficiency and increased operational/maintenance costs, including but not limited to rectifiers, transformers, generators, cables and busways
Damage, outage, or throttling at the upstream utility, including violation of contractual commitments to the utility on power usage profiles, and corresponding financial costs
Unintended and frequent triggering of the uninterrupted power supply (UPS) system from large power fluctuations, resulting in shortened lifetime of the UPS system
Large power fluctuations may also impact hardware reliability at a much smaller per-chip or per-system scale. Although the maximum temperature is well under control, power fluctuations may still translate into large and frequent temperature fluctuations, triggering various forms of interactions including warpage, changes to thermal interface material property, and electromigration.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3758e60460>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
A full-stack approach to proactive power shaping
Due to the high complexity and large scale of our data-center infrastructure, we posited that proactively shaping a workload’s power profile could be more efficient than simply adapting to it. Google’s full codesign across the stack — from chip to data center, from hardware to software, and from instruction set to realistic workload — provides us with all the knobs we need to implement highly efficient end-to-end power management features to regulate our workloads’ power profiles and mitigate detrimental fluctuations.
Specifically, we installed instrumentation in the TPU compiler to check on signatures in the workload that are linked with power fluctuations, such as sync flags. We then dynamically balance the activities of major compute blocks of the TPU around these flags to smooth out their utilization over time. This achieves our goal of mitigating power and thermal fluctuations with negligible performance overhead. In the future, we may also apply a similar approach to the workload’s starting and completion phases, resulting in a gradual, rather than abrupt, change in power levels.
We’ve now implemented this compiler-based approach to shaping the power profile and applied it on realistic workloads. We measured the system’s total power consumption and a single chip’s hotspot temperature with, and without, the mitigation, as plotted in Fig. 2 and Fig. 3, respectively. In the test case, the magnitude of power fluctuations dropped by nearly 50% from the baseline case to the mitigation case. The magnitude of temperature fluctuations also dropped from ~20 C in the baseline case to ~10 C in the mitigation case. We measured the cost of the mitigation by the increase in average power consumption and the length of the training step. With proper tuning of the mitigation parameters, we can achieve the benefits of our design with small increases in average power with <1% performance impact.
Fig. 2. Power fluctuation with and without the compiler-based mitigation
Fig. 3. Chip temperature fluctuation with and without the compiler-based mitigation
A call to action
ML infrastructure is growing rapidly and expected to surpass traditional server infrastructure in terms of total power demand in the coming years. At the same time, ML infrastructure’s power and temperature fluctuations are unique and tightly coupled with the ML workload’s characteristics. Mitigating these fluctuations is just one example of many innovations we need to ensure reliable and high-performance infrastructure. In addition to the method described above, we’ve been investing in an array of innovative techniques to take on ever-increasing power and thermal challenges, including data center water cooling, vertical power delivery, power-aware workload allocation, and many more.
But these challenges aren’t unique to Google. Power and temperature fluctuations in ML infrastructure are becoming a common issue for many hyperscalers and cloud providers as well as infrastructure providers. We need partners at all levels of the system to help:
Utility providers to set forth a standardized definition of acceptable power quality metrics — especially in scenarios where multiple data centers with large power fluctuations co-exist within a same grid and interact with one another
Power and cooling equipment suppliers to offer quality and reliability enhancements for electronics components, particularly for use-conditions with large and frequent power and thermal fluctuations
Hardware suppliers and data center designers to create a standardized suite of solutions such as rack-level capacitor banks (RLCB) or on-chip features, to help establish an efficient supplier base and ecosystem
ML model developers to consider the energy consumption characteristics of the model, and consider adding low-level software mitigations to help address energy fluctuations
Google has been leading and advocating for industry-wide collaboration on these issues through forums such as Open Compute Project (OCP) to benefit the data center infrastructure industry as a whole. We look forward to continuing to share our learnings and collaborating on innovative new solutions together.
A special thanks to Denis Vnukov, Victor Cai, Jianqiao Liu, Ibrahim Ahmed, Venkata Chivukula, Jianing Fan, Gaurav Gandhi, Vivek Sharma, Keith Kleiner, Mudasir Ahmad, Binz Roy, Krishnanjan Gubba Ravikumar, Ashish Upreti and Chee Chung from Google Cloud for their contributions.
At Google Cloud, we strive to make it easy to deploy AI models onto our infrastructure. In this blog we explore how the Cross-Cloud Network solution supports your AI workloads.
Managed and Unmanaged AI options
Google Cloud provides both managed (Vertex AI) and do-it-yourself (DIY) approaches for running AI workloads.
Vertex AI: A fully managed machine learning platform. Vertex AI offers both pre-trained Google models and access to third-party models through Model Garden. As a managed service, Vertex AI handles infrastructure management, allowing you to concentrate on training, tuning, and inferencing your AI models.
Custom infrastructure deployments: These deployments utilize various compute, storage and networking options based on the type of workload the user is running. AI Hypercomputer is one way to deploy both HPC workloads that may not require GPU and TPUs, and also AI workloads running TPUs or GPUs.
Networking for managed AI
With Vertex AI you don’t have to worry about the underlying infrastructure. For network connectivity by default the service is accessible via public API. Enterprises that want to use private connectivity have a choice of Private Service Access, Private Google Access, Private Service Connect endpoints and Private Service Connect for Google APIs. The option you choose will vary based on the specific Vertex AI service you are using. You can learn more in the Accessing Vertex AI from on-premises and multicloud documentation.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3e75542520d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/products?#networking’), (‘image’, None)])]>
Networking AI infrastructure deployments
An organization has data located in another cloud, and would like to deploy an AI cluster with GPUs on Google Cloud. Let’s look at a sample case.
Based on this need, you need to analyze the networking based on planning, data ingestion, training and inference.
Planning: This crucial initial phase involves defining your requirements, the size of the cluster (number of GPUs), the type of GPUs needed, the desired region and zone for deployment, storage and anticipated network bandwidth for transfers. This planning informs the subsequent steps. For instance, training large language models like LLaMA which has billions of parameters requires a significantly larger cluster than fine-tuning smaller models.
Data ingestion: Since the data is located in another cloud, you need a high-speed connection so that the data can be accessed directly or transferred to a storage option in Google Cloud. To facilitate this, Cross-Cloud Interconnect offers a direct connection at high bandwidth with a choice of 10Gbps or 100Gbps per link. Alternatively if the data is located on-premises, you can use Cloud Interconnect.
Training: Training workloads demand high-bandwidth, low-latency, and lossless cluster networking. You can achieve GPU-to-GPU communication that bypasses the system OS with Remote Direct Memory Access (RDMA). Google Cloud networking supports the RDMA over converged ethernet (RoCE) protocol in special network VPCs using the RDMA network profile. Proximity is important so nodes and clusters need to be as close to each other as possible for best performance.
Threat actors who target cloud environments are increasingly focusing on exploiting compromised cloud identities. A compromise of human or non-human identities can lead to increased risks, including cloud resource abuse and sensitive data exfiltration. These risks are exacerbated by the sheer number of identities in most organizations; as they grow, the attack surface they represent also grows.
As described in the latest Google Cloud Threat Horizons Report, organizations should prioritize measures that can strengthen identity protection.
“We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication, regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms,” said Iain Mulholland, senior director, Security Engineering, in last week’s Cloud CISO Perspectives newsletter.
Today, we are detailing key risk mitigations from Google Cloud security experts that you can quickly act on. Every organization should evaluate these mitigations as part of their efforts to protect their cloud deployments.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7554258a90>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Google Cloud’s built-in protections
Google Cloud provides always-on account protection measures that help mitigate credential theft. Many of these protections are based on heuristics that detect likely credential theft and terminate an attacker’s session. Others limit the use of suspected stolen cookies to minutes, instead of hours.
Google Cloud requires users to reauthenticate to confirm the validity of their credentials before allowing many sensitive actions in the Cloud Console. This reauthentication can happen deterministically or based on a risk score.
Google Cloud sets default Organization Policies on newly created organizations to guard against common risks of service credential theft and sharing of resources.
However, as attacker tactics evolve, it’s important to have additional layers of defense in place spanning multi-factor authentication (MFA), protecting sessions, protecting service credentials, identity and access controls, and security monitoring.
Google Cloud customers are encouraged to adopt the following measures to help increase protection against credential theft:
Multi-factor authentication (MFA): As part of our shared fate approach to help customers, we recently described our plans to make MFA mandatory for all Google Cloud users this year. If you have not enabled MFA yet, you can take these steps in advance of mandatory enforcement:
Enable MFA on your primary Identity Provider (IdP). For Google Cloud customers who use Google Cloud Identity as their primary IdP, follow these instructions.
Add an MFA instrument to Google Cloud Identity accounts for re-authentication. If Google Cloud Identity is not your primary IdP, this provides an independent layer of verification prior to allowing sensitive actions. Follow these instructions.
Configure your IdP to always challenge (ideally with MFA) when accessing Google. When Google Cloud customers use Cloud Identity with their own IdP through SAML or OIDC, Cloud Identity queries the IdP for an attestation when the session expires or when Google Cloud requires re-authentication. In the default configuration, IdPs silently approve all these attestations to minimize user friction. However, most IdPs can be configured to always require re-entering credentials, and even to always require MFA whenever Google Cloud requests an attestation. This configuration can be set up to only apply to the app representing Google Cloud, and not for all apps that the IdP federates for a smoother user and administrative experience.
Protecting sessions: We recommend four controls that can help increase session protection:
Limiting session length can reduce the usefulness of stolen cookies. The default session length is 16 hours, and is user-configurable. Here are instructions for setting session length, and you can read more on session length management.
Limiting IPs allowed to access Cloud Console and APIs with Context-Aware Access (CAA) can make stolen credentials useless (unless the attacker has access to allowlisted IPs, such as the corporate network or VPN IPs.)
Certificate-based access can be used to require mTLS certificates to access Cloud Console and Google Cloud APIs. mTLS provides strong protection against cookie theft, requiring users to present an mTLS certificate in addition to existing credentials such as cookies. mTLS certificates are typically stored in the Trusted Platform Module (TPM) of the user’s device, making them extremely difficult for an attacker to steal. Many enterprises already deploy mTLS certificates to their users, and Google Cloud allows customers to either reuse their existing mTLS certificates, or use new ones just for Google Cloud.
Contextual-access restrictions can be configured with Access Context Manager, which allows Google Cloud organization administrators to define fine-grained, attribute based access control for projects and resources. Access levels can be configured to require additional device and user attributes to be met in order for a resource request to be successful. For example, you can require that a corporate-managed be used to access and configure resources.
Protecting service credentials: Organizations should also build layered protection for non-human identities. Google Cloud offers detailed best practices for managing, using, and securing service account keys and API keys. Three important controls to consider:
Disable creation of service account keys: This Organization Policy setting prevents users from creating persistent keys for service accounts. Instead of allowing unqualified use of service account keys, choose the right authentication method for your use case, and allow exceptions for service account keys only for scenarios that cannot use more secure alternatives.
Disable leaked service account keys automatically: Google Cloud regularly scans public repositories (including Github and Gitlab) for leaked service account keys. If Google Cloud detects an exposed key, it will automatically disable the key. It also creates a Cloud Audit Logs event and sends a notification about the exposed key to project owners and security contacts. We strongly recommend not modifying the DISABLE_KEY option (which is on by default).
Binding service account keys to trusted networks: Context Aware Access for service accounts enables customers to bind service accounts to an IP-range or specific VPC networks, and enforce that service accounts can access Google Cloud services and APIs only from these trusted networks. Customers can request early access to this control using this form.
Identity and access controls: Adhering to the principle of least privilege can help limit the impact of credential compromise; use these controls to limit access and privileges to only what users need to perform their job functions.
Google Cloud Identity and Access Management (IAM) lets you grant granular access to specific Google Cloud resources and can help prevent access to other resources. Permissions are grouped into roles, and roles are granted to authenticated principals. You shouldregularly review and right-size permissions using tools such as IAM Recommender. The Google Cloud Architecture Framework provides additional best practices for managing identity and access.
VPC Service Controls enable a powerful, context-aware approach to control access for your cloud resources. You can create granular access control policies based on attributes such as user identity and IP address. These policies ensure specific security controls are in place before granting access to cloud resources from untrusted networks. By allowing access only from authorized networks, VPC Service Controls helps protect against the risk of data exfiltration presented by clients using stolen OAuth or service account credentials.
Principal access boundaries can precisely define the resources that a principal is eligible to access. If a policy makes a principal ineligible to access a resource, then their access to that resource is limited regardless of the roles they’ve been granted.
Restrict identities by domain using domain-restricted sharing to limit role grants to users belonging to a specific domain or organization. When domain restricted sharing is active, only principals that belong to allowed domains or organizations can be granted IAM roles in your Google Cloud organization.
Security monitoring: In addition to implementing preventative controls, you should proactively monitor your cloud environment for signs of compromise. Early detection can help limit the business impact of a compromise.
Security Command Center (SCC) is Google Cloud’s built-in security and risk management platform. It provides comprehensive security posture management, threat detection, and compliance monitoring.
With SCC’s Cloud Infrastructure Entitlement Management (CIEM) capabilities, you can manage which identities have access to which resources in your deployments, mitigate potential vulnerabilities that result from misconfigurations, and enforce the principle of least privilege. The Sensitive Actions Service within SCC automatically detects and alerts on potentially damaging actions occurring across your cloud organization, folders, and projects. SCC’s Virtual Red Teaming capability continuously detects if high value resources are exposed and surfaces the identities and access paths that could lead to compromise.
Next steps
Maintaining a strong security posture requires ongoing evaluation of the risks your organization faces, and the controls you have in place to address them. These recommendations can help you strengthen your cloud estate against the growing risks associated with credential compromise.
You can learn more about protecting your Google Cloud deployments in our security Best Practices Center.
2025 is off to a racing start. From announcing strides in the new Gemini 2.0 model family to retailers accelerating with Cloud AI, we spent January investing in our partner ecosystem, open-source, and ways to make AI more useful. We’ve heard from people everywhere, from developers to CMOs, about the pressure to adapt the latest in AI with efficiency and speed – and the delicate balance of being both conservative and forward-thinking. We’re here to help. Each month, we’ll post a retrospective that recaps Google Cloud’s latest announcements in AI – and importantly, how to make the most of these innovations.
Top announcements: Bringing AI to you
This month, we announced agent evaluation in Vertex AI. A surprise to nobody, AI agents are top of mind for many industries looking to deploy their AI and boost productivity. But closing the gap between impressive model demos and real-world performance is crucial for successfully deploying generative AI. That’s why we announced Vertex AI’s RAG Engine, a fully managed service that helps you build and deploy RAG implementations with your data and methods. Together, these new innovations can help you build reliable, trustworthy models.
From an infrastructure perspective, we announcednew updates to AI Hypercomputer. We wanted to make it easier for you to run large multi-node workloads on GPUs by launching A3 Ultra VMs and Hypercompute Cluster, our new highly scalable clustering system. This builds on multiple advancements in AI infrastructure, including Trillium, our sixth-generation TPU.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e9f3e8cf730>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
At the same time, we shared several important announcements in the world of open-source. We announced Mistral AI’s Mistral Large 24.11 and Codestral 25.01 models on Vertex AI. These models will help developers write code and build faster – from high-complexity tasks to reasoning tasks, like creative writing. To help you get started, we providedsample codeanddocumentation.
And, most recently, we announced the public beta of Gen AI Toolbox for Databasesin partnership with LangChain, the leading orchestration framework for developers building LLM applications. Toolbox is an open-source server that empowers application developers to connect production-grade, agent-based generative AI applications to databases. You can get started here.
Industry news: Google Cloud at the National Retail Federation (NRF)
The National Retail Federation kicked off the year with their annual NRF conference, where Google Cloud showed how AI agents and AI-powered search are already helping retailers operate more efficiently, create personalized shopping experiences, and use AI to get the latest products and experiences to their customers. Check our new AI tools to help retailers build gen AI search and agents.
As an example, Google Cloud worked with NVIDIA to empower retailers to boost their customer engagements in exciting new ways, deliver more hyper-personalized recommendations, and build their own AI applications and agents. Now with NVIDIA’s AI Enterprise software available on Google Cloud, retailers can handle more data and more complex AI tasks without their systems getting bogged down.
News you can use
This month, we shared several ways to better implement fast-moving AI, from a comprehensive guide on Supervised Fine Tuning (SFT), to how developers can help their LLMs deliver more accurate, relevant, and contextually aware responses, minimizing hallucinations and building trust in AI applications by optimizing their RAG retrieval.
We also published new documentation to use open models in Vertex AI Studio. Model selection isn’t limited to Google’s Gemini anymore. Now, choose models from Anthropic, Meta, and more when writing or comparing prompts.
Hear from our leaders
We closed out the month with The Prompt, our monthly column that brings observations from the field of AI. This month, we heard from Warren Barkley, AI product leader, who shares some best practices and essential guidance to help organizations successfully move AI pilots to production. Here’s a snippet:
More than 60% of enterprisesare now actively using gen AI in production, helping to boost productivity and business growth, bolster security, and improve user experiences. In the last year alone, we witnessed a staggering 36x increase in Gemini API usage and a nearly 5x increase of Imagen API usage on Vertex AI — clear evidence that our customers are making the move towards bringing gen AI to their real-world applications.
Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud, read our weekly updates, The Overwhelmed Person’s Guide to Google Cloud.
We are excited to announce the availability of datasets on Google Cloud Marketplace through BigQuery Analytics Hub, opening up new avenues for organizations to power innovative analytics use cases and procure data for enterprise business needs. As a centralized procurement platform, Google Cloud Marketplace offers access to a wide array of enterprise applications, foundational AI models, LLMs, and now, commercial and free datasets from third-party data providers and Google. BigQuery Analytics Hub enables cross-organizational zero-copy sharing at scale, with governance, security, and encryption all built in natively.
This deep integration between Google Cloud Marketplace and Analytics Hub not only simplifies data procurement for customers, but also helps data providers extend reach to a global audience and unlock additional business opportunities. Let’s delve into the various benefits this development brings.
Streamlined data procurement for customers
The introduction of BigQuery datasets on Google Cloud Marketplace offers numerous advantages for customers looking to access high-quality datasets to power analytics, AI and to optimize business applications. We offer a wide variety of datasets, including commercial data products from leading providers such as Dun & Bradstreet, Equifax, and Weather Source, a Pelmorex company. Data teams can now easily find, buy, and consume datasets from a centralized, comprehensive catalog — the same place where they discover generative AI, analytics and business applications that integrate with or run on Google Cloud. By simplifying the data discovery and procurement process, businesses can allocate their resources more efficiently, reduce administrative burden, and accelerate data and AI-driven initiatives. Dataset purchased from the Google Cloud Marketplace can draw down the customer’s Google Cloud commitment.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e9ad2af9490>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Immediate access to purchased data
Upon purchasing a dataset, customers can gain instant access to it within their BigQuery environment through Analytics Hub. By subscribing to a purchased BigQuery dataset in Analytics Hub, a linked dataset is immediately created in the customer’s own Google Cloud project. This allows businesses to swiftly integrate procured data with their own data without requiring data movement or replication, expedite analytical processes, and accelerate time-to-value. By eliminating the delays commonly associated with data procurement and by streamlining data delivery time, organizations can quickly leverage the acquired data to inform strategic decisions and drive innovation.
Cost control, security and governance
Customers procuring datasets through Google Cloud Marketplace can benefit significantly from cost savings, as linked datasets in Analytics Hub are live pointers to shared data and require no data copying, and there are no extra replication or storage costs to account for. In addition, customers can reduce billing sprawl with consolidated billing for Google Cloud services, third-party ISV solutions, and now datasets. A recent Google Cloud commissioned IDC study1 found that Google Cloud Marketplace can help customers lower spending on third-party solutions by 21.2% on average, largely due to avoiding unnecessary purchases, reducing duplicative spend, and leveraging committed spend discounts. Customers gain cost efficiencies and improved time-to-value opportunities by consolidating contracts across their entire organization.
On the security front, Google Cloud provides robust features to support data protection. Analytics Hub natively supports provider and subscriber project isolation, helping to ensure that commercial data can be safely shared across organizational boundaries. Customers can also apply specific security configurations via BigQuery and Analytics Hub, including Virtual Private Cloud Service Controls support, allowing for tailored access controls to help safeguard from unauthorized access.
Furthermore, organizations can maintain governance and control over the solutions in use by turning on the Google Cloud Private Marketplace capability, enabling a curated collection of trusted products — including datasets — that can be discovered, procured and used by their data analyst teams. With Private Marketplace, administrators can maintain control over which datasets are used, yet also ensure that governance controls do not hinder productivity by turning on the ability for end-users to request additional products be made available. The same IDC study found that managing third-party software purchases through Google Cloud Marketplace can result in 31% productivity gains for compliance teams1.
Data providers extend reach to customers
Data provider partners get significant advantages by listing their offerings on Google Cloud Marketplace, gaining access to a wider customer base, facilitating market expansion and business growth. With a streamlined onboarding process, data providers can create new revenue channels by efficiently making their datasets available to new customers.
Once the transaction is completed in Google Cloud Marketplace, Analytics Hub automatically enables customer access to the data provider’s data, minimizing friction for sellers and customers. In addition, the integration with Analytics Hub means data updates are propagated instantly, so that end users have access to the most current information, enhancing customer satisfaction and loyalty. Google Cloud Marketplace supports dataset transactions via the agency model, which at the time of this announcement is enabled for customers and partners based in France, Germany, the United Kingdom, and the United States.
Unlock monetization opportunities
Google Cloud Marketplace opens up various monetization opportunities for data provider partners. Those who already have data in BigQuery can quickly share at scale with Analytics Hub, commercialize, list, and unlock new income streams through Google Cloud Marketplace. Integration opportunities between Analytics Hub and Google Cloud Marketplace further enable partners to capitalize on the intrinsic value of their data, expanding their monetization strategies and maximizing revenue potential.
Partners have the flexibility to transact with customers via public, off-the-shelf pricing or through custom-negotiated private offers. They can set up fixed-fee subscriptions and customize payment schedules for data offerings without needing complex technical integrations, simplifying the process of generating revenue. Leverage Google Cloud’s standard agreements or provide your own. Finally, with Analytics Hub usage metrics and subscription management, data providers can easily analyze usage behavior, identify patterns, and add or revoke subscriptions, all within a single pane of glass. And if they execute campaigns to drive traffic to Google Cloud Marketplace dataset offerings, they can track traffic and conversion in the Analytics dashboard within Google Cloud Marketplace Producer Portal. Whether it’s through fixed subscriptions or through offering advanced data services, partners have numerous ways to monetize data effectively on our platform.
Data provider partners are excited about the business opportunities and customer use cases that BigQuery datasets on Google Cloud Marketplace can help deliver.
“Driving adoption of Dun & Bradstreet data through joint-go-to-market is a key pillar of our partnership with Google Cloud. We are excited about the ability for our mutual customers to seamlessly transact Dun & Bradstreet’s high-quality and trusted data on the Google Cloud Marketplace and immediately unlock powerful analytics and real-time insights. Having more of our AI-ready data on BigQuery helps organizations be deliberate about their data strategy.” – Isabel Gomez Vidal, Chief Revenue Office, Dun & Bradstreet
“Our collaboration with Google Cloud to make Equifax data available on Google Cloud Marketplace and Analytics Hub represents a significant step forward in data accessibility. By leveraging this platform, our customers can now integrate Equifax insights seamlessly into their existing workflows, driving innovation and informed decision-making.” – Felipe Castillo, Chief Product Officer, US Information Solutions, Equifax
“We are proud to be an early adopter of the Google Cloud Marketplace and we are looking forward to building upon our initial success leveraging the integrated functionality in BigQuery. Google Cloud Marketplace has accelerated lead capturing, procurement, and delivery of our data assets, allowing our teams to focus on unlocking business opportunities with our mutual customers.” – Craig Stelmach, Senior Vice President of Business Development and Sales, Weather Source, a Pelmorex Company
Analytics Hub and Google Cloud Marketplace are helping to reshape the landscape of how customers and data providers make the most out of data to power the next generation of AI and enterprise use cases. Learn more about Analytics Hub and explore datasets on Google Cloud Marketplace.
One of the most compelling aspects of cloud computing is being able to automatically scale resources up, but almost as importantly, to scale them back down to manage costs and performance. This is standard practice with virtual machines, for instance Compute Engine Managed Instance Groups, but because of their inherent complexity, less so with stateful services such as databases.
Memorystore for Redis Cluster capacity is determined by the number of shards in your cluster, which can be increased/decreased without downtime, and your cluster’s shard size, which maps on to the underlying node type. At this time, the node type of the cluster is immutable. To scale capacity in or out, you modify the number of shards in your cluster. To automate this process, you can deploy the Memorystore Cluster Autoscaler to monitor your cluster metrics, and rightsize your cluster based on that information. The Autoscaler performs the necessary resource adjustments using rulesets that evaluate memory and CPU utilization, without impacting cluster availability.
The following chart shows the Autoscaler in action, with a Memorystore for Redis Cluster instance automatically scaling out as memory utilization increases. The green line represents data being written to the cluster at the rate of one gigabyte every five minutes. The blue line represents the number of shards in the cluster. You can see that the cluster scales out, with the number of shards increasing in proportion to the memory utilization, then plateaus when the writes stop, and finally scales back in when the keys are flushed at the end of the test.
Experience and deployment
To use the Autoscaler, deploy it to one of your Google Cloud projects. The Autoscaler is very flexible and there are multiple options for its deployment, so the repository contains multiple example Terraform deployment configurations, as well as documentation that describes the various deployment models.
Once you’ve deployed the Autoscaler, configure it according to the scaling requirements of the Memorystore instances being managed, to suit your workloads’ characteristics. You do this by setting Autoscaler configuration parameters for each of the Memorystore instances. Once configured, the Autoscaler autonomously manages and scales the Memorystore instances. You can read more about these parameters later in this post, and in the Autoscaler documentation.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4df7d14c70>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Autoscaler architecture
The Autoscaler consists of two main components, the Poller and the Scaler. You can deploy these to either Cloud Run functions or Google Kubernetes Engine (GKE) via Terraform, and configure them so that the Autoscaler runs according to a user-defined schedule. The Poller queries the Memorystore metrics in Cloud Monitoring at a pre-defined interval to determine utilization, and passes them to the Scaler. The Scaler then compares the metrics against the recommended thresholds specified in the rule set, and determines if the instance should be scaled in or out, and if so, by how many shards. You can modify the sample configuration to determine minimum and maximum cluster sizes and any other thresholds suitable for your environment.
Throughout the flow, the Autoscaler writes a step-by-step summary of its recommendations and actions to Cloud Logging for tracking and auditing, as well as metrics to Cloud Monitoring to provide insight into its actions.
Scaling rubrics
Memorystore performance is most commonly limited by in-memory storage and by CPU. The Autoscaler is configured by default to take both of these factors into consideration when scaling, by utilizing the CPU_AND_MEMORY profile. This is a good place to start your deployment, and can be replaced with a custom configuration, if required, to best suit your needs.
Defaults
Metric
Average Default Setting
Max Default Setting
CPU scale OUT
CPU > 70%
Max CPU > 80% and average CPU > 50%
CPU scale IN
CPU < 50% *
Max CPU < 60% and average CPU < 40% *
Memory Scale OUT
Usage > 70%
Max Usage > 80% and average usage > 50%
Memory Scale IN
Usage < 50% *
Max Usage < 60% and average usage < 40% *
* Scale-in will be blocked if there are ongoing key evictions, which occur when the keyspace is full and keys are removed from the cache to make room. Scale in is enabled by default, but can be configured using a custom scaling profile. Refer to the Scaling Profiles section of the documentation for more information on how to do this.
Scaling scenarios and methods
Let’s take a look at some typical scenarios and their specific utilization patterns, and the Autoscaler configurations best suited to each of them. You can read more about the options described in the following section in the configuration documentation.
Standard workloads
With many applications backed by Memorystore, users interact with the application at certain times of day more than others, in a regular pattern — think a banking application where users check their accounts in the morning, make transactions during the afternoon and early evening, but don’t use the application much at night.
We refer to this fairly typical scenario as a “standard workload” whose time series shows:
Large utilization increase or decrease at certain points of the day
Small spikes over and under the threshold
A recommended base configuration for these types of workflow should include:
The LINEAR scalingMethod to cover large scale events
A small value for scaleOutCoolingMinutes — between 5 and 10 minutes — to minimize Autoscaler’s reaction time.
Plateau workloads
Another common scenario is applications with more consistent utilization during the day such as global apps, games, or chat applications. User interactions with these applications are more consistent, so the jumps in utilization are less pronounced than for a standard workload.
These scenarios create a “plateau workload” whose time series shows:
A pattern composed of various plateaus during the day
Some larger spikes within the same plateau
A recommended base configuration for these types of workflow should include:
The STEPWISE scalingMethod, with a stepSize sufficient to cover the largest utilization jump using only a few steps during a normal day, OR
The LINEAR scalingMethod, if there is likely to be a considerable increase or reduction in utilization at certain times, for example when breaking news is shared. Use this method together with a scaleInLimit to avoid reducing the capacity of your instance too quickly
Batch workloads
Customers often need increased capacity for their Memorystore clusters to handle batch processes or a sales event, where the timing is usually known in advance. These scenarios comprise a “batch workload” with the following properties:
A scheduled, well-known peak that requires additional compute capacity
A drop in utilization when the process or event is over
A recommended base configuration for these types of workloads should include two separate scheduled jobs:
One for the batch process or event, that includes an object in the configuration that uses the DIRECT scalingMethod, and a minSize value of the peak number of shards/nodes to cover the process or event
One for regular operations, that includes configuration with the same projectId and instanceId, but using the LINEAR or STEPWISE method. This job will take care of decreasing the capacity when the process or event is over
Be sure to choose an appropriate scaling schedule so that the two configurations don’t conflict. For both Cloud Run functions and GKE deployments, make sure the batch operation starts before the Autoscaler starts to scale the instance back in again. You can use the scaleInLimit parameter to slow the scale-in operation down if needed.
Spiky workloads
Depending on load, it can take around several minutes for Memorystore to update the cluster topology and fully utilize new capacity. Therefore, if your traffic is characterized by very spiky traffic or sudden-onset load patterns, the Autoscaler might not be able to provision capacity quickly enough to avoid latency, or efficiently enough to yield cost savings.
For these spiky workloads, a base configuration should:
Set a minSize that slightly over-provisions the usual instance workload
Use the LINEAR scalingMethod, in combination with a scaleInLimit to avoid further latency when the spike is over
Choose scaling thresholds large enough to smooth out some smaller spikes, while still being reactive to large ones
Advanced usage
As described above, the Autoscaler is preconfigured with scaling rules designed to optimize cluster size based on CPU and memory utilization. However, depending on your workload(s), you may find that you need to modify these rules to suit your utilization, performance and/or budget goals.
There are several ways to customize the rule sets that are used for scaling, in increasing order of effort required:
Choose to scale on only memory or only CPU metrics. This can help if you find your clusters flapping, i.e., alternating rapidly between sizes. You can do this by specifying a scalingProfile of either CPU or MEMORY to override the default CPU_AND_MEMORY in the Autoscaler configuration.
Use your own custom scaling rules by specifying a scalingProfile of CUSTOM, and supplying a custom rule set in the Autoscaler configuration as shown in the example here.
Create your own custom rule sets and make them available for everyone in your organization to use as part of a scaling profile. You can do this by customizing one of the existing scaling profiles to suit your needs. We recommend starting by looking at the existing scaling rules and profiles, and creating your own customizations.
Next steps
The OSS Autoscaler comes with a Terraform configuration to get you started, which can be integrated into your codebase for production deployments. We recommend starting with non-production environments, and progressing through to production when you are confident with the behavior of the Autoscaler alongside your application(s). Some more tips for production deployments are here in the documentation.
If there are additional features you would like to see in the Autoscaler — or would like to contribute to it yourself — please don’t hesitate to raise an issue via the GitHub issues page. We’re looking forward to hearing from you.
Today, we are thrilled to announce the public beta launch of Gen AI Toolbox for Databases in partnership with LangChain, the leading orchestration framework for developers building large language model (LLM) applications.
Gen AI Toolbox for Databases (Toolbox) is an open-source server that empowers application developers to connect production-grade, agent-based generative AI (gen AI) applications to databases. It streamlines the creation, deployment, and management of sophisticated gen AI tools capable of querying databases with secure access, robust observability, scalability, and comprehensive manageability. It also provides connectivity to popular open-source databases such as PostgreSQL, MySQL, as well as Google’s industry-leading Cloud Databases like AlloyDB, Spanner, and Cloud SQL for SQL Server. We are open to contributions from other databases outside of Google Cloud.
In this post, we’ll explore how Gen AI Toolbox for Databases works, and how to get started.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4df7dc48b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Challenges in gen AI tool management
Building AI agents requires using different tools, frameworks, and connecting to various data sources. This process creates several challenges for developers, particularly when these tools need to query databases. These include –
Scaling tool management: Current approaches to tool integration often require extensive, repetitive code and modifications across multiple locations for each tool. This complexity hinders consistency, especially when tools are shared across multiple agents or services. A more streamlined framework integration is needed to simplify tool management and ensure consistency across agents and applications.
Complex database connections: Databases require configuration, connection pooling, and caching for optimal performance at scale.
Security vulnerabilities: Ensuring secure access from gen AI models to sensitive data requires complex integration with auth services, databases and the application, which can be error-prone and introduce security risks.
Inflexible tool updates: Adding new tools or updating existing ones often requires a complete rebuild and redeployment of the application, potentially leading to downtime.
Limited workflow observability: Current solutions lack built-in support for comprehensive monitoring and troubleshooting, making it difficult to gain insights into gen AI workflows with databases.
Components
Gen AI Toolbox for Databases improves how gen AI tools interact with data, addressing common challenges in gen AI tool management. By acting as an intermediary between the application’s orchestration layer and data sources/databases, it enables faster development and more secure data access, improving the production-quality of tools.
Toolbox comprises two components: a server specifying the tools for application use, and a client interacting with this server to load these tools onto orchestration frameworks. This centralizes tool deployment and updates, incorporating built-in production best practices to enhance performance, security, and simplify deployments.
Benefits
Toolbox offers various features that provide better managebility, security and observability for AI Agents. Some of the benefits for application developers are as follows –
Simplified development – Reduced boilerplate code and consolidated integration simplifies tool development and sharing across other agents.
Built-in performance and scale – Built-in connection pooling and optimized connectors for popular databases to handle connection management efficiency.
Zero downtime deployment – A config-driven approach enables seamless deployment of new tools and updates without any service interruption and supports incremental rollouts.
Enhanced security – Using Oauth2 and ODIC, built-in support for common auth providers enables control over Agents’ access to tools and data.
End-to-end observability – Toolbox integrates with OpenTelemetry, providing day-one insights via logging, metrics, and tracing, offering end-to-end observability for better operations.
Compatibility with LangChain
LangChain is the most popular developer framework for building LLM applications, and we’re excited to announce Toolbox compatibility with the LangChain ecosystem from day one. Together with Toolbox, LangGraph can leverage LLMs like Gemini on Vertex AI to build powerful agentic workflows.
LangGraph extends LangChain’s capabilities by providing a framework for building stateful, multi-actor applications with LLMs. Its support for cycles, state management, and coordination enables the development of complex and dynamic AI agents. All of these capabilities integrate seamlessly with Toolbox.
Tool calling is essential for building agents. Agents need to call tools in a controlled and specified way, run the tool reliably, and then pass the correct context back to the LLM. LangGraph provides a low-level agent framework for managing how tools are called and how their responses are integrated, ensuring precision and control. Toolbox then handles the execution itself, seamlessly running the tool and returning results. Together, they create a powerful solution for tool calling in agent workflows.
“The integration of Gen AI Toolbox for Databases with the LangChain ecosystem is a boon for all developers” says Harrison Chase, CEO of LangChain. “In particular, the tight integration between Toolbox and LangGraph will allow developers to build more reliable agents than ever before.”
Get started with Gen AI Toolbox for Databases
Gen AI Toolbox for Databases simplifies gen AI tool development and deployment by automating the entire lifecycle. Here are some resources to get you started:
Last year, we offered our first ever “Google Launchpad for Women” series to empower women within our customer ecosystem to grow their cloud and AI skills. The response from our customers has been tremendous: more than 11,000 women across a breadth of roles – sales, leadership, marketing, finance, and more have completed previous editions of the program. As a result, they are building critical skills that help them put AI to work in their jobs, grow their careers, and help transform their businesses.
This year, in honor of International Women’s Day, we are opening “Google Launchpad for Women,” to thousands of more customer participants, providing them with no-cost training, exam prep, and access to Google experts. Registration is now open to Google Cloud customers in the Americas, EMEA, and Japan, with the three-week program beginning on March 4th in Japan and March 6th in the Americas and EMEA. Program benefits include:
Expert-led training: Two days of in-depth, instructor-led training covering key cloud concepts and best practices.
Industry insights: Engage with Google Cloud experts through panel discussions on topics such as Generative AI.
Exam preparation: Dedicated sessions to prepare for the Cloud Digital Leader certification exam.
Complimentary exam voucher: Participants will receive a voucher for the $99 exam fee.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get hands-on experience for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e24784c42b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome/’), (‘image’, None)])]>
Why these trainings are critical
Harnessing the power of cloud computing and AI is essential for all job roles, not just IT. As more businesses adopt AI, people across business roles utilize this technology every day and often make purchasing decisions about new AI platforms and tools. However, a talent gap remains, and is particularly pronounced for women, who represent about 14% of the global cloud workforce according to recent data from the World Economic Forum.
We aim to help our customers reduce this gap, ensure they have access to the skilled experts they need to advance their digital and AI transformations, and give more people opportunities to grow their careers and lead these transformations. Ultimately, those who complete the Google Launchpad for Women program will be well-equipped to achieve the Cloud Digital Leader certification, putting them at the forefront of the cloud and AI era.
Google Launchpad for Women is open to all Google Cloud customers, regardless of prior technical experience or role. We welcome women from all professional backgrounds who are eager to develop their cloud skills and advance their careers. While this initiative is specifically focused on women, we invite everyone to participate.
Sign up today
Visit the links below to learn more about each regional session and contact your sales rep to sign up today.
Mobile devices have become the go-to for daily tasks like online banking, healthcare management, and personal photo storage, making them prime targets for malicious actors seeking to exploit valuable information. Bad actors often turn to publishing and distributing malware via apps as a lucrative channel for generating illegal and/or unethical profits.
Android takes a multi-layered approach to combating malware to help keep users safe (more later in the post), but while we continuously strengthen our defenses against malware, threat actors are persistently updating their malware to evade detection. Malware developers used to complete their entire malicious aggression using the common Android app development toolkits in Java, which is easier to detect by reversing the Java bytecode. In recent years, malware developers are increasing the use of native code to obfuscate some of the critical malware behaviors and putting their hopes on obscuration in compiled and symbol-stripped Executable and Linkable Format (ELF) files, which can be more difficult and time-consuming to reveal their true intentions.
To combat these new challenges, Android Security and Privacy Team is partnering with Mandiant FLARE to extend the open-source binary analysis tool capa to analyze native ARM ELF files targeting Android. Together, we improved existing and developed new capa rules to detect capabilities observed in Android malware, used the capa rule matches to highlight the highly suspicious code in native files, and prompted Gemini with the highlighted code behaviors for summarization to enhance our review processes for faster decisions.
In this blog post, we will describe how we leverage capa behavior-detection capabilities and state-of-art Gemini summarization by:
Showcasing a malware sample that used various anti-analysis tricks to evade detections
Explaining how our existing and new capa rules identify and highlighted those behaviors
Presenting how Gemini summarizes the highlighted code for security reviews
An Illegal Gambling App Under a Music App Façade
Google Play Store ensures all published apps conform to local laws and regulations. This includes gambling apps, which are prohibited or require licenses in some areas. Developing and distributing illegal gambling apps in such areas can generate significant illicit profits, which sometimes is associated with organized crimes. To bypass Google Play Store’s security-screening procedures, some gambling apps disguise themselves with harmless façades like music or casual games. These apps only reveal their gambling portals in certain geographic markets using various anti-analysis tricks. Unfortunately, dynamic analysis, such as emulation and sandbox detonation, relies on specific device configurations, and threat actors keep trying different combinations of settings to evade our detections. It’s an ongoing game of cat and mouse!
In response, the Android Security and Privacy Team has evolved static analysis techniques, such as those that evaluate the behavior of a complete program and all its conditional logic. So, let’s describe an app that violated Google Play Store rules and show how we can better detect and block other apps like it.
We received reports of a music app opening gambling websites for users in certain geographical areas. It used an interesting trick of hiding key behaviors in a native ELF file that has most symbols (except the exported ones) stripped and is loaded at runtime to evade detection.
When we decompiled the app into Java source code, using a tool like JEB Decompiler, we found that the app has a song-playing functionality as shown in “MainActivity” of Figure 1. This looks like benign behavior and is fully within the limits of Google Play Store policies.
However, there was a small region of initialization code that loads an ELF file as soon as the app is initialized when calling the onCreate function, as shown in com.x.y.z class of Figure 1. To fully understand the behavior of the entire app, we also had to reverse engineer the ELF file, which requires a completely different toolset.
Figure 1: How the app applies anti-analysis techniques
Using a tool like Ghidra, we decompiled the ARM64 ELF file into C source code and found that this app estimates the user’s geographic location using timezone information (“Code Section 1” in Figure 1). The code implements a loop that compares the user’s timezone with a list of target regions (“Data Section” in Figure 1).
If the user’s location matches a value in the list (“Data Section” in Figure 1), this malware:
Downloads an encrypted DEX file from a remote server (“Code Section 2” in Figure 1)
Decrypts the downloaded DEX file (“Code Section 3” in Figure 1)
Loads the decrypted DEX file into memory (“Code Section 4” in Figure 1)
The loaded DEX file uses further server-side cloaking techniques and finally loads a gambling website (Figure 3) to the app users. Compared to the app icon in Figure 2, it is an obvious mismatch of the app’s advertised functionality.
Figure 2: The app icon as published
Figure 3: The loaded gambling website in app
While there are many detection technologies, such as YARA, available for identifying malware distributed in ELF files, they are less resilient to app updates or variations introduced by threat actors. Fortunately, the Android Security and Privacy Team has developed new techniques for detecting malicious Android apps by inspecting their native ELF components. For example, in the gambling app in Figure 3, there are many API calls dynamically resolved via the Java Native Interface (JNI) that interact with the Android runtime. Our detection systems recognized these cross-runtime interactions and reason about their intent. We’ve enumerated behaviors commonly seen in Android malware, such as making ptrace API calls, extracting device information, downloading code from remote servers to local storage, and making various cryptographic operations via JNI, turning them into capa detections we can use to identify and block Google Play Store threats.
Let’s now talk a little more about how this works.
Android capa Rules
capa is a tool that detects capabilities in executable files. You run it against a compiled program, and it tells you what it thinks the program can do. For example, capa might suggest that a file is a backdoor, is capable of installing services, or relies on HTTP to communicate.
Mandiant FLARE extended capa to support BinExport2, an architecture agnostic representation of disassembled programs. This enables capa to match capabilities for additional architectures and file formats, such as those supported by Ghidra and its BinExport2 plugin, with an initial focus on ARM64 ELF files. The Android Security and Privacy Team then created new capa rules focused specifically on detecting capabilities observed in ARM64 ELF files used by various Android malware samples. These proprietary rules alongside capa’s open-source rules are used to detect malware capabilities as part of internal Android malware analysis pipelines.
Referring back to the gambling app in Figure 3, the following Google proprietary rules and open-source capa rules matched the malicious functions performing cloaking techniques for further inspection.
Proprietary rules:
Make ptrace API calls
Extract device configuration information via JNI on Android
Extract timezone via JNI on Android
Encode or decode data using Base64 via JNI on Android
Encrypt or decrypt data using Cipher API via JNI on Android
Instead of browsing hundreds of thousands lines of obfuscated code, our analysts were able to quickly identify the evidence of the app’s wrong-doings using the function addresses matching those rules and enforced on the app.
Gemini Summaries of capa Rule Matches
Safeguarding the Android ecosystem, our Android malware analysis pipelines scan millions of ELF files in-depth every day, each one containing thousands to millions of lines in their decompiled codes. On top of the fast-evolving Gemini capabilities in malware analysis, capa rules are able to select the most interesting code for Gemini summarization, with sharpened focus on a much smaller set of the most suspicious functions.
Figure 4: How capa rules and Gemini help our reviewers and automation
We asked Gemini to summarize the functions matched on capa rules from the earlier gambling app with the following prompt:
You are a world-class Android reverse engineer with malware analysis specialities. You are given a few functions from an ARM/AArch64 ELF file as part of an Android App, with their decompiled pseudo-C codes. You need to analyze these functions to
1. summarize what these functions try to achieve
2. decide whether any of them has code/behavior patterns that would be found in Android malware.
{some more instructions…}
…
These functions are decompiled by Ghidra into pseudo-C format as follows:
[function_1 content]
[function_2 content]
…
In your response, please give your overall verdict assessing the risk level using HIGH / MEDIUM / LOW in the beginning, and explain your rationale for each potential malicious behavior.
Gemini responded with the following suggestions:
VERDICT: HIGH RISK
The provided Android application code exhibits several concerning behaviors strongly indicative of malicious intent. The risk level is assessed as HIGH due to the presence of multiple red flags consistent with sophisticated Android malware techniques. The analysis below details the specific malicious behavior patterns observed:
1. Dynamic Code Loading (HIGH): The function comeOn demonstrates a pattern of dynamically loading and executing code. The app fetches files (potentially DEX or ELF) using getCacheFilePath and downloadFile, then attempts to execute them using getDexClassLoader. This is a classic technique for delivering malicious payloads after initial installation, allowing the malware to easily update its functionality and evade static analysis. The use of custom decryption (decryptFile, decryptAES) further enhances the obfuscation and anti-analysis capabilities, indicating a deliberate attempt to conceal the loaded code’s true nature.
2. Custom Decryption and Execution (HIGH): The decryptFile function performs custom decryption of a file, likely to obfuscate the malicious payload. The decryption utilizes an AES algorithm, which adds a layer of complexity to the reversal process. In combination with the dynamic loading, this points strongly towards a sophisticated malware strategy. The decrypted file is then written to disk (fopen, fwrite) and executed using a custom class loader.
3. Time Cloaking (MEDIUM):thread_function incorporates time-based cloaking. It checks the user’s timezone (getTimeZoneInfo) and potentially alters behavior if the timezone is not “America/Sao Paulo”. More importantly, it appends a timestamp ("?time=" + current time) to a string used for downloading further files. This suggests an attempt to evade detection by changing behavior based on the device’s time and potentially making analysis more challenging by dynamically creating different network requests.
4. Network Cloaking (MEDIUM):thread_function exhibits additional cloaking behavior, though details are limited. The function interacts with network resources to download additional resources, showing its network-active nature. The data used in network operations are likely obfuscated through string manipulation and URL obfuscation, making identification and analysis difficult. The precise cloaking mechanisms used are not fully clear from the decompiled code, but the presence of this behavior is a significant concern.
5. Obfuscation Techniques (HIGH): The entire codebase uses extensive obfuscation techniques. The code is heavily reliant on complex data structures (deques, hash tables, trees) and intricate control flow. Function and variable names are nonsensical, hindering analysis. The use of multiple layers of indirection, combined with aggressive heap memory management (operator_new, operator_delete), makes the code difficult to understand and analyze. These obfuscation methods are typical of malware aiming to evade reverse engineering.
6. Anti-Debugging/Anti-Analysis Techniques (HIGH): The JNI_OnLoad function includes a call to ptrace(PTRACE_TRACEME, 0, 0, 0). This is a clear indicator of an attempt to detect debugging or analysis tools. If a debugger is attached, this function’s behavior will likely differ; it could crash, stall the execution, or even hide its malicious code.
Conclusion:
The combination of dynamic code loading, custom decryption, sophisticated obfuscation, and anti-debugging techniques creates a high likelihood of malicious activity. The incomplete understanding of the role of several custom functions further aggravates the risk. The app’s behavior is heavily contingent on received data and time-sensitive factors, raising serious concerns about its potential for wide-ranging malicious acts. Immediate action should be taken to contain and investigate this application further. A detailed sandbox analysis is critical to identify any network communications, file system manipulation, or other abusive activities.
As seen in the Gemini output, the Android ELF behaviors are explained clearly on the functions matched on capa rules.
In this particular example, Gemini helped to:
Accentuate the function call sequences to perform dynamic code loading, where our analysts can easily inspect the key function calls getCacheFilePath and getDexClassLoader
Identify the timezone extraction with the additional URL parameter hint, where our analysts may try to probe the malicious payload quickly and accurately
Describe more potential suspicious behaviors (e.g. getDexClassLoader JNI call, URL obfuscation) for further rule-writing ideas
capa rules in Android together with Gemini summarization shows great potential for further malware detection with more advanced techniques. Our analysts are closely monitoring the malware trends and techniques in the market and writing up-to-date capa rules to catch the bad actors in the wild.
Android’s Multi-Layered Security Approach
Android’s ever-evolving, multi-layered security approach includes integrating advanced features and working with developers and device implementers to keep the Android platform and ecosystem safe. This includes, but is not limited to:
Advanced built-in protections: Google Play Protect automatically scans every app on Android devices with Google Play Services, no matter the download source. This built-in protection, enabled by default, provides crucial security against malware and unwanted software. Google Play Protect scans more than 200 billion apps daily and performs real-time scanning at the code-level on novel apps to combat emerging and hidden threats, like polymorphic malware. In 2024, Google Play Protect’s real-time scanning identified more than 13 million new malicious apps from outside Google Play.
Google Play and developer protections from malware: To create a trusted experience for everyone on Google Play, we use our SAFE principles as a guide, incorporating multi-layered protections that are always evolving to help keep Google Play safe. These protections start with the developers themselves, who play a crucial role in building secure apps. We provide developers with best-in-class tools, best practices, and on-demand training resources for building safe, high-quality apps. Every app undergoes rigorous review and testing, with only approved apps allowed to appear in the Play Store. Before a user downloads an app from Play, users can explore its user reviews, ratings, and Data safety section on Google Play to help them make an informed decision.
Engagement with the security research community: Google works closely with the security community on multiple levels, including the App Defense Alliance, to advance app safety standards. Android also collaborates with Google Threat Intelligence Group (GTIG) to address emerging threats and safeguard Android users worldwide.
Equipped with the fast-evolving Gemini, our analysts are able to spend less time on those sophisticated samples, minimising the exposure for malicious apps and ensuring the safety of Android ecosystems.
Acknowledgement
Special thanks to Willi Ballenthin, Yannis Gasparis, Mike Hunhoff, and Moritz Raabe for their support.
As AI continues to unlock new opportunities for business growth and societal benefits, we’re working to reduce the carbon intensity of AI systems — including by optimizing software, improving hardware efficiency, and powering AI models with carbon-free energy.
Today we’re releasing a first-of-its-kind study1 on the lifetime emissions of our Tensor Processing Unit (TPU) hardware. Over two generations — from TPU v4 to Trillium — more efficient TPU hardware design has led to a 3x improvement in the carbon-efficiency of AI workloads.2
Our life-cycle assessment (LCA) provides the first detailed estimate of emissions from an AI accelerator, using observational data from raw material extraction and manufacturing, to energy consumption during operation. These measurements provide a snapshot of the average, chip-level carbon intensity of Google’s TPU hardware, and enable us to compare efficiency across generations.
Introducing Compute Carbon Intensity (CCI)
Our study examined five models of TPUs to estimate their full life-cycle emissions and understand how hardware design decisions have impacted their carbon-efficiency. To measure emissions relative to computational performance and enable apples-to-apples comparisons between chips, we developed a new metric — Compute Carbon Intensity (CCI) — that we believe can enable greater transparency and innovation across the industry.
CCI quantifies an AI accelerator chip’s carbon emissions per unit of computation (measured in grams of CO2e per Exa-FLOP).3 Lower CCI scores mean lower emissions from the AI hardware platform for a given AI workload — for example training an AI model. We’ve used CCI to track the progress we’ve made in increasing the carbon-efficiency of our TPUs, and we’re excited to share the results.
Key takeaways
Google’s TPUs have become significantly more carbon-efficient. Our study found a 3x improvement in the CCI of our TPU chips over 4 years, from TPU v4 to Trillium. By choosing newer generations of TPUs — like our 6th-generation TPU, Trillium — our customers not only get cutting-edge performance, but also generate fewer carbon emissions for the same AI workload.
Operational electricity emissions are key. Today, operational electricity emissions comprise the vast majority (70%+) of a Google TPU’s lifetime emissions. This underscores the importance of improving the energy efficiency of AI chips and reducing the carbon intensity of the electricity that powers them. Google’s efforts torun on 24/7 carbon-free energy (CFE) on every grid where we operate by 2030 aims directly at reducing the largest contributor to TPU emissions — operational electricity consumption.
Manufacturing matters. While operational emissions dominate an AI chip’s lifetime emissions, emissions associated with chip manufacturing are still notable — and their share of total emissions will increase as we reduce operational emissions with carbon-free energy. The study’s detailed manufacturing LCA helps us target our manufacturing decarbonization efforts towards the highest-impact initiatives. We’re actively working with our supply chain partners to reduce these emissions through more sustainable manufacturing processes and materials.
Our significant improvements in AI hardware carbon-efficiency in this paper complement rapid advancements in AI model and algorithm design. Outside of this study, continued optimization of AI models is reducing the number of computations required for a given model performance. Some models that once required a supercomputer to run can now be run on a laptop, and at Google we’re using techniques like Accurate Quantized Training and speculative decoding to further increase model efficiency. We expect model advancements to continue unlocking carbon-efficiency gains, and are working to quantify the impact of software design on carbon-efficiency in future studies.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud TPU API’), (‘body’, <wagtail.rich_text.RichText object at 0x3e570b0e0af0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/marketplace/product/google/tpu.googleapis.com’), (‘image’, None)])]>
Partnering for a sustainable AI future
The detailed approach we’ve taken here allows us to target our efforts to continue increasing the carbon-efficiency of our TPUs.
This life-cycle analysis of AI hardware is an important first step in quantifying and sharing the carbon-efficiency of our AI systems, but it’s just the beginning. We will continue to analyze other aspects of AI’s emissions footprint — for example AI model emissions and software efficiency gains — and share our insights with customers and the broader industry.
1. The authors would like to thank and acknowledge the co-authors for their important contributions: Ian Schneider, Hui Xu, Stephan Benecke, Tim Huang, and Cooper Elsworth. 2. A February 2025 Google case study quantified the full lifecycle emissions of TPU hardware as a point-in-time snapshot across Google’s generations of TPUs. To estimate operational emissions from electricity consumption of running workloads, we used a one month sample of observed machine power data from our entire TPU fleet, applying Google’s 2023 average fleetwide carbon intensity. To estimate embodied emissions from manufacturing, transportation, and retirement, we performed a life-cycle assessment of the hardware. Data center construction emissions were estimated based on Google’s disclosed 2023 carbon footprint. These findings do not represent model-level emissions, nor are they a complete quantification of Google’s AI emissions. Based on the TPU location of a specific workload, CCI results of specific workloads may vary. 3. CCI includes both estimates of lifetime embodied and operational emissions in order to understand the impact of improved chip design on our TPUs. In this study, we hold the impact of carbon-free energy on carbon intensity constant across generations, by using Google’s 2023 average fleetwide carbon intensity. We did this purposefully to remove the impact of deployment location on the results.
Last year we announced Imagen 3, our highest quality image generation model. Imagen 3 is available to Vertex AI customers, which means businesses can create high quality images that reflect their own brand style and logos for use in marketing, advertising, or product design.
Today, we’ll share how you can build your brand style with a logo using Imagen 3, Gemini, and the Python Library Pillow.
First, use Imagen 3 to generate visual options
Imagen 3 generates the most realistic and highest quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting, and artifact reduction. The new Imagen 3 generation model (002), delivers even higher visual appeal, prompt alignment, and overall preference.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e570cdb1f40>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Here’s how it works: Imagen 3 generates the initial images, Gemini selects and refines them, while Pillow enables precise integration and manipulation. This collaborative workflow allows for a high degree of customization and efficiency in building your brand identity.
Imagen 3 uses natural language processing (NLP) to transform text descriptions into high-quality images. But here’s the secret to getting the right image: combine Imagen with Gemini’s selection process.
Let’s take an example. Imagine you’re opening a coffee shop named “Layo Cafe.” You want a logo that embodies your brand’s modern, inviting aesthetic.
Here’s how you can use Imagen and Gemini to help:
Describe your vision: Provide Imagen with a prompt, for example,”Create an image for a new coffee shop campaign“. Gemini will rephrase your prompt to generate a better prompt for the image generation, for example, “Photorealistic image of a bright, modern coffee shop interior, showcasing a steaming cup of coffee on a minimalist table, bathed in warm, natural light. Focus on the coffee and the inviting atmosphere.“
Generate options: Imagen will generate multiple variations based on your description.
Gemini’s selection: Gemini, Google’s next-generation large language model, steps in to analyze each image. It considers factors like aesthetics, readability, and brand alignment to select the most suitable option.
In this example, Gemini created four images.
When asked which one performs the best, Gemini chose the first one. Why? Based on the provided instructions, it showed the best balance of elements. It shows a steaming cup of latte art coffee, in a bright, modern setting with warm natural light streaming in from large windows. The background is nicely blurred, keeping the focus on the coffee. The overall aesthetic is inviting and appealing, likely to attract customers. The other images either lack the latte art (important for showcasing the cafe’s offerings) or the lighting isn’t as warm and inviting.
Adjust or add instructions to Gemini prompt based on the desired output to ensure the best-generated image is selected, as each use case and expectation may vary.
Next, build your logo
Now that you have the right image, it’s time to integrate it with your marketing visuals. This works with three AI models working together – Gemini, Imagen, and Pillow.
Set the scene: Provide Imagen with a prompt describing the desired image, for example,”Create an image for a new coffee shop campaign“. Gemini will rephrase your prompt to generate a better prompt for the image generation, for example, “Photorealistic image of a bright, modern coffee shop interior, showcasing a steaming cup of coffee on a minimalist table, bathed in warm, natural light. Focus on the coffee and the inviting atmosphere.“
Ask Gemini to curate a selection based on your brand needs: Gemini analyzes the generated images and selects the one that best represents your brand and aligns with the desired aesthetic. Repeat the process for creating a new logo or if you already have a logo , proceed with the next step.
Integrate with Pillow: The Pillow library adds your Layo Cafe logo to the chosen image, ensuring optimal placement and size for maximum impact.
In this case, this was the preferred logo option:
Finally, land your message
Amplify your message by overlaying text with visuals. Whether it’s a catchy tagline or a special offer, integrating text into your AI-generated images is a powerful way to engage your audience.
Craft your message: Decide on the text you want to overlay on your image. For example, “Layo Cafe: Your daily dose of inspiration.”
Apply text overlay using the Pillow library: This Python Imaging Library acts as the artist’s brush, expertly adding the text to the image according to Gemini’s recommendations. With Pillow, the integration of text becomes seamless, allowing for a polished final product.
Reach a global audience: One of the most exciting features of this process is the ability to overlay text in any language on your generated images. This multilingual support broadens your creative horizons, enabling you to reach diverse audiences with tailored messages.
Let’s bring everything together. Here is the logo with text on Imagen’s best-generated image.
Get started today
By combining the creative ability of Imagen with the intelligent selection and design capabilities of Gemini, you can generate a logo, branded marketing materials, and enhance your visual storytelling. Want to see the code and examples?Check out the code here on GitHub.
Artificial intelligence (AI) is no longer a futuristic concept. In 2025, it’s poised to become a cornerstone of public sector operations, transforming how agencies make decisions and serve constituents. By leading with AI-powered innovation, government agencies can improve efficiency, enhance decision-making, and provide better services to citizens.
According to a recent Google commissioned IDC study that surveyed 161 federal Chief AI Officers and other AI decision makers, 50% of agencies reported high levels of AI maturity, which corresponds to mature behaviors like embracing an innovation culture and piloting generative AI use cases to drive mission impact.
Let’s dive into five AI trends we believe will shape the future of the public sector in the year ahead.
Trend #1: Multimodal AI: Unleashing the power of context
Imagine a world where AI can understand and analyze information from various sources like text, images, video and even audio. This is the power of multimodal AI. This year will be pivotal for AI adoption, driven largely by multimodal learning and the contextual awareness it enables. Looking ahead, multimodal AI will enable agencies to analyze local and state-level data and combine it with data from other sources like Google Earth Engine, Google Maps, Waze, and public data sets to improve decision-making, pre-empt climate-related risks and improve public infrastructure.
Google AI empowers Hawaii Department of Transportation (HDOT) to address climate challenges. By leveraging Google Earth Engine and Google Cloud to deploy a Climate Resilience Platform, they are able to assess risk and prioritize investment decisions based on multiple climate risks, asset conditions, and community impact.
Trend #2: AI agents: The evolution from chatbots to multi-agent systems
We’ve gone from simple chatbots of the past, to sophisticated AI agents of the future, capable of handling the most complex tasks. These agents can reason, plan, and even learn from their interactions. In the future, AI agents will help government employees work and code more efficiently, manage their applications, gain deeper data insights, identify and resolve security threats, and bring their best ideas to life.
Sullivan County, NY is using virtual agents that use Google AI to serve more citizens faster, at any hour of the day. With limited staff and budget, Sullivan County created a solution that supports citizens beyond traditional work hours while freeing up government workers to focus on strategic work.
Trend #3: Assistive search: The next frontier for knowledge work
AI is already transforming the way we access and understand information. With generative AI, governments can improve the accuracy and efficiency of searching vast datasets. By investing in semantic search, automated metadata tools, and advanced document transcription, agencies can unlock the value of their data and make it more accessible.
The Air Force Research Laboratory (AFRL) is leveraging Google Cloud’s cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities to tackle complex challenges across various domains, from materials science and bioinformatics to human performance optimization. AFRL is embracing the power of AI and cloud computing to accelerate its mission of developing and transitioning advanced technologies to the air, space, and cyberspace forces.
Trend #4: AI-powered constituent experience: seamless and personalized
In 2025, AI will help improve citizen experiences. AI-powered tools and technologies can build trust and foster closer citizen-government relationships by enabling citizens to quickly and easily navigate government websites and services—such as applying for permits and licenses—offered in multiple languages and available 24/7.
Working with Google AI, Wisconsin Department of Workforce Development (DWD) was able to scale the state’s response to unemployment insurance claims and speed up overall response time. They were also successful in screening out fraudulent claims so that the UI program could be administered–with integrity–to Wisconsinites who needed financial assistance.
Trend #5: Security gets tighter— and tougher— with AI
As AI becomes more prevalent, so do the threats it poses. However, AI can also be a powerful tool for enhancing security. By automating threat detection, analyzing vast amounts of data, and responding to incidents quickly, AI can help protect government systems and sensitive information. As AI increasingly permeates government and citizen services, agencies must prioritize security to combat threats like deepfakes and disinformation.
New York City is hit by 90 billion cyber events every single week. “We have to distill those 90 billion events down to less than 50 or 60 things we look at. We couldn’t do that without a lot of artificial intelligence and automated decision-making tools.” – Matthew Fraser, Chief Technology Officer, New York City.
The opportunity ahead
We are already seeing incredible energy around AI in public sector agencies across the country who are using it to improve efficiency, enhance decision-making, and better serve their constituents. Going forward, we expect agencies to appoint and invest in AI leaders who bolster AI governance, build public trust, and invest in an effective workforce strategy with AI and security at the forefront. This will be crucial to ensure the responsible and secure development and deployment of AI across all levels of government.
Download the AI Trends 2025 in Public Sector report to learn more about how AI can help support your mission and accelerate your impact.
Cloud SQL Enterprise Plus edition provides high performance and availability for demanding applications with enhanced read and write performance. And high-performance applications often require that you tune the underlying database services.
To help application developers and DBAs build and deploy high performing applications, we’ve added new capabilities to query insights for Cloud SQL Enterprise Plus edition. This new database observability offering builds on top of the existing query insights capabilities in Cloud SQL Enterprise edition, and provides a unified and comprehensive observability experience that helps developers and database teams optimize performance faster than ever. Query insights for Cloud SQL Enterprise Plus edition captures and analyzes query telemetry and statistics to surface key performance indicators, diagnostic reports, and recommendations to enhance performance in an easy-to-consume format in the Google Cloud console. These signals and recommendations help application developers and database teams observe and tune overall database performance quickly, easily, and efficiently.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5709ca95e0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
With the new enhanced capabilities in query insights for Cloud SQL Enterprise Plus edition, you can:
Solve nuanced database performance issues faster than ever before. Access fine-grained database metrics such as wait events to conduct deeper root-cause analysis. With richer, near-real-time diagnostics, you can easily analyze query executions at a granular level. Query insights for Cloud SQL Enterprise Plus edition also helps you detect query plan regressions by capturing plans for all unique query executions and highlighting the rate-determining step for each execution.
Control your active query executions. Gain visibility into the queries currently running in your system. You can also choose to terminate sub-optimally running queries to unblock other critical queries and manage system resources better.
Enhance database performance with intelligent recommendations tailored for dynamic workloads. Query insights for Cloud SQL Enterprise Plus edition automatically analyzes workloads, highlights performance issues, and provides recommendations to solve them. It looks for common problems like missing indexes, missing or incorrect flags, etc. to help optimize queries and tune databases.
Use an AI-powered chat interface to ask your performance-related questions. Query insights for Cloud SQL Enterprise Plus editions comes along with an AI-powered natural language interface that provides advanced troubleshooting tips and tailored recommendations to make the resolution of complex database problems easier.
Let’s look at these capabilities in more detail.
Detailed query plans with 30-day telemetry
With 30 days of telemetry data, you can analyze long-term trends and identify recurring issues in query performance. By reviewing detailed query execution plans and comparing them over time, you can pinpoint inefficiencies and make data-driven optimizations for sustained database improvements.
Optimize query performance with historical analysis and execution plans over time
Wait events
Wait events in query insights help you identify where your database is stuck, such as on disk I/O or locks. This enables faster diagnosis of performance bottlenecks and smarter resource optimization.
Diagnose and optimize query performance with wait events
Index recommendations
Index recommendations help you identify performance bottlenecks by detecting missing indexes and providing precise, actionable recommendations to improve query performance. These recommendations offer specific SQL index-creation commands, showing their potential performance impact and highlighting the impacted queries, thereby streamlining the process of database performance optimization.
Index recommendations for enhanced query performance
Get started
Query insights for Cloud SQL Enterprise Plus edition democratizes access to enterprise-grade observability for tier-1 workloads on Google Cloud managed databases. Empowered by these advanced performance management capabilities, get ready to simplify complex workflows, monitor database health, write better queries, and meaningfully optimize system performance. This reduces mean-time-to-resolution for database performance issues, and allows application developers and database teams to focus more on the core business logic.
Query insights for Cloud SQL Enterprise Plus edition is now available in preview. Simply access query insights within the console and begin monitoring and managing your database performance. To get started, please visit our documentation.
In the federal government, organizations face unique challenges in meeting strict security and compliance requirements. FedRAMP, IL4, and IL5 standards set forth rigorous guidelines to ensure the protection of sensitive data and systems. Google Cloud understands the importance of meeting these accreditation needs and provides a comprehensive suite of tools and capabilities to support federal agencies.
Security-first development with Google Cloud
Google Cloud offers a suite of developer tools to streamline your workflow and enhance security:
Cloud Workstations: Enables secure, cloud-based workstations for remote work without compromising security.
Gemini Code Assist: Provides AI-powered code completion and suggestions helping you write better code faster.
Assured Open Source Software: Offers a curated catalog of audited and approved open-source software helping ensure compliance and security.
Cloud Build: Uses automated build and deployment service which streamlines the development process and reduces the risk for errors.
Artifact Registry: Provides secure storage and management of container images, making it easy to track and manage versions of software
The Google Cloud DevOps platform creates a strong foundation for a secure software supply chain by addressing various aspects of security throughout the development lifecycle. Powered by Cloud Workstations, Cloud Build, Artifact Registry,Google Kubernetes Engine and Cloud Run, this robust approach can help federal agencies protect their applications from supply chain attacks and ensure the integrity of their software.
Meet FedRAMP and IL4 and IL5 compliance with ease
Google Cloud offers security controls and certifications to help federal agencies meet various compliance regime requirements, such as FedRAMP, IL4 and IL5. These include:
Comprehensive encryption: Data is encrypted at rest and in transit, both in Google Cloud and when transferred to and from on-premises systems.
Vulnerability management: Google Cloud provides tools and services to identify and remediate vulnerabilities in a timely manner.
Continuous monitoring:Security Command Center provides real-time monitoring and threat detection.
Accelerating federal security and compliance with Google Cloud
Google Cloud provides a comprehensive platform that enables federal agencies to meet security accreditation needs while enhancing developer productivity. With a range of tools and capabilities, Google Cloud is the ideal partner for public sector organizations looking to innovate and transform their operations.
At Google Cloud, we’re deeply invested in making AI helpful to organizations everywhere — not just for our valued customers, but for our equally important partners.
Today, we’re thrilled to introduce a significant leap forward in how we enable our partners to co-market with us: Gemini-powered content creation within Partner Marketing Studio. These AI features are designed to streamline marketing efforts across our entire ecosystem, empowering our partners to unlock new levels of success, efficiency, and impact.
The evolving landscape of partner marketing and the need for AI
Today’s marketers are faced with a complex landscape of digital channels, diverse customer segments, and an ever-increasing demand for personalized experiences. In this dynamic environment, the ability to create high-quality, targeted content quickly and efficiently is more critical than ever.
For our partners, this challenge is amplified by the need to not only promote their own unique services and solutions, but also to tell a joint story with Google Cloud. This requires a delicate balance of creativity, strategic thinking, and operational efficiency. That’s where AI comes in.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e877c5efc10>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
With the addition of generative AI content creation, Partner Marketing Studio — our complimentary marketing automation tool for partners—evolves from a content repository into an extension of a partner’s marketing team, making it easier than ever to build campaigns that generate awareness and demand for partner services and solutions in three ways:
#1 Customize campaigns with precision
Partner Marketing Studio is where Google Cloud partners can access a curated library of customizable assets, tap into Google’s marketing expertise, and streamline campaign execution. Our new AI editing feature sits on top of this curated campaign library to give partners the ability to customize campaigns for their specific audience, whether it’s by organizational maturity, target persona, or industry.
Imagine you’re a partner, launching a new campaign to promote your expertise in data analytics. You find a pre-built campaign in Partner Marketing Studio that aligns with your goals, but you need to tailor for the healthcare industry. With the editing feature, you can easily refine the existing content with just a few prompts, ensuring your message best resonates with your target audience.
#2: Generate original campaigns from scratch with ease
Often partners need to create co-marketing content entirely from scratch. That’s where the AI creation feature comes in. This powerful new capability empowers partners to generate original marketing content quickly and efficiently within a co-branded template, without the need for extensive design or copywriting resources.
Imagine a partner is launching a new service that leverages Google Cloud’s AI capabilities. They need a fresh email campaign to promote this service to their target audience. With the creation feature, they can simply provide the objective of the asset they are creating and any relevant filters (like industry or audience), and in just moments, they will get a compelling, ready-to-use, co-branded email, complete with subject line, body copy, and a clear call to action—all infused with Google’s marketing expertise.
With the creation feature, partners can generate original marketing content, tailored to the needs of their organization, saving time and resources while ensuring the message is on point for the target audience.
#3: Built on Google’s foundation of AI innovation and expertise
These new features are powered by Gemini and grounded in Google’s content and brand standards. Built on Google’s deep understanding of AI and trained on our vast knowledge base of best practices and marketing insights, these features aren’t just providing access to generative AI capabilities. Partners gain access to Google’s marketing expertise, distilled into an intuitive, easy-to-use interface.
We’ve leveraged the power of Gemini to ensure these tools are not only powerful but also aligned with our brand standards and best practices. This means you can trust that the content generated by the editing and creation features is not only effective but also consistent with the Google Cloud brand.
Bringing AI marketing training to our partners
We’re committed to providing training and support to help partners effectively leverage new AI capabilities across Google products. That’s why we’re thrilled to announce that in 2025, we’ll be making one of our most successful internal training series available to our partners: AI Boost Bites.
This video training series is designed to provide practical, on-the-job training in just 5 minutes. Each bite-sized episode features Google marketers who’ve successfully integrated AI into their daily work, using tools like Gemini, Gemini for Workspace, NotebookLM, and AI Studio. Partners will get to see firsthand how Google marketers use these tools to produce compelling content (from text and graphics to video and audio), develop market insights, and solve real-world marketing challenges. To help put these learnings into practice, each video is followed by a “challenge” to apply the concepts. AI Boost Bites has been instrumental in upskilling Google Marketers in AI, fostering a culture of continuous learning and innovation that we are excited to extend to our partners. To find the AI Boost Bites training series and start using the AI features, login to Partner Marketing Studio.
Investing in our partners through comprehensive co-marketing benefits:
Partners at the Partner and Premiere level of the Partner Advantage Program can access Partner Marketing Studio and get started today. Partner Marketing Studio is more than just an AI-powered content creation tool: it’s a comprehensive platform designed to support the marketing efforts of Google Cloud partners every step of the way. In addition to generative AI features, partners can also access:
Global content library: Localized assets like Google-authored reports, targeted emails, social posts, banner ads, one-pagers, and pitch decks support our partners across the globe
Automation tools: Monitor the performance of your campaigns and make data-driven decisions.
Marketing resources: Access a wealth of co-marketing resources, including brand guidelines, logos, and marketing playbooks.
Campaigns: Explore our collection of pre-built campaigns, designed to help you promote your services and solutions.
Messaging: Use Google Cloud product and solution messaging to train your sales and marketing teams and develop your unique joint message with Google Cloud.
Templates: Design and build effective marketing and sales assets with ease using pre-built templates and plug-and-play editing.
Live support: Get personalized guidance from Google Cloud marketing experts. Our team hosts regular, live webinars, weekly orientation sessions for those new to Partner Marketing Studio, and one-on-one support to help you develop effective marketing strategies and maximize your results.
Google Cloud speakers: Request a Google Cloud speaker for your events. Elevate your events with the expertise and insights of our industry-leading speakers.
Google-vetted agencies: Connect with trusted marketing agencies who have proven success with Google Cloud partners to support your marketing initiatives. We’ve curated a list of top-tier agencies that specialize in Google Cloud marketing, ensuring you have access to the resources available.
What our pilot partners are saying
“With the new AI features in Partner Marketing Studio, we can create more targeted industry and persona-based versions of our Google Cloud marketing campaigns automatically. I’m excited that this efficiency will enable my team to focus on more strategic marketing activities and close more deals.” – Elissa Robins, Head of Marketing, SADA, An Insight company
Ready to unlock AI for your marketing?
We invite our partners and their marketing teams to join us for our upcoming webinar on February 18th to learn more about Partner Marketing benefits, and discover how Google Cloud is empowering partners to achieve unprecedented marketing success. This is your opportunity to get a firsthand look at these powerful new tools, including the new AI features, and learn how you can leverage them to drive your business forward.