Amazon Relational Database Service (RDS) for MySQL announces Amazon RDS Extended Support minor version 5.7.44-RDS.20250103. We recommend that you upgrade to this version to fix known security vulnerabilities and bugs in prior versions of MySQL. Learn more about the bug fixes and patches in this version in the Amazon RDS User Guide.
Amazon RDS Extended Support provides you more time, up to three years, to upgrade to a new major version to help you meet your business requirements. During Extended Support, Amazon RDS will provide critical security and bug fixes for your RDS for MySQL databases after the community ends support for a major version. You can run your MySQL databases on Amazon RDS with Extended Support for up to three years beyond a major version’s end of standard support date. Learn more about Extended Support in the Amazon RDS User Guide and the Pricing FAQs.
Amazon RDS for MySQL makes it simple to set up, operate, and scale MySQL deployments in the cloud. See Amazon RDS for MySQL Pricing for pricing details and regional availability. Create or update a fully managed Amazon RDS database in the Amazon RDS Management Console.
We are excited to announce that Amazon OpenSearch Serverless now supports workloads up to 100TB of data for time-series collections. OpenSearch Serverless is a serverless deployment option for Amazon OpenSearch Service that makes it simple for you to run search and analytics workloads without having to think about infrastructure management. With the support for larger datasets, OpenSearch Serverless now enables more data-intensive use cases such as log analytics, security analytics, real-time application monitoring, and more.
OpenSearch Serverless’ compute capacity used for indexing and search are measured in OpenSearch Compute Units (OCUs). To accommodate for larger datasets, OpenSearch Serverless now allows customers to independently scale indexing and search operations to use up to 1700 OCUs. You configure the maximum OCU limits on search and indexing independently to manage costs. You can also monitor real-time OCU usage with CloudWatch metrics to gain a better perspective on your workload’s resource consumption.
Amazon Elastic Compute Cloud (Amazon EC2) C6in instances are now available in the Chicago and New York City Local Zones. C6in instances are powered by 3rd Generation Intel Xeon Scalable processors with an all-core turbo frequency of up to 3.5 GHz. They are x86-based Amazon EC2 compute-optimized instances offering up to 200 Gbps of network bandwidth. The instances are built on AWS Nitro System, which is a dedicated and lightweight hypervisor that delivers the compute and memory resources of the host hardware to your instances for better overall performance and security. You can take advantage of the higher network bandwidth to scale the performance for a broad range of workloads running in AWS Local Zones.
Local Zones are an AWS infrastructure deployment that place compute, storage, database, and other select services closer to large population, industry, and IT centers where no AWS Region exists. You can use Local Zones to run applications that require single-digit millisecond latency for use cases such as real-time gaming, hybrid migrations, media and entertainment content creation, live video streaming, engineering simulations, financial services payment processing, capital market operations, and AR/VR.
You can now update your existing Amazon Elastic Container Service (Amazon ECS) services that use a short Amazon Resource Name (ARN) to use a long ARN without needing to re-create the service. This enables you to tag your long-running Amazon ECS services, letting you better allocate cost, improve visibility, and define fine-grained resource-level permissions for these services.
Since 2018, customers have been able to tag Amazon ECS services that use the long ARN format (which includes the cluster name in the ARN) but if they wanted to tag services that were created with the old short ARN format, they had to delete and re-create the service. Now, ECS enables you to tag services that were created with the old short ARN format without needing to re-create the service. To enable this, you need to complete 2 steps: 1/opt-in your account to the long Amazon Resource Names (ARN) format for tasks and services and 2/tag the service you want to migrate to the long ARN format using the TagResource API action. Once you complete these steps, ECS updates the ARN of the service to the long ARN format and tags the service. Updating the service to use the long ARN format allows you to define resource-based access policies in IAM and granularly monitor the cost of your services in the Cost & Usage Report and Cost Explorer.
You can update your services with short ARNs to long ARNs in all AWS regions using the AWS Console, CLI, and API. To learn more, please read our documentation.
AWS CodePipeline now provides Amazon CloudWatch metrics integration for V2 pipelines, enabling you to monitor both pipeline-level and account-level metrics directly in your AWS account. The integration introduces a pipeline duration metric that tracks the total execution time of your pipeline completions, and pipeline failure metric that monitors the frequency of pipeline execution failures. You can now track these metrics through both the CodePipeline console and the CloudWatch Metrics console to actively monitor your pipeline health.
To learn more about this feature, please visit our documentation. For more information about AWS CodePipeline, visit our product page. This feature is available in all regions where AWS CodePipeline is supported, except the AWS GovCloud (US) Regions and the China Regions.
As organizations rush to adopt generative AI-driven chatbots and agents, it’s important to reduce the risk of exposure to threat actors who force AI models to create harmful content.
We want to highlight two powerful capabilities of Vertex AI that can help manage this risk — content filters and system instructions. Today, we’ll show how you can use them to ensure consistent and trustworthy interactions.
Content filters: Post-response defenses
By analyzing generated text and blocking responses that trigger specific criteria, content filters can help block the output of harmful content. They function independently from Gemini models as part of a layered defense against threat actors who attempt to jailbreak the model.
Gemini models on Vertex AI use two types of content filters:
Non-configurable safety filters automatically block outputs containing prohibited content, such as child sexual abuse material (CSAM) and personally identifiable information (PII).
Configurable content filters allow you to define blocking thresholds in four harm categories (hate speech, harassment, sexually explicit, and dangerous content,) based on probability and severity scores. These filters are default off but you can configure them according to your needs.
It’s important to note that, like any automated system, these filters can occasionally produce false positives, incorrectly flagging benign content. This can negatively impact user experience, particularly in conversational settings. System instructions (below) can help mitigate some of these limitations.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e71d60e52b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
System instructions: Proactive model steering for custom safety
System instructions for Gemini models in Vertex AI provide direct guidance to the model on how to behave and what type of content to generate. By providing specific instructions, you can proactively steer the model away from generating undesirable content to meet your organization’s unique needs.
You can craft system instructions to define content safety guidelines, such as prohibited and sensitive topics, and disclaimer language, as well as brand safety guidelines to ensure the model’s outputs align with your brand’s voice, tone, values, and target audience.
System instructions have the following advantages over content filters:
You can define specific harms and topics you want to avoid, so you’re not restricted to a small set of categories.
You can be prescriptive and detailed. For example, instead of just saying “avoid nudity,” you can define what you mean by nudity in your cultural context and outline allowed exceptions.
You can iterate instructions to meet your needs. For example, if you notice that the instruction “avoid dangerous content” leads to the model being excessively cautious or avoiding a wider range of topics than intended, you can make the instruction more specific, such as “don’t generate violent content” or “avoid discussion of illegal drug use.”
However, system instructions have the following limitations:
They are theoretically more susceptible to zero-shot and other complex jailbreak techniques.
They can cause the model to be overly cautious on borderline topics.
In some situations, a complex system instruction for safety may inadvertently impact overall output quality.
We recommend using both content filters and system instructions.
Evaluate your safety configuration
You can create your own evaluation sets, and test model performance with your specific configurations ahead of time. We recommend creating separate harmful and benign sets, so you can measure how effective your configuration is at catching harmful content and how often it incorrectly blocks benign content.
Investing in an evaluation set can help reduce the time it takes to test the model when implementing changes in the future.
How to get started
Both content filters and system instructions play a role in ensuring safe and responsible use of Gemini. The best approach depends on your specific requirements and risk tolerance. To get started, check out content filters and system instructions for safety documentation.
Generative AI is now well beyond the hype and into the realm of practical application. But while organizations are eager to build enterprise-ready gen AI solutions on top of large language models (LLMs), they face challenges in managing, securing, and scaling these deployments, especially when it comes to APIs. As part of the platform team, you may already be building a unified gen AI platform. Some common questions you might have are:
How do you ensure security and safety for your organization? As with any API, LLM APIs represent an attack vector. What are the LLM-specific considerations you need to worry about?
How do you stay within budget when your LLM adoption grows, while ensuring that each team has appropriate LLM capacity they need to continue to innovate and make your business more productive?
How do you put the right observability capabilities in place to understand your usage patterns, help troubleshoot issues, and capture compliance data?
How do you give end users of your gen AI applications the best possible experience, i.e., provide responses from the most appropriate models with minimal downtime?
Apigee, Google Cloud’s API management platform, has enabled our customers to address API challenges like these for over a decade. Here is an overview of the AI-powered digital value chain leveraging Apigee API Management.
Figure 1: AI-powered Digital Value chain
Gen AI, powered by AI agents and LLMs, is changing how customers interact with businesses, creating a large opportunity for any business. Apigee streamlines the integration of gen AI agents into applications by bolstering their security, scalability, and governance through features like authentication, traffic control, analytics, and policy enforcement. It also manages interactions with LLMs, improving security and efficiency. Additionally, Application Integration, an Integration-Platform-as-a-Service solution from Google cloud, offers pre-built connectors that allow gen AI agents to easily connect with databases and external systems, helping them fulfill user requests.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0ad423a3d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
This blog details how Apigee’s customers have been using the product to address challenges specific to LLM APIs. We’re also releasing a comprehensive set of reference solutions that enable you to get started on addressing these challenges yourself with Apigee. You can also view a webinar on the same topic, complete with product demos.
Apigee as a proxy for agents
AI agents leverage capabilities from LLMs to accomplish tasks for end-users. These agents can be built using a variety of tools — from no-code and low-code platforms, to full-code frameworks like LangChain or LlamaIndex. Apigee acts as an intermediary between your AI application and its agents. It enhances security by allowing you to defend your LLM APIs against theOWASP Top 10 API Security risks, manages user authentication and authorization, and optimizes performance through features like semantic caching. Additionally, Apigee enforces token limits to control costs and can even orchestrate complex interactions between multiple AI agents for advanced use cases.
Apigee as a gateway between LLM application and models
Depending on the task at hand, your AI agents might need to tap into the power of different LLMs. Apigee simplifies this by intelligently routing and managing failover of requests to the most suitable LLM using Apigee’s flexible configurations and templates. It also streamlines the onboarding of new AI applications and agents while providing robust access control for your LLMs. Beyond LLMs, agents often need to connect with databases and external systems to fully address users’ needs. Apigee’s robust API Management platform enables these interactions via managed APIs, and for more complex integrations, where custom business logic is required, you can leverage Google Cloud’s Application Integration platform.
It’s important to remember that these patterns aren’t one-size-fits-all. Your specific use cases will influence the architecture pattern for an agent and LLM interaction. For example, you might not always need to route requests to multiple LLMs. In some scenarios, you could connect directly to databases and external systems from the Apigee agent proxy layer. The key is flexibility — Apigee lets you adapt the architecture to match your exact needs.
Now let’s break down the specific areas where Apigee helps one by one:
AI safety For any API managed with Apigee, you can call out to Model Armor, Google Cloud’s model safety offering that allows you to inspect every prompt and response to protect you against potential prompt attacks and help your LLMs respond within the guardrails you set. For example, you can specify that your LLM application does not provide answers about financial or political topics.
Latency and cost Model response latency continues to be a major factor when building LLM-powered applications, and this will only get worse as more reasoning happens during inference. With Apigee, you can implement a semantic cache that allows you to cache responses to any model for semantically similar questions. This dramatically reduces the time end users need to wait for a response.
Performance Different models are good at different things. For example, Gemini Pro models provide the highest quality answers, while Gemini Flash models excel at speed and efficiency. You can route users’ prompts to the best model for the job, depending on the use case or application.
You can decide which model to use by specifying it in your API call and Apigee routes it to your desired model while keeping a consistent API contract. See this reference solution to get started.
Distribution and usage limits With Apigee you can create a unified portal with self-service access to all the models in your organization. You can also set up usage limits by individual apps and developers to maintain capacity for those who need it, while also controlling overall costs. See how you can set up usage limits in Apigee using LLM token counts here.
Availability Due to the high computational demands of LLM inference, model providers regularly restrict the number of tokens you can use in a certain time window. If you reach a model limit, requests from your applications will get throttled, which could lead to your end users being locked out of the model. In order to prevent this, you can implement a circuit breaker in Apigee so that requests are re-routed to a model with available capacity. See this reference solution to get started.
Reporting As a platform team, you need visibility into usage of the various models you support as well as which apps are consuming how many tokens. You might want to use this data for internal cost reporting or to optimize. Whatever your motivation, with Apigee, you can build dashboards that let you see usage based on the actual tokens counts — the currency of LLM APIs. This way you can see the true usage volume across your applications. See this reference solution to get started.
Auditing and troubleshooting Perhaps you need to log all interactions with LLMs (prompts, responses, RAG data) to meet compliance or troubleshooting requirements. Or perhaps you want to analyze response quality to continue to improve your LLM applications. With Apigee you can safely log any LLM interaction with Cloud Logging, de-identify it, and inspect it from a familiar interface. Get started here.
Security With APIs increasingly seen as an attack surface, security is paramount to any API program. Apigee can act as a secure gateway for LLM APIs, allowing you to control access with API keys, OAuth 2.0, and JWT validation. This helps you enforce using enterprise security standards to authenticate users and applications that interact with your models. Apigee can also help prevent abuse and overload by enforcing rate limits and quotas, safeguarding LLMs from malicious attacks and unexpected traffic spikes.
In addition to these security controls, you can also use Apigee to control the model providers and models that can be used. You can do this by creating policies that define the models that can be accessed by which users or applications. For example, you could create a policy that only allows certain users to access your most powerful LLMs, or you could create a policy that only allows certain applications to access your LLMs for specific tasks. This gives you granular control over how your LLMs are used, so they are only used for their intended purposes.
By integrating Apigee with your LLM architecture, you create a secure and reliable environment for your AI applications to thrive.
Ready to unlock the full potential of gen AI?
Explore Apigee’s comprehensive capabilities for operationalizing AI and start building secure, scalable, and efficient gen AI solutions today! Visit our Apigee generative AI samples page to learn more and get started, watch a webinar with more details, or contact us here!
Amazon FSx for Lustre, a service that provides high-performance, cost-effective, and scalable file storage for compute workloads, now enables you to upgrade the Lustre version of your FSx for Lustre file systems. This feature allows you to benefit from the enhancements available in newer Lustre versions on your existing file systems.
FSx for Lustre provides fully-managed file systems built on Lustre, the world’s most popular open-source high performance file system. FSx for Lustre supports multiple long-term support Lustre versions released by the Lustre community. Newer Lustre versions provide benefits such as performance enhancements, new features, and support for the latest Linux kernel versions for your client instances. Starting today, you can upgrade your file systems to newer Lustre versions within minutes using the AWS management console or the AWS CLI/SDK .
The feature is now available on all file systems at no additional cost in all AWS Regions where FSx for Lustre is available. For more information, see Amazon FSx for Lustre documentation.
AWS AppSync, a fully managed GraphQL service that helps customers build scalable APIs, announces improvements to its EvaluateCode and EvaluateMappingTemplate APIs. This update enables developers to comprehensively mock all properties of the context object during resolver and function unit testing, including identity information, stash variables, and error handling. The enhancement also introduces improved JSON input validation with clear, actionable error messages, making it easier for developers to identify and fix issues in their context setup.
These improvements simplify the setup and configuration requirements. Developers can now efficiently test functions and resolvers by accessing and validating resolver stash (ctx.stash) and error tracking (ctx.outErrors) in their test environments. The update also simplifies identity mocking by allowing developers to include only the relevant caller information in ctx.identity. The updated console experience provides better visibility into the resolver test results, helping developers troubleshoot and optimize their resolver implementations more effectively.
This enhancement is available in all AWS Regions where AWS AppSync is currently supported.
To learn more about these new features, visit the AWS AppSync documentation and explore the context object reference. You can also explore examples and best practices in the AWS AppSync Developer Guide or get started by visiting the AWS AppSync console.
Amazon Elastic Block Store (Amazon EBS) now displays the full snapshot size for EBS Snapshots. With this enhancement, customers can now retrieve full snapshot sizes programmatically through the DescribeSnapshots API using the new field, full-snapshot-size-in-bytes. The full snapshot size is also displayed in the EBS Snapshots console under the new ‘Full snapshot size’ column.
Since EBS Snapshots are incremental in nature, if you take multiple snapshots of a volume over time, each snapshot only stores the new or modified blocks while maintaining references to unchanged blocks from previous snapshots. The ‘full snapshot size’ field shows you the total size of all blocks that make up a snapshot, including both the blocks stored directly in that snapshot and all blocks referenced from previous snapshots. For instance, if you have a 100 GB volume with 50 GB of data, the ‘full snapshot size’ would show 50 GB regardless of whether it’s the first snapshot or a subsequent one.
The ‘full snapshot size’ field provides crucial information about your EBS snapshot storage, such as the total size of the snapshot in the archived tier or the amount of data written to the source volume at the time the snapshot was created. Please note that this is different from the incremental snapshot size, which only refers to the size of newly changed blocks stored in that specific snapshot.
This feature is now generally available in all commercial AWS regions and the AWS GovCloud (US) Regions. To get started, see the EBS Snapshots user guide and API specification.
AWS HealthScribe is a generative AI-powered service that automatically generates summarized clinical notes and transcripts from patient-clinician conversations. Documentation for behavioral health related encounters follows a goal centric format based on GIRPP (Goal, Intervention, Response, Progress, Plan) format. With this launch, AWS HealthScribe customers can directly convert a behavioral health related patient-clinician conversation to a GIRPP format note. This can potentially save clinicians hours daily in manually documenting behavioral health related encounters.
Customers using the HealthScribe StartMedicalScribeJob and StartMedicalScribeStream API can simply set note template type parameter as “GIRPP” in the ClinicalNoteGenerationSettings for both async and streaming jobs and will share the output note in GIRPP format as the conversation ends.
This feature is available in US East (N.Virginia) Region. To learn more refer to our documentation.
Google Cloud Next 2025 is coming up fast, and it’s shaping up to be a must-attend event for the cybersecurity community and anyone passionate about learning more about the threat landscape. We’re going to offer an immersive experience packed with opportunities to connect with experts, explore innovative technologies, and hone your skills in the ever-evolving world of cloud security and governance, frontline threat intelligence, enterprise compliance and resilience, AI risk management, and incident response.
Whether you’re a seasoned security pro or just starting your security journey, Next ’25 has something for you.
Immerse yourself in the Security Hub
The heart of our security presence at Next ‘25 will be the Security Hub, a dynamic space designed for engagement and exploration. Here, you can dive deep into the full portfolio of Google Cloud Security products, experience expanded demos, and get your most pressing questions answered by the engineers who build them.
Experience the SOC Arena
Step into our Security Operations Center (SOC) Arena for a front-row seat to real-world attack scenarios. Witness the latest hacker tactics and learn how Google Cloud equips cybersecurity teams with the data, AI, and scalable analytics needed to quickly detect and remediate attacks. Between SOC sessions, security experts and key partners will deliver lightning talks, sharing foundational insights and valuable resources to bolster your security knowledge.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe972f3700>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Sharpen your skills in the Security Situation Room
The Situation Room offers two unique avenues for boosting your security expertise:
Security Tabletop Workshop: Prepare your organization for challenging security incidents by participating in a realistic cybersecurity tabletop exercise. Role-play different personas in a data breach, ransomware attack, and other simulated incidents, and explore potential responses, gaining insights into how your team might react, recognizing the opportunity to learn from varied perspectives and refine your approach through collaborative exploration. This exercise can help you identify vulnerabilities, evaluate incident response strategies, address gaps, foster collaboration, clarify roles, and ultimately reduce the potential impact of future attacks.
Birds of a Feather Sessions: These no-slide, discussion-focused sessions offer invaluable opportunities to connect with peers and Google Cloud Security experts. Dive into topics including securing AI, identity and access management, network security, and protection against fraud and abuse. Share challenges, discuss best practices, and explore cutting-edge trends in a collaborative environment as you network, learn, and contribute to the vibrant Google Cloud Security community.
Get hands-on in the Security Sandbox
The Security Sandbox is where the action happens. Two interactive experiences await:
Capture the Flag (CTF): Test your cybersecurity prowess in Google Threat Intelligence’s CTF challenge. This unique game blends real-world data from CISA advisories, ransom notes, and Dark Web intelligence into a simulated threat hunt.
Use industry-standard tools and data to navigate clues, analyze evidence, and solve puzzles. This CTF is designed for all skill levels, offering a chance to learn valuable techniques, experience the thrill of an investigation, and even win prizes.
ThreatSpace: Step into Google Cloud’s ThreatSpace, a digital training ground where you can experience real cyberattacks and practice your incident response skills in a safe environment. Mandiant’s red team will simulate attacks while their incident response team guides you through the investigation. Use Google Cloud Security tools including Security Operations and Threat Intelligence to uncover the attacker’s methods and prevent further damage.
Connect and recharge at Coffee Talk
Grab a coffee, snag a copy of “Defenders Advantage,” and chat with Google Cloud Security experts. Learn how our products and services can empower your security strategy across the domains of intelligence, detection, response, validation, hunting, and mission control and get personalized advice for your organization.
Register today
Next ’25 is your chance to immerse yourself in the world of cybersecurity, connect with industry leaders, and gain the knowledge and skills you need to stay ahead of the curve. To join us, register here.
Today, we are excited to announce the general availability of Jasmine – new Singaporean English Neural Text-to-Speech (NTTS) female voice for Amazon Polly.
Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk and to build entirely new categories of speech-enabled products.
Jasmine is our first voice for the Singaporean variant of English. Even though Singaporean English is reported to be close to British English, there are some unique pronunciation patterns that we captured while training this voice, such as pronunciation of telephone numbers or postal codes, to make sure that Jasmine sounds like a local speaker. With this launch, we continue building a variety of voice and language options for Amazon Polly customers.
AWS AppSync GraphQL now offers operation-level caching, a new feature that allows customers to cache entire GraphQL query operation responses. This enhancement enables developers to optimize read-heavy GraphQL APIs, delivering faster response times and improved application performance.
Operation-level caching in AWS AppSync GraphQL streamlines the caching process by storing complete query responses. This approach is particularly beneficial for complex queries or high-traffic scenarios, where it can significantly reduce latency and enhance the overall user experience. By caching at the operation level, developers can easily boost API efficiency and create more responsive applications without additional code changes.
Operation-level caching is now available in all AWS Regions where AWS AppSync is offered.
To learn more about operation-level caching in AWS AppSync GraphQL, visit the AWS AppSync documentation. You can start using this feature today by configuring caching settings in the AWS AppSync GraphQL console or through the AWS CLI.
Cybercrime makes up a majority of the malicious activity online and occupies the majority of defenders’ resources. In 2024, Mandiant Consulting responded to almost four times more intrusions conducted by financially motivated actors than state-backed intrusions. Despite this overwhelming volume, cybercrime receives much less attention from national security practitioners than the threat from state-backed groups. While the threat from state-backed hacking is rightly understood to be severe, it should not be evaluated in isolation from financially motivated intrusions.
A hospital disrupted by a state-backed group using a wiper and a hospital disrupted by a financially motivated group using ransomware have the same impact on patient care. Likewise, sensitive data stolen from an organization and posted on a data leak site can be exploited by an adversary in the same way data exfiltrated in an espionage operation can be. These examples are particularly salient today, as criminals increasingly target and leak data from hospitals. Healthcare’s share of posts on data leak sites has doubled over the past three years, even as the number of data leak sites tracked by Google Threat Intelligence Group has increased by nearly 50% year over year. The impact of these attacks mean that they must be taken seriously as a national security threat, no matter the motivation of the actors behind it.
Cybercrime also facilitates state-backed hacking by allowing states to purchase cyber capabilities, or co-opt criminals to conduct state-directed operations to steal data or engage in disruption. Russia has drawn on criminal capabilities to fuel the cyber support to their war in Ukraine. GRU-linked APT44 (aka Sandworm), a unit of Russian military intelligence, has employed malware available from cybercrime communities to conduct espionage and disruptive operations in Ukraine and CIGAR (aka RomCom), a group that historically focused on cybercrime, has conducted espionage operations against the Ukrainian government since 2022. However, this is not limited to Russia. Iranian threat groups deploy ransomware to raise funds while simultaneously conducting espionage, and Chinese espionage groups often supplement their income with cybercrime. Most notably, North Korea uses state-backed groups to directly generate revenue for the regime. North Korea has heavily targeted cryptocurrencies, compromising exchanges and individual victims’ crypto wallets.
Despite the overlaps in effects and collaboration with states, tackling the root causes of cybercrime requires fundamentally different solutions. Cybercrime involves collaboration between disparate groups often across borders and without respect to sovereignty. Any solution requires international cooperation by both law enforcement and intelligence agencies to track, arrest, and prosecute these criminals. Individual takedowns can have important temporary effects, but the collaborative nature of cybercrime means that the disrupted group will be quickly replaced by others offering the same service. Achieving broader success will require collaboration between countries and public and private sectors on systemic solutions such as increasing education and resilience efforts.
aside_block
<ListValue: [StructValue([(‘title’, ‘Cybercrime: A Multifaceted National Security Threat’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb5935d2400>), (‘btn_text’, ‘Download now’), (‘href’, ‘https://services.google.com/fh/files/misc/cybercrime-multifaceted-national-security-threat.pdf’), (‘image’, <GAEImage: cybercrime-cover>)])]>
Stand-Alone Cybercrime is a Threat to Countries’ National Security
Financially motivated cyber intrusions, even those without any ties to state goals, harm national security. A single incident can be impactful enough on its own to have a severe consequence on the victim and disrupt citizens’ access to critical goods and services. The enormous volume of financially motivated intrusions occurring every day also has a cumulative impact, hurting national economic competitiveness and placing huge strain on cyber defenders, leading to decreased readiness and burnout.
A Single Financially-Motivated Operation Can Have Severe Effects
Cybercrime, particularly ransomware attacks, are a serious threat to critical infrastructure. Disruptions to energy infrastructure, such as the 2021 Colonial Pipeline attack, a 2022 incident at the Amsterdam-Rotterdam-Antwerp refining hub, and the 2023 attack on Petro-Canada, have disrupted citizens’ ability to access vital goods. While the impacts in these cases were temporary and recoverable, a ransomware attack during a weather emergency or other acute situation could have devastating consequences.
Beyond energy, the ransomware attacks on the healthcare sector have had the most severe consequences on everyday people. At the height of the pandemic in early 2020, it appeared that ransomware groups might steer clear of hospitals, with multiple groups making statements to that effect, but the forbearance did not hold. Healthcare organizations’ critical missions and the high impact of disruptions have led them to be perceived as more likely to pay a ransom and led some groups to increase their focus on targeting healthcare. The healthcare industry, especially hospitals, almost certainly continues to be a lucrative target for ransomware operators given the sensitivity of patient data and the criticality of the services that it provides.
Since 2022, Google Threat Intelligence Group (GTIG) has observed a notable increase in the number of data leak site (DLS) victims from within the hospital subsector. Data leak sites, which are used to release victim data following data theft extortion incidents, are intended to pressure victims to pay a ransom demand or give threat actors additional leverage during ransom negotiations.
In July 2024, the Qilin (aka “AGENDA”) DLS announced upcoming attacks targeting US healthcare organizations. They followed through with this threat by adding a regional medical center to their list of claimed victims on the DLS the following week, and adding multiple healthcare and dental clinics in August 2024. The ransomware operators have purportedly stated that they focus their targeting on sectors that pay well, and one of those sectors is healthcare.
In March 2024, the RAMP forum actor “badbone,” who has been associated with INC ransomware, sought illicit access to Dutch and French medical, government, and educational organizations, stating that they were willing to pay 2–5% more for hospitals, particularly ones with emergency services.
Studies from academics and internal hospital reviews have shown that the disruptions from ransomware attacks go beyond inconvenience and have led to life-threatening consequences for patients. Disruptions can impact not just individual hospitals but also the broader healthcare supply chain. Cyberattacks on companies that manufacture critical medications and life-saving therapies can have far-reaching consequences worldwide.
A recent study from researchers at the University of Minnesota – Twin Cities School of Public Health showed that among patients already admitted to a hospital when a ransomware attack takes place, “in-hospital mortality increases by 35 – 41%.”
Public reporting stated that UK National Health Service data showed a June 2024 ransomware incident at a contractor led to multiple cases of “long-term or permanent impact on physical, mental or social function or shortening of life-expectancy,” with more numerous cases of less severe effects.
Ransomware operators are aware that their attacks on hospitals will have severe consequences and will likely increase government attention on them. Although some have devised strategies to mitigate the blowback from these operations, the potential monetary rewards associated with targeting hospitals continue to drive attacks on the healthcare sector.
The actor “FireWalker,” who has recruited partners for REDBIKE (aka Akira) ransomware operations, indicated a willingness to accept access to government and medical targets, but in those cases a different ransomware called “FOULFOG” would be used.
Leaked private communications broadly referred to as the “ContiLeaks” reveal that the actors expected their plan to target the US healthcare system in the fall of 2020 to cause alarm, with one actor stating “there will be panic.”
Economic Disruption
On May 8, 2022, Costa Rican President Rodrigo Chaves declared a national emergency caused by CONTI ransomware attacks against several Costa Rican government agencies the month prior. These intrusions caused widespread disruptions in government medical, tax, pension, and customs systems. With imports and exports halted, ports were overwhelmed, and the country reportedly experienced millions of dollars of losses. The remediation costs extended beyond Costa Rica; Spain supported the immediate response efforts, and in 2023, the US announced $25 million USD in cybersecurity aid to Costa Rica.
While the Costa Rica incident was exceptional, responding to a cybercrime incident can involve significant expenses for the affected entity, such as paying multi-million dollar ransom demands, loss of income due to system downtime, providing credit monitoring services to impacted clients, and paying remediation costs and fines. In just one example, a US healthcare organization reported $872 million USD in “unfavorable cyberattack effects” after a disruptive incident. In the most extreme cases, these costs can contribute to organizations ceasing operations or declaring bankruptcy.
In addition to the direct impacts to individual organizations, financial impacts often extend to taxpayers and can have significant impacts on the national economy due to follow-on effects of the disruptions. The US Federal Bureau of Investigation’s Internet Crime Complaint Center (IC3) has indicated that between October 2013 and December 2023, business email compromise (BEC) operations alone led to $55 billion USD in losses. The cumulative effect of these cybercrime incidents can have an impact on a country’s economic competitiveness. This can be particularly severe for smaller or developing countries, especially those with a less diverse economy.
Data Leak Sites Add Additional Threats
In addition to deploying ransomware to interfere with business operations, criminal groups have added the threat of leaking data stolen from victims to bolster their extortion operations. This now standard tactic has increased the volume of sensitive data being posted by criminals and created an opportunity for it to be obtained and exploited by state intelligence agencies.
Threat actors post proprietary company data—including research and product designs—on data leak sites where they are accessible to the victims’ competitors. GTIG has previously observed threat actors sharing tips for targeting valuable data for extortion operations. In our research, GTIG identified Conti “case instructions” indicating that actors should prioritize certain types of data to use as leverage in negotiations, including files containing confidential information, document scans, HR documents, company projects, and information protected by the General Data Protection Regulation (GDPR).
The number of data leak sites has proliferated, with the number of sites tracked by GTIG almost doubling since 2022. Leaks of confidential business and personal information by extortion groups can cause embarrassment and legal consequences for the affected organization, but they also pose national security threats. If a company’s confidential intellectual property is leaked, it can undermine the firm’s competitive position in the market and undermine the host country’s economic competitiveness. The wide-scale leaking of personally identifiable information (PII) also creates an opportunity for foreign governments to collect this information to facilitate surveillance and tracking of a country’s citizens.
Cybercrime Directly Supporting State Activity
Since the earliest computer network intrusions, financially motivated actors have conducted operations for the benefit of hostile governments. While this pattern has been consistent, the heightened level of cyber activity following Russia’s war in Ukraine has shown that, in times of heightened need, the latent talent pool of cybercriminals can be paid or coerced to support state goals. Operations carried out in support of the state, but by criminal actors, have numerous benefits for their sponsors, including a lower cost and increased deniability. As the volume of financially motivated activity increases, the potential danger it presents does as well.
States as a Customer in Cybercrime Ecosystems
Modern cybercriminals are likely to specialize in a particular area of cybercrime and partner with other entities with diverse specializations to conduct operations. The specialization of cybercrime capabilities presents an opportunity for state-backed groups to simply show up as another customer for a group that normally sells to other criminals. Purchasing malware, credentials, or other key resources from illicit forums can be cheaper for state-backed groups than developing them in-house, while also providing some ability to blend in to financially motivated operations and attract less notice.
Russian State Increasingly Leveraging Malware, Tooling Sourced from Crime Marketplaces
Google assesses that resource constraints and operational demands have contributed to Russian cyber espionage groups’ increasing use of free or publicly available malware and tooling, including those commonly employed by criminal actors to conduct their operations. Following Russia’s full-scale invasion of Ukraine, GTIG has observed groups suspected to be affiliated with Russian military intelligence services adopt this type of “low-equity” approach to managing their arsenal of malware, utilities, and infrastructure. The tools procured from financially motivated actors are more widespread and lower cost than those developed by the government. This means that if an operation using this malware is discovered, the cost of developing a new tool will not be borne by the intelligence agency; additionally, the use of such tools may assist in complicating attribution efforts. Notably, multiple threat clusters with links to Russian military intelligence have leveraged disruptive malware adapted from existing ransomware variants to target Ukrainian entities.
APT44 (Sandworm, FROZENBARENTS)
APT44, a threat group sponsored by Russian military intelligence, almost certainly relies on a diverse set of Russian companies and criminal marketplaces to source and sustain its more frequently operated offensive capabilities. The group has used criminally sourced tools and infrastructure as a source of disposable capabilities that can be operationalized on short notice without immediate links to its past operations. Since Russia’s full-scale invasion of Ukraine, APT44 has increased its use of such tooling, including malware such as DARKCRYSTALRAT (DCRAT), WARZONE, and RADTHIEF (“Rhadamanthys Stealer”), and bulletproof hosting infrastructure such as that provided by the Russian-speaking actor “yalishanda,” who advertises in cybercriminal underground communities.
APT44 campaigns in 2022 and 2023 deployed RADTHIEF against victims in Ukraine and Poland. In one campaign, spear-phishing emails targeted a Ukrainian drone manufacturer and leveraged SMOKELOADER, a publicly available downloader popularized in a Russian-language underground forum that is still frequently used in criminal operations, to load RADTHIEF.
APT44 also has a history of deploying disruptive malware built upon known ransomware variants. In October 2022, a cluster we assessed with moderate confidence to be APT44 deployed PRESSTEA (aka Prestige) ransomware against logistics entities in Poland and Ukraine, a rare instance in which APT44 deployed disruptive capabilities against a NATO country. In June 2017, the group conducted an attack leveraging ETERNALPETYA (aka NotPetya), a wiper disguised as ransomware, timed to coincide with Ukraine’s Constitution Day marking its independence from Russia. Nearly two years earlier, in late 2015, the group used a modified BLACKENERGY variant to disrupt the Ukrainian power grid. BLACKENERGY originally emerged as a distributed denial-of-service (DDoS) tool, with later versions sold in criminal marketplaces.
UNC2589 (FROZENVISTA)
UNC2589, a threat cluster whose activity has been publicly attributed to the Russian General Staff Main Intelligence Directorate (GRU)’s 161st Specialist Training Center (Unit 29155), has conducted full-spectrum cyber operations, including destructive attacks, against Ukraine. The actor is known to rely on non-military elements including cybercriminals and private-sector organizations to enable their operations, and GTIG has observed the use of a variety of malware-as-a-service tools that are prominently sold in Russian-speaking cybercrime communities.
In January 2022, a month prior to the invasion, UNC2589 deployed PAYWIPE (also known as WHISPERGATE) and SHADYLOOK wipers against Ukrainian government entities in what may have been a preliminary strike, using the GOOSECHASE downloader and FINETIDE dropper to drop and execute SHADYLOOK on the target machine. US Department of Justice indictmentsidentified a Russian civilian, who GTIG assesses was a likely criminal contractor, as managing the digital environments used to stage the payloads used in the attacks. Additionally, CERT-UAcorroborated GTIG’s findings of strong similarities between SHADYLOOK and WhiteBlackCrypt ransomware (also tracked as WARYLOOK). GOOSECHASE and FINETIDE are also publicly available for purchase on underground forums.
Turla (SUMMIT)
In September 2022, GTIG identified an operation leveraging a legacy ANDROMEDA infection to gain initial access to selective targets conducted by Turla, a cyber espionage group we assess to be sponsored by Russia’s Federal Security Service (FSB). Turla re-registered expired command-and-control (C&C or C2) domains previously used by ANDROMEDA, a common commodity malware that was widespread in the early 2010s, to profile victims; it then selectively deployed KOPILUWAK and QUIETCANARY to targets in Ukraine. The ANDROMEDA backdoor whose C2 was hijacked by Turla was first uploaded to VirusTotal in 2013 and spreads from infected USB keys.
While GTIG has continued to observe ANDROMEDA infections across a wide variety of victims, GTIG has only observed suspected Turla payloads delivered in Ukraine. However, Turla’s tactic of piggybacking on widely distributed, financially motivated malware to enable follow-on compromises is one that can be used against a wide range of organizations. Additionally, the use of older malware and infrastructure may cause such a threat to be overlooked by defenders triaging a wide variety of alerts.
In December 2024, Microsoft reported on the use of Amadey bot malware related to cyber criminal activity to target Ukrainian military entities by Secret Blizzard, an actor that aligns approximately with what we track as Turla. While we are unable to confirm this activity, Microsoft’s findings suggest that Turla has continued to leverage the tactic of using cybercrime malware.
APT29 (ICECAP)
In late 2021, GTIG reported on a campaign conducted by APT29, a threat group assessed to be sponsored by the Russian Foreign Intelligence Service (SVR), in which operators used credentials likely procured from an infostealer malware campaign conducted by a third-party actor to gain initial access to European entities. Infostealers are a broad classification of malware that have the capability or primary goal of collecting and stealing a range of sensitive user information such as credentials, browser data and cookies, email data, and cryptocurrency wallets.An analysis of workstations belonging to the target revealed that some systems had been infected with the CRYPTBOT infostealer shortly before a stolen session token used to gain access to the targets’ Microsoft 365 environment was generated.
An example of the sale of government credentials on an underground forum
Use of Cybercrime Tools by Iran and China
While Russia is the country that has most frequently been identified drawing on resources from criminal forums, they are not the only ones. For instance, in May 2024, GTIG identified a suspected Iranian group, UNC5203, using the aforementioned RADTHIEF backdoor in an operation using themes associated with the Israeli nuclear research industry.
In multiple investigations, the Chinese espionage operator UNC2286 was observed ostensibly carrying out extortion operations, including using STEAMTRAIN ransomware, possibly to mask its activities. The ransomware dropped a JPG file named “Read Me.jpg” that largely copies the ransomware note delivered with DARKSIDE. However, no links have been established with the DARKSIDE ransomware-as-a-service (RaaS), suggesting the similarities are largely superficial and intended to lend credibility to the extortion attempt. Deliberately mixing ransomware activities with espionage intrusions supports the Chinese Government’s public efforts to confound attribution by conflating cyber espionage activity and ransomware operations.
Criminals Supporting State Goals
In addition to purchasing tools for state-backed intrusion groups to use, countries can directly hire or co-opt financially motivated attackers to conduct espionage and attack missions on behalf of the state. Russia, in particular, has leveraged cybercriminals for state operations.
Current and Former Russian Cybercriminal Actors Engage in Targeted Activity Supporting State Objectives
Russian intelligence services have increasingly leveraged pre-existing or new relationships with cybercriminal groups to advance national objectives and augment intelligence collection. They have done so in particular since the beginning of Russia’s full-scale invasion of Ukraine. GTIG judges that this is a combination of new efforts by the Russian state and the continuation of ongoing efforts for other financially motivated, Russia-based threat actors that had relationships with the Russian intelligence services that predated the invasion. In at least some cases, current and former members of Russian cybercriminal groups have carried out intrusion activity likely in support of state objectives.
CIGAR (UNC4895, RomCom)
CIGAR (also tracked as UNC4895 and publicly reported as RomCom) is a dual financial and espionage-motivated threat group. Active since at least 2019, the group historically conducted financially motivated operations before expanding into espionage activity that GTIG judges fulfills espionage requirements in support of Russian national interests following the start of Russia’s full-scale invasion of Ukraine. CIGAR’s ongoing engagement in both types of activity differentiates the group from threat actors like APT44 or UNC2589, which leverage cybercrime actors and tooling toward state objectives. While the precise nature of the relationship between CIGAR and the Russian state is unclear, the group’s high operational tempo, constant evolution of its malware arsenal and delivery methods, and its access to and exploitation of multiple zero-day vulnerabilities suggest a level of sophistication and resourcefulness unusual for a typical cybercrime actor.
Targeted intrusion activity from CIGAR dates back to late 2022, targeting Ukrainian military and government entities. In October 2022, CERT-UA reported on a phishing campaign that distributed emails allegedly on behalf of the Press Service of the General Staff of the Armed Forces of Ukraine, which led to the deployment of the group’s signature RomCom malware. Two months later, in December 2022, CERT-UA highlighted a RomCom operation targeting users of DELTA, a situational awareness and battlefield management system used by the Ukrainian military.
CIGAR activity in 2023 and 2024 included the leveraging of zero-day vulnerabilities to conduct intrusion activity. In late June 2023, a phishing operation targeting European government and military entities used lures related to the Ukrainian World Congress, a nonprofit involved in advocacy for Ukrainian interests, and a then-upcoming NATO summit, to deploy the MAGICSPELL downloader, which exploited CVE-2023-36884 as a zero-day in Microsoft Word. In 2024, the group was reported to exploit the Firefox vulnerability CVE-2024-9680, chained together with the Windows vulnerability CVE-2024-49039, to deploy RomCom.
CONTI
At the outset of Russia’s full-scale invasion of Ukraine, the CONTI ransomware group publicly announced its support for the Russian government, and subsequent leaks of server logs allegedly containing chat messages from members of the group revealed that at least some individuals were interested in conducting targeted attacks,and may have been taking targeting directions from a third party. GTIG further assessed that former CONTI members comprise part of an initial access broker group conducting targeted attacks against Ukraine tracked by CERT-UA as UAC-0098.
UAC-0098 historically delivered the IcedID banking trojan, leading to human-operated ransomware attacks, and GTIG assesses that the group previously acted as an initial access broker for various ransomware groups including CONTI and Quantum. In early 2022, however, the actor shifted its focus to Ukrainian entities in the government and hospitality sectors as well as European humanitarian and nonprofit organizations.
UNC5174 uses the “Uteus” hacktivist persona who has claimed to be affiliated with China’s Ministry of State Security, working as an access broker and possible contractor who conducts for-profit intrusions. UNC5174 has weaponized multiple vulnerabilities soon after they were publicly announced, attempting to compromise numerous devices before they could be patched. For example, in February 2024, UNC5174 was observed exploiting CVE-2024-1709 in ConnectWise ScreenConnect to compromise hundreds of institutions primarily in the US and Canada, and in April 2024, GTIG confirmed UNC5174 had weaponized CVE-2024-3400 in an attempt to exploit Palo Alto Network’s (PAN’s) GlobalProtect appliances. In both cases, multiple China-nexus clusters were identified leveraging the exploits, underscoring how UNC5174 may enable additional operators.
Hybrid Groups Enable Cheap Capabilities
Another form of financially motivated activity supporting state goals are groups whose main mission may be state-sponsored espionage are, either tacitly or explicitly, allowed to conduct financially motivated operations to supplement their income. This can allow a government to offset direct costs that would be required to maintain groups with robust capabilities.
Moonlighting Among Chinese Contractors
APT41
APT41 is a prolific cyber operator working out of the People’s Republic of China and most likely a contractor for the Ministry of State Security. In addition to state-sponsored espionage campaigns against a wide array of industries, APT41 has a long history of conducting financially motivated operations. The group’s cybercrime activity has mostly focused on the video game sector, including ransomware deployment. APT 41 has also enabled other Chinese espionage groups, with digital certificates stolen by APT41 later employed by other Chinese groups. APT41’s cybercrime has continued since GTIG’s 2019 report, with the United States Secret Service attributing an operation that stole millions in COVID relief funds to APT41, and GTIG identifying an operation targeting state and local governments.
Iranian Groups Deploy Ransomware for Disruption and Profit
Over the past several years, GTIG has observed Iranian espionage groups conducting ransomware operations and disruptive hack-and-leak operations. Although much of this activity is likely primarily driven by disruptive intent, some actors working on behalf of the Iranian government may also be seeking ways to monetize stolen data for personal gain, and Iran’s declining economic climate may serve as an impetus for this activity.
UNC757
In August 2024, the US Federal Bureau of Investigation (FBI), Cybersecurity and Infrastructure Security Agency (CISA), and Department of Defense Cybercrime Center (DC3) released a joint advisory indicating that a group of Iran-based cyber actors known as UNC757 collaborated with ransomware affiliates including NoEscape, Ransomhouse, and ALPHV to gain network access to organizations across various sectors and then help the affiliates deploy ransomware for a percentage of the profits. The advisory further indicated that the group stole data from targeted networks likely in support of the Iranian government, and their ransomware operations were likely not sanctioned by the Government of Iran.
GTIG is unable to independently corroborate UNC757’s reported collaboration with ransomware affiliates. However, the group has historical, suspected ties to the persona “nanash” that posted an advertisement in mid-2020 on a cybercrime forum claiming to have access to various networks, as well as hack-and-leak operations associated with the PAY2KEY ransomware and corresponding persona that targeted Israeli firms.
Examples of Dual Motive (Financial Gain and Espionage)
In multiple incidents, individuals who have conducted cyber intrusions on behalf of the Iranian government have also been identified conducting financially motivated intrusion.
A 2020 US Department of Justice indictment indicated that two Iranian nationals conducted cyber intrusion operations targeting data “pertaining to national security, foreign policy intelligence, non-military nuclear information, aerospace data, human rights activist information, victim financial information and personally identifiable information, and intellectual property, including unpublished scientific research.” The intrusions in some cases were conducted at the behest of the Iranian government, while in other instances, the defendants sold hacked data for financial gain.
In 2017, the US DoJ indicted an Iranian national who attempted to extort HBO by threatening to release stolen content. The individual had previously worked on behalf of the Iranian military to conduct cyber operations targeting military and nuclear software systems and Israeli infrastructure.
DPRK Cyber Threat Actors Conduct Financially Motivated Operations to Generate Revenue for Regime, Fund Espionage Campaigns
Financially motivated operations are broadly prevalent among threat actors linked to the Democratic People’s Republic of Korea (DPRK). These include groups focused on generating revenue for the regime as well as those that use the illicit funds to support their intelligence-gathering efforts. Cybercrime focuses on the cryptocurrency sector and blockchain-related platforms, leveraging tactics including but not limited to the creation and deployment of malicious applications posing as cryptocurrency trading platforms and the airdropping of malicious non-fungible tokens (NFTs) that redirect the user to wallet-stealing phishing websites. A March 2024 United Nations (UN) report estimated North Korean cryptocurrency theft between 2017 and 2023 at approximately $3 billion.
APT38
APT38, a financially motivated group aligned with the Reconnaissance General Bureau (RGB), was responsible for the attempted theft of vast sums of money from institutions worldwide, including via compromises targeting SWIFT systems. Publicreporting has associated the group with the use of money mules and casinos to withdraw and launder funds from fraudulent ATM and SWIFT transactions. In publicly reported heists alone, APT38’s attempted thefts from financial institutions totaled over $1.1 billion USD, and by conservative estimates, successful operations have amounted to over $100 million USD. The group has also deployed destructive malware against target networks to render them inoperable following theft operations. While APT38 now appears to be defunct, we have observed evidence of its operators regrouping into other clusters, including those heavily targeting cryptocurrency and blockchain-related entities and other financials.
UNC1069 (CryptoCore), UNC4899 (TraderTraitor)
Limited indicators suggest that threat clusters GTIG tracks as UNC1069 (publicly referred to as CryptoCore) and UNC4899 (also reported as TraderTraitor) are successors to the now-defunct APT38. These clusters focus on financial gain, primarily by targeting cryptocurrency and blockchain entities. In December 2024, a joint statement released by the US FBI, DC3, and National Police Agency of Japan (NPA) reported on TraderTraitor’s theft of cryptocurrency then valued at $308 million USD from a Japan-based company.
APT43 (Kimsuky)
APT43, a prolific cyber actor whose collection requirements align with the mission of the RGB, funds itself through cybercrime operations to support its primary mission of collecting strategic intelligence, in contrast to groups focused primarily on revenue generation like APT38. While the group’s espionage targeting is broad, it has demonstrated a particular interest in foreign policy and nuclear security, leveraging moderately sophisticated technical capabilities coupled with aggressive social engineering tactics against government organizations, academia, and think tanks. Meanwhile, APT43’s financially motivated operations focus on stealing and laundering cryptocurrency to buy operational infrastructure.
UNC3782
UNC3782, a suspected North Korean threat actor active since at least 2022, conducts both financial crime operations against the cryptocurrency sector and espionage activity, including the targeting of South Korean organizations attempting to combat cryptocurrency-related crimes, such as law firms and related government and media entities. UNC3782 has targeted users on cryptocurrency platforms including Ethereum, Bitcoin, Arbitrum, Binance Smart Chain, Cronos, Polygon, TRON, and Solana; Solana in particular constitutes a target-rich environment for criminal actors due to the platform’s rapid growth.
APT45 (Andariel)
APT45, a North Korean cyber operator active since at least 2009, has conducted espionage operations focusing on government, defense, nuclear, and healthcare and pharmaceutical entities. The group has also expanded its remit to financially motivated operations, and we suspect that it engaged in the development of ransomware, distinguishing it from other DPRK-nexus actors.
DPRK IT Workers
DPRK IT workers pose as non-North Korean nationals seeking employment at a wide range of organizations globally to generate revenue for the North Korean regime, enabling it to evade sanctions and fund its weapons of mass destruction (WMD) and ballistic missiles programs. IT workers have also increasingly leveraged their privileged access at employer organizations to engage in or enable malicious intrusion activity and, in some cases, extort those organizations with threats of data leaks or sales of proprietary company information following the termination of their employment.,
While DPRK IT worker operations are widely reported to target US companies, they have increasingly expanded to Europe and other parts of the world. Tactics to evade detection include the use of front companies and services of “facilitators,” non-North Korean individuals who provide services such as money and/or cryptocurrency laundering, assistance during the hiring process, and receiving and hosting company laptops to enable the workers remote access in exchange for a percentage of the workers’ incomes.
A Comprehensive Approach is Required
We believe tackling this challenge will require a new and stronger approach recognizing the cybercriminal threat as a national security priority requiring international cooperation. While some welcome enhancements have been made in recent years, more must—and can—be done. The structure of the cybercrime ecosystem makes it particularly resilient to takedowns. Financially motivated actors tend to specialize in a single facet of cybercrime and regularly work with others to accomplish bigger schemes. While some actors may repeatedly team up with particular partners, actors regularly have multiple suppliers (or customers) for a given service.
If a single ransomware-as-a-service provider is taken down, many others are already in place to fill in the gap that has been created. This resilient ecosystem means that while individual takedowns can disrupt particular operations and create temporary inconveniences for cybercriminals, these methods need to be paired with wide-ranging efforts to improve defense and crack down on these criminals’ ability to carry out their operations. We urge policymakers to consider taking a number of steps:
Demonstrably elevate cybercrime as a national security priority: Governments must recognize cybercrime as a pernicious national security threat and allocate resources accordingly. This includes prioritizing intelligence collection and analysis on cybercriminal organizations, enhancing law enforcement capacity to investigate and prosecute cybercrime, and fostering international cooperation to dismantle these transnational networks.
Strengthen cybersecurity defenses: Policymakers should promote the adoption of robust cybersecurity measures across all sectors, particularly critical infrastructure. This includes incentivizing the implementation of security best practices, investing in research and development of advanced security technologies, enabling digital modernization and uptake of new technologies that can advantage defenders, and supporting initiatives that enhance the resilience of digital systems against attacks and related deceptive practices.
Disrupt the cybercrime ecosystem: Targeted efforts are needed to disrupt the cybercrime ecosystem by targeting key enablers such as malware developers, bulletproof hosting providers, and financial intermediaries such as cryptocurrency exchanges. This requires a combination of legal, technical, and financial measures to dismantle the infrastructure that supports cybercriminal operations and coordinated international efforts to enable the same.
Enhance international cooperation: cybercrime transcends national borders, necessitating strong international collaboration to effectively combat this threat. Policymakers should prioritize and resource international frameworks for cyber threat information sharing, joint investigations, and coordinated takedowns of cybercriminal networks, including by actively contributing to the strengthening of international organizations and initiatives dedicated to combating cybercrime, such as the Global Anti-Scams Alliance (GASA). They should also prioritize collective efforts to publicly decry malicious cyber activity through joint public attribution and coordinated sanctions, where appropriate.
Empower individuals and businesses: Raising awareness about cyber threats and promoting cybersecurity education is crucial to building a resilient society. Policymakers should support initiatives that educate individuals and businesses about online safety, encourage the adoption of secure practices, empower service providers to take action against cybercriminals including through enabling legislation, and provide resources for reporting and recovering from cyberattacks.
Elevate strong private sector security practices: Ransomware and other forms of cybercrime predominantly exploit insecure, often legacy technology architectures. Policymakers should consider steps to prioritize technology transformation, including the adoption of technologies/products with a strong security track record; diversifying vendors to mitigate risk resulting from overreliance on a single technology; and requiring interoperability across the technology stack.
aside_block
<ListValue: [StructValue([(‘title’, ‘The Evolution of Cybercrime’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb5935d2e20>), (‘btn_text’, ‘Watch now’), (‘href’, ‘https://www.youtube.com/watch?v=NtANWZPHUak’), (‘image’, <GAEImage: evolution of cybercrime>)])]>
About the Authors
Google Threat Intelligence Group brings together the Mandiant Intelligence and Threat Analysis Group (TAG) teams, and focuses on identifying, analyzing, mitigating, and eliminating entire classes of cyber threats against Alphabet, our users, and our customers. Our work includes countering threats from government-backed attackers, targeted 0-day exploits, coordinated information operations (IO), and serious cybercrime networks. We apply our intelligence to improve Google’s defenses and protect our users and customers.
Today, AWS Secrets Manager announces that AWS Secrets and Configuration Provider (ASCP) now integrates with Amazon Elastic Kubernetes Service (Amazon EKS) Pod Identity. This integration simplifies IAM authentication for Amazon EKS when retrieving secrets from AWS Secrets Manager or parameters from AWS Systems Manager Parameter Store. With this new capability, you can manage IAM permissions for Kubernetes applications more efficiently and securely, enabling granular access control through role session tags on secrets.
ASCP is a plugin for the industry-standard Kubernetes Secrets Store CSI Driver. It enables applications running in Kubernetes pods to retrieve secrets from AWS Secrets Manager easily, without the need for custom code or restarting containers when secrets are rotated. The AWS EKS Pod Identity, streamlines the process of configuring IAM permissions for Kubernetes applications in a more efficient and secure way. This integration combines the strengths of both components, enhancing secret management in Amazon EKS environments.
Previously, ASCP relied on IAM Roles for Service Accounts (IRSA) for authentication. Now, you can choose between IRSA and Pod Identity for IAM authentication using the new optional parameter “usePodIdentity”. This flexibility allows you to adopt the authentication method that best suits your security requirements and operational needs.
The integration of ASCP with Pod Identity is available in all AWS Regions where AWS Secrets Manager and Amazon EKS Pod Identity are supported. To get started with this new feature, see the following resources AWS Secrets Manager documentation, Amazon EKS Pod Identity documentation and launch blog post.
Amazon Connect Contact Lens now enables managers to create rules based on patterns of customer hold time and agent interaction duration, to take automated actions such as categorizing contacts, evaluating agent performance and notifying supervisors. With this launch, managers can create rules to check how well agents comply with guidelines on placing customers on hold. For example, did the agent set expectations on hold duration, before placing the customer on hold for more than 5 minutes? In addition, managers can check if the agent interaction lasted long enough to warrant assessment of complex agent behaviors such as building customer rapport, customer issue root cause analysis, etc. By excluding contacts that were too short, such as less than 30 seconds, managers can get more meaningful insights from automated contact categorization and agent performance evaluations.
This feature is available in all regions where Contact Lens performance evaluations are already available. To learn more, please visit our documentation and our webpage. For information about Contact Lens pricing, please visit our pricing page.
Contact Lens now provides managers with an agent performance evaluation dashboard, to view aggregations of agent performance, and insights across cohorts of agents over time. With this launch, managers can access a unified dashboard on agent performance across evaluation scores, productivity (e.g., contacts handled, average handle time, etc.) and operational metrics. Through detailed performance scorecards at both team and individual levels, managers can dive deep into specific performance criteria, and compare performance with similar cohorts and over time, to identify agent strengths and improvement opportunities. The dashboard also provides managers with insights into agent time allocation and contact handling efficiency, so they can drive improvements in agent productivity.
This feature is available in all regions where Contact Lens performance evaluations are already available. To learn more, please visit our documentation and our webpage. For information about Contact Lens pricing, please visit our pricing page.
You can now request for Amazon DynamoDB account-level and table-level throughput quota adjustments using AWS Service Quotas in all AWS Commercial Regions and the AWS GovCloud (US) Regions, and get auto-approved within minutes.
Previously, when requesting a quota adjustment, Service Quotas allowed you to indicate the Amazon DynamoDB quota and desired value to be adjusted to. AWS Support would then review your request, approve, and make the adjustments. With this launch, when you make updates to your DynamoDB account-level and table-level throughput quotas using AWS Service Quotas, your adjustments will get automatically approved and adjusted with just a few clicks. AWS Service Quotas is available at no additional charge.
To learn more about Amazon DynamoDB, the Serverless, NoSQL, fully managed database with single-digit millisecond performance at any scale, please visit the Amazon DynamoDB website.
The recent explosion of machine learning (ML) applications has created unprecedented demand for power delivery in the data center infrastructure that underpins those applications. Unlike server clusters in the traditional data center, where tens of thousands of workloads coexist with uncorrelated power profiles, large-scale batch-synchronized ML training workloads exhibit substantially different power usage patterns. Under these new usage conditions, it is increasingly challenging to ensure the reliability and availability of the ML infrastructure, as well as to improve data-center goodput and energy efficiency.
Google has been at the forefront of data center infrastructure design for several decades, with a long list of innovations to our name. In this blog post, we highlight one of the key innovations that allowed us to manage unprecedented power and thermal fluctuations in our ML infrastructure. This innovation underscores the power of full codesign across the stack — from ASIC chip to data center, across both hardware and software. We also discuss the implications of this approach and propose a call to action for the broader industry.
New ML workloads lead to new ML power challenges
Today’s ML workloads require synchronized computation across tens of thousands of accelerator chips, together with their hosts, storage, and networking systems; these workloads often occupy one entire data-center cluster — or even multiples of them. The peak power utilization of these workloads could approach the rated power of all the underlying IT equipment, making power overscription much more difficult. Furthermore, power consumption rises and falls between idle and peak utilization levels much more steeply, due to the fact that the entire cluster’s power usage is now dominated by no more than a few large ML workloads. You can observe these power fluctuations when a workload launches or finishes, or when it is halted, then resumed or rescheduled. You may also observe a similar pattern when the workload is running normally, mostly attributable to alternating compute- and networking-intensive phases of the workload within a training step. Depending on the workload’s characteristics, these inter- and intra-job power fluctuations can occur very frequently. This can result in multiple unintended consequences on the functionality, performance, and reliability of the data center infrastructure.
Fig. 1. Large power fluctuations observed on cluster level with large-scale synchronized ML workloads
In fact, in our latest batch-synchronous ML workloads running on dedicated ML clusters, we observed power fluctuations in the tens of megawatts (MW), as shown in Fig.1. And compared to a traditional load variation profile, the ramp speed could be almost instantaneous, repeat as frequently as every few seconds, and last for weeks… or even months!
Fluctuations of this kind pose the following risks:
Functionality and long-term reliability issues with rack and data center equipment, resulting in hardware-induced outages, reduced energy efficiency and increased operational/maintenance costs, including but not limited to rectifiers, transformers, generators, cables and busways
Damage, outage, or throttling at the upstream utility, including violation of contractual commitments to the utility on power usage profiles, and corresponding financial costs
Unintended and frequent triggering of the uninterrupted power supply (UPS) system from large power fluctuations, resulting in shortened lifetime of the UPS system
Large power fluctuations may also impact hardware reliability at a much smaller per-chip or per-system scale. Although the maximum temperature is well under control, power fluctuations may still translate into large and frequent temperature fluctuations, triggering various forms of interactions including warpage, changes to thermal interface material property, and electromigration.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3758e60460>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
A full-stack approach to proactive power shaping
Due to the high complexity and large scale of our data-center infrastructure, we posited that proactively shaping a workload’s power profile could be more efficient than simply adapting to it. Google’s full codesign across the stack — from chip to data center, from hardware to software, and from instruction set to realistic workload — provides us with all the knobs we need to implement highly efficient end-to-end power management features to regulate our workloads’ power profiles and mitigate detrimental fluctuations.
Specifically, we installed instrumentation in the TPU compiler to check on signatures in the workload that are linked with power fluctuations, such as sync flags. We then dynamically balance the activities of major compute blocks of the TPU around these flags to smooth out their utilization over time. This achieves our goal of mitigating power and thermal fluctuations with negligible performance overhead. In the future, we may also apply a similar approach to the workload’s starting and completion phases, resulting in a gradual, rather than abrupt, change in power levels.
We’ve now implemented this compiler-based approach to shaping the power profile and applied it on realistic workloads. We measured the system’s total power consumption and a single chip’s hotspot temperature with, and without, the mitigation, as plotted in Fig. 2 and Fig. 3, respectively. In the test case, the magnitude of power fluctuations dropped by nearly 50% from the baseline case to the mitigation case. The magnitude of temperature fluctuations also dropped from ~20 C in the baseline case to ~10 C in the mitigation case. We measured the cost of the mitigation by the increase in average power consumption and the length of the training step. With proper tuning of the mitigation parameters, we can achieve the benefits of our design with small increases in average power with <1% performance impact.
Fig. 2. Power fluctuation with and without the compiler-based mitigation
Fig. 3. Chip temperature fluctuation with and without the compiler-based mitigation
A call to action
ML infrastructure is growing rapidly and expected to surpass traditional server infrastructure in terms of total power demand in the coming years. At the same time, ML infrastructure’s power and temperature fluctuations are unique and tightly coupled with the ML workload’s characteristics. Mitigating these fluctuations is just one example of many innovations we need to ensure reliable and high-performance infrastructure. In addition to the method described above, we’ve been investing in an array of innovative techniques to take on ever-increasing power and thermal challenges, including data center water cooling, vertical power delivery, power-aware workload allocation, and many more.
But these challenges aren’t unique to Google. Power and temperature fluctuations in ML infrastructure are becoming a common issue for many hyperscalers and cloud providers as well as infrastructure providers. We need partners at all levels of the system to help:
Utility providers to set forth a standardized definition of acceptable power quality metrics — especially in scenarios where multiple data centers with large power fluctuations co-exist within a same grid and interact with one another
Power and cooling equipment suppliers to offer quality and reliability enhancements for electronics components, particularly for use-conditions with large and frequent power and thermal fluctuations
Hardware suppliers and data center designers to create a standardized suite of solutions such as rack-level capacitor banks (RLCB) or on-chip features, to help establish an efficient supplier base and ecosystem
ML model developers to consider the energy consumption characteristics of the model, and consider adding low-level software mitigations to help address energy fluctuations
Google has been leading and advocating for industry-wide collaboration on these issues through forums such as Open Compute Project (OCP) to benefit the data center infrastructure industry as a whole. We look forward to continuing to share our learnings and collaborating on innovative new solutions together.
A special thanks to Denis Vnukov, Victor Cai, Jianqiao Liu, Ibrahim Ahmed, Venkata Chivukula, Jianing Fan, Gaurav Gandhi, Vivek Sharma, Keith Kleiner, Mudasir Ahmad, Binz Roy, Krishnanjan Gubba Ravikumar, Ashish Upreti and Chee Chung from Google Cloud for their contributions.