Amazon Connect Cases now allows agents and supervisors to filter cases in the agent workspace by custom field values, making it easier to narrow down search results and find relevant cases. Users can also customize the case list view and search results layout by adding custom columns, hiding or rearranging existing columns, and adjusting the number of cases per page. These enhancements enable users to tailor the case list view to meet their needs and manage their case workloads more effectively.
The Amazon EventBridge console now displays the source and detail type of all available AWS service events when you create a rule in the EventBridge console. This makes it easier for customers to discover and utilize the full range of AWS service events when building event-driven architectures. Additionally, the EventBridge documentation now includes an automatically updated list of all AWS service events, facilitating access to the most current information.
Amazon EventBridge Event Bus is a serverless event router that enables you to create highly scalable event-driven applications by routing events between your own applications, third-party SaaS applications, and other AWS services. With this update, developers can quickly search and filter through all available AWS service events, including event types, within the EventBridge console, when configuring event patterns in the sandbox and rules, and in the documentation, enabling customers to more efficiently create event-driven integrations and reduce misconfiguration.
This feature in the EventBridge console is available in all commercial AWS Regions. To learn more about discovering and using AWS service events in Amazon EventBridge, see the updated list of AWS service events in the documentation here.
Amazon Managed Service for Prometheus collector, a fully-managed agentless collector for Prometheus metrics, adds support for cross-account ingestion. Starting today, you can agentlessly scrape metrics from Amazon Elastic Kubernetes Service clusters in different accounts than your Amazon Managed Service for Prometheus workspace.
While it was previously possible to apply AWS multi-account best practices for centralized observability with Amazon Managed Service for Prometheus workspaces, you had to use self-managed collection. This meant that you had to run, scale, and patch telemetry agents yourself to scrape metrics from Amazon Elastic Kubernetes Service clusters in various accounts in order to ingest them into a central Amazon Managed Service for Prometheus workspaces in a different account. With this launch, you can now use the Amazon Managed Service for Prometheus collector to get rid of this heavy lifting and ingest metrics in a cross-account setup without having to self-run a collector. In addition, you can now also use the Amazon Managed Service for Prometheus collector to scrape metrics from for Amazon Elastic Kubernetes Service clusters to ingest them into Amazon Managed Service for Prometheus workspaces created with customer managed keys.
Amazon Managed Service for Prometheus collector is available in all regions where Amazon Managed Service for Prometheus is available. To learn more about Amazon Managed Service for Prometheus collector, visit the user guide or product page.
For developers who want to use the PyTorch deep learning framework with Cloud TPUs, the PyTorch/XLA Python package is key, offering developers a way to run their PyTorch models on Cloud TPUs with only a few minor code changes. It does so by leveraging OpenXLA, developed by Google, which gives developers the ability to define their model once and run it on many different types of machine learning accelerators (i.e., GPUs, TPUs, etc.).
The latest release of PyTorch/XLA comes with several improvements that improve its performance for developers:
A new experimental scan operator to speed up compilation for repetitive blocks of code (i.e., for loops)
Host offloading to move TPU tensors to the host CPU’s memory to fit larger models on fewer TPUs
Improved goodput for tracing-bound models through a new base Docker image compiled with the C++ 2011 Standard application binary interface (C++ 11 ABI) flags
In addition to these improvements we’ve also re-organized the documentation to make it easier to find what you’re looking for!
Let’s take a look at each of these features in greater depth.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3a54db9ee0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Experimental scan operator
Have you ever experienced long compilation times, for example when working with large language models and PyTorch/XLA — especially when dealing with models with numerous decoder layers? During graph tracing, where we traverse the graph of all the operations being performed by the model, these iterative loops are completely “unrolled” — i.e., each loop iteration is copied and pasted for every cycle — resulting in large computation graphs. These larger graphs lead directly to longer compilation times. But now there’s a new solution: the new experimental scan function, inspired by jax.lax.scan.
The scan operator works by changing how loops are handled during compilation. Instead of compiling each iteration of the loop independently, which creates redundant blocks, scan compiles only the first iteration. The resulting compiled high-level operation (HLO) is then reused for all subsequent iterations. This means that there is less HLO or intermediate code that is being generated for each subsequent loop. Compared to a for loop, scan compiles in a fraction of the time since it only compiles the first loop iteration. This improves the developer iteration time when working on models with many homogeneous layers, such as LLMs.
Building on top of torch_xla.experimental.scan, the torch_xla.experimental.scan_layers function offers a simplified interface for looping over sequences of nn.Modules. Think of it as a way to tell PyTorch/XLA “These modules are all the same, just compile them once and reuse them!” For example:
code_block
<ListValue: [StructValue([(‘code’, ‘import torchrnimport torch.nn as nnrnimport torch_xlarnfrom torch_xla.experimental.scan_layers import scan_layersrnrnclass DecoderLayer(nn.Module):rn def __init__(self, size):rn super().__init__()rn self.linear = nn.Linear(size, size)rnrn def forward(self, x):rn return self.linear(x)rnrnwith torch_xla.device():rn layers = [DecoderLayer(1024) for _ in range(64)]rn x = torch.randn(1, 1024)rnrn# Instead of a for loop, we can scan_layers once:rn# for layer in layers:rn# x = layer(x)rnx = scan_layers(layers, x)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3a54d60610>)])]>
One thing to note is that custom pallas kernels do not yet support scan. Here is a complete example of using scan_layers in an LLM for reference.
Host offloading
Another powerful tool for memory optimization in PyTorch/XLA is host offloading. This technique allows you to temporarily move tensors from the TPU to the host CPU’s memory, freeing up valuable device memory during training. This is especially helpful for large models where memory pressure is a concern. You can use torch_xla.experimental.stablehlo_custom_call.place_to_host to offload a tensor and torch_xla.experimental.stablehlo_custom_call.place_to_device to retrieve it later. A typical use case involves offloading intermediate activations during the forward pass and then bringing them back during the backward pass. Here’s an example of host offloading for reference.
Strategic use of host offloading, such as when you’re working with limited memory and are unable to use the accelerator continuously, may significantly improve your ability to train large and complex models within the memory constraints of your hardware.
Alternative base Docker image
Have you ever encountered a situation where your TPUs are sitting idle while your host CPU is heavily loaded tracing your model execution graph for just-in-time compilation? This suggests your model is “tracing bound,” meaning performance is limited by the speed of tracing operations.
The C++11 ABI image offers a solution. Starting with this release, PyTorch/XLA offers a choice of C++ ABI flavors for both Python wheels and Docker images. This gives you a choice for which version of C++ you’d like to use with PyTorch/XLA. You’ll now find builds with both the pre-C++11 ABI, which remains the default to match PyTorch upstream, and the more modern C++11 ABI.
Switching to the C++11 ABI wheels or Docker images can lead to noticeable improvements in the above-mentioned scenarios. For example, we observed a 20% relative improvement in goodput with the Mixtral 8x7B model on v5p-256 Cloud TPU (with a global batch size of 1024) when we switched from the pre-C++11 ABI to the C++11 ABI! ML Goodput gives us an understanding of how efficiently a given model utilizes the hardware. So if we have a higher goodput measurement for the same model on the same hardware, that indicates better performance of the model.
An example of using a C++11 ABI docker image in your Dockerfile might look something like:
code_block
<ListValue: [StructValue([(‘code’, ‘# Use the C++11 ABI PyTorch/XLA image as the basernFROM us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/xla:r2.6.0_3.10_tpuvm_cxx11rnrn# Install any additional dependencies herern# RUN pip install my-other-packagernrn# Copy your code into the containerrnCOPY . /apprnWORKDIR /apprnrn# Run your training scriptrnCMD [“python”, “train.py”]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3a54d60460>)])]>
Alternatively, if you are not using Docker images, because you’re testing locally for instance, you can install the C++11 ABI wheels for version 2.6 using the following command (Python 3.10 example):
The above command works for Python 3.10. We have instructions for other versions within our documentation.
The flexibility to choose between C++ ABIs lets you choose the optimal build for your specific workload and hardware, ultimately leading to better performance and efficiency in your PyTorch/XLA projects!
So, what are you waiting for, go try out the latest version of PyTorch/XLA! For additional information check out the latest release notes.
A note on GPU support
We aren’t offering a PyTorch/XLA:GPU wheel in the PyTorch/XLA 2.6 release. We understand this is important and plan to reinstate GPU support by the 2.7 release. PyTorch/XLA remains an open-source project and we welcome contributions from the community to help maintain and improve the project. To contribute, please start with the contributors guide.
The latest stable version where a PyTorch/XLA:GPU wheel is available is torch_xla 2.5.
Modern AI workloads require powerful accelerators and high-speed interconnects to run sophisticated model architectures on an ever-growing diverse range of model sizes and modalities. In addition to large-scale training, these complex models need the latest high-performance computing solutions for fine-tuning and inference.
Today, we’re excited to bring the highly-anticipated NVIDIA Blackwell GPUs to Google Cloud with the preview of A4 VMs, powered by NVIDIA HGX B200. The A4 VM features eight Blackwell GPUs interconnected by fifth-generation NVIDIA NVLink, and offers a significant performance boost over the previous generation A3 High VM. Each GPU delivers 2.25 times the peak compute and 2.25 times the HBM capacity, making A4 VMs a versatile option for training and fine-tuning for a wide range of model architectures, while the increased compute and HBM capacity makes it well-suited for low-latency serving.
The A4 VM integrates Google’s infrastructure innovations with Blackwell GPUs to bring the best cloud experience for Google Cloud customers, from scale and performance, to ease-of-use and cost optimization. Some of these innovations include:
Enhanced networking: A4 VMs are built on servers with our Titanium ML network adapter, optimized to deliver a secure, high-performance cloud experience for AI workloads, building on NVIDIA ConnectX-7 network interface cards (NICs). Combined with our datacenter-wide 4-way rail-aligned network, A4 VMs deliver non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE). Customers can scale to tens of thousands of GPUs with our Jupiter network fabric with 13 Petabits/sec of bi-sectional bandwidth.
Google Kubernetes Engine: With support for up to 65,000 nodes per cluster, GKE is the most scalable and fully automated Kubernetes service for customers to implement a robust, production-ready AI platform. Out of the box, A4 VMs are natively integrated with GKE. Integrating with other Google Cloud services, GKE facilitates a robust environment for the data processing and distributed computing that underpin AI workloads.
Vertex AI: A4 VMs will be accessible through Vertex AI, our fully managed, unified AI development platform for building and using generative AI, and which is powered by the AI Hypercomputer architecture under the hood.
Open software: In addition to PyTorch and CUDA, we work closely with NVIDIA to optimize JAX and XLA, enabling the overlap of collective communication and computation on GPUs. Additionally, we added optimized model configurations and example scripts for GPUs with XLA flags enabled.
Hypercompute Cluster: Our new highly scalable clustering system streamlines infrastructure and workload provisioning, and ongoing operations of AI supercomputers with tight GKE and Slurm integration.
Multiple consumption models: In addition to the On-demand, Committed use discount, and Spot consumption models, we reimagined cloud consumption for the unique needs of AI workloads with Dynamic Workload Scheduler, which offers two modes for different workloads: Flex Start mode for enhanced obtainability and better economics, and Calendar mode for predictable job start times and durations.
Hudson River Trading, a multi-asset-class quantitative trading firm, will leverage A4 VMs to train its next generation of capital market model research. The A4 VM, with its enhanced inter-GPU connectivity and high-bandwidth memory, is ideal for the demands of larger datasets and sophisticated algorithms, accelerating Hudson River Trading’s ability to react to the market.
“We’re excited to leverage A4, powered by NVIDIA’s Blackwell B200 GPUs. Running our workload on cutting edge AI Infrastructure is essential for enabling low-latency trading decisions and enhancing our models across markets. We’re looking forward to leveraging the innovations in Hypercompute Cluster to accelerate deployment of training our latest models that deliver quant-based algorithmic trading.” – Iain Dunning, Head of AI Lab, Hudson River Trading
“NVIDIA and Google Cloud have a long-standing partnership to bring our most advanced GPU-accelerated AI infrastructure to customers. The Blackwell architecture represents a giant step forward for the AI industry, so we’re excited that the B200 GPU is now available with the new A4 VM. We look forward to seeing how customers build on the new Google Cloud offering to accelerate their AI mission.” – Ian Buck, Vice-President and General Manager of Hyperscale and HPC, NVIDIA
Better together: A4 VMs and Hypercompute Cluster
Effectively scaling AI model training requires precise and scalable orchestration of infrastructure resources. These workloads often stretch across thousands of VMs, pushing the limits of compute, storage, and networking.
Hypercompute Cluster enables you to deploy and manage these large clusters of A4 VMs with compute, storage and networking as a single unit. This makes it easy to manage complexity while delivering exceptionally high performance and resilience for large distributed workloads. Hypercompute Cluster is engineered to:
Deliver high performance through co-location of A4 VMs densely packed to enable optimal workload placement
Optimize resource scheduling and workload performance with GKE and Slurm, packed with intelligent features like topology-aware scheduling
Increase reliability with built-in self-healing capabilities, proactive health checks, and automated recovery from failures
Enhance observability and monitoring for timely and customized insights
Automate provisioning, configuration, and scaling, integrated with GKE and Slurm
We’re excited to be the first hyperscaler to announce preview availability of an NVIDIA Blackwell B200-based offering. Together, A4 VMs and Hypercompute Cluster make it easier for organizations to create and deliver AI solutions across all industries. If you’re interested in learning more, please reach out to your Google Cloud representative.
Amazon S3 announces schema definition support for the CreateTable API to programmatically create tables with pre-defined columns. This enhancement simplifies table creation for data analytics applications, making it easier to get started and ingest data in S3 table buckets.
To use this feature, you can specify column names and their data types as new request headers in the CreateTable API to define a table’s schema in an S3 table bucket. You can also define a table’s schema when you create tables using the AWS CLI or the AWS SDK.
Amazon S3 Tables now support creating up to 10,000 tables in each S3 table bucket. With this higher quota, you can scale up to 100,000 tables across 10 table buckets within an AWS Region per AWS Account. The higher table quota is available by default on all table buckets at no additional cost.
S3 Tables deliver the first cloud object store with built-in Apache Iceberg support, and the easiest way to store tabular data at scale. You can use S3 Tables with AWS Analytics services through the preview integration with Amazon SageMaker Lakehouse, as well as Apache Iceberg-compatible open source engines like Apache Spark and Apache Flink.
The Amazon Q Developer Pro tier now offers automated email notifications for newly subscribed users. When a new user is subscribed by an administrator, users will now automatically receive a welcome email within 24 hours containing important information to help them get started quickly and efficiently with their new subscription. This automation streamlines the onboarding process and saves administrators valuable time by eliminating the need for them to manually notify each new user.
In the welcome email, users will find guidance on accessing the Q Console chat and details on downloading and installing the Q Developer plugin in their Integrated Development Environment (IDE). The email includes their unique Start URL and AWS region for authentication. Additionally, it provides quick-start steps for using Q Developer in their IDE.
To learn more about this new feature and other Amazon Q Developer Pro tier subscription management features, visit the AWS Console.
We are thrilled to announce the collaboration between Google Cloud, AWS, and Azure on Kube Resource Orchestrator, or kro (pronounced “crow”). kro introduces a Kubernetes-native, cloud-agnostic way to define groupings of Kubernetes resources. With kro, you can group your applications and their dependencies as a single resource that can be easily consumed by end users.
Challenges of Kubernetes resource orchestration
Platform and DevOps teams want to define standards for how application teams deploy their workloads, and they want to use Kubernetes as the platform for creating and enforcing these standards. Each service needs to handle everything from resource creation to security configurations, monitoring setup, defining the end-user interface, and more. There are client-side templating tools that can help with this (e.g., Helm, Kustomize), but Kubernetes lacks a native way for platform teams to create custom groupings of resources for consumption by end users.
Before kro, platform teams needed to invest in custom solutions such as building custom Kubernetes controllers, or using packaging tools like Helm, which can’t leverage the benefits of Kubernetes CRDs. These approaches are costly to build, maintain, and troubleshoot, and complex for non-Kubernetes experts to consume. This is a problem many Kubernetes users face. Rather than developing vendor-specific solutions, we’ve partnered with Amazon and Microsoft on making K8s APIs simpler for all Kubernetes users.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2ac7fe4d00>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
How kro simplifies the developer experience
kro is a Kubernetes-native framework that lets you create reusable APIs to deploy multiple resources as a single unit. You can use it to encapsulate a Kubernetes deployment and its dependencies into a single API that your application teams can use, even if they aren’t familiar with Kubernetes. You can use kro to create custom end-user interfaces that expose only the parameters an end user should see, hiding the complexity of Kubernetes and cloud-provider APIs.
kro does this by introducing the concept of a ResourceGraphDefinition, which specifies how a standard Kubernetes Custom Resource Definition (CRD) should be expanded into a set of Kubernetes resources. End users define a single resource, which kro then expands into the custom resources defined in the CRD.
kro can be used to group and manage any Kubernetes resources. Tools like ACK, KCC, or ASO define CRDs to manage cloud provider resources from Kubernetes (these tools enable cloud provider resources, like storage buckets, to be created and managed as Kubernetes resources). kro can also be used to group resources from these tools, along with any other Kubernetes resources, to define an entire application deployment and the cloud provider resources it depends on.
Example use cases
Below, you’ll find some examples of kro being used with Google Cloud. You can find additional examples on the kro website.
Example 1: GKE cluster definition
Imagine that a platform administrator wants to give end users in their organization self-service access to create GKE clusters. The platform administrator creates a kro ResourceGraphDefinition called GKEclusterRGD that defines the required Kubernetes resources and a CRD called GKEcluster that exposes only the options they want to be configurable by end users. In addition to creating a cluster, the platform team also wants clusters to deploy administrative workloads such as policies, agents, etc. The ResourceGraphDefinition defines the following resources, using KCC to provide the mappings from K8s CRDs to Google Cloud APIs:
GKE cluster, Container Node Pools, IAM ServiceAccount, IAM PolicyMember, Services, Policies
The platform administrator would then define the end-user interface so that they can create a new cluster by creating an instance of the CRD that defines:
Everything related to policy, service accounts, and service activation (and how these resources relate to each other) is hidden from the end user, simplifying their experience.
Example 2: Web application definition
In this example, a DevOpsEngineer wants to create a reusable definition of a web application and its dependencies. They create a ResourceGraphDefinition called WebAppRGD, which defines a new Kubernetes CRD called WebApp. This new resource encapsulates all the necessary resources for a web application environment, including:
Deployments, service, service accounts, monitoring agents, and cloud resources like object storage buckets.
The WebAppRGD ResourceGraphDefinition can set a default configuration, and also define which parameters can be set by the end user at deployment time (kro gives you the flexibility to decide what is immutable, and what an end user is able to configure). A developer then creates an instance of the WebApp CRD, inputting any user-facing parameters. kro then deploys the desired Kubernetes resource.
Key benefits of kro
We believe kro is a big step forward for platform engineering teams, delivering a number of advantages:
Kubernetes-native: kro leverages Kubernetes Custom Resource Definitions (CRDs) to extend Kubernetes, so it works with any Kubernetes resource and integrates with existing Kubernetes tools and workflows.
Lets you create a simplified end user experience: kro makes it easy to define end-user interfaces for complex groups of Kubernetes resources, making it easy for people who are not Kubernetes experts to consume services built on Kubernetes.
Enables standardized services for application teams: kro templates can be reused across different projects and environments, promoting consistency and reducing duplication of effort.
Get started with kro
kro is available as an open-source project on GitHub. The GitHub organization is currently jointly owned by teams from Google, AWS, and Microsoft, and we welcome contributions from the community. We also have a website with documentation on installing and using kro, including example use cases. As an early-stage project, kro is not yet ready for production use, but we still encourage you to test it out in your own Kubernetes development environments!
Amazon SageMaker Unified Studio is now available in preview in seven additional AWS Regions: Asia Pacific (Seoul, Singapore, and Sydney), Europe (Frankfurt and London), South America (São Paulo), and Canada (Central).
Amazon SageMaker Unified Studio (preview) is an integrated data and AI development environment that enables collaboration and helps teams build data products faster. It brings together familiar tools from AWS analytics and AI/ML services for data processing, SQL analytics, machine learning model development, and generative AI application development into a single experience. SageMaker Unified Studio provides unified data access through Amazon SageMaker Lakehouse, and enhanced governance features are built in to help you meet enterprise security requirements. With the availability of new Regions, customers who have data sovereignty and low latency requirements can now use SageMaker Unified Studio while keeping their data and workloads closer to their primary operational Regions.
For more information on AWS Regions where SageMaker Unified Studio is available in preview, see Supported Regions.
You can create an Amazon SageMaker Unified Studio domain by visiting the Amazon SageMaker console. To get started, see the following resources:
Amazon SES announces that the SES Mail Manager product is now available in 11 new commercial AWS Regions. This expands coverage from the original six commercial AWS Regions where Mail Manager first launched, meaning that Mail Manager is now offered in all non-opt-in commercial Regions where SES offers sending and receiving services.
SES Mail Manager allows customers to configure email routing and delivery mechanisms for their domains, or for private use, and to have a single view of email governance, risk, and compliance solutions for all email workloads. Mail Manager is most often deployed to replace legacy hosted mail relays, or to simplify integration alongside third-party mailbox providers or external email content security solutions. In addition, Mail Manager allows customers to perform onward delivery to WorkMail mailboxes, archive content to built-in archiving and search/export features, and to interoperate with third-party security add-ons offered directly within the Mail Manager console experience.
The new Regions include US East (Ohio), US West (San Francisco), Asia Pacific (Mumbai), Asia Pacific (Osaka), Asia Pacific (Seoul), Canada Central (Montreal), Europe (London), Europe (Paris), Europe (Stockholm), and South America (São Paulo).
They join US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland).
Customers can learn more about SES Mail Manager here and explore the new Regions in their SES consoles for any of the available Regions listed above.
CloudWatch Database Insights now supports the analysis of historical snapshots of operating system (OS) processes running on your databases, allowing you to correlate a spike in database load with OS process metrics.
Database administrators (DBAs) leverage OS metrics to understand how different processes or threads use system resources on their database instances. With this new Database Insights feature, DBAs can now access historical snapshots of OS processes running on their databases, including key metrics like memory and CPU utilization for each running process. OS process snapshots in Database Insights helps DBAs understand how each running process is using system resources on their databases for a given timestamp, making it easy to correlate OS process metrics with database load.
OS process snapshots are now available for both Aurora PostgreSQL and Aurora MySQL in all regions where Database Insights is available. To learn more about OS process snapshots in Database Insights, please refer to the public documentation. To learn more about Database Insights pricing, refer to the CloudWatch pricing page.
To get started with OS process snapshots in Database Insights, ensure you have enabled RDS Enhanced Monitoring and Database Insights Advanced mode. From the Database Instance dashboard, navigate to Database Telemetry and click on the OS processes tab. To correlate OS process metrics with database load, click on any data point on the database load chart, and a snapshot of OS processes will populate accordingly with key metrics per running process for the selected timestamp.
Today, we are announcing the general availability of AWS Wavelength in partnership with Orange in Casablanca, Morocco. With this first Wavelength Zone in North Africa, Independent Software Vendors (ISVs), enterprises, and developers can now use AWS infrastructure and services to support applications with data residency, low latency, and resiliency requirements.
AWS Wavelength, in partnership with Orange, delivers on-demand AWS compute and storage services to customers in North Africa. AWS Wavelength enables customers to build and deploy applications that meet their data residency, low-latency, and resiliency requirements. AWS Wavelength offers the operational consistency, industry leading cloud security practices, and familiar tools for automation that are similar to an AWS Region. With AWS Wavelength in partnership with Orange, developers can now build the applications needed for use cases, such as AI/ML inference at the edge, gaming, and fraud detection.
Amazon EMR Serverless is a serverless option in Amazon EMR that makes it simple for data engineers and data scientists to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers. Today, we are excited to announce support for Public Subnets that allow you to use EMR Serverless for cost effective outbound data transfer from the cloud for big data processing workloads.
EMR Serverless applications allow you to enable VPC connectivity for use cases that need to connect to VPC resources or for outbound data transfer from the cloud to access resources on the Internet or other cloud providers. Previously, VPC connectivity supported only Private Subnets, hence you needed to configure a NAT (network address translation) Gateway for outbound connectivity from the cloud, which adds additional charges based on the amount of data transferred. Now, you can configure VPC connectivity for EMR Serverless applications on Public Subnets, which have a direct route to an internet gateway. This allows you to eliminate the NAT Gateway charges and use EMR Serverless for cost-effective outbound data transfer from the cloud for big data processing workloads.
Amazon EMR Serverless Public Subnet support is available in all supported EMR releases and in all AWS Regions where EMR Serverless is available, including the AWS GovCloud (US) Regions. To learn more, visit Configuring VPC Access in the EMR Serverless documentation.
SES announces that Mail Manager now supports defined email address and domain lists which are used as part of the Mail Manager rules engine to distinguish between known and unknown addresses. This functionality adds both the mechanisms to upload and manage email address and domain lists, and the rules engine controls to make routing decisions based on whether a given address in a message envelope is on such a list or not. Customers are therefore able to ensure trusted delivery for known internal recipients while implementing catch-all behaviors for directory harvesting attacks, mistyped addresses, and standard behaviors for other domains owned and managed by the customer.
SES recipient lists allow customers to upload email addresses individually or in batches via CSV files. They can then configure one or more lists with different routing preferences in the Mail Manager rules engine. This provides immediate changes to mail routing simply by adding another address to an existing list. For example, a list of “Retired Employees” might have new names added with some frequency, but the handling rule — attached to the list name itself — remains the same throughout.
SES Mail Manager recipient lists increase the flexibility and security of customers using Mail Manager to handle incoming mail by increasing resistance to email-based reconnaissance efforts and without disclosing list names or aliases externally. SES Mail Manager recipient lists are available in every region where Mail Manager is launched. Customers can learn more about SES Mail Manager here.
We are excited to announce the launch of storage scaling functions for Amazon Timestream for InfluxDB, allowing you to scale your allocated storage and change your storage Tiers as needed. With Storage Scaling, in you few simple steps you have greater flexibility and control over your time-series data processing and analysis.
Timestream for InfluxDB is used in applications that require high-performance time-series data processing and analysis. You can quickly respond to changes in data ingestion rates, query volumes, or other workload fluctuations by moving to a faster more performant storage tier or extending your allocated storage capacity, ensuring that your Timestream for InfluxDB instances always have the necessary resources to handle your workload and cost effectively. This means you can focus on building and deploying your applications, rather than worrying about storage sizing and management.
Support for Storage Scaling is available in all Regions where Timestream for InfluxDB is available. See here for a full listing of our Regions. To learn more about Amazon Timestream for InfluxDB, please refer to our user guide.
You can create a Amazon Timestream Instance from the Amazon Timestream console, AWS Command line Interface (CLI), or SDK, and AWS CloudFormation. To learn more about compute scaling for Amazon Timestream for InfluxDB, visit the product page,documentation, and pricing page.
CloudWatch Synthetics now allows canaries running in a VPC to make outbound requests to IPv6 endpoints allowing monitoring of IPv6-only and dual stack enabled endpoints over IPv6. You can also access CloudWatch Synthetics APIs over both IPv4 and IPv6 through new dual stack compatible regional endpoints. Additionally, PrivateLink access to Synthetics within VPCs is now available over IPv6 connections.
Using CloudWatch Synthetics, you can now monitor the availability and performance of websites or microservices accessible via IPv6 endpoints ensuring that end users can use the applications seamlessly irrespective of their network protocol. You can create IPv6 enabled canaries in your VPC using the CLI, CDK, CloudFormation, or the AWS console, and update existing VPC canaries to support dual stack connectivity without making any script changes. You can monitor endpoints external to your VPC by giving the canary internet access and configuring the VPC subnets appropriately. Now you can manage Synthetics resources in environments with IPv6-only networking policies, or access Synthetics APIs via IPv6 without traffic traversing the internet using PrivateLink helping meet security and regulatory requirements.
IPv6 support for Synthetics is available in all commercial regions where CloudWatch Synthetics is present at no additional cost to the users.
To learn how to configure a IPv6 canary in a VPC see documentation, or click here to find dual-stack API management endpoints for Synthetics. See user guide and One Observability Workshop to get started with CloudWatch Synthetics.
Amazon Lex has expanded Assisted Slot Resolution to additional AWS regions and enhanced its capabilities through integration with newer Amazon Bedrock foundation models. Bot developers can now select from allowlisted foundation models in their account to enhance slot resolution capabilities, while maintaining the same simplified permission model through bot Service Linked Role updates.
When enabled, this feature helps chatbots better understand user responses during slot collection, activating during slot retries and fallback scenarios. The feature supports AMAZON.City, AMAZON.Country, AMAZON.Number, AMAZON.Date, AMAZON.AlphaNumeric (without regex), and AMAZON.PhoneNumber slot types, with the ability to enable improvements for individual slots during build time.
Assisted Slot Resolution is now available in Europe (Frankfurt, Ireland, London), Asia Pacific (Sydney, Singapore, Seoul, Tokyo), and Canada (Central) regions, in addition to US East (N. Virginia) and US West (Oregon). While there are no additional Amazon Lex charges for this feature, standard Amazon Bedrock pricing applies for foundation model usage.
To learn more about implementing these enhancements, please refer to our documentation on Assisted Slot Resolution. You can enable the feature through the Amazon Lex console or APIs.
Welcome to the second Cloud CISO Perspectives for January 2025. Iain Mulholland, senior director, Security Engineering, shares insights on the state of ransomware in the cloud from our new Threat Horizons Report. The research and intelligence in the report should prove helpful to all cloud providers and security professionals. Similarly, the recommended risk mitigations will work well with Google Cloud, but are generally applicable to all clouds.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe101b1f70>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
How cloud security can adapt to ransomware threats in 2025
By Iain Mulholland, senior director, Security Engineering, Google Cloud
How should cloud providers and cloud customers respond to the threat of ransomware? Cloud security strategies in 2025 should prioritize protecting against data exfiltration and identity access abuse, we explain in our new Threat Horizons Report.
Iain Mulholland, senior director, Security Engineering, Google Cloud
Research and intelligence in the report shows that threat actors have made stealing data and exploiting weaknesses in identity security top targets. We’ve seen recent adaptations from some threat actor groups, where they’ve started using new ransomware families to achieve their goals. We’ve also observed them rapidly adapt their tactics to evade detection and attribution, making it harder to accurately identify the source of attacks — and increasing the likelihood that victims will pay ransom demands.
As part of our shared fate approach, where we are active partners with our customers in helping them secure their cloud use by sharing our expertise, best practices, and detailed guidance, this edition of Threat Horizons provides all cloud security professionals with a deeper understanding of the threats they face, coupled with actionable risk mitigations from Google’s security and threat intelligence experts.
These mitigations will work well with Google Cloud, but are generally applicable for other clouds, too.
Evolving ransomware and data-theft risks in the cloud
Ransomware and data threats in the cloud are not new, and investigations and analysis of the threats and risks they pose has been a key part of previous Threat Horizons Reports. Notably, Google Cloud security and intelligence experts exposed ransomware-related trends in the Threat Horizons Report published in February 2024, that included threat actors prioritizing data exfiltration over encryption and exploiting server-side vulnerabilities.
We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication, regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms.
We observed in the second half of 2024 a concerning shift that threat actors were becoming more adept at obscuring their identities. This latest evolution in their tactics, techniques, and procedures makes it harder for defenders to counter their attacks and increases the likelihood of ransom payments — which totalled $1.1 billion in 2023. We also saw threat actors adapt by relying more on ransomware-as-a-service (RaaS) to target cloud services, which we detail in the full report.
We recommend that organizations incorporate automation and awareness strategies such as strong password policies, mandatory multi-factor authentication (MFA), regular reviews of user access and cloud storage bucket security, leaked credential monitoring on the dark web, and account lockout mechanisms. Importantly, educate employees about security best practices to help prevent credential compromise.
Government insights can help here, too. Guidance from the Cybersecurity and Infrastructure Security Agency’s Ransomware Vulnerability Warning Pilot can proactively identify and warn about vulnerabilities that could be exploited by ransomware actors.
I’ve summarized risk mitigations to enhance your Google Cloud security posture to better protect against threats including account takeover, which could lead to threat actor ransomware and data extortion operations.
To help prevent cloud account takeover, your organization can:
Enroll in MFA: Google Cloud’s phased approach to mandatory MFA can make it harder for attackers to compromise accounts even if they have stolen credentials and authentication cookies.
Implement robust Identity and Access Management (IAM) policies: Use IAM policies to grant users only the necessary permissions for their jobs. Google Cloud offers a range of tools to help implement and manage IAM policies, including Policy Analyzer.
To help mitigate ransomware and extortion risks, your organization can:
Establish acloud-specific backup strategy: Disaster recovery testing should include configurations, templates, and full infrastructure redeployment and backups should be immutable for maximum protection.
Enable proactive virtual machine scanning: Part of SCC, Virtual Machine Threat Detection (VMTD) scans virtual machines for malicious applications to detect threats, including ransomware.
Monitor and control unexpected costs: With Google Cloud, you can identify and manage unusual spending patterns across all projects linked to a billing account, which could indicate unauthorized activity.
Organizations can use multiple Google Cloud products to enhance protection against ransomware and data theft extortion. Security Command Center can help establish a multicloud security foundation for your organization that can help detect data exfiltration and misconfigurations. Sensitive Data Protection can help protect against potential data exfiltration by identifying and classifying sensitive data in your Google Cloud environment, and also help you monitor for unauthorized access and movement of data.
Threats beyond ransomware
There’s much more to the cloud threat landscape than ransomware, and also more that organizations can do to mitigate the risks they face. As above, I’ve summarized here five more threat landscape trends that we identify in the report, and our suggested mitigations on how you can improve your organization’s security posture.
Service account risks, including over-privileged service accounts exploited with lateral movement tactics.
What you should do: Investigate and protect service accounts to help prevent exploitation of overprivileged accounts and reduce detection noise from false positives.
Identity exploitation, including compromised user identities in hybrid environments exploited with lateral movement between on-premises and cloud environments.
What you should do: Combine strong authentication with attribute-based validation, modernize playbooks and processes for comprehensive identity incident response (including enforcing mandatory MFA.)
Attacks against cloud databases, including active vulnerability exploits and exploiting weak credentials that guard sensitive information.
Diversified attack methods, including privilege escalation that allows threat actors to charge against victim billing accounts in an effort to maximize their profits from compromised accounts.
What you should do: As discussed above, enroll in MFA, use automated sensitive monitoring and alerting, and implement robust IAM policies.
Data theft and extortion attacks, including MFA bypass techniques and aggressive communication strategies with victims, use increasingly sophisticated tactics against cloud-based services to compromise accounts and maximize profits.
What you should do: Use a defense-in-depth strategy that includes strong password policies, MFA, regular reviews of user access, leaked credential monitoring, account lockout mechanisms, and employee education. Robust tools such as SCC can help monitor for data exfiltration and unauthorized access of data.
We provide more detail on each of these in the full report.
How Threat Horizons Reports can help
The Threat Horizons report series is intended to present a snapshot of the current state of threats to cloud environments, and how we can work together to mitigate those risks and improve cloud security for all. The reports provide decision-makers with strategic threat intelligence that cloud providers, customers, cloud security leaders, and practitioners can use today.
Threat Horizon reports are informed by Google Threat Intelligence Group (GTIG), Mandiant, Google Cloud’s Office of the CISO, Product Security Engineering, and Google Cloud intelligence, security, and product teams.
The Threat Horizons Report for the first half of 2025 can be read in full here. Previous Threat Horizons reports are available here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe0fd3a790>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Get ready for a unique, immersive security experience at Next ‘25: Here’s why Google Cloud Next is shaping up to be a must-attend event for security experts and the security-curious alike. Read more.
How Google secures its own cloud use: Take a peek under the hood at how we use and secure our own cloud environments, as part of our new “How Google Does It” series. Read more.
Privacy-preserving Confidential Computing now on even more machines and services: Confidential Computing is available on even more machine types than before. Here’s what’s new. Read more.
Use custom Org Policies to enforce CIS benchmarks for GKE: Many CIS recommendations for GKE can be enforced with custom Organization Policies. Here’s how. Read more.
Making GKE more secure with supply-chain attestation and SLSA: You can now verify the integrity of Google Kubernetes Engine components with SLSA, the Supply-chain Levels for Software Artifacts framework. Read more.
Office of the CISO 2024 year in review: Google Cloud’s Office of the CISO shared insights in 2024 on how to approach generative AI securely, featured industry experts on the Cloud Security Podcast, published research papers, and examined security lessons learned across many sectors. Read more.
Celebrating one year of AI bug bounties at Alphabet: What we learned in the first year of AI bug bounties, and how those lessons will inform our approach to vulnerability rewards going forward. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Tell Google Cloud what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe0fd3ac10>), (‘btn_text’, ‘Vote now’), (‘href’, ‘https://www.linkedin.com/feed/update/urn:li:activity:7290368088598822913/’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
How to stop cryptocurrency heists: Many factors are spurring a spike in cryptocurrency heists, including the lucrative nature of their rewards and the challenges associated with attribution to malicious actors. In our new Securing Cryptocurrency Organizations guide, we detail the defense measures organizations should take to stop cryptocurrency heists. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
How the modern CISO balances risk, innovation, business strategy, and cloud: John Rogers, CISO, MSCI, talks about the biggest cloud security challenges CISOs are facing today — and they’re evolving — with host Anton Chuvakin and guest co-host Marina Kaganovich from Google Cloud’s Office of the CISO. Listen here.
Slaying the ransomware dragon: Can startups succeed where others have failed, and once and for all end ransomware? Bob Blakley, co-founder and chief product officer of ransomware defense startup Mimic, tells hosts Anton Chuvakin and Tim Peacock his personal reasons for joining the fight against ransomware, and how his company can help. Listen here.
Behind the Binary: How a gamer became a renowned hacker: Stephen Eckels, from Google Mandiant’s FLARE team, discusses how video game modding helped kickstart his cybersecurity career. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in February with more security-related updates from Google Cloud.
In today’s complex digital world, building truly intelligent applications requires more than just raw data — you need to understand the intricate relationships within that data. Graph analysis helps reveal these hidden connections, and when combined with techniques like full-text search and vector search, enables you to deliver a new class of AI-enabled application experiences. However, traditional approaches based on niche tools result in data silos, operational overhead, and scalability challenges. That’s why we introduced Spanner Graph, and today we’re excited to announce that it’s generally available.
In a previous post, we described how Spanner Graph reimagines graph data management with a unified database that integrates graph, relational, search, and gen AI capabilities with virtually unlimited scalability. With Spanner Graph, you gain access to an intuitive ISO Standard Graph Query Language (GQL) interface that simplifies pattern matching and relationship traversal. You also benefit from full interoperability between GQL and SQL, for tight integration between graph and tabular data. Powerful vector and full-text search enable fast data retrieval using semantic meaning and keywords. And you can rely on Spanner’s scalability, availability, and consistency to provide a solid data foundation. Finally, integration with Vertex AI gives you access to powerful AI models directly within Spanner Graph.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebe10508ee0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
What’s new in Spanner Graph
Since the preview, we have added exciting new capabilities and partner integrations to make it easier for you to build with Spanner Graph. Let’s take a closer look.
1) Spanner Graph Notebook: Graph visualization is key to developing with graphs. The new open-source Spanner Graph Notebook tool provides an efficient way to query Spanner Graph visually. This tool is natively installed in Google Colab, meaning you can use it directly within that environment. You can also leverage it in notebook environments like Jupyter Notebook. With this tool, you can use magic commands with GQL to visualize query results and graph schemas with multiple layout options, inspect node and edge properties, and analyze neighbor relationships.
Open-source Spanner Graph Notebook.
2) GraphRAG with LangChain integration: Spanner Graph’s integration with LangChain allows for quick prototyping of GraphRAG applications. Conventional RAG, while capable of grounding the LLM by providing relevant context from your data using vector search, cannot leverage the implicit relationships present in your data. GraphRAG overcomes this limitation by constructing a graph from your data that captures these complex relationships. At retrieval time, GraphRAG uses the combined power of graph queries with vector search to provide a richer context to the LLM, enabling it to generate more accurate and relevant answers.
3) Graph schema in Spanner Studio: The Spanner Studio Explorer panel now displays a list of defined graphs, their nodes and edges, and the associated labels and properties. You can explore and understand the structure of your graph data at a glance, making it easier to design, debug, and optimize your applications.
4) Graph query improvements: Spanner Graph now supports the path data type and functions, allowing you to retrieve and analyze the specific sequence of nodes and relationships that connect two nodes in your graph. For example, you can create a path variable in a path pattern, using the IS_ACYCLIC function to check if the path has repeating nodes, and return the entire path:
code_block
<ListValue: [StructValue([(‘code’, ‘GRAPH FinGraphrnMATCH p = (:Account)-[:Transfers]->{2,5}(:Account)rnRETURN IS_ACYCLIC(p) AS is_acyclic_path, TO_JSON(p) AS full_path;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ebe12caf910>)])]>
5) Graph visualization partner integrations: Spanner Graph is now integrated with leading graph visualization partners. For example, Spanner Graph customers can use GraphXR, Kineviz’s flagship product, which combines cutting-edge visualization technology with advanced analytics to help organizations make sense of complex, connected data.
“We are thrilled to partner with Google Cloud to bring graph analytics to big data. By integrating GraphXR with Spanner Graph, we’re empowering businesses to visualize and interact with their data in ways that were previously unimaginable.” – Weidong Yang, CEO, Kineviz
“Businesses can finally handle graph data with both speed and scale. By combining Graphistry’s GPU-accelerated graph visualization and AI with Spanner Graph’s global-scale querying, teams can now easily go all the way from raw data to graph-informed action. Whether detecting fraud, analyzing journeys, hunting hackers, or surfacing risks, this partnership is enabling teams to move with confidence.” – Leo Meyerovich, Founder and CEO, Graphistry
Visual analytics capabilities in Graphistry: zooming, clustering, filtering, histograms, time bar filtering, node styling (colors), allowing point-and-click analysis to quickly understand the data, clusters, identify patterns, anomalies and other insights.
Furthermore, you can use G.V(), a quick-to-install graph database client, with Spanner Graph to perform day-to-day graph visualization and data analytics tasks with ease. Data professionals benefit from high-performance graph visualization, no-code data exploration, and highly customizable data visualization options.
“Graphs thrive on connections, which is why I’m so excited about this new partnership between G.V() and Google Cloud Spanner Graph. Spanner Graph turns big data into graphs, and G.V() effortlessly turns graphs into interactive data visualizations. I’m keen to see what data professionals build combining both solutions.” – Arthur Bigeard, Founder, gdotv Ltd.
Visually querying and exploring Spanner Graph with G.V().
What customers are saying
Through our road to GA, we have also been working with multiple customers to help them innovate with Spanner Graph:
“The Commercial Financial Network manages commercial credit data for more than 30 million U.S. businesses Managing the hierarchy of these businesses can be complex due to the volume of these hierarchies, as well as the dynamic nature driven by mergers and acquisitions, Equifax is committed to providing lenders with the accurate, reliable and timely information they need as they make financial decisions. Spanner Graph helps us manage these rapidly changing, dynamic business hierarchies easily at scale.” – Yuvaraj Sankaran, Chief Architect of Global Platforms, Equifax
“As we strive to enhance our fraud detection capabilities, having a robust, multi-model database like Google Spanner is crucial for our success. By integrating SQL for transactional data management with advanced graph data analysis, we can efficiently manage and analyze evaluated fraud data. Spanner’s new capabilities significantly improve our ability to maintain data integrity and uncover complex fraud patterns, ensuring our systems are secure and reliable.” – Hai Sadon, Data Platform Group Manager, Transmit Security
“Spanner Graph has provided a novel and performant way for us to query this data, allowing us to deliver features faster and with greater peace of mind. Its flexible data modeling and high-performance querying have made it far easier to leverage the vast amount of data we have in our online applications.” – Aaron Tang, Senior Principal Engineer, U-NEXT