Amazon Keyspaces (for Apache Cassandra) now supports Logged Batches, enabling you to perform multiple write operations as a single atomic transaction. With Logged Batches, you can ensure that either all operations (INSERT, UPDATE, DELETE) within a batch succeed or none of them do, maintaining data consistency across multiple rows and tables within a keyspace. This capability is particularly valuable for applications that require strong data consistency, such as financial systems, inventory management, and user profile updates that span multiple data entities.
Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. Amazon Keyspaces is serverless, so you pay for only the resources that you use and you can build applications that serve thousands of requests per second with virtually unlimited throughput and storage.
Logged Batches in Amazon Keyspaces provide the same atomicity guarantees as Apache Cassandra while eliminating the operational complexity of managing transaction logs across distributed clusters. It’s designed to scale automatically with your workload and maintain consistent performance regardless of transaction volume. The feature integrates seamlessly with existing Cassandra Query Language (CQL) statements, allowing for adoption in both new and existing applications.
Logged Batches are available today in all AWS Commercial and AWS GovCloud (US) Regions where Amazon Keyspaces is available. You pay only for the standard write operations processed within each batch. To learn more about Logged Batches, please visit our blog post or refer to our Amazon Keyspaces documentation.
Starting today, the general-purpose Amazon EC2 M8a instances are available in US East (N. Virginia) and Asia Pacific (Tokyo) regions. M8a instances are powered by 5th Gen AMD EPYC processors (formerly code named Turin) with a maximum frequency of 4.5 GHz, deliver up to 30% higher performance, and up to 19% better price-performance compared to M7a instances.
M8a instances deliver 45% more memory bandwidth compared to M7a instances, making these instances ideal for even latency sensitive workloads. M8a instances deliver even higher performance gains for specific workloads. M8a instances are up to 60% faster for GroovyJVM benchmark, and up to 39% faster for Cassandra benchmark compared to Amazon EC2 M7a instances. M8a instances are SAP-certified and offer 12 sizes including 2 bare metal sizes. This range of instance sizes allows customers to precisely match their workload requirements.
M8a instances are built using the latest sixth generation AWS Nitro Cards and ideal for applications that benefit from high performance and high throughput such as financial applications, gaming, rendering, application servers, simulation modeling, mid-size data stores, application development environments, and caching fleets.
To get started, sign in to the AWS Management Console. Customers can purchase these instances via Savings Plans, On-Demand instances, and Spot instances. For more information visit the Amazon EC2 M8a instance page.
India’s developer community, vibrant startup ecosystem, and leading enterprises are embracing AI with incredible speed. To meet this moment for India, we are investing in powerful, locally-available tools in India that can help foster a diverse ecosystem, and ensure our platform delivers the controls you need for compliance and AI sovereignty.
Today, we’re announcing a significant expansion of our local AI hardware capacity for customers in India. This increase in local compute, powered by Google’s AI Hypercomputer architecture with the latest Trillium TPUs, will help more businesses and public sector organizations train and serve their most advanced Gemini models in India.
By unblocking new opportunities for high-performance, low-latency AI applications we can help customers meet India’s data residency and sovereignty requirements.
Enabling models and control: AI tools built for India’s context
While infrastructure is the foundation for digital sovereignty, it also requires control over the data and the models built on it. We’re committed to bringing our latest AI advancements to India faster than ever, with the controls you need.
Our new services would enable you to build, tune, and deploy models that understand India’s unique business logic and rich cultural context.
Next-generation models, here in India: Earlier this year, Google Cloud made Gemini available to regulated Indian customers by deploying Gemini 2.5 Flash with local machine-learning processing support. Now, we’re opening early testing for our latest and most advanced Gemini models to Indian customers. We’re also committing to launching the most powerful Gemini models in India with full data residency support. This is a first for Google Cloud, and a direct response to help meet the needs of our Indian customers.
More AI capabilities, available locally: We’re providing additional consumption models and pre-built AI-powered applications tailored for local context by launching a suite of new capabilities with data residency support in India:
Batch support for Gemini 2.5 Flash: Now generally available, this allows organizations to run high-volume, non-real-time AI tasks at a lower cost, all in India.
Document AI: Now in preview, we’re providing local support to help Indian businesses automate document processing.
More local context in your AI: Grounding on Google Maps is a new capability to ground model responses in real time from Google Maps, ensuring AI applications can provide accurate, location-aware answers.
A sovereign AI ecosystem: Building for India, with India
The most durable and decisive factor for long-term digital sovereignty lies in cultivating the “human element” — the skilled talent and innovation ecosystem. A sovereign AI future depends on building a strong local ecosystem.
Our strategy is to support India’s ecosystem-led approach by investing in the researchers, developers, and startups who are building for India’s specific needs.
Collaboration with IIT Madras: Google Cloud and Google DeepMind are thrilled to collaborate with IIT Madras to support the launch of Indic Arena. Run independently by the renowned AI4Bharat center at IIT Madras, this platform will allow users from all over India to anonymously evaluate and rank AI models on tasks unique to India’s rich multilingual landscape. To support this initiative, we are providing cloud credits to power this critical, community-driven resource.
“At AI4Bharat, our mission is to build AI for India’s specific needs. A critical part of this is having a neutral, standardized benchmark to understand how models are performing across our many languages,” said Mitesh Khapra, associate professor, IIT Madras. “Indic Arena will be that platform. We are delighted to have Google Cloud’s support to provide the initial compute power to bring this independent, public-facing project to life for the entire Indian AI community.”
We encourage all developers, researchers, and organizations in India to explore the Indic Arena platform and contribute to building a more inclusive AI future.
We invite the entire Indian ecosystem, from startups and universities to government bodies and enterprises, to take advantage of this new, dedicated capacity for Gemini in Vertex AI and our sovereign-ready infrastructure to build the next generation of AI that is built by Indians, for Indians.
AWS Backup now supports Amazon Elastic Kubernetes Service (EKS), providing a fully-managed, centralized solution for backing up EKS cluster state and persistent application data. You can now use AWS Backup to help protect your entire EKS environments through a centralized, policy-driven backup service.
You now get comprehensive data protection capabilities through AWS Backup across your Amazon EKS Clusters, including automated scheduling, retention management, immutable vaults, cross-Region and cross-account copies. AWS Backup delivers a new an agent-free solution that works natively with AWS, replacing custom scripts or third-party tools to perform backups for each cluster. You can restore entire EKS clusters, specific namespaces, or individual persistent volumes. Use AWS Backup to protect your clusters for disaster recovery, to help meet your compliance requirements, or for additional protection before EKS cluster upgrades.
AWS Backup for EKS is available in all AWS Regions where both AWS Backup and Amazon EKS are available. For the most up-to-date information on Regional availability, please refer to the AWS Backup Regional availability.
The best feedback during a code review is specific, consistent, and understands the history of a project.
However, AI code review agents today are often stateless; they have no memory of past interactions. This means you might find the same feedback on new pull requests that you’ve rejected before, because the agent can’t learn from your team’s guidance, leading to frustration and repeated work.
Today, we’re releasing a new memory capability for Gemini Code Assist on GitHub for both enterprises and individual developers. Now, you can create a dynamic, evolving memory of your team’s coding standards, style, and best practices, all derived from your direct interactions and feedback within pull requests. The memory is stored securely in a Google-managed project specific to your installation, isolating it from other users.
Here’s how memory works
Memory transforms the code review agent from a stateless tool into a long-term project contributor that learns and adapts to your team.
Automated vs. manual memory
Gemini Code Assist on GitHub already supports memory in the form of styleguide.md files. These rules are always added to the agent’s prompt, which makes it suitable for static, universal guidelines.
In contrast, persistent memory introduces a more dynamic and automated approach. It automatically extracts rules from pull request interactions, requiring no manual effort. These learned rules are stored efficiently and are only retrieved and applied when they are relevant to the specific code being reviewed. This creates a smarter, more scalable memory that adapts to your team
The process is built on three key pillars:
1. It learns from your interactions
The process begins when you and your team do what you already do today – conducting code reviews: When a pull request is merged, Gemini Code Assist on GitHub will analyze the comment threads for feedback. For instance, if Gemini Code Assist on GitHub points out that “do not line-wrap import statements” in a .java file, and the author disagrees in their comment, the agent sees this interaction as a valuable piece of feedback and will store it. By waiting until a PR is merged, we ensure the conversation is complete and the code is a valuable source of truth.
2. It intelligently creates, updates and stores rules
From that simple interaction, persistent memory uses the powerful Gemini model to infer a generalized, reusable rule. In the example above, it would generate a natural language rule like: “In Java, import statements could be line-wrapped”.
3. It applies rules to future reviews
Once rules are stored in memory, the agent uses them in two critical ways:
To guide the initial review: Before it even begins analyzing a new pull request, the agent will query the persistent memory for a broad set of relevant rules for the repository. This helps shape its initial analysis to be more in line with your team’s established patterns.
To filter its own suggestions: After generating a set of draft review comments, the agent performs a second check. It retrieves highly specific rules related to its own comments and evaluates them. This acts as a filter to ensure its suggestions don’t violate a previously learned best practice, allowing it to drop or modify comments before you ever see them.
As more rules are accrued, the team’s tribal knowledge is shared across the codebase through code reviews.
Getting started
New to the app?
If you are an individual developer or OSS maintainer, install Gemini Code Assist on GitHub from the GitHub Marketplace.
As Large Language Models (LLMs) evolve, Reinforcement Learning (RL) is becoming the crucial technique for aligning powerful models with human preferences and complex task objectives.
However, enterprises that need to implement and scale RL for LLMs are facing infrastructure challenges. The primary hurdles include the memory contention from concurrently hosting multiple large models (such as the actor, critic, reward, and reference models), iterative switching between high latency inference generation, and high throughput training phases.
This blog details Google Cloud’s full-stack, integrated approach, from custom TPU hardware to the GKE orchestration layer — and shares how you can solve the hybrid, high-stakes demands of RL at scale.
A quick primer: Reinforcement Learning (RL) for LLMs
RL is a continuous feedback loop that combines elements of both training and inference. At a high level, the RL loop for LLMs functions as follows:
The LLM generates a response to a given prompt.
A “reward model” (often trained on human preferences) assigns a quantitative score, or reward, to the output.
An RL algorithm (e.g., DPO, GRPO) uses this reward signal to update the LLM’s parameters, adjusting its policy to generate higher-rewarding outputs in subsequent interactions.
This generation, evaluation, and optimization continually improves the LLM’s performance based on predefined objectives.
RL workloads are hybrid and cyclical. The main goal of RL is not to minimize error (training) or fast prediction (inference), but to maximize reward through iterative interaction. The primary constraint for the RL workload is not just the computational power, but also system-wide efficiency, specifically minimizing aggregate sampler latency and maximizing the speed of weight copying for efficient end-to-end step time.
Google Cloud’s full-stack approach to RL
Solving these system-wide challenges requires an integrated approach. You can’t just have fast hardware or a good orchestrator; you need every layer of the stack to work together. Here is how our full-stack approach is built to solve the specific demands of RL:
1. Flexible, high-performance compute (TPUs and GPUs): Instead of locking customers into one path, we provide two high-performance options. Our TPU stack is a vertically integrated, JAX-native solution where our custom hardware (excelling at matrix operations) is co-designed with our post-training libraries (MaxText and Tunix). In parallel, we fully support the NVIDIA GPU ecosystem, partnering with NVIDIA on optimized NeMo RL recipes so customers can leverage their existing expertise directly on GKE.
2. Holistic, full-stack optimization: We integrate optimization from the bare metal up. This includes our custom TPU accelerators, high-throughput storage (Managed Lustre, Google Cloud Storage), and — critically — the orchestration and scheduling that GKE provides. By optimizing the entire stack, we can attack the system-wide latencies that bottleneck hybrid RL workloads.
3. Leadership in open-source: RL infrastructure is complex and built on a wide range of tools. Our leadership starts with open-sourcing Kubernetes and extends to active partnerships with orchestrators like Ray. We contribute to key projects like vLLM, develop open-source solutions like llm-d for cost-effective serving, and open-source our own high-performance MaxText and Tunix libraries. This helps ensure you can integrate the best tools for the job, not just the ones from a single vendor.
4. Proven, mega-scale orchestration: Post-training RL can require compute resources that rival pre-training. This requires an orchestration layer that can manage massive, distributed jobs as a single unit. GKE AI mega-clusters support up to 65,000 nodes today, and we are heavily investing in multi-cluster solutions like MultiKueue to scale RL workloads beyond the limits of a single cluster.
Running RL workloads on GKE
Existing GKE infrastructure is well-suited for demanding RL workloads and provides several infrastructure-level efficiencies.
The image below outlines the architecture and key recommendations for implementing RL at scale.
Figure : GKE infrastructure for running RL
At the base, the infrastructure layer provides the foundational hardware, including supported compute types (CPUs, GPUs, and TPUs). You can use the Run:ai model streamer to accelerate the model streaming for all three compute types. High performance storage (Managed Lustre, Cloud Storage) can be used for storage needs for RL.
The middle layer is the managed K8s layer powered by GKE, which handles the resource orchestration, resource obtainability using Spot or Dynamic Workload Scheduler, autoscaling, placement, job queuing and job scheduling and more at mega scale.
Finally, the open frameworks layer runs on top of GKE, providing the application and execution environment. This includes the managed support for open-source tools such as KubeRay, Slurm and gVisor sandbox for secure isolated task execution.
Building RL workflow
Before creating an RL workload, you must first identify a clear use case. With that objective defined, you then architect the core components: selecting the algorithm (e.g, DPO, GRPO), the model server (like vLLM or SGLang), the target GPU/TPU hardware, and other critical configurations.
Next, you can provision a GKE cluster configured with Workload Identity, GCS Fuse, and DGCM metrics. For robust batch processing, install the Kueue and JobSet APIs. We recommend deploying Ray as the orchestrator on top of this GKE stack. From there, you can launch the Nemo RL container, configure it for your GRPO job, and begin monitoring its execution. For the detailed implementation steps and source code, please refer to this repository.
Partner with the open-source ecosystem: Our leadership in AI is built on open standards like Kubernetes, llm-d, Ray, MaxText or Tunix. We invite you to partner with us to build the future of AI together. Come contribute to llm-d! Join the llm-d community, check out the repository on GitHub, and help us define the future of open-source LLM serving.
In today’s competitive environment, IT leaders are faced with supporting application scale, rolling out more features, and enabling high-bar customer experiences. This creates a direct and complex challenge: finding the right balance between performance and total cost of ownership (TCO) for the general-purpose workloads that power everyday business operations.
Today, we are announcing the general availability of the N4D machine series, the latest addition to Google Compute Engine’s cost-optimized, general-purpose portfolio. Addressing a wide range of workloads, such as web and application servers, data analytics platforms, and containerized microservices, N4D provides a flexible and price-performant solution.
The N4D machine series combines Google’s Titanium infrastructure with 5th Gen AMD EPYC™ “Turin” processors, delivering up to 3.5x the throughput for web-serving workloads vs. the previous-generation N2D. N4D offers predefined shapes of up to 96 vCPUs and 768 GB of DDR5 memory, up to 50 Gbps of networking bandwidth, and Hyperdisk Balanced and Throughput storage. To deliver a blended cost savings, N4D allows you to move beyond rigid instance sizing for both compute and storage, with Custom Machine Types to independently configure the exact number of vCPUs and amount of memory, complemented with Hyperdisk, for tuning disk storage performance and capacity. For the most demanding general purpose workloads, pair N4D together with consistently high performance of C4D.
Google Cloud provides workload-optimized infrastructure to ensure the right resources are available for every task. Titanium in particular, with its multi-tier offloads and security capabilities, is foundational to that infrastructure. Titanium offloads networking and storage processing to free up the CPU, and its dedicated SmartNIC manages all I/O, ensuring the AMD EPYC cores are reserved exclusively for your application. Titanium is part of Google Cloud’s vertically integrated stack — from the custom silicon in our servers to our planet-scale network traversing 7.75 million kilometers of terrestrial and subsea fiber across 42 regions — that is engineered to maximize efficiency and provide the ultra-low latency and high bandwidth to customers at global scale.
A new standard for price-performance
N4D machine series doesn’t just inch past the previous N2D generation; it sprints, delivering up to 50% higher price-performance for general computing workloads and up to 70% better price-performance for Java workloads. For web-serving workloads, N4D leverages Titanium and AMD’s Turin processors to drive incredible throughput. This results in up to 3.5x the price-performance vs N2D, driving faster response times and a better overall experience for your end-users.
As of October 2025. Performance based on the estimated SPECrate®2017_int_base, estimated SPECjbb2015, and Google internal Nginx Reverse Proxy benchmark scores run in production. Price-performance claims based on published and estimated list prices for Google Cloud.
“Our edge proxy fleet and internal data pipelines observed a3-4x performance improvement on Google Cloud’s N4D instances compared to N2D. Our benchmarks also show N4D processes the same workload with significantly greater consistency while using just a fraction of the CPU. This leap in price-performance allows us to efficiently scale our general-purpose workloads, and fits neatly in our fleet alongside more specific Google compute products we leverage.” – Matt Schallert, Member of Technical Staff, Chronosphere
“A10% increase in throughput while cutting costs by up to 50% is a massive win for TCO optimization. That’s what we achieved on Google Cloud’s N4D machine series. For MediaGo, this efficiency is critical. It allows our AI-driven advertising platform to scale more cost-effectively, directly supporting our mission to maximize ROI for our global partners.” – MediaGo
“The move from N2D to N4D is a significant generational leap. This 144.14% performance uplift over 152 tests is a testament to Google’s Titanium, unlocking the full potential of the new AMD EPYC ‘Turin’ processors. For those looking for the best possible price-performance in Google Cloud, the N4D instances are a clear winner.” – Michael Larabel, Founder and Principal Author, Phoronix (Read the full study here.)
“With the launch of the new N4D instances, Google Cloud now offersthe most comprehensive portfolio based on our 5th Gen AMD EPYC processors, marking a significant milestone in our strategic partnership. N4D machine series combines the leading performance of AMD CPUs with the uniqueness of Google’s Custom Machine Types to deliver a remarkable uplift in price-performance, flexibility, and cost-optimization for everyday workloads. Our benchmark tests confirm this, showing measured performance gains of up to 75% over the previous generation N2D machine series for media encode and transcode workloads.” – Ryan Rodman, Sr Director, Cloud Business Group, AMD
Complementing C4D machine series
Earlier this year, we introduced our general-purpose C4D machine series built on the same underlying processor as N4D. Its consistently high performance and enterprise features like advanced maintenance support, larger shapes, and our next-gen Titanium Local SSDs, make C4D a great fit for critical workloads. In fact, customers such as Silk and Chess.com report greater than 40% improvement in performance with C4D over prior generations.
But critical applications are only part of the story. A modern cloud architecture must also run countless general-purpose workloads where flexibility and price-performance are key. That’s why we designed N4D — as a complement to C4D. By leveraging C4D and N4D in tandem, you unlock the full spectrum of enterprise features, performance, flexibility, and cost-optimization, choosing:
C4D for consistent performance: This is your solution for the most demanding, latency-sensitive applications. With up to 200 Gbps networking, Local SSD support along with larger shapes up to 384 vCPUs and bare metal options, C4D delivers predictable, high-end performance for large databases, high-traffic ad and game servers, and demanding AI/ML inference workloads.
N4D for flexible cost-optimization: This is the engine for the vast majority of your general-purpose workloads. N4D’s leading price-performance, low cost, and flexibility allow you to slash TCO for applications like web servers, microservices, and development environments.
This approach is already delivering real-world results, allowing customers like Verve to optimize their business from both ends.
“With Google’s Gen4 AMD portfolio, we can optimize for both revenue and cost simultaneously.C4D provides the consistent peak performance we need for our core ad servers— 81% faster than C3D — which directly translates to more revenue from higher fill-rates (successful bid/ask matching). Meanwhile,N4D delivers an incredible 2x performance and price-performance over N2D for everyday workloads, including scale-out microservices with GKE, enabling us to grow while slashing our overall TCO. This ‘Better Together’ strategy allows us to use the consistently peak performance of C4D for our mission-critical services and the flexible, cost-efficient N4D to aggressively reduce TCO everywhere else — a level of optimization that simply isn’t possible with a single VM type elsewhere.” – Pablo Loschi, Principal Systems Engineer at Verve
The Custom Machine Type and Hyperdisk advantage
Custom Machine Types are a key differentiator for Google Cloud, letting you go beyond predefined “T-shirt sizes”. Instead of forcing your workload into a box, you can tailor the infrastructure to fit your workload’s needs, saving on cost. For instance, a memory-intensive workload requiring 16 vCPUs and 70 GB of RAM might typically be placed on a predefined N4D-highmem-16 shape, forcing you to pay for unused resources. With CMTs, you provision the exact 16 vCPU and 70 GB configuration, eliminating that waste and achieving up to 17% cost savings.
With shapes of up to 96 vCPUs and 768 GB of DDR5 memory, the combination of Custom Machine Types and N4D lets you dial in the exact resources you need with flexible vCPU-to-memory ratios along with extended memory support.
“At Symbotic, our vision is to revolutionize the global supply chain with an AI-powered robotics platform built for scale and efficiency. This demands an infrastructure that is both powerful and scalable. Google Cloud’s N4D VMs, powered by AMD’s latest EPYC processors, delivered exactly that. We observed asignificant 40% performance upliftcompared to the previous N2D generation, allowing us to cut our CPU footprint in halfwith no change in simulation speed or fidelity.The ability to pair these gains with Custom Machine Types— a capability unique to Google Cloud — is a game-changer. It allows us toprecisely sculpt our infrastructure to our workloads and gain a significant TCO advantage versus other cloud offerings.” – Dan Inbar, Chief Information Officer, Symbotic
This granular control and TCO advantage extends beyond compute to your storage. Just as Custom Machine Types let you break free from fixed vCPU-to-memory ratios, Hyperdisk unbundles storage performance from capacity, letting you independently tune capacity and performance to precisely match your workload’s block storage requirements.
This is further enhanced by Hyperdisk Storage Pools for Hyperdisk Balanced volumes, which let you provision performance and capacity in aggregate, rather than managing each volume individually. The result is simpler management, higher efficiency, an easier path for modernizing SAN workloads — all this while helping you lower your storage TCO by as much as 30-50%.
Get started with N4D today
Adopting the latest N4D VM series is easy, particularly if you use Google Kubernetes Engine (GKE), where our custom compute classes remove the operational hurdles of migrating workloads to new hardware. Just add N4D to your prioritized list of VM types to ensure your workloads have the performance and flexibility they need to scale.
N4D is now available in us-central1 (Iowa), us-east1 (South Carolina), us-west1 (Oregon), us-west4 (Las Vegas), europe-west1 (Belgium), and europe-west4 (Netherlands).
1. 9xx5C-044 – Testing by AMD Performance Labs as of 10/21/2025. N4D-standard-16 score comparison to N2D-standard-16 running FFmpeg v6.1.1 benchmark (average of 2x encode and 2x transcode) on Ubuntu24.04LTS OS with 6.8.0-1021-gcp kernel, SMT On.
Cloud performance results presented are based on the test date in the configuration. Results may vary due to changes to the underlying configuration, and other conditions such as the placement of the VM and its resources, optimizations by the cloud service provider, accessed cloud regions, co-tenants, and the types of other workloads exercised at the same time on the system
In today’s fast-paced, data-driven landscape, the ability to process, analyze, and act on vast amounts of data in real time is paramount. For businesses aiming to deliver personalized customer experiences and optimize operations, the choice of database technology is a critical decision.
At Zeotap — a leading Customer Data Platform (CDP) — we empower enterprises to unify their data from disparate sources to build a comprehensive, unified view of their customers. This enables businesses to activate data across various channels for marketing, customer support, and analytics. Zeotap handles more than 10 billion new data points a day from more than 500 data sources across our clients, while orchestrating through more than 2000 workflows — one-third of those in real time with milliseconds latency. To meet stringent SLAs for data freshness and end-to-end latencies, performance is crucial.
However, as Zeotap grew, our ScyllaDB-based infrastructure faced scaling challenges, especially as the business needed to evolve towards real-time use cases and increasingly spiky workloads. We needed a more flexible, performant, cost-effective, and operationally efficient solution, which led us to Bigtable, a low-latency, NoSQL database service from Google Cloud for machine learning, operational analytics, and high-throughput applications. The migration resulted in significant benefits, including a 46% reduction in Total Cost of Ownership (TCO).
The challenge of scaling real-time analytics
Zeotap’s platform demands a database capable of handling a high write throughput of over 300,000 writes per second and nearly triple that in reads during peaks.
As our platform evolved, the initial architecture presented several hurdles:
Scalability limitations: We initially self-managed ScyllaDB, on-prem, and later on in the cloud. We use Spark and BigQuery for analytical batch processing, but managing these different tools and pipelines across our own environment and customer environments reached a peak where scaling became increasingly harder.
Operational overhead: Managing and scaling our previous database infrastructure required significant operational effort. We had to run scripts in the background to add nodes when resource alerts came up and had to map hardware to different kinds of workloads.
Deployment complexity: Embedding third-party technology in our stack complicated deployment. The commercial procurement process was also cumbersome.
Cost predictability: Ensuring predictable costs for us and our clients was a growing concern as our business grew.
These challenges drove us to re-evaluate our data infrastructure and seek a cloud-native solution that could meet our streaming first, “zero-touch” ops philosophy, while supporting our demanding OLAP and OLTP workloads.
Why Bigtable? Performance, scalability, and efficiency
Zeotap’s decision to migrate to Bigtable was driven by four key requirements:
Operational simplicity: Moving from ScyllaDB cluster to Bigtable meant eliminating a significant operational burden and achieving “zero-touch ops”. Bigtable abstracts away hardware mapping and node management. This eliminates the need for maintenance windows and helps ensure data rebalancing.
Performance: Zeotap needed predictable performance, even in the face of regularly unpredictable workloads to meet our stringent SLAs. Bigtable’s ability to deliver low latencies for both reads and writes at scale was crucial — especially with spiky traffic patterns.
Efficient scalability: Managing ScyllaDB cluster scaling, rebalancing, and hotspots was operationally intensive. Zeotap handles very spiky and bursty workloads at times exceeding 300,000 writes per second. Bigtable disaggregates compute and storage, allowing for rapid scaling (further enhanced by autoscaling), which automatically adjusts cluster size in response to demand. This lead to more cost efficiency and helped eliminate idle resources.
Total cost of ownership (TCO): A significant driver of this migration was the need for cost efficiency and predictability. By moving from ScyllaDB to Bigtable, we achieved a significant 46% reduction in our TCO. This stems from Bigtable’s efficient storage and the ability to combine use cases, such as using Bigtable as a hot store and BigQuery as a warm store.
Tight integration: Bigtable’s integration with other Google Cloud services, particularly BigQuery, was a major advantage in reducing operational overhead. Features like reverse ETL directly into Bigtable greatly simplifies data pipelines and reduces Zeotap’s operational footprint by 20%.
aside_block
<ListValue: [StructValue([(‘title’, ‘Build smarter with Google Cloud databases!’), (‘body’, <wagtail.rich_text.RichText object at 0x7f583d15dd00>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Zeotap’s architectural evolution to cloud-native
Zeotap’s transition to Bigtable wasn’t an overnight lift-and-shift, but part of a strategic plan to build a streaming real-time analytics platform that could meet the needs of an evermore demanding customer landscape:
2020: After running one of the largest graphs with JanusGraph-on-ScyllaDB and a heavy processing operation with Spark on AWS, we made the strategic move to migrate to Google Cloud.
2022: Adopted a Lambda architecture, heavily pivoting into BigQuery, and moving away from graph due to performance issues. ScyllaDB was acting now as a pure key-value store.
2023: Shifted to a Kappa architecture, prioritizing real-time ingestion and streaming. This was a major network redesign to meet the needs of clients for real-time use cases.
2024: Fully committed to a cloud-native model with Bigtable and BigQuery as its core, while eliminating Spark from our stack.
In our current architecture, Zeotap’s ingestion layer runs via Dataflow and a home-grown streaming engine with a combination of Memorystore and Bigtable powering inline enrichment, transformation, and ingestion. We used Memorystore as a lightning-fast cache layer to speed up read-heavy workloads, while helping to reduce strain on Bigtable. Bigtable serves as the hot store for real-time ingestion and data API for low-latency point lookups, while BigQuery acts as the warm and cold store for analytics, inferencing, and batch processing.
This architectural transformation, with Bigtable at its heart, enables us to:
Consolidate fragmented data: Bigtable handles the complex multi-read/write operations required to build single customer views. The data derives from hundreds of different channels, ERP, CRM, web apps, and data warehouses. The data have different types of ID that need to get stitched together as they get consolidated into Bigtable.
Deliver real-time customer 360: Serves comprehensive customer profiles, including identities, attributes, streaming events, calculated attributes, and consent data — all through our Bigtable-backed data API. This enables the same unified assets available across the entire customer lifecycle — empowering customer support, marketers, and data analysts alike.
Optimize AI pipelines: The synergy between Bigtable as a feature store, and BigQuery as our inferencing platform by leveraging BQML, has dramatically shrunk our time to market for AI model deployment for clients — down from multiple weeks to less than a week.
Results and looking forward
Migrating to Bigtable has delivered substantial, quantifiable benefits for Zeotap. Most notably, we achieved a 46% decrease in Total Cost of Ownership (TCO) compared to our previous infrastructure. This cost efficiency was paired with a 20% reduction in overall operational tasks and overhead — a direct result of the tight integration between Bigtable and BigQuery. Beyond resource savings, the platform now offers enhanced performance and reliability — with lower latencies — enabling us to confidently meet our stringent Service Level Agreement (SLA) commitments. Furthermore, Bigtable has improved our agility, allowing for faster deployment of AI/ML models across various environments with efficient resource utilization, such as reading batch workloads off our Disaster Recovery (DR) cluster.
Transform your data infrastructure with Bigtable
Zeotap’s migration is a compelling example of how choosing the right database can address the challenges of scale, performance, and operational complexity in the era of real-time data and AI. By leveraging Bigtable’s capabilities for high throughput, low-latency reads, and efficient handling of demanding workloads, coupled with its seamless integration with BigQuery, Zeotap built a more flexible, efficient, and cost-effective platform that empowers customers’ real-time data initiatives.
Learn more
Check out the power of Bigtable and begin planning your migration today.
Discover Bigtable’s Cassandra API and tools for no-downtime, no code-change migrations from ScyllaDB and Cassandra
Effective today, all new Amazon MSK Provisioned clusters with Express brokers will support Intelligent Rebalancing at no additional cost. This new capability makes it effortless for customers to execute automatic partition balancing operations when scaling their Kafka clusters up or down. Intelligent Rebalancing maximizes the capacity utilization of MSK Express-based clusters by optimally rebalancing Kafka resources on them for better performance, eliminating the need for customers to manage partitions themselves or via third-party tools. Intelligent Rebalancing performs these operations up to 180 times faster compared to Standard brokers.
MSK Express brokers are designed to deliver up to three times more throughput per-broker, scale up to 20 times faster, and reduce recovery time by 90 percent as compared to Standard brokers running Apache Kafka. With Intelligent Rebalancing, MSK Express-based clusters are continuously monitored for resource imbalance or overload based on intelligent Amazon MSK defaults to maximize cluster performance. When required, brokers are efficiently scaled, without affecting cluster availability for clients to produce and consume data. Customers can now take full advantage of the scaling and performance benefits of MSK Provisioned clusters for Express brokers while simplifying cluster management operations.
Intelligent Rebalancing is being rolled out for all new MSK Provisioned clusters with Express brokers in all AWS Regions, where Express brokers are available. Intelligent Rebalancing does not require any additional configuration or setup to get started. To learn more, see the Amazon MSK Developer Guide.
AWS Control Tower customers can now simply move their accounts to an Organizational Unit (OU) to enroll them under AWS Control Tower governance. This feature helps customers maintain consistency across their AWS environment and simplifies the account creation and enrollment processes. When enrolled, member accounts receive best practice configurations, controls, and baseline resources required for AWS Control Tower governance.
Customers are no longer required to manually update accounts or re-register OUs when migrating accounts or making changes to their OU structure. When an account is moved to a new OU, AWS Control Tower automatically enrolls the account, applying the baseline configurations and controls from the new OU and removing those from the original OU. With this feature, customers can further simplify their new account provisioning workflows by creating an account and then moving it into the right OU using the AWS Organizations console or the CreateAccount and MoveAccount APIs.
Customers on landing zone version 3.1 and higher can opt in to this feature by toggling the automatically enroll accounts flag in their Landing Zone settings or using the Create or UpdateLandingZone APIs by setting the value of the RemediationTypes parameter to Inheritance_Drift. To learn more about this functionality, review Move and enroll accounts with auto-enrollment. For a list of AWS Regions where AWS Control Tower is available, see the AWS Region Table.
Claude Sonnet 4.5 currently leads the SWE-bench Verified benchmarks with enhanced instruction following, better code improvement identification, stronger refactoring judgment, and more effective production-ready code generation. This model excels at powering long-running agents that tackle complex, multi-step tasks requiring peak accuracy—like autonomously managing multi-channel marketing campaigns or orchestrating cross-functional enterprise workflows. In cybersecurity, it can help teams shift from reactive detection to proactive defense by autonomously patching vulnerabilities. For financial services, it can handle everything from analysis to advanced predictive modeling.
Through the Amazon Bedrock API, Claude can now automatically edit context to clear stale information from past tool calls, allowing you to maximize the model’s context. A new memory tool lets Claude store and consult information outside the context window to boost accuracy and performance.
To get started with Claude Sonnet 4.5 in Amazon Bedrock, read the News Blog, visit the AWS GovCloud (US) console console, Anthropic’s Claude in Amazon Bedrock product page, and the Amazon Bedrock pricing page.
Amazon Braket notebook instances now come with native support for CUDA-Q, streamlining access to NVIDIA’s quantum computing platform for hybrid quantum-classical applications. This enhancement is enabled by upgrading the underlying operating system to Amazon Linux 2023, which delivers improved performance, security, and compatibility for quantum development workflows.
Quantum researchers and developers can now seamlessly build and test hybrid quantum-classical algorithms using CUDA-Q’s GPU-accelerated quantum circuit simulation alongside access to quantum processing units (QPUs) from IonQ, Rigetti, and IQM, all within a single managed environment. With this release, developers can now access CUDA-Q directly within the managed notebook environment, simplifying workflows that previously required local deployment or needed to be run via Hybrid Jobs.
CUDA-Q support in Amazon Braket notebook instances is available in all AWS Regions where Amazon Braket is available. To get started, see the Amazon Braket Developer Guide and visit the Amazon Braket product page to learn more about quantum computing on AWS.
Amazon S3 Express One Zone now supports Internet Protocol version 6 (IPv6) addresses for gateway Virtual Private Cloud (VPC) endpoints. S3 Express One Zone is a high-performance storage class designed for latency-sensitive applications.
Organizations are adopting IPv6 networks to mitigate IPv4 address exhaustion in their private networks or to comply with regulatory requirements. You can now access your data in S3 Express One Zone over IPv6 or DualStack VPC endpoints. You don’t need additional infrastructure to handle IPv6 to IPv4 address translation.
S3 Express One Zone support for IPv6 is available in all AWS Regions where the storage class is available at no additional cost. You can set up IPv6 for new and existing VPC endpoints using the AWS Management Console, AWS CLI, AWS SDK, or AWS CloudFormation. To get started using IPv6 on S3 Express One Zone, visit the S3 User Guide.
Written by: Stallone D’Souza, Praveeth DSouza, Bill Glynn, Kevin O’Flynn, Yash Gupta
Welcome to the Frontline Bulletin Series
Straight from Mandiant Threat Defense, the “Frontline Bulletin” series brings you the latest on the threats we are seeing in the wild right now, equipping our community to understand and respond.
Introduction
Mandiant Threat Defense has uncovered exploitation of an unauthenticated access vulnerability within Gladinet’s Triofox file-sharing and remote access platform. This now-patched n-day vulnerability, assigned CVE-2025-12480, allowed an attacker to bypass authentication and access the application configuration pages, enabling the upload and execution of arbitrary payloads.
As early as Aug. 24, 2025, a threat cluster tracked by Google Threat Intelligence Group (GTIG) as UNC6485 exploited the unauthenticated access vulnerability and chained it with the abuse of the built-in anti-virus feature to achieve code execution.
The activity discussed in this blog post leveraged a vulnerability in Triofox version 16.4.10317.56372, which was mitigated in release 16.7.10368.56560.
Gladinet engaged with Mandiant on our findings, and Mandiant has validated that this vulnerability is resolved in new versions of Triofox.
Initial Detection
Mandiant leverages Google Security Operations (SecOps) for detecting, investigating, and responding to security incidents across our customer base. As part of Google Cloud Security’s Shared Fate model, SecOps provides out-of-the-box detection content designed to help customers identify threats to their enterprise. Mandiant uses SecOps’ composite detection functionality to enhance our detection posture by correlating the outputs from multiple rules.
For this investigation, Mandiant received a composite detection alert identifying potential threat actor activity on a customer’s Triofox server. The alert identified the deployment and use of remote access utilities (using PLINK to tunnel RDP externally) and file activity in potential staging directories (file downloads to C:WINDOWSTemp).
Within 16 minutes of beginning the investigation, Mandiant confirmed the threat and initiated containment of the host. The investigation revealed an unauthenticated access vulnerability that allowed access to configuration pages. UNC6485 used these pages to run the initial Triofox setup process to create a new native admin account, Cluster Admin, and used this account to conduct subsequent activities.
Triofox Unauthenticated Access Control Vulnerability
Figure 1: CVE-2025-12480 exploitation chain
During the Mandiant investigation, we identified an anomalous entry in the HTTP log file – a suspicious HTTP GET request with an HTTP Referer URL containing localhost. The presence of the localhost host header in a request originating from an external source is highly irregular and typically not expected in legitimate traffic.
Within a test environment, Mandiant noted that standard HTTP requests issued to AdminAccount.aspx result in a redirect to the Access Denied page, indicative of access controls being in place on the page.
Figure 3: Redirection to AccessDenied.aspx when attempting to browse AdminAccount.aspx
Access to the AdminAccount.aspx page is granted as part of setup from the initial configuration page at AdminDatabase.aspx. The AdminDatabase.aspx page is automatically launched after first installing the Triofox software. This page allows the user to set up the Triofox instance, with options such as database selection (Postgres or MySQL), connecting LDAP accounts, or creating a new native cluster admin account, in addition to other details.
Attempts to browse to the AdminDatabase.aspx page resulted in a similar redirect to the Access Denied page.
Figure 4: Redirection to AccessDenied.aspx when attempting to browse AdminDatabase.aspx
Mandiant validated the vulnerability by testing the workflow of the setup process. The Host header field is provided by the web client and can be easily modified by an attacker. This technique is referred to as an HTTP host header attack. Changing the Host value to localhost grants access to the AdminDatabase.aspx page.
Figure 5: Access granted to AdminDatabase.aspx by changing Host header to localhost
By following the setup process and creating a new database via the AdminDatabase.aspx page, access is granted to the admin initialization page, AdminAccount.aspx, which then redirects to the InitAccount.aspx page to create a new admin account.
Figure 6: Successful access to the AdminCreation page InitAccount.aspx
Figure 7: Admin page
Analysis of the code base revealed that the main access control check to the AdminDatabase.aspx page is controlled by the function CanRunCrticalPage(), located within the GladPageUILib.GladBasePage class found in C:Program Files (x86)TriofoxportalbinGladPageUILib.dll.
public bool CanRunCriticalPage()
{
Uri url = base.Request.Url;
string host = url.Host;
bool flag = string.Compare(host, "localhost", true) == 0; //Access to the page is granted if Request.Url.Host equals 'localhost', immediately skipping all other checks if true
bool result;
if (flag)
{
result = true;
}
else
{
//Check for a pre-configured trusted IP in the web.config file. If configured, compare the client IP with the trusted IP to grant access
string text = ConfigurationManager.AppSettings["TrustedHostIp"];
bool flag2 = string.IsNullOrEmpty(text);
if (flag2)
{
result = false;
}
else
{
string ipaddress = this.GetIPAddress();
bool flag3 = string.IsNullOrEmpty(ipaddress);
if (flag3)
{
result = false;
}
else
...
Figure 8: Vulnerable code in the function CanRunCrticalPage()
As noted in the code snippet, the code presents several vulnerabilities:
Host Header attack – ASP.NET builds Request.Urlfrom the HTTP Host header, which can be modified by an attacker.
No Origin Validation – No check for whether the request came from an actual localhost connection versus a spoofed header.
Configuration Dependence – If TrustedHostIP isn’t configured, the only protection is the Host header check.
Triofox Anti-Virus Feature Abuse
To achieve code execution, the attacker logged in using the newly created Admin account. The attacker uploaded malicious files to execute them using the built-in anti-virus feature. To set up the anti-virus feature, the user is allowed to provide an arbitrary path for the selected anti-virus. The file configured as the anti-virus scanner location inherits the Triofox parent process account privileges, running under the context of the SYSTEM account.
The attacker was able to run their malicious batch script by configuring the path of the anti-virus engine to point to their script. The folder path on disk of any shared folder is displayed when publishing a new share within the Triofox application. Then, by uploading an arbitrary file to any published share within the Triofox instance, the configured script will be executed.
Figure 9: Anti-virus engine path set to a malicious batch script
SecOps telemetry recorded the following command-line execution of the attacker script:
Download a payload from http://84.200.80[.]252/SAgentInstaller_16.7.10368.56560.zip, which hosted a disguised executable despite the ZIP extension
Save the payload to: C:WindowsappcompatSAgentInstaller_16.7.10368.56560.exe
Execute the payload silently
The executed payload was a legitimate copy of the Zoho Unified Endpoint Management System (UEMS) software installer. The attacker used the UEMS agent to then deploy the Zoho Assist and Anydesk remote access utilities on the host.
Reconnaissance and Privilege Escalation
The attacker used Zoho Assist to run various commands to enumerate active SMB sessions and specific local and domain user information.
Additionally, they attempted to change passwords for existing accounts and add the accounts to the local administrators and the “Domain Admins” group.
Defense Evasion
The attacker downloaded sihosts.exe and silcon.exe (sourced from the legitimate domain the.earth[.]li) into the directory C:windowstemp.
Filename
Original Filename
Description
sihosts.exe
Plink (PuTTY Link)
A common command-line utility for creating SSH connections
silcon.exe
PuTTY
A SSH and telnet client
These tools were used to set up an encrypted tunnel, connecting the compromised host to their command-and-control (C2 or C&C) server over port 433 via SSH. The C2 server could then forward all traffic over the tunnel to the compromised host on port 3389, allowing inbound RDP traffic. The commands were run with the following parameters:
While this vulnerability is patched in the Triofox version 16.7.10368.56560, Mandiant recommends upgrading to the latest release. In addition, Mandiant recommends auditing admin accounts, and verifying that Triofox’s Anti-virus Engine is not configured to execute unauthorized scripts or binaries. Security teams should also hunt for attacker tools using our hunting queries listed at the bottom of this post, and monitor for anomalous outbound SSH traffic.
Acknowledgements
Special thanks to Elvis Miezitis, Chris Pickett, Moritz Raabe, Angelo Del Rosario, and Lampros Noutsos
Detection Through Google SecOps
Google SecOps customers have access to these broad category rules and more under the Mandiant Windows Threatsrule pack. The activity discussed in the blog post is detected in Google SecOps under the rule names:
Gladinet or Triofox IIS Worker Spawns CMD
Gladinet or Triofox Suspicious File or Directory Activity
Gladinet Cloudmonitor Launches Suspicious Child Process
Powershell Download and Execute
File Writes To AppCompat
Suspicious Renamed Anydesk Install
Suspicious Activity In Triofox Directory
Suspicious Execution From Appcompat
RDP Protocol Over SSH Reverse Tunnel Methodology
Plink EXE Tunneler
Net User Domain Enumeration
SecOps Hunting Queries
The following UDM queries can be used to identify potential compromises within your environment.
GladinetCloudMonitor.exe Spawns Windows Command Shell
Identify the legitimate GladinetCloudMonitor.exe process spawning a Windows Command Shell.
Identify the execution of a renamed Plink executable (sihosts.exe) or a renamed PuTTy executable (silcon.exe) attempting to establish a reverse SSH tunnel.
metadata.event_type = "PROCESS_LAUNCH"
target.process.command_line = /-Rb/
(
target.process.file.full_path = /(silcon.exe|sihosts.exe)/ nocase or
(target.process.file.sha256 = "50479953865b30775056441b10fdcb984126ba4f98af4f64756902a807b453e7" and target.process.file.full_path != /plink.exe/ nocase) or
(target.process.file.sha256 = "16cbe40fb24ce2d422afddb5a90a5801ced32ef52c22c2fc77b25a90837f28ad" and target.process.file.full_path != /putty.exe/ nocase)
)
Google Public Sector is committed to supporting the critical missions of the U.S. Department of Defense (DoD) by delivering cutting-edge cloud, AI, and data services securely. Today, we are marking an important milestone in that commitment: we have successfully achieved Cybersecurity Maturity Model Certification (CMMC) Level 2 certification under the DoD’s CMMC program.
This certification, validated by a certified third-party assessment organization (C3PAO), affirms that Google Public Sector’s internal systems used to handle Controlled Unclassified Information (CUI) meet the DoD’s rigorous cybersecurity standards for protecting CUI.
Enabling a secure partnership
This CMMC Level 2 certification is a key enabler for our partnership with the DoD. It ensures our teams can operate and collaborate within the defense ecosystem fully supporting the new DoD requirements, allowing us to serve as a trusted partner and support the mission without compromise.
Helping the Defense Industrial Base on their CMMC journey
While this certification does not extend to customer environments, we are also dedicated to helping our partners and customers across the Defense Industrial Base (DIB) on their own CMMC journeys.
Our FedRAMP-authorized cloud services, including Google Workspace, are designed to support DIB suppliers in building their CMMC-compliant solutions with secure, cutting-edge cloud, AI, and data capabilities. You can find all of our compliance resources, including guides for both Google Cloud and Google Workspace, on our central CMMC compliance page. As an example, our Google Workspace CMMC Implementation Guide provides specific configuration details and control mappings and our recent blog details how Google Workspace can help you achieve CMMC 2.0 compliance. These resources are designed to help DIB companies accelerate their own assessments and build their CMMC-compliant solutions on a secure, verified foundation.
Understanding CMMC and the DFARS connection
The CMMC program is a DoD initiative to enhance cybersecurity across the DIB. Its purpose is to verify that contractors have implemented the required security controls, based heavily on NIST Special Publication (SP) 800-171, to protect CUI and Federal Contract Information (FCI).
Many contractors are already familiar with DFARS 252.204-7012, which has long required the implementation of NIST SP 800-171. The new CMMC program is being implemented into contracts via the clause DFARS 252.204-7021. When this clause appears in a solicitation, it makes having achieved a specific CMMC level a mandatory condition for contract award.
A continued commitment to the mission
Our CMMC Level 2 certification is a direct reflection of our commitment to meeting the DoD’s stringent security requirements. It ensures we can continue to support the Department’s mission responsibly and compliantly. We remain committed to our partnership with the DoD, empowering the Defense Industrial Base with cutting-edge cloud, AI, and data services to build a more secure and resilient future.
Catch the highlights from our recent Google Public Sector Summit where we shared how Google Cloud’s AI and security technologies can help advance your mission.
Amazon CloudWatch agent now supports collection of shared memory utilization metrics from Linux hosts running on Amazon EC2 or on-premises environments. This new capability enables you to monitor total shared memory usage in CloudWatch, alongside existing memory metrics like free memory, used memory, and cached memory.
Enterprise applications such as SAP HANA and Oracle RDBMS make extensive use of shared memory segments that were previously not captured in standard memory metrics. By enabling shared memory metric collection in your CloudWatch agent configuration file, you can now accurately assess total memory utilization across your hosts, helping you optimize host and application configurations and make informed decisions about instance sizing.
Amazon CloudWatch agent is supported in all commercial AWS Regions and AWS GovCloud (US) Regions. For Amazon CloudWatch custom metrics pricing, see the CloudWatch Pricing page.
Amazon SageMaker Unified Studio now provides real-time notifications for data catalog activities, enabling data teams to stay informed of subscription requests, dataset updates, and access approvals. With this launch, customers receive real-time notifications for catalog events including new dataset publications, metadata changes, and access approvals directly within the SageMaker Unified Studio notification center. This launch streamlines collaboration by keeping teams updated as datasets are published or modified.
The new notification experience in SageMaker Unified Studio is accessible from a “bell” icon in the top right corner of the project home page. From here, you can access a short list of recent notifications including subscription requests, updates, comments, and system events. To see the full list of all notifications, you can click on “notification center” to see all notifications in a tabular view that can be filtered based on your preferences for data catalogs, projects and event types.
Amazon Web Services (AWS) now supports Internet Protocol version 6 (IPv6) addresses for AWS PrivateLink Gateway and Interface Virtual Private Cloud (VPC) endpoints for Amazon S3.
The continued growth of the internet is exhausting available Internet Protocol version 4 (IPv4) addresses. IPv6 increases the number of available addresses by several orders of magnitude, and customers no longer need to manage overlapping address spaces in their VPCs. To get started with IPv6 connectivity on a new or existing S3 gateway or interface endpoint, configure IP address type for the endpoint to IPv6 or Dualstack. When enabled, Amazon S3 automatically updates the routing tables with IPv6 addresses for gateway endpoints and sets up an Elastic network interface (ENI) with IPv6 addresses for interface endpoints.
IPv6 support for VPC endpoints for Amazon S3 is now available in all AWS Commercial Regions and the AWS GovCloud (US) Regions, at no additional cost. You can set up IPv6 for new and existing VPC endpoints using the AWS Management Console, AWS CLI, AWS SDK, or AWS CloudFormation. To learn more, please refer to the service documentation.
AWS Private Certificate Authority (AWS Private CA) now enables you to create certificate authorities (CAs) and issue certificates that use Module Lattice-based Digital Signature Algorithm (ML-DSA). This feature enables you to begin transitioning your public key infrastructure (PKI) towards post-quantum cryptography, allowing you to put protections in place now to protect the security of your data against future quantum computing threats. ML-DSA is a post-quantum digital signature algorithm standardized by National Institute of Standards and Technology (NIST) as Federal Information Processing Standards (FIPS) 204.
With this feature, you can now test ML-DSA in your environment for certificate issuance, identity verification, and code signing. You can create CAs, issue certificates, create certificate revocation lists (CRLs) and configure online certificate status protocol (OCSP) responders using ML-DSA. Cryptographically relevant quantum computer (CRQC) will be able to break current digital signature algorithms, like Rivest–Shamir–Adleman (RSA) or Elliptic Curve Digital Signature Algorithm (ECDSA), which are expected to be phased out over the next decade.
AWS Private CA support for ML-DSA is available in all commercial AWS Regions, the AWS GovCloud (US) Regions, and the China Regions.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7i-flex instances that deliver up to 19% better price performance compared to C6i instances, are available in the Middle East (UAE) Region. C7i-flex instances provide the easiest way for you to get price performance benefits for a majority of compute intensive workloads. The new instances are powered by the 4th generation Intel Xeon Scalable custom processors (Sapphire Rapids) that are available only on AWS, and offer 5% lower prices compared to C7i.
C7i-flex instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources. With C7i-flex instances, you can seamlessly run web and application servers, databases, caches, Apache Kafka, and Elasticsearch, and more. For compute-intensive workloads that need larger instance sizes (up to 192 vCPUs and 384 GiB memory) or continuous high CPU usage, you can leverage C7i instances.