Cloud

2025 07 08

GCP – Accelerate your AI workloads with the Google Cloud Managed Lustre

Today, we’re making it even easier to achieve breakthrough performance for your AI/ML workloads: Google Cloud Managed Lustre is now GA, and available in four distinct performance tiers that deliver throughput ranging from 125 MB/s, 250 MB/s, 500 MB/s, to 1000 MB/s per TiB of capacity — with the ability to scale up to 8 PB of storage capacity. The Managed Lustre solution is powered by DDN’s EXAScaler, combining DDN’s decades of leadership in high-performance storage with Google Cloud’s expertise in cloud infrastructure.

Managed Lustre provides a POSIX-compliant, parallel file system that delivers consistently high throughput and low latency, essential for:

High-throughput inference: For applications that require near-real-time inference on large datasets, Lustre provides high parallel throughput and sub-millisecond read latency.
Large-scale model training: Accelerate the training cycles of deep learning models by providing rapid access to petabytes-sized datasets. Lustre’s parallel architecture ensures GPUs and TPUs are fed with data, minimizing idle time.
Checkpointing and restarting large models: Save and restore the state of large models during training faster, improving goodput and allowing for more efficient experimentation.
Data preprocessing and feature engineering: Process raw data, extract features, and prepare datasets for training, reducing the time spent on data pipelines.
Scientific simulations and research: Beyond AI/ML, Lustre excels in traditional HPC scenarios like computational fluid dynamics, genomic sequencing, and climate modeling, where massive datasets and high-concurrency access are critical.

Lustre is designed for the highly parallel and random I/O that characterizes many AI/ML training and inference tasks. This parallel processing capability across multiple clients ensures your compute resources are never starved for data.

Performance tiers and pricing

Managed Lustre offers flexible pricing and performance tiers designed to meet the diverse needs of your workloads, whether you’re focused on capacity or highest throughput density.

Throughput MB/s per TiB of storage capacity	Storage pricing per GiB per month
125	$0.145
250	$0.21
500	$0.34
1000	$0.60

Please see more details at the Managed Lustre pricing page.

Irrespective of the aggregate throughput, all tiers come with sub-millisecond read latency, high single-stream throughput, and are perfect for parallel access to many small files.

Driving innovation together: partnering with DDN

Google Cloud’s Managed Lustre is powered by DDN’s EXAScaler, bringing together two industry leaders in high-performance computing and elastic cloud infrastructure. This partnership represents a joint commitment to simplifying the deployment and management of large-scale AI and HPC workloads in the cloud, thanks to:

Trusted leaders: By combining DDN’s decades of expertise in high-performance Lustre with Google Cloud’s global infrastructure and AI ecosystem, we are delivering a foundational capability that removes storage bottlenecks and helps our customers solve their most complex challenges in AI and HPC.
Fully managed and supported solution: Enjoy the benefits of a fully managed service from Google, with comprehensive support from both Google and DDN, for seamless operations and peace of mind.
Global availability and ecosystem integration: Managed Lustre is now globally accessible in multiple Google Cloud regions and integrates with the broader Google Cloud ecosystem, including Google Kubernetes Engine (GKE) and TPUs.

These benefits caught the attention of one of our largest partners, NVIDIA, who is looking forward to having it as part of its NVIDIA AI platform.

“Enterprises today demand AI infrastructure that combines accelerated computing with high-performance storage solutions to deliver uncompromising speed, seamless scalability and cost efficiency at scale. Google and DDN’s collaboration on Google Cloud Managed Lustre creates a better-together solution uniquely suited to meet these needs. By integrating DDN’s enterprise-grade data platforms and Google’s global cloud capabilities, organizations can readily access vast amounts of data and unlock the full potential of AI with the NVIDIA AI platform (or NVIDIA accelerated computing platform) on Google Cloud — reducing time-to-insight, maximizing GPU utilization, and lowering total cost of ownership.” – Dave Salvator, Director of Accelerated Computing Products, NVIDIA

Get started today!

Ready to supercharge your AI/ML and HPC workloads? Getting started with Managed Lustre is simple:

Navigate to Managed Lustre in the Google Cloud console.
Provision your Managed Lustre instance, choosing the performance tier and size that best fits your needs.
Connect your compute instances, GKE clusters to your new high-performance file system.

For detailed instructions and documentation, please visit the Managed Lustre documentation. And if needed, reach out to Google Cloud sales specialists.

Watch the Fireside Chat

Don’t miss the opportunity to learn more about the strategic partnership between Google Cloud and DDN, and the unique capabilities of Managed Lustre. Read the official DDN press release here.

Watch the fireside chat with Sameet Agarwal, VP/GM Storage and Sven Oehme, CTO of DDN, here.

Read More for the details.

2025 07 08

AWS – AWS Parallel Computing Service (PCS) is now available in the AWS Europe (London) Region

Tibor Kiss AWS, Cloud AWS

Today, AWS launches AWS Parallel Computing Service (PCS) in the AWS Europe (London) region, enabling you to easily build and manage High Performance Computing (HPC) clusters using the Slurm workload manager.

AWS PCS is a managed service that makes it easier for you to run and scale your high performance computing (HPC) workloads and build scientific and engineering models on AWS using Slurm. You can use AWS PCS to build complete, elastic environments that integrate compute, storage, networking, and visualization tools. AWS PCS simplifies cluster operations with managed updates and built-in observability features, helping to remove the burden of maintenance. You can work in a familiar environment, focusing on your research and innovation instead of worrying about infrastructure.

To get started, visit the AWS PCS page and the AWS PCS documentation.

Read More for the details.

2025 07 08

GCP – Expanding Z3 family with 9 new VMs and a bare metal instance for storage and I/O intensive workloads

Tibor Kiss Cloud, Google Cloud gcp

Today, we are thrilled to announce the expansion of the Z3 Storage Optimized VM family with the general availability of nine new Z3 virtual machines that offer local SSD capacity ranging from 3 TiB to 18 TiB per VM, complementing existing Z3 VMs which offer 36TiB of Local SSD per VM. We are also very pleased to launch a Z3 bare metal instance, which includes up to 72 TiB of Local SSDs. Z3 VMs enable customers like Shopify, Tenderly and ScyllaDB to achieve impressive performance improvements for their high performance storage workloads by reducing the IO access latency by up to 35% compared to VM instances using previous-generation local SSDs.

Z3 VMs are designed to run I/O-intensive workloads that require large local storage capacity and high storage performance, including SQL, NoSQL, and vector databases, data analytics, semantic data search and retrieval, and distributed file systems. The Z3 bare metal instance provides direct access to the physical server CPUs and is ideal for workloads that require low-level system access like private and hybrid cloud platforms, custom hypervisors, container platforms, or applications with specialized performance or licensing needs.

Both Z3 VMs and the bare metal instance are based on Titanium SSDs, which offload local storage processing from CPU resources to deliver real-time data processing, low-latency, high-throughput storage performance and enhanced storage security. Z3 VMs with Titanium SSD offer up to 36 GiB/s of read throughput and up to 9M IOPS, increasing write storage performance by up to 25% compared to previous generation Local SSDs¹.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5fa49db850>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>

Based on the 4th Gen Intel Xeon scalable processor, Z3 VMs come with up to 176 vCPUs, 1,408 GiB of memory, and 36 TiB of local storage in 11 virtual machine shapes. The Z3 bare metal instance offers 192 vCPUs, 1,536 GiB of memory and 72 TiB of local storage. Z3 VMs and the bare metal instance deliver the connectivity and storage performance that enterprise workloads need, with up to 100 Gbps in standard bandwidth and up to 200 Gbps with Tier1 networking for high-traffic applications.

The expanded Z3 virtual machine portfolio lets you rightsize your infrastructure and scale your clusters to meet workloads requirements by providing larger total local SSD capacity and higher local SSD capacity per vCPU. Z3 offers two different VM types: the standardlssd VM types, which include five VM shapes that offer about 200 GiB of local SSD per vCPU. They are optimized for data analytics (OLAP), and SQL databases like MySQL and Postgres workloads. The highlssd VM types include six different VM shapes and the Z3 bare metal instance. They offer about 400 GiB of local SSD per vCPU and are optimized for distributed databases, data streaming, large parallel file systems and data search.

What our customers and partners are saying

“We are thrilled to announce Nutanix Cloud Clusters coming to Google Cloud at the end of CY25 as part of Nutanix’s commitment to delivering flexible, hybrid cloud solutions. Google Cloud’s Z3 instance types represent a perfect foundation for Nutanix to enable performance and resilience for enterprise applications. We’re excited about our partnership with Google Cloud in empowering our joint customers with greater choice and simplicity in their cloud journey.” – Saveen Pakala, Vice President of Product Management, Nutanix

“OP Labs contributes to the Optimism protocol, which enables orders of magnitude of improved performance and scalability for Ethereum. Z3 reduces p99 block insertion tail latencies by 30-50% for our most I/O-demanding blockchain nodes compared to N2. By migrating our solution to Z3, we will be able to scale our blockchain nodes to handle L2 state growth in a more performant and cost-effective way.” – Zach Howard Senior Staff Engineer, OP Labs

The launch of Google Cloud’s Z3 storage optimized instances with smaller VM shapes represents a leap forward in performance for high-traffic NoSQL environments. In internal tests and customer projects, ScyllaDB has impressively leveraged the advantages of Z3 including extremely low latencies under high read and write loads, high IOPS capacity enabling the processing of massive amounts of data and excellent cost-performance ratio for large-scale production systems. We are very excited to offer Z3 family servers in ScyllaDB Cloud, including Bring Your Own Account (BYOA).” – Avi Kivity, Co-founder and CTO, ScyllaDB

“Shopify has found Z3s to be an excellent platform to build our most performance sensitive storage systems on. We experienced a critical need for both large data volumes while remaining sensitive to latency and throughput on the storage side. While Google has a lot of options, local SSD was really the best fit, and Z3s allowed us to achieve the best price/performance along with enhanced stability appropriate for a source of truth Storage workload. Right now we see these storage optimized VMs as our platform of choice for the future.” – Mattie Toia, VP Infrastructure, Shopify

“Tenderly is built to be your go-to for Web3 production and development, bringing all the necessary infrastructure into one place. This allows teams to operate with speed and confidence, making blockchain technology accessible. We’ve seen impressive results running blockchain workloads on Z3 instances, with a 40% improvement on read latency compared to N2 and N2D instances.” – Ilija Petrovic, SRE Lead, Tenderly

“The VAST AI Operating System gives organizations a unified platform to reason over all of their data – structured, unstructured, and streaming through a global namespace that spans cloud and on-prem environments – enabling intelligent agents and applications to operate with full context and real-time speed. ,For customers running on Google Cloud, Z3 VMs complement this vision by providing the ideal storage infrastructure to accelerate these workloads, ensuring AI pipelines run fast and scale effortlessly in the cloud.” – Renen Hallak, Founder & CEO, VAST Data

Z3 VMs are also the physical foundation of AlloyDB, our flagship PostgreSQL-compatible database service, delivering sophisticated multi-level caching. AlloyDB uses Z3’s expansive local SSDs as an ultra-fast cache, holding datasets up to 25x larger than can be stored in memory. Database queries can access these large, cached datasets at latencies that closely approach in-memory performance, particularly when factoring in overall end-to-end application response times. This is a significant advantage for very large databases, including real-time analytical workloads, as AlloyDB’s high-performance columnar engine operates entirely within this massive cache. AlloyDB on Z3 VMs will soon be available in preview, delivering up to 3x better performance than N-series VMs for transactional workloads, particularly for large datasets.

Enhanced maintenance experience

Z3 instances make it easier for you to plan ahead and schedule maintenance operations at a time of your choosing by providing notice from the system several days in advance of a required maintenance. The new Z3 VMs further enhance the maintenance experience by allowing you to live-migrate an instance during maintenance events for VMs with 18 TiB or less of local SSD storage. For Z3 VMs with 36 TiB of local SSD and for Z3 bare metal instances, you’ll also receive in-place upgrades that preserve your data through the planned maintenance events.

Support for Hyperdisk

Z3 VMs support Hyperdisk, Google Cloud’s workload-optimized block storage that lets you optimize the performance for each workload by independently tuning the storage performance and capacity for each instance.

Z3 VMs are compatible with Hyperdisk Balanced, Hyperdisk Throughput, and Extreme Hyperdisk storage for scalable, high-performance network-attached storage, supporting up to 512 TiB of capacity per instance. For general-purpose workloads, Hyperdisk Balanced, with up to 160K IOPS per instance, offers a mix of performance and cost-efficiency. Hyperdisk Extreme delivers ultra-low latency and supports up to 350K IOPS and 5,000 MiB/s throughput per Z3 VM instance and up 500K IOPS and 10,000 MiB/s throughput for the Z3 bare metal instance — making it well-suited for demanding workloads like databases. Using Hyperdisk for persistent storage and Z3 Local SSD for caching creates an optimal storage architecture for high end databases and mission critical workloads

Get started with Z3 today

Z3 VMs and bare metal instances are available today in most regions worldwide. To start using Z3 instances, select Z3 under the new Storage-Optimized machine family when creating a new VM or GKE node pool in the Google Cloud console. Learn more at the Z3 machine series page. Contact your Google Cloud sales representative for more information on regional availability.

^{1. Results are based on Google Cloud’s internal benchmarking}

Read More for the details.

2025 07 08

GCP – Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview

Tibor Kiss Cloud, Google Cloud gcp

Developers are racing to productize agents, but a common limitation is the absence of memory. Without memory, agents treat each interaction as the first, asking repetitive questions and failing to recall user preferences. This lack of contextual awareness makes it difficult for an agent to personalize their assistance–and leaves developers frustrated.

How we normally mitigate memory problems: So far, a common approach to this problem has been to leverage the LLM’s context window. However, directly inserting entire session dialogues into an LLM’s context window is both expensive and computationally inefficient, leading to higher inference costs and slower response times. Also, as the amount of information fed into an LLM grows, especially with irrelevant or misleading details, the quality of the model’s output significantly declines, leading to issues like “lost in the middle” and “context rot”.

How we can solve it now: Today, we’re excited to announce the public preview of Memory Bank, the newest managed service of the Vertex AI Agent Engine, to help you build highly personalized conversational agents to facilitate more natural, contextual, and continuous engagements. Memory Bank helps us address memory problems in four ways:

Personalize interactions: Go beyond generic scripts. Remember user preferences, key events, and past choices to tailor every response.
Maintain continuity: Pick up conversations seamlessly where they left off across multiple sessions, even if days or weeks have passed.
Provide better context: Arm your agent with the necessary background on a user, leading to more relevant, insightful, and helpful responses.
Improve user experience: Eliminate the frustration of repeating information and create more natural, efficient, and engaging conversations.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5fa0277220>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Where you can access it: Memory Bank is integrated with the Agent Development Kit (ADK) and Agent Engine Sessions. You can define an agent using ADK, enable Agent Engine Sessions to store and manage conversation history within individual sessions. Now, you can enable Memory Bank to provide long-term memory for agents to store, retrieve, and manage relevant information across multiple sessions. You can also use Memory Bank to manage your memories with other agent frameworks including LangGraph and CrewAI.

Here’s how Memory Bank works

It understands and extracts memories from interactions: Using Gemini models, Memory Bank can analyze a user’s conversation history with the agent (stored in Agent Engine Sessions) to extract key facts, preferences, and context to generate new memories. This happens asynchronously in the background, without you needing to build complex extraction pipelines.
It stores and updates memories intelligently: Key information—like “My preferred temperature is 71 degrees,” or “I prefer aisle seats on flights” — is stored persistently and organized by your defined scope, such as user ID. When new information arises, Memory Bank (using Gemini) can consolidate it with existing memories, resolving contradictions and keeping the memories up to date.
It recalls relevant information: When a user starts a new conversation (session), the agent can retrieve these stored memories. This can be a simple retrieval of all facts or a more advanced similarity search (using embeddings) to find the memories most relevant to the current topic, ensuring the agent is always equipped with the right context.

1 - Memory Bank system — A diagram illustrating how an AI agent uses conversation history from Agent Engine Sessions to generate and retrieve persistent memories about the user from Memory Bank.

This entire process is grounded in Google Research’s novel research method (accepted by ACL 2025), which enables an intelligent, topic-based approach to how agents learn and recall information, setting a new standard for agent memory performance.

Let’s take an example. Imagine you’re a retailer in the beauty industry. You have a personal beauty companion equipped with memory that recommends products and skincare routines.

As shown in the illustration, the agent is able to remember the user’s skin type (maintaining context) even after it evolves over time and be able to make personalized recommendations. This is the power of an agent with long-term memory.

Get started today with Memory Bank

You can integrate Memory Bank into your agent in two primary ways:

Develop an agent with Google Agent Development Kit (ADK) for an out-of-the-box experience
Develop an agent that orchestrates API calls to Memory Bank if you are building your agent with any other framework.

To get started, please refer to the official user guide and the developer blog. For hands-on examples, the Google Cloud Generative AI repository on GitHub offers a variety of sample notebooks, including integration with ADK and deployment to the Agent Engine runtime. For those wishing to try Memory Bank with third-party frameworks, we also provide notebook samples for LangGraph and CrewAI.

If you’re a developer using Agent Development Kit (ADK) but have never used Google Cloud before, you can still start by using our new express mode registration for Agent Engine Sessions and Memory Bank. Here’s how it works:

Sign up with your Gmail account to receive an API key
Use the key to access Agent Engine Sessions and Memory Bank
Build and test your agent within the free tier usage quotas
Seamlessly upgrade to a full Google Cloud project when you are ready for production

If you want to know more about Memory Bank, join the Vertex AI Google Cloud community to share your experiences, ask questions, and collaborate on new projects.

Read More for the details.

2025 07 08

GCP – Google is a Leader in the 2025 Gartner® Magic Quadrant™ for Search and Product Discovery

Tibor Kiss Cloud, Google Cloud gcp

We’re thrilled to announce that Google has been named a Leader in the Gartner® Magic Quadrant™ for Search and Product Discovery.

We believe this recognition affirms Google’s evolving commitment to delivering AI solutions tuned to unique industry challenges, empowering businesses to transform the digital commerce experience and deliver high ROI-generating product discovery, through the use of AI in relevance, ranking, and personalization.

1 - MQ Graphic — Download the complimentary 2025 Gartner Magic Quadrant for Search and Product Discovery

Built upon Google’s deep expertise & knowledge of the way users search, interact with & purchase products across a broad landscape of commerce domains, Vertex AI Search for commerce is a fully-managed, AI-first solution tailored for e-commerce.

Redefining product discovery with generative AI

As digital channels continue to increase in traffic and adoption, businesses in e-commerce need to navigate the challenges involved in providing great digital experiences. Customers expect relevance, convenience, and personalization.

Vertex AI Search for commerce uses the best of Google AI to drive personalized product discovery at scale, optimized for revenue-per visitor. This enables businesses to not only understand user intent, but proactively surface the right products and content to shoppers through a variety of multimodal capabilities that leverage the latest advancements in generative AI.

Search and browse: Powering digital commerce sites and applications with Google-quality capabilities highly tuned for e-commerce and revenue maximization.
Conversational search: Enabling real-time, back and forth conversation, powered by Gemini models to guide users through their shopping journeys, while optimizing for conversion.
Recommendations: Deliver highly personalized recommendations at scale.
Semantic image search: Users can search by image (“find this blouse”) or by a combination of text & image (“find the shoes that would look great with this blouse”).

In particular, our customers have noted our leadership in conversational search through one of our latest offerings, Conversational Commerce, which helps retailers provide assisted shopping on any digital channel, to engage with customers in a more natural and human-like conversation. This includes helping customers find their desired products online or helping a store associate answer questions using data from multiple sources to increase buying confidence for customers worldwide.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5f90eb37c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Intuitive search at the speed of intent, tuned for ecommerce

Search is changing as consumers are leveraging longer queries, seeking more comprehensive answers to questions.

Over the past decade, we’ve made significant strides in understanding user intent and refining our proprietary algorithms, now enhanced with gen AI. This has become increasingly important as user search behavior and the shopping journey have evolved, transforming the retail landscape. Unlike traditional models that are not AI-native, Vertex AI Search for commerce nearly eliminates the need for manual overriding to re-rank results and compensate for suboptimal search quality. Our solution is developed to help businesses, from digitally native e-commerce companies to traditional retailers, adapt to this new era of buying and selling by leveraging AI to maximize revenue.

Access to cutting edge AI/ML models: Vertex AI Search for commerce redefines e-commerce search using powerful AI / ML models like Gemini for unparalleled accuracy and relevance.
Leverage Google’s proprietary intelligence: Vertex AI Search for commerce utilizes Google Shopping’s vast query & click datasets, along with our most advanced knowledge graphs, to train our industry-leading AI Relevance models.
Advanced intent detection: Interprets complex queries and nuances in user intent beyond simple keyword matching, focusing on semantic meaning for intuitive results.
AI-based catalog optimization: Retail catalog search results are enhanced by combining Google Shopping’s effective web crawlers with Google’s extensive understanding of web content and our unique AI models.
Deep and scalable personalization: Analyzes individual shopper behavior, preferences, and historical data to deliver tailored product recommendations and search rankings, boosting satisfaction, conversions, and loyalty.
Managed deployment: Offers a fully managed solution for seamless integration and launch, minimizing engineering overhead.
Flexible customization and control: Businesses can customize the search experience with specific business rules and optimization objectives, aligning with unique KPI goals and other unique metrics.

The future of e-commerce is personalized and simplified

We believe being positioned as a Leader in the Gartner® Magic Quadrant™ for Search and Product Discovery underscores Google’s proven ability to deliver real business value. Vertex AI Search for commerce provides a comprehensive, AI-first solution that guides your customers through the buying journey, ensuring they find exactly what they need, every time.

To download the full 2025 Gartner Magic Quadrant for Search and Product Discovery report, click here, and for more information on Vertex AI Search for commerce, see here or register for our upcoming product roadmap webinar.

^{Gartner, Magic Quadrant for Search and Product Discovery – Mike Lowndes, Noam Dorros, Sandy Shen, Aditya Vasudevan, June 24, 2025}

^{Disclaimer: Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google.}

^{GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.}

Read More for the details.

2025 07 08

GCP – Google Public Sector supports AI-optimized HPC infrastructure for researchers at Caltech

Tibor Kiss Cloud, Google Cloud gcp

For decades, institutions like Caltech, have been at the forefront of large-scale artificial intelligence (AI) research. As high-performance computing (HPC) clusters continue to evolve, researchers across disciplines have been increasingly equipped to process massive datasets, run complex simulations, and train generative models with billions of parameters. With this powerful combination, researchers have accelerated scientific discovery across diverse use cases including genomic analysis, drug discovery, weather forecasting, and beyond.

Modern research workloads, driven by AI and HPC, demand processing of structured and unstructured data at an unprecedented scale, while maintaining sub-millisecond storage latency, enterprise-level security and compliance, and reproducibility despite varying hardware and software configurations. This presents significant technical challenges for both researchers and supporting departments.

To accelerate scientific discovery in the AI era, Google Public Sector has announced it will support AI-optimized HPC infrastructure for researchers at Caltech. This new initiative furthers Caltech’s mission to expand human knowledge and benefit society through research integrated with education.

This support provides Caltech researchers four key resources:

A workhorse of diverse processor types, including Cloud GPUs and Google’s custom-design Arm-based processors (Axion) and Cloud Tensor Processing Units (TPUs) for AI acceleration and intense workloads.
Access to Google’s first party datasets like AlphaFold, Earth Engine, Google Maps Platform and VirusTotal to accelerate discoveries across all disciplines.
A fully-managed, unified AI development platform, Vertex AI Platform, which includes Vertex AI Agent Builder and 200+ first-party (Gemini, Imagen 3), third-party, and open (Gemma) foundation models in Model Garden.
Dedicated campus training and workshops for students, researchers, and supporting teams, enabling them to increase AI literacy and adoption.

Google Public Sector and Caltech will integrate this AI-optimized infrastructure with Caltech’s existing HPC research environments to provide researchers instant access while maintaining their existing data and workloads.

One of the first initiatives that will be powered by this new AI infrastructure will be led by Dr. Babak Hassibi, Mose and Lillian S. Bohn Professor of Electrical Engineering and Computing and Mathematical Sciences at Caltech. Dr. Hassibi’s research focuses on making AI models more efficient, which is crucial for the advancement of the field. Current large language models (LLMs) can have billions or trillions of parameters. In efforts to make models even more efficient and useful. Dr. Hassibi’s proposal uses Vertex AI and has the potential to make AI more accessible and sustainable.

“We will be using Vertex AI to develop training methods on TPUs that incorporate pruning, quantization, and distillation, as well as considerations regarding resilience to attacks, into the training process itself. The former has the potential to significantly reduce inference time costs of the trained models, making AI much more accessible and sustainable. The latter can markedly improve the safety of systems that employ AI. Both will allow AI to move to the edge. In addition to the practical benefits, the work will inform the theoretical studies of AI models, in particular, their generalization performance and the limits of their compressibility, which is a major focus of my research group,” said Dr. Babak Hassibi.

By providing access to advanced AI and planet-scale infrastructure, this support enables Caltech researchers to continue to push scientific boundaries, investigate complex problems, and develop innovative solutions.

“We are living a time when we need to answer bigger questions faster, and do more with less. Google Public Sector is excited to support Caltech to build an AI-optimized infrastructure that will lead scientific discoveries across all domains for the best of all our constituents,” said Reymund Dumlao, Director of State & Local Government and Education at Google Public Sector.

At Google Public Sector, we’re passionate about applying the latest cloud, AI and security innovations to help you meet your mission. Subscribe to our Google Public Sector Newsletter to stay informed and stay ahead with the latest updates, announcements, events and more.

For more information about Caltech’s research heritage and centers, visit https://www.caltech.edu/research

Read More for the details.

2025 07 08

GCP – Formula E accelerates its work with Google Cloud Storage and Google Workspace

Tibor Kiss Cloud, Google Cloud gcp

In the high-speed world of global motorsport, operational efficiency and technological innovation are as critical off the track as they are on it. And when it comes to innovating in the field, Formula E, with its focus on the future of mobility and sustainability, regularly takes the checkered flag.

Formula E orchestrates thrilling races with super-electric-power cars in cities worldwide. A central part of this experience is the management and distribution of immense volumes of media content. Over the course of a decade of high-octane racing, Formula E has accumulated a vast media archive, comprising billions of files and petabytes of data.

This valuable collection of content, previously housed in AWS S3 storage, necessitated a more robust, high-performance, and globally accessible solution to fully harness its potential for remote production and content delivery. These needs became especially acute as AI offers more opportunities to leverage archives and historic data.

At the same time, the organization faced internal operational challenges, including common issues like disconnected systems, which limited collaboration, alongside escalating costs from multiple software subscriptions. Formula E sought a unified solution to foster greater integration and prepare for the future of work.

This led to the widespread adoption of Google Cloud’s ecosystem, including Google Workspace, to address both its large-scale data storage needs and internal collaboration and productivity workflows.

A decade of data and disconnected operations

Formula E’s media archive represented a significant challenge due to its sheer scale. With 10 years of racing history captured, the archive contained billions of files and petabytes of data, making any migration a formidable task.

The demands of its global operations, which include taking production to challenging race locations across the globe, require seamless connectivity and high throughput. The organization had previously encountered difficulties connecting to its S3 storage from remote geographical locations. These experiences prompted successful experiments with Google Cloud, underscoring the critical need for a cloud solution that could deliver consistent performance, even in areas with less reliable internet infrastructure.

Internally, Formula E’s use of disparate systems, including Office 365 and multiple communication tools, resulted in disjointed workflows and limited collaborative capabilities. The reliance on multiple software subscriptions for communication and collaboration was also driving up operational costs. The team needed a more integrated and cost-effective environment that could streamline processes and foster a more collaborative culture, setting the stage for future advancements in workplace technology.

A pitstop in the cloud drives better performance

To address these multifaceted challenges, Formula E embarked on a strategic, two-pronged migration. This involved moving its massive media archive to Google Cloud Storage and transitioning its entire staff to Google Workspace.

For the monumental task of migrating its petabyte-scale archive, Formula E collaborated closely with Google Cloud and Mimir, its media management system provider. Following meticulous planning with Mimir and technical support from Google Cloud, the team chose the Google Cloud Storage Transfer Service (STS).

STS is a fully managed service engineered for rapid data transfer, which crucially required no effort from Formula E’s team during the migration itself. A pivotal element that ensured a smooth transition was Cloud Storage’s comprehensive support for S3 APIs. This compatibility enabled Formula E to switch cloud providers without any interruption to its business, guaranteeing continuity for its critical media operations.

To revolutionize its internal operations, Formula E partnered with Cobry, a Google Workspace migration specialist, for a two-phased migration process. The initial phase focused on migrating Google Drive content, which was soon followed by the full migration to Workspace. Cobry not only helped with the technical migration, the company also provided support for change management and training.

To ensure a smooth transition and encourage strong user adoption, Google champions from across the business received in-person training at Google’s offices. The project successfully migrated 220 staff members along with 9 million emails and calendar invites.

The need for speed, collaboration, and savings

The migration of the media archive to Google Cloud proved as successful as a clever hairpin maneuver at the track.

The Storage Transfer Service performed beyond expectations, moving data out of S3 at an impressive rate of 240 gigabits per second — which translates to approximately 100 terabytes per hour. The entire sprawling archive was transferred in less than a day, a level of throughput that impressed even Google Cloud’s internal teams and confirmed that STS could deliver results at major scale. Such efficiency meant that Formula E experienced no business interruption and maintained continuous access to its valuable media assets.

Beyond the rapid migration, Formula E now benefits significantly from Google Cloud’s global network. By leveraging this infrastructure, the organization enjoys lower latency and higher throughput, which are critical for its remote production studios when operating worldwide.

A core technology behind this enhanced performance is Google Cloud’s “Cold Potato Routing.” This strategy keeps data on Google’s private, internal backbone network for as long as possible, using the public internet for only the shortest final leg of the journey. This approach guarantees improved throughput and latency, effectively resolving the connectivity challenges Formula E previously faced in remote race locations.

Google Cloud’s commitment to an open cloud ecosystem, demonstrated by its full support for S3 APIs, was instrumental in facilitating a smooth transition without vendor lock-in.

The transition to Google Workspace, meanwhile, has transformed Formula E’s internal operations, fostering a more integrated and collaborative work environment. Developed with collaboration at its core, Google Workspace has seamlessly integrated Gemini into the team’s workflow. Initially adopted by office-based teams for automating repetitive tasks, formatting, and summarizing data, Gemini is now accessible across the entire business, empowering staff to work more intelligently.

“Moving from Office 365 to Google Workspace has been a great step for our team in productivity,” said Hayley Mann, chief people officer at Formula E. “The enhanced collaboration features have been really beneficial, and we’re excited about how Google’s integrated Al capabilities will empower and enable our people to work smarter and even more efficiently every day.”

The migration is also delivering financial benefits, as Formula E was also able to replace their usage of Slack and Zoom with Chat and Meet in Google Workspace. Further savings are anticipated as other existing software contracts expire.

Racing toward an open ecosystem

Looking to the future, Formula E is also positioned to realize significant cost savings by using Cloud Storage’s Autoclass feature, which intelligently manages storage classes based on access patterns to optimize costs.

With its petabyte-scale media archive now residing on Google Cloud, Formula E is well positioned to continue innovating in media management, ensuring its content is high-performance, cost-effective, and globally accessible for years to come. This includes leveraging new AI capabilities emerging across the Google ecosystem.

And by embracing integrated AI through Gemini and fostering a truly collaborative environment with Google Workspace, Formula E is accelerating its journey towards peak operational efficiency — mirroring the innovation it champions on the racetrack.

Read More for the details.

2025 07 07

GCP – This migration from Snowflake to BigQuery accelerated model building and cut costs in half

Tibor Kiss Cloud, Google Cloud gcp

In 2024, retail sales for consumer packaged goods were worth $7.5 trillion globally. Their sheer variety — from cosmetics to clothing, frozen vegetables to vitamins — is hard to fathom. And distribution channels have multiplied: Think big box stores in the brick-and-mortar world and mega ecommerce sites online. Most importantly, jury-rigged digital tools can no longer keep pace with the ever-growing web of regulations designed to protect consumers and the environment.

SmarterX uses AI to untangle that web. Our Smarter1 AI model aggregates and triangulates publicly available datapoints — hundreds of millions UPCs and SKUs, as well as product composition and safety information — from across the internet. By matching specific products to applicable regulatory information and continuously updating our models for a particular industry or client, SmarterX helps retailers make fully compliant decisions about selling, shipping, and disposing of regulated consumer packaged goods.

And just like our clients, we needed to accelerate and expand our capabilities to keep pace with that data deluge and build better AI models faster. Migrating to Google Cloud and BigQuery gave us the power, speed, and agility to do so.

Embracing BigQuery: a flexible, easy-to-use, AI-enabled data platform

Because we deal with data from so many sources, we needed a cloud-based enterprise data platform to handle multiple formats and schemas natively. That’s exactly what BigQuery gives us. Since data is the foundation of our company and products, we began by migrating all our data — including the data housed in Snowflake — to BigQuery.

With other data platforms, the data has to be massaged before you can work with it: a time-consuming, often manual process. BigQuery is built to quickly ingest and aggregate data in many different formats, and its query engine allows us to work with data right away in whatever format it lands. Coupled with Looker, we can create easy-to-understand visualizations of the complex data in BigQuery without ever leaving Google Cloud.

In addition, because Gemini Code Assist is integrated with BigQuery, even our less-technical team members can do computational and analytical work.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1bb6ef6160>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/1’), (‘image’, None)])]>

An integrated tech stack unleashes productivity and creativity

After 10 years in business, SmarterX was also suffering from system sprawl.

Just as migrating data between platforms is inefficient, developers become less efficient when they have to bounce around different tools. Even with the increase in AI agents to help with coding and development, the tools struggle, too: When hopping among multiple systems, they pick up noise along the way. And governing identity and access management (IAM) individually for all those systems was time-consuming and left us vulnerable to potential security risks caused by misapplied access privileges.

Google Cloud provides a fully integrated tech stack that consolidates our databases, data ingestion and processing pipelines, models, compute power — even our email, documents, and office apps — in a single, unified ecosystem. And its LLMs are integrated throughout that ecosystem, from the Chrome browser to the SQL variants themselves. This obviates building custom pipelines for most new data sets and allows us to work more efficiently and coherently:

We’re now releasing new products 10-times faster than we were prior to migrating to Google Cloud.
We onboard new customers in less than a week instead of six months.
Our data pipelines handle 100 times the data they did previously, so we can maintain hundreds of customer deployments with a relatively small staff.

Consolidating on Google Cloud also lowered our overhead by 50% because we deprecated several of other SaaS platforms and teams can easily engage with Google’s tools without specialized expertise. Our entire team now lives in Google Cloud: Not an hour goes by that we aren’t using some form of the platform’s technology.

Eliminating system sprawl also means we no longer need to maintain security protocols for separate platforms. Permissioning and identity and access management are handled centrally, and Google Cloud makes it easy to stay current on compliance requirements like SOC-2.

A vision for AI in tune with our own: Gemini

The value SmarterX provides our customers relies heavily on our platform’s AI-driven capabilities. Finding the right AI model development platform and AI models was therefore one of the driving forces behind our choice of a new data platform. And when it comes to creating AI models, philosophy matters.

Google’s philosophy dovetails with our own because they’ve always been at the forefront of understanding how people want to access information. Since the company’s expertise makes web data searchable on an enterprise scale, its Gemini models are tuned beautifully to do what SmarterX needs them to. Before switching to Vertex AI and Gemini, it took us months to release a new model; we can now do the same work in a matter of weeks.

When SmarterX hires new team members, we look for creative thinkers, not speakers of a specific coding language. And we want to give our developers the brainspace to focus on complex problem-solving rather than puzzling over syntax for SQL coding. Gemini Code Assist in BigQuery is easy to learn and can accurately handle the syntax for them. That leaves our developers more time for finding innovative solutions.

A smooth migration by a team that knows its stuff

We couldn’t have completed our migration without the support of the Google Technical Onboarding Center. They really know their way around their technology and had spelunking tools at the ready for tricky scenarios we encountered along the way.

In less than a month, we migrated terabytes of data from Snowflake to BigQuery: more than 80 databases and thousands of tables from 21 different sources. We used a two-pronged approach that leveraged external tables for rapid access to data and native tables for optimized query performance.

Prior to the migration, Google provided foundational training for managing and operating the Google Cloud Platform. They also took the time to understand SmarterX technology. So instead of being constrained by a cookie-cutter migration plan, the Google team helped us to design and schedule a migration — with minimal disruptions or downtime — in the way that made the most sense for SmarterX and our customers. Google’s expertise in best-practices for security and identity and access management further enhanced the security of our new cloud environment.

Even though we’re not a huge customer pumping petabytes of data through Google Cloud daily, the team treated us as if we were on par with larger organizations. When you’re literally moving the foundation of your entire business, it feels good to know that Google has your back.

Snowflake felt like a traditional enterprise data warehouse grafted onto the cloud, completely uninfluenced by the AI revolution, with a database that forced us to work in a specific, predetermined way. With BigQuery, we have a real information production system: a computing cloud with a built-in SQL-friendly data platform, a wide-ranging toolset, embedded AI and model development, and a single user interface for developing products our own way.

Unlimited imagination, not roadmaps

Many people are surprised when I tell them that SmarterX doesn’t have roadmaps — we make bets. We’re wagering that companies want AI to solve whatever real-world use cases arise. Rather than telling you what to do, AI has the ability to understand things that were previously impossible to understand, and to help people express ideas that were previously impossible to express.

SmarterX works with over 2,000 brands. Ultimately, what they’re purchasing is the speed at which we can help them solve their business challenges with artificial intelligence. In much the same way, Google Cloud is solving our own technology challenges, sometimes before we even know we have them, so we can deliver top-notch products to our customers.

Instead of doing battle with a growing sprawl of outdated technology, BigQuery and the rest of the Google Cloud integrated toolset is allowing us to relentlessly reinvent ourselves. Not a week goes by when I don’t hear someone say, “Oh, wow, we can do that with Google Cloud too?”

^{Company description}^{SmarterX helps retailers, manufacturers, and logistics companies minimize regulatory risk, maximize sales, and protect consumers and the environment by giving them AI-driven tools to safely and compliantly sell, ship, store, and dispose of their products. Its clients include global brands that are household names all across the world.}

Read More for the details.

2025 07 07

GCP – Chat with confidence: Unpacking security in Looker Conversational Analytics

Tibor Kiss Cloud, Google Cloud gcp

The landscape of business intelligence is evolving rapidly, with users expecting greater self-service and natural language capabilities, powered by AI. Looker’s Conversational Analytics empowers everyone in your organization to access the wealth of information within your data. Select the data you wish to explore, ask questions in natural language, as you would a colleague, and quickly receive insightful answers and visualizations that are grounded in truth, thanks to Looker’s semantic layer. This intuitive approach lowers the technical barriers to data analysis and fosters a more data-driven culture across your teams.

How does this work? At its core, Conversational Analytics understands the intent behind your questions. Enhanced by Gemini models, the process involves interpreting your natural language query, generating the appropriate data retrieval logic, and presenting the results in an easy-to-understand format, often as a visualization. This process benefits from Looker’s semantic model, which simplifies complex data with predefined business metrics, so that Gemini’s AI is grounded in a reliable and consistent understanding of your data.

Prioritizing privacy in Gemini

The rise of powerful generative AI models like Gemini brings incredible opportunities for innovation and efficiency. But you need a responsible and secure approach to data. When users use AI tools, questions about data privacy and security are top of mind. How are prompts and data used? Are they stored? Are they used to train the model?

At Google Cloud, the privacy of your data is a fundamental priority when you use Gemini models, and we designed our data governance practices to give you control and peace of mind. Specifically:

Your prompts and outputs are safe

Google Cloud does not train models on your prompts or your company’s data.

Conversational Analytics only uses your data to answer your business questions — making data queries, creating charts and summarizations, and providing answers. We store agent metadata, such as special instructions, to improve the quality of the agent’s answers, and so you can use the same agent in multiple chat sessions. We also store chat conversations so you can pick up where you left off. Both are protected via IAM and not shared with anyone outside your organization without permission.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1bb9ae4f40>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Your data isn’t used to train our models

The data processing workflow in Conversational Analytics involves multiple steps:

The agent sees the user’s question, identifies the specific context needed to answer it, and uses tools to retrieve useful context like sample data and column descriptions.
Using business and data context and the user question, the agent generates and executes a query to retrieve the data. The data is returned, and the resulting data table is generated.
Previously gathered information can then be used to create visualizations, text explanations, or suggested follow-up questions. Through this process, the system keeps track of the user’s question, the data samples, and the query results, to help formulate a final answer.
When the user asks a new question, such as a follow-up, the previous context of the conversation helps the agent understand the user’s new intent.

Enhancing trust and security in Conversational Analytics

To give you the confidence to rely on Conversational Analytics, we follow a comprehensive set of best practices within Google Cloud:

Leverage Looker’s semantic layer: By grounding Conversational Analytics in Looker’s semantic model, we ensure that the AI model operates on a trusted and consistent understanding of your business metrics. This not only improves the accuracy of insights but also leverages Looker’s established governance framework.
Secure data connectivity: Conversational Analytics connects to Google Cloud services like BigQuery, which have their own robust security measures and access controls. This helps ensure that your underlying data remains protected.
Use data encryption: Data transmitted to Gemini for processing is encrypted in-transit, safeguarding it from unauthorized access. Agent metadata and conversation history are also encrypted.
Continuously monitor and improve: Our teams continuously monitor the performance and security of Conversational Analytics and Gemini in Google Cloud.

Role-based access controls

In addition, Looker provides a robustrole-based access control (RBAC) framework that Conversational Analytics leverages to offer granular control over who can interact with specific data. When a Looker user initiates a chat with data, Looker Conversational Analytics respects their established Looker permissions. This means they can only converse with Looker Explores to which they already have access. For instance, while the user might have permission to view two Looker Explores, an administrator could restrict conversational capabilities to only one. As conversational agents become more prevalent, the user will only be able to use those agents to which they have been granted access. Agent creators also have the ability to configure the capabilities of Conversational Analytics agents, for example limiting the user to viewing charts while restricting advanced functionalities like forecasting.

Innovate with confidence

We designed Gemini to be a powerful partner for your business, helping you create, analyze, and automate with Google’s most capable AI models. We’re committed to providing you this capability without compromising your data’s security or privacy, or training on your prompts or data. By not storing your prompts, data, and model outputs or using them for training, you can leverage the full potential of generative AI with confidence, knowing your data remains under your control.

By following these security principles and leveraging Google Cloud’s robust infrastructure, Conversational Analytics offers a powerful, insightful experience that is also secure and trustworthy. By making data insights accessible to everyone securely, you can unlock new levels of innovation and productivity in your organization. Enable Conversational Analytics in Looker today, and start chatting with your data with confidence.

Read More for the details.

2025 07 07

GCP – Isolated Recovery Environments: A Critical Layer in Modern Cyber Resilience

Tibor Kiss Cloud, Google Cloud gcp

Written by: Jaysn Rye

Executive Summary

As adversaries grow faster, stealthier, and more destructive, traditional recovery strategies are increasingly insufficient. Mandiant’s M-Trends 2025 report reinforces this shift, highlighting that ransomware operators now routinely target not just production systems but also backups. This evolution demands that organizations re-evaluate their resilience posture. One approach gaining traction is the implementation of an isolated recovery environment (IRE)—a secure, logically separated environment built to enable reliable recovery even when an organization’s primary network has been compromised.

This blog post outlines why IREs matter, how they differ from conventional disaster recovery strategies, and what practical steps enterprises can take to implement them effectively.

The Backup Blind Spot

Most organizations assume that regular backups equal resilience; however, that assumption doesn’t hold up against today’s threat landscape. Ransomware actors and state-sponsored adversaries are increasingly targeting backup infrastructure directly, encrypting, deleting, or corrupting it to prevent recovery and increase leverage.

The M-Trends 2025 report reveals that in nearly half of ransomware intrusions, adversaries used legitimate remote management tools to disable security controls and gain persistence. In these scenarios, the compromise often extends to backup systems, especially those accessible from the main domain.

Initial infection vector — Figure 1: Observed tools in 2024 ransomware-related investigations (source: M-Trends 2025)

In short: your backup isn’t safe if it’s reachable from your production network. During an active incident, that makes it irrelevant.

What Is an Isolated Recovery Environment?

An isolated recovery environment (IRE) is a secure enclave designed to store immutable copies of backups and provide a secure space to validate restored workloads and rebuild in parallel while incident responders carry out the forensic investigation. Unlike traditional disaster recovery solutions, which often rely on replication between live environments, an IRE is logically and physically separated from production.

At its core, an IRE is about assuming a breach has occurred and planning for the moment when your primary environment is lost, ensuring you have a clean fallback that hasn’t been touched by the adversary.

Key Characteristics of an IRE

Separation of infrastructure and access: The IRE must be isolated from the corporate environment. No shared authentication, no shared tooling, no shared infrastructure or services, no persistent network links or direct TCP/IP connections between the production environment and the IRE.
Restricted administrative workflows: Day-to-day access is disallowed. Only break-glass, documented processes exist for access during validation or recovery.
Known-good, validated artifacts: Data entering the IRE must be scanned, verified, and stored with cryptographic integrity checks all while maintaining the isolation controls.
Validation environment and tools: The IRE must also include a secured network environment, which can be used by security teams to validate restored workloads and remove any identified attacker remnants.
Recovery-ready templates: Rather than restoring single machines, the IRE should support the rapid rebuild of critical systems in isolation with predefined procedures.

Implementation Strategy

Successfully implementing an IRE is not a checkbox exercise. It requires coordination between security, infrastructure, identity management, and business continuity teams. The following breaks down the major building blocks and considerations.

Infrastructure Segmentation and Physical Isolation

The foundational principle behind an IRE is separation. The environment must not share any critical infrastructure, identity, network, hypervisors, storage, or other services with the production environment. In most cases, this means:

Dedicated platforms (on-premises or cloud based) and tightly controlled virtualization platforms
No routable paths from production to the IRE network
Physical air-gaps or highly restricted one-way replication mechanisms
Independent DNS, DHCP, and identity services

Figure 2 illustrates the permitted flows into and within the IRE.

Identity and Access Control

Identity is the primary attack vector in many intrusions. An IRE must establish its own identity plane:

No trust relationships to production Active Directory
No shared local or domain accounts
All administrative access must require phishing resistant multi-factor authentication (MFA)
All administrative access should be via hardened Privileged Access Workstations (PAW) from within the IRE
Where possible, implement just-in-time (JIT) access with full audit logging

Accounts used to manage the IRE should not have any ties to the production environment; this includes being used from a device belonging to the production domain. These accounts must be used from a dedicated PAW.

Secure Administration Flows

Administrative access is often the weak link that attackers exploit. That’s why an IRE must be designed with tight control over how it’s managed, especially during a crisis.

In the following model, all administrative access is performed from a dedicated PAW. This workstation sits inside an isolated management zone and is the only system permitted to access the IRE’s core components.

Here’s how it works:

No production systems, including IT admin workstations, are allowed to directly reach the IRE. These paths are completely blocked.
The PAW manages the IRE’s:

Isolated Data Vault, where validated backups are stored.
Management Plane, which includes IRE services such as Active Directory, DNS, PAM, backup, and recovery systems.
Green VLAN, which hosts rebuilt Tier-0 and Tier-1 infrastructure.

Any restored services go first into a yellow staging VLAN, a controlled quarantine zone with no east-west traffic. Systems must be verified clean before moving into the production-ready green VLAN. Remote access to machines in the yellow VLAN is restricted to console only access (hypervisor or iLO consoles) from the PAW. No direct RDP/SSH is permitted.

This design ensures that even during a compromise of the production environment, attackers can’t pivot into the recovery environment. All privileged actions are audited, isolated, and console-restricted, giving defenders a clean space to rebuild from.

Figure 3: Permitted administration paths

One-Way Replication and Immutable Storage

How data enters the IRE is just as important as how it’s managed. Backups that are copied into the data transfer zone must be treated as potentially hostile until proven otherwise.

To mitigate risk:

Data must flow in only one direction, from production to IRE, never the other way around.*
This is typically achieved using data diodes or time-gated software replication that enforces unidirectional movement and session expiry.
Ingested data lands in a staging zone where it undergoes:

Hash verification against expected values.
Malware scanning, using both signature and behavioural analysis.
Cross-checks against known-good backup baselines (e.g., file structure, size, time delta).

Once validated, data is committed to immutable storage, often in the form of Write Once, Read Many (WORM) volumes or cloud object storage with compliance-mode object locking. Keys for encryption and retention are not shared with production and must be managed via an isolated KMS or HSM.

The goal is to ensure that even if an attacker compromises your primary backup system, they cannot alter or delete what’s been stored in the IRE.

*Depending on overall recovery strategies, it’s possible that restored workloads may need to move from the IRE back to a rebuilt production environment.

Recovery Workflows and Drills

An IRE is only useful if it enables recovery under pressure. That means planning and testing full restoration of core services. Effective IRE implementations include:

Templates for rebuilding domain controllers, authentication services, and core applications
Automated provisioning of VMs or containers within the IRE
Access to disaster recovery runbooks that can be followed by incident responders
Scheduled tabletop and full-scale recovery exercises (e.g., quarterly or bi-annually)

Many organizations discover during their first exercise that their documentation is out of date or their backups are incomplete. Recovery drills allow these issues to surface before a real incident forces them into view.

Hash Chaining and Log Integrity

If you’re relying on the IRE for forensic investigation as well as recovery, it’s essential to ensure the integrity of system logs and metadata. This is where hash chaining becomes important.

Implement hash chaining on logs stored in the IRE to detect tampering.
Apply digital signatures from trusted, offline keys.
Regularly verify the chain against trusted checkpoints.

This ensures that during an incident, you can prove not only what happened but also that your evidence hasn’t been modified, either by an attacker or by accident.

Choosing the Right IRE Deployment Model

The right model depends on your environment, compliance obligations, and team maturity.

Model	Advantages	Challenges
On-Premises	Full control, better for air-gapped environments	Higher CapEx, longer provisioning time, less flexibility
Cloud	Faster provisioning, built-in automation, easier to test	Requires strong cloud security maturity and IAM separation
Hybrid	Local speed + cloud resilience; ideal for large orgs with critical workloads	More complex design; requires secure identity split and replication paths

Common Pitfalls

Over-engineering for normal operations: The IRE is not a sandbox. Avoid mission creep.
Using the IRE beyond cyber recovery: The IRE is not for DR testing, HA, or daily operations. Any non-incident use risks breaking its isolation and trust model.
Assuming cloud equals isolation: Isolation requires deliberate configuration. Cloud tenancy is not enough.
Neglecting insider threats: The IRE must defend against sabotage from inside the organization, not just ransomware.

Closing Thoughts

As attackers accelerate and the blast radius of intrusions expands, the need for trusted, tamper-proof recovery options becomes clear. An isolated recovery environment is not just a backup strategy, it is a resilience strategy.

It assumes breach. It accepts that visibility may be lost during a crisis. And it gives defenders a place to regroup, investigate, and rebuild.

The Mandiant M-Trends 2025 report makes it clear; the cost of ransomware isn’t just in ransom paid, but in days or weeks of downtime, regulatory penalties, and reputation loss. The cost of building an IRE is less than a breach, and the peace of mind it offers is far greater.

For deeper technical guidance on building secure recovery workflows or assessing your current recovery posture, Mandiant Consulting offers strategic workshops and assessment services.

Acknowledgment

A special thanks to Glenn Staniforth for their contributions.

Read More for the details.

2025 07 03

AWS – Amazon SNS now supports delivery to Amazon Data Firehose in three additional AWS Regions

Tibor Kiss AWS, Cloud AWS

Amazon Simple Notification Services (Amazon SNS) now supports notification delivery to Amazon Data Firehose endpoints in three additional AWS Regions, Asia Pacific (Taipei), Asia Pacific (Thailand) and Mexico (Central) Regions.

You can now use Amazon SNS to deliver notifications to Amazon Data Firehose (Firehose) endpoints for storage and analysis. Through Firehose delivery streams, customers can deliver events to AWS destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon OpenSearch Service, or to third-party destinations such as Datadog, New Relic, MongoDB, and Splunk.

To get started, see the following resources:

Fanout to Firehose delivery streams in the Amazon SNS Developer Guide.
Create Firehose Stream in the Amazon Data Firehose Developer Guide.
SNS pricing for deliveries to Amazon Data Firehose in the Amazon SNS Pricing Page.

Read More for the details.

2025 07 03

AWS – AWS Fargate now supports SOCI Index Manifest v2 for greater deployment consistency

Tibor Kiss AWS, Cloud AWS

Amazon ECS customers using AWS Fargate launch mode now benefit from improved deployment consistency with SOCI Index Manifest v2 support. Seekable OCI (SOCI) accelerates Amazon ECS task launches by enabling containers to start running before the full container image is downloaded. SOCI Index Manifest v2 uses a cryptographic method to establish an explicit link between the image and its manifest, ensuring integrity and consistency during and across all deployment stages.

To get started, create a SOCI index using the new convert subcommand in the soci CLI, available from the SOCI Snapshotter GitHub repository. Once generated, push the container image along with the SOCI index to your Amazon ECR repository, and use it to launch Amazon ECS tasks on AWS Fargate.

As of today, SOCI Index Manifest v2 is the default mechanism for using SOCI with ECS and Fargate. If you’re still using the legacy Manifest v1 implementation, we recommend upgrading to take advantage of the improved reliability and consistency. For more information, see the documentation on using SOCI Index Manifest v2 with Amazon ECS and AWS Fargate and the blog post.

Read More for the details.

2025 07 03

AWS – Amazon Rekognition Face Liveness launches accuracy improvements and new challenge setting for improved UX

Tibor Kiss AWS, Cloud AWS

Today, AWS announces accuracy improvements and new settings for Amazon Rekognition Face Liveness. Amazon Rekognition Face Liveness is a feature of Amazon Rekognition that detects in real-time whether real users, not bad actors using spoofs, can access your services.

Customers across financial, gig economy, telecommunications, healthcare, and social media use Rekognition Face Liveness detection for workflows such as onboarding, authentication, and bot detection. Until now, Rekognition Liveness only offered a single experience with the ‘FaceMovementAndLightChallenge’ setting, which delivers the highest accuracy by requiring users to move their face toward the screen and hold still for a series of flashing lights. With this launch, the new ‘FaceMovementChallenge’ setting reduces the check time by 3 seconds by eliminating the flashing lights. While ‘FaceMovementAndLightChallenge’ remains the best setting to maximize accuracy, ‘FaceMovementChallenge’ allows customers to prioritize faster liveness checks when appropriate. For additional flexibility, ‘FaceMovementChallenge’, allows users to complete checks using the front or back facing camera. Lastly, this update also delivers improved accuracy across both settings to aid with fraud mitigation.

The new Face Liveness settings are available in all AWS commercial regions where Rekognition Liveness is offered at no additional cost. Customers can enable the ‘FaceMovementChallenge’ setting in the CreateFaceLivenessSession API call.

To get started with the new settings, visit the Amazon Rekognition Face Liveness page or refer to the Amazon Rekognition Developer Guide.

Read More for the details.

2025 07 03

AWS – Amazon Connect launches additional APIs to update and delete cases and related case items

Tibor Kiss AWS, Cloud AWS

Amazon Connect now provides APIs that allow you to delete cases, case comments, undo contact associations, and remove service level agreements (SLAs) from cases. These new capabilities enable you to programmatically remove sensitive customer information from cases or delete cases upon a customer’s request.

Amazon Connect Cases is available in the following AWS regions: US East (N. Virginia), US West (Oregon), Canada (Central), Europe (Frankfurt), Europe (London), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo) AWS regions. To learn more and get started, visit the Amazon Connect Cases webpage and documentation.

Read More for the details.

2025 07 03

AWS – Amazon Aurora PostgreSQL database clusters now support up to 256 TiB of storage volume

Tibor Kiss AWS, Cloud AWS

Amazon Aurora PostgreSQL-Compatible Edition now supports a maximum storage limit of 256 TiB, doubling the previous limit of 128 TiB. This enhancement allows customers to store and manage even larger datasets within a single Aurora database cluster simplifying data management for large-scale applications and supporting the growing data needs of modern applications. Customers only pay for the storage they use, with no need for upfront provisioning of the full 256 TiB.

To access the increased storage limit, upgrade your cluster to supported database versions. Once upgraded, Aurora storage will automatically scale up to 256 TiB capacity based on the amount of data in the cluster volume. Visit technical documentation to learn more about supported versions. This new storage volume capacity is available in all AWS regions where Aurora PostgreSQL is available.

Amazon Aurora is designed for unparalleled high performance and availability at global scale with full MySQL and PostgreSQL compatibility. It provides built-in security, continuous backups, serverless compute, up to 15 read replicas, automated multi-Region replication, and integrations with other AWS services. To get started with Amazon Aurora, take a look at our getting started page.

Read More for the details.

2025 07 03

AWS – Amazon Connect now provides enhanced flow designer UI editing features

Tibor Kiss AWS, Cloud AWS

Amazon Connect now provides new editing and accessibility enhancements for the drag-and-drop flow designer making it easier to build customer service experiences. These enhancements include keyboard navigation, auto arranging of blocks, screen reader support, and improved support for high zoom on browsers. Additionally, when editing a flow block in configuration side panel on the flow designer UI, you can view and edit all incoming and outgoing branch connections, create new flow blocks, and review all attached notes. Each of these capabilities can be accessed through new keyboard shortcuts which are visible on the canvas.

To learn more, see the Amazon Connect Administrator Guide. These features are available in all AWS regions where Amazon Connect is available. To learn more about Amazon Connect, the AWS contact center as a service solution on the cloud, please visit the Amazon Connect website.

Read More for the details.

2025 07 03

AWS – Amazon Aurora DSQL is now available in additional AWS Regions

Tibor Kiss AWS, Cloud AWS

Amazon Aurora DSQL is now available in Asia Pacific (Seoul) and supports multi-Region clusters within Asia Pacific Regions – Asia Pacific (Osaka), Asia Pacific (Tokyo), Asia Pacific (Seoul) as well as European Regions – Europe (Ireland), Europe (London), Europe (Paris). Aurora DSQL is the fastest serverless, distributed SQL database with active-active high availability and multi-Region strong consistency. Aurora DSQL enables you to build always available applications with virtually unlimited scalability, the highest availability, and zero infrastructure management. It is designed to make scaling and resilience effortless for your applications and offers the fastest distributed SQL reads and writes.

Aurora DSQL is available in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Osaka), Asia Pacific (Tokyo), Asia Pacific (Seoul), Europe (Ireland), Europe (London), and Europe (Paris).

Get started with Aurora DSQL for free with the AWS Free Tier. To learn more, visit the Aurora DSQL webpage and documentation.

Read More for the details.

2025 07 03

AWS – Amazon Neptune Graph Explorer Introduces Native Query Support for Gremlin and openCypher

Tibor Kiss AWS, Cloud AWS

Today, we are excited to announce the launch of a new feature in Graph Explorer that enables users to write and execute native Gremlin and openCypher queries directly within the interface.

This enhancement empowers data scientists, developers, and database administrators to seamlessly interact with their graph databases using their preferred query language, eliminating the need for additional tools or interfaces. With this update, users can now leverage the full expressive power of both Gremlin and openCypher to traverse complex relationships, perform advanced pattern matching, and extract valuable insights from their graph data while enjoying the intuitive visual environment of Graph Explorer.

To get started, create a new Notebook from the Amazon Neptune console, and start the Graph Explorer from the Notebook actions menu. You can also contribute to the graph-explorer GitHub project here. For more information on how graph-explorer works with Amazon Neptune, see the Amazon Neptune User Guide.

Read More for the details.

2025 07 03

AWS – Amazon EC2 R7i instances are now available in Asia Pacific (Hyderabad) Region

Tibor Kiss AWS, Cloud AWS

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) R7i instances are available in Asia Pacific (Hyderabad) Region.

Amazon EC2 R7i instances are powered by custom 4th Generation Intel Xeon Scalable processors (code-named Sapphire Rapids), available only on AWS, and offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.

R7i instances deliver up to 15% better price-performance versus R6i instances. These instances are SAP certified and are a great choice for memory-intensive workloads, such as SAP, SQL and NoSQL databases, distributed web scale in-memory caches, in-memory databases like SAP HANA, and real time big data analytics like Hadoop and Spark. They offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl) for high-transaction and latency-sensitive workloads. These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology, allowing customers to facilitate efficient offload and acceleration of data operations, and optimize performance for workloads.

R7i instances support the new Intel Advanced Matrix Extensions (AMX) that accelerate matrix multiplication operations for applications such as CPU-based ML. In addition, customers can now attach up to 128 EBS volumes to an R7i instance (vs 28 EBS volume attachments on R6i). This allows processing of larger amounts of data, scale workloads, and improve performance over R6i instances.

To learn more, visit Amazon EC2 R7i Instances.

Read More for the details.

2025 07 02

AWS – Amazon S3 Express One Zone now supports tags for cost allocation and attribute-based access control

Tibor Kiss AWS, Cloud AWS

Amazon S3 Express One Zone, a high performance S3 storage class, now supports tags for cost allocation and attribute-based access control (ABAC). You can add tags to S3 directory buckets to track and organize AWS costs using AWS Billing and Cost Management. Additionally, with ABAC support, you can extend your tag-based access control to new and existing users, roles, and directory buckets. This helps eliminate frequent AWS Identity and Access Management (IAM) or S3 bucket policy updates, simplifying how you scale access governance.

S3 Express One Zone supports tags on directory buckets in all AWS Regions where the storage class is available. You can get started with tagging using the AWS Management Console, S3 REST API, AWS CLI, or the AWS SDK. To learn more about using tags to simplify cost allocation or ABAC, visit the S3 User Guide.

Read More for the details.