Amazon VPC Route Server enhances connectivity monitoring and logging with new network metrics. These metrics allow customers to proactively monitor network health, troubleshoot network issues and gain visibility into route propagation and peer connectivity.
Previously, customers would manually track Border Gateway Protocol (BGP) and Bidirectional Forwarding Detection (BFD) session changes and often required AWS Support assistance for network troubleshooting. With these new logging capabilities, customers can independently monitor and diagnose network connectivity issues, leading to faster resolution times and improved network visibility. The enhancement offers real-time monitoring of BGP and BFD session states, historical peer-to-peer session data logging and flexible log delivery options through CloudWatch, S3, Data Firehose or AWS CLI.
Enhanced logging for VPC Route Server is available in all AWS commercial regions where VPC Route Server is supported. Data charges for vended logs will apply for Amazon CloudWatch, Amazon S3, and Amazon Data Firehose. To learn more about this feature, please refer to our documentation.
Amazon Web Services (AWS) announces the availability of Amazon EC2 I4i Instances in the AWS Europe (Spain) region. Designed for large I/O intensive storage workloads, these instances are powered by 3rd generation Intel Xeon Scalable processors (Ice Lake) and deliver high local storage performance within Amazon EC2 using 2nd generation AWS Nitro NVMe SSDs.
I4i instances offer up to 30 TB of NVMe storage from Amazon Nitro SSDs and Torn Write Prevention (TWP) feature which can improve database transactions per seconds without compromising resiliency. These instances deliver high performance and are ideal for databases such as MySQL, Oracle DB, and Microsoft SQL Server, and NoSQL databases such as MongoDB, Couchbase, Aerospike, and Redis where low latency local NVMe storage is needed in order to meet application service level agreements (SLAs). I4i instances offer up to 75 Gbps of network bandwidth and up to 40 Gbps of bandwidth to the Amazon Elastic Block Store (Amazon EBS).
Customers can purchase the I4i instances via Savings Plans, Reserved, On-Demand, and Spot instances.
Today, AWS Entity Resolution introduces near real-time rule-based matching to enable customers to match new and existing records within seconds.
With this launch, organizations can match records in near real-time to support use cases across multiple industries that require low-latency, time-sensitive matching on consumer records. For example, travel and hospitality companies can match consumer records in near real-time to prioritize callers in contact centers, recognize loyal customers, deliver tailored product recommendations, and personalize guest check-in experiences. Healthcare organizations can match patient records in near real-time, with appropriate patient consent, to provide clinicians with a complete view of medical history and improve care coordination. Additionally, financial institutions can match new and historical financial transactions within seconds to identify discrepancies and detect fraudulent transactions. When processing incremental records using rule-based matching, AWS Entity Resolution compares the incoming record against existing records, returns a match using a consistent match ID, or generates a new match ID, all within seconds.
AWS Entity Resolution helps organizations match, link, and enhance related customer, product, business, or healthcare records stored across multiple applications, channels, and data stores. You can get started in minutes using matching workflows that are flexible, scalable, and can seamlessly connect to your existing applications, without any expertise in entity resolution or ML. AWS Entity Resolution is generally available in these AWS Regions. To learn more, visit AWS Entity Resolution.
Many organizations in regulated industries and the public sector that want to start using generative AI face significant challenges in adopting cloud-based AI solutions due to stringent regulatory mandates, sovereignty requirements, the need for low-latency processing, and the sheer scale of their on-premises data. Together, these can all present institutional blockers to AI adoption, and force difficult choices between using advanced AI capabilities and adhering to operational and compliance frameworks.
GDC Sandbox can help organizations harness Google’s gen AI technologies while maintaining control over data, meeting rigorous regulatory obligations, and unlocking a new era of on-premises AI-driven innovation. With flexible deployment models, a robust security architecture, and transformative AI applications like Google Agentspace search, GDC Sandbox enables organizations to accelerate innovation, enhance security, and realize the full potential of AI.
Secure development in isolated environments
For sovereign entities and regulated industries, a secure Zero Trust architecture via platforms like GDC Sandbox is a prerequisite for leveraging advanced AI. GDC Sandbox lets organizations implement powerful use cases — from agentic automation and secure data analysis to compliant interactions — while upholding sovereign Zero Trust mandates for security and compliance.
“GDC Sandbox provides Elastic with a unique opportunity to enable air-gapped gen AI app development with Elasticsearch, as well as enable customers to rapidly deploy our Security Incident & Event Management (SIEM) capabilities.” – Ken Exner, Chief Product Officer, Elastic
“Accenture is excited to offer Google Distributed Cloud air-gapped to customers worldwide as a unique solution for highly secure workloads. By using GDC Sandbox, an emulator for air-gapped workloads, we can expedite technical reviews, enabling end-customers to see their workloads running in GDC without the need for lengthy proofs of concept on dedicated hardware.” – Praveen Gorur, Managing Director, Accenture
Air-gapped environments are challenging
Public sector agencies, financial institutions, and other organizations that handle sensitive, secret, and top-secret data are intentionally isolated (air-gapped) from the public internet to enhance security. This physical separation prevents cyberattacks and unauthorized data access from external networks, helping to create a secure environment for critical operations and highly confidential information. However, this isolation significantly hinders the development and testing of cutting-edge technologies. Traditional air-gapped development often requires complex hardware setups, lengthy procurement cycles, and limits access to the latest tools and frameworks. These limitations hinder the rapid iteration cycles essential to development.
Video Analysis Application Built on GDC Sandbox
According to Gartner® analyst Michael Brown in the recent report U.S. Federal Government Context: Magic Quadrant for Strategic Cloud Platform Services, where Google Cloud is evaluated as a Notable Vendor, “Federal CIOs will need to consider cost and feature availability in selecting a GCC [government community cloud] provider. Careful review of available services within the compliance scope is necessary. A common pitfall is the use of commercially available services in early solution development and subsequently finding that some of those services are not available in the target government community environment. This creates technical debt requiring refactoring, which results in delays and additional expense.”
GDC Sandbox: A virtualized air-gapped environment
GDC Sandbox addresses these challenges head-on. This virtual environment emulates the experience of GDC air-gapped, allowing you to build, test, and deploy gen AI applications using popular development tools and CI/CD pipelines. With it, you don’t need to procure hardware or set up air-gapped infrastructure to test applications with stringent security requirements before moving them to production. Customers can leverage Vertex AI APIs for key integrations with GDC Sandbox – AI Optimized including:
GPUs: Dedicate user-space GPUs for gen AI development
Interacting with GDC Sandbox
One of the things that sets GDC Sandbox apart is its consistent user interface. As seen above, developers familiar with Google Cloud will find themselves in a comfortable and familiar environment, which helps streamline the development process and reduces the learning curve. This means you can jump right into building and testing your gen AI applications without missing a beat.
“GDC Sandbox has proven to be an invaluable tool to develop and test our solutions for highly regulated customers who are looking to bring their air-gapped infrastructures into the cloud age.” – David Olivier, Defense and Homeland Security Director, Sopra Steria Group
“GDC Sandbox provides a secure playground for public sector customers and other regulated industries to prototype and test how Google Cloud and AI can solve their unique challenges. By ensuring consistency with other forms of compute, we simplify development and deployment, making it easier for our customers to bring their ideas to life. We’re excited to see how our customers use the GDC Sandbox to push the boundaries of what’s possible.” – Will Grannis, VP & CTO, Google Cloud
The GDC Sandbox architecture and experience
GDC Sandbox offers developers a familiar and intuitive environment by mirroring the API, UI, and CLI experience of GDC air-gapped and GDC air-gapped appliance. It offers a comprehensive suite of services, including virtual machines, Kubernetes clusters, storage, observability, and identity management. This allows developers to build and deploy a wide range of gen AI applications, and leverage the power of Google’s AI and machine learning expertise within a secure, dedicated environment.
GDC Sandbox – Product Architecture
Use cases for GDC Sandbox
GDC Sandbox offers numerous benefits for organizations with air-gapped environments. Let’s explore some compelling use cases:
Gen AI development: Develop and test Vertex and gen AI applications via GPUs to cost-effectively validate them in secure production environments.
Partner enablement: Empower partners to build applications, host GDC Marketplace offerings, train personnel, and prepare services for production.
Training and proof of concepts: Provide hands-on training for developers and engineers on GDC air-gapped technologies and best practices. Deliver ground-breaking new capabilities and showcase the art of the possible for customers and partners.
Building applications in GDC Sandbox
GDC Sandbox leverages containers and Kubernetes to host your applications. To get your application up and running, follow these steps:
Build and push: Build your application image locally using Docker and ensure your Dockerfile includes all necessary dependencies. Tag your image in your source repository then sync with the Harbor instance URI and push it to the provided Harbor repository.
Deploy with Kubernetes: Create a Kubernetes deployment YAML file that defines your application’s specifications, including the Harbor image URI and the necessary credentials to access the image. Apply this file using the kubectl command-line tool to deploy your application to the Kubernetes cluster within the Sandbox.
Expose and access: Create a Kubernetes service to expose your application within the air-gap. Retrieve the service’s external IP using kubectl get svc to access your application.
Migrate and port: Move your solutions from GDC Sandbox to GDC air-gapped and appliance deployments.
U.S. Federal Government Context: Magic Quadrant for Strategic Cloud Platform Services, By Michael Brown, 3 February 2025
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
As the first fully cloud-native private bank in Switzerland, Alpian stands at the forefront of digital innovation in the financial services sector. With its unique model blending personal wealth management and digital convenience, Alpian offers clients a seamless, high-value banking experience.
Through its digital-first approach built on the cloud, Alpian has achieved unprecedented agility, scalability, and compliance capabilities, setting a new standard for private banking in the 21st century. In particular, its use of generative AI gives us a glimpse of the future of banking.
The Challenge: Innovating in a Tightly Regulated Environment
The financial industry is one of the most regulated sectors in the world, and Switzerland’s banking system is no exception. Alpian faced a dual challenge: balancing the need for innovation to provide cutting-edge services while adhering to stringent compliance standards set by the Swiss Financial Market Supervisory Authority (FINMA).
Especially when it came to deploying a new technology like generative AI, the teams at Alpian and Google Cloud knew there was virtually no room for error.
Tools like Gemini have streamlined traditionally complex processes, allowing developers to interact with infrastructure through simple conversational commands. For instance, instead of navigating through multiple repositories and manual configurations, developers can now deploy a new service by simply typing their request into a chat interface.
This approach not only accelerates deployment times — reducing them from days to mere hours — it’s also empowered teams to focus on innovative rather than repetitive tasks.
There are limits, to be sure, both to ensure security and compliance, as well as focus on the part of teams.
Thanks to this platform with generative AI, we haven’t opened the full stack to our engineers, but we have created a defined scope where they can interact with different elements of our IT using a simplified conversational interface. It’s within these boundaries that they have the ability to be autonomous and put AI to work.
Faster deployment times translate directly into better client experiences, offering quicker access to new features like tailored wealth management tools and enhanced security. This integration of generative AI has not only optimized internal workflows but also set a new benchmark for operational excellence in the banking sector.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0ff9deda60>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
A Collaborative Journey to Success
Alpian worked closely with its team at Google Cloud to find just the right solutions to meet it’s evolving needs. Through strong trust, dedicated support and expertise, they were able to optimize infrastructure, implement scalable solutions, and leverage AI-powered tools like Vertex AI and BigQuery.
“Google Cloud’s commitment to security, compliance, and innovation gave us the confidence to break new ground in private banking,” Damien Chambon, head of cloud at Alpian, said.
Key Results
Alpian’s cloud and AI work has already had a meaningful impact on the business:
Enhanced developer productivity with platform engineering, enabling more independence and creativity within teams.
Automated compliance workflows, aligning seamlessly with FINMA’s rigorous standards.
Simplified deployment processes, reducing infrastructure complexity with tools like Gemini
These achievements have enabled Alpian to break down traditional operational silos, empowering cross-functional teams to work in harmony while delivering customer-focused solutions.
Shaping the Future of Private Banking
Alpian’s journey is just beginning. With plans to expand its AI capabilities further, the bank is exploring how tools like machine learning and data analytics can enhance client personalization and operational efficiency. By leveraging insights from customer interactions and integrating them with AI-driven workflows, Alpian aims to refine its offerings continually and remain a leader in the competitive digital banking space.
By aligning technological advancements with regulatory requirements, Alpian is creating a model for the future of banking — one where agility, security, and customer-centricity can come together seamlessly and confidently.
As an AI/ML developer, you have a lot of decisions to make when it comes to choosing your infrastructure — even if you’re running on top of a fully managed Google Kubernetes Engine (GKE) environment.While GKE acts as the central orchestrator for your AI/ML workloads — managing compute resources, scaling your workloads, and simplifying complex workflows — you still need to choose an ML framework, your preferred compute (TPU or GPUs), a scheduler (Ray, Kueue, Slurm) and how you want to scale your workloads. By the time you have to configure storage, you’re facing decision fatigue!
You could simply choose Google’s Cloud Storage for its size, scale and cost efficiency. However, Cloud Storage may not be a good fit for all use cases. For instance, you might benefit from a storage accelerator in front of Cloud Storage like Hyperdisk ML for better model weights load times. But in order to benefit from the acceleration these bring, you would need to develop custom workflows to orchestrate data transfer across storage systems.
Introducing GKE Volume Populator
GKE Volume Populator is targeted at organizations that want to store their data in one data source and let GKE orchestrate the data transfers. To achieve this, GKE leverages the Kubernetes Volume Populator feature through the same PersistentVolumeClaim API that customers use today.
GKE Volume Populator along with the relevant CSI drivers dynamically provision a new destination storage volume and transfer data from your Cloud Storage bucket to the destination storage volume. Your workload pods then wait to be scheduled until the data transfer is complete.
Using GKE Volume Populator provides a number of benefits:
Low management overhead: As part of a managed solution that’s enabled by default, GKE Volume Populator handles the data transfer, so you don’t need to build a bespoke solution for data hydration but leave it to GKE.
Optimized resource utilization: Your workload pods are blocked for scheduling until the data transfer completes. You can use your GPUs/TPUs for other tasks while data is being transferred.
Easy progress tracking: Monitor the data transfer progress by checking the event message on your PVC object.
Customers like Abridge AI report that GKE Volume Populator is helping them streamline their AI development processes.
“Abridge AI is revolutionizing clinical documentation by leveraging generative AI to summarize patient-clinician conversations in real time. By adopting Google Cloud Hyperdisk ML, we’ve accelerated model loading speeds by up to 76% and reduced pod initialization times. Additionally, the new GKE Volume Populator feature has significantly streamlined access to large models and LoRA adapters stored in Cloud Storage buckets. These performance improvements enable us to process and generate clinical notes with unprecedented efficiency — especially during periods of high clinician demand.” – Taruj Goyal, Software Engineer, Abridge
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0ff9d562b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Accelerate your data via Hyperdisk ML
Let’s say you have an AI/ML inference workload, and your data is stored in a Cloud Storage bucket, you want to move your data from the Cloud Storage bucket to a Hyperdisk ML instance to accelerate the loading of model weights, scale up to 2,500 concurrent nodes and reduce the pod over-provisioning. Here’s how to do this with GKE Volume Populator:
1. Prepare your GKE Cluster: Create a GKE cluster with the corresponding CSI driver, and enable Workload Identity Federation for GKE.
2. Set up necessary permissions: Configure permissions so that GKE Volume Populator has read access to your Cloud Storage bucket.
3. Define Your data source: Create a GCPDataSource This specifies:
The URL of the Cloud Storage bucket that contains your data
The Kubernetes Service Account you created with read access to the bucket
4. Create your PersistentVolumeClaim: Create a PVC that refers to the GCPDataSource you created in step 3 and the corresponding StorageClass for the destination storage.
5. Deploy Your AI/ML workload: Create your inference workload with the PVC. Configure this workload to use the PVC you created in step 4.
GKE Volume Populator is generally available, and support for Hyperdisk ML is in preview. To enable it in your console, reach out to your account team.
“There’s no red teaming on the factory floor,” isn’t an OSHA safety warning, but it should be — and for good reason. Adversarial testing in most, if not all, manufacturing production environments is prohibited because the safety and productivity risks outweigh the value.
If resources were not a constraint, the security team would go build another factory with identical equipment and systems and use it to conduct proactive security testing. Almost always, costs outweigh the benefits, and most businesses simply can not support the expense.
This is where digital twins can help. Digital twins are essentially IT stunt doubles, cloud-based replicas of physical systems that use real-time data to create a safe environment for security and resilience testing. The digital twin environment can be used to test for essential subsystem interactions and repercussions as the systems transition from secure states to insecure states.
aside_block
<ListValue: [StructValue([(‘title’, ‘Don’t test in prod: Use digital twins for safer, smarter resilience’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0ffa1783a0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://cloud.google.com/transform/dont-test-in-prod-use-digital-twins-safer-smarter-resilience’), (‘image’, None)])]>
Security teams can operationalize digital twins and resilience analysis using the following approach:
Gain a deep understanding about the correlations between the leading indicators of cyber resilience and the role of digital twins in becoming resilient. The table below offers this mapping.
Get buy-in from business leaders, including the CMO, CIO, and CTO. Security teams should be able to demonstrate the strategic value to the organization by using digital twins for adversarial security testing without disrupting production.
Identify the right mix of engineers and security experts, as well as appropriate technologies to execute the strategy. Google Cloud’s security and infrastructure stack is positioned to help security teams achieve operational digital twins for security (see table below).
Cyber resilience
leading indicator
Role of digital twins
Hard-restart recovery time
Simulate various system failure scenarios on the digital twins and discover subsequent rebuild processes. Identify areas of improvement, optimal recovery procedures, and bottlenecks.
Cyber-physical modularity
Use digital twins to quantify the impact of single point failures on the overall production process. Use the digital twin environment to measure metrics such as the mean operational capability of a service in a degraded state and trackability of the numbers of modules impacted by each single point failure.
Internet denial and communications resilience
Simulate the loss of internet connectivity to the digital twins and measure the proportion of critical services that continue operating successfully. Assess the effectiveness of the backup communication systems and the response speed. This process can also be applied to the twins of non-internet facing systems.
Manual operations
Disrupt the automation controls on the digital twins and measure the degree to which simulation of manual control can sustain a minimum viable operational delivery objective. Incorporate environmental and operational constraints such as the time taken for the personnel to manually control.
Control pressure index (CPI)
Model the enablement of security controls and dependencies on the digital twins to calculate CPI. Then, simulate failures of individual controls or a combination of controls to assess the impact. Discover defense-in-depth improvement opportunities.
Software reproducibility
Not applicable
Preventative maintenance levels
Explore and test simulated failures to optimize and measure preventative maintenance effectiveness. Simulate the impact of maintenance activities, downtime reduction, and evaluate return on investment (ROI).
Inventory completeness
Inventory completeness will become apparent during the digital twin construction process.
Stress-testing vibrancy
Conduct red teaming, apply chaos engineering principles, and stress test the digital twin environment to assess the overall impact.
Common mode failures
In the twin environment, discover and map critical dependencies and identify potential common mode failures that could impact the production process. In a measurable manner, identify and test methods of reducing risk of cascading failures during disruption events.
What digital twins architecture can look like with Google Cloud
To build an effective digital twin, the physics of the electrical and mechanical systems must be represented with sufficient accuracy.
The data needed for the construction of the twin should either come from the physical sensors or computed using mathematical representations of the physical process. The twin should be modeled across three facets:
Subsystems:Modeling the subsystems of the system, and pertinent interactions between the subsystems (such as a robotic arm, its controller, and software interactions).
Networks:Modeling the network of systems and pertinent interactions (such as plant-wide data flow and machine-to-machine communication).
Influencers:Modeling the environmental and operational parameters, such as temperature variations, user interactions, and physical anomalies causing system and network interruptions.
Developing digital twins in diverse OT environments requires secure data transmission, compatible data storage and processing, and digital engines using AI, physics modeling, applications, and visualization. This is where comprehensive end-to-end monitoring, detection, logging, and response processes using tools such as Google Security Operations and partner solutions comes in.
The following outlines one potential architecture for building and deploying digital twins with Google Cloud:
Compute Engine to replicate physical systems on a digital plane
Cloud Storage to store data, simulate backup and recovery
Cloud Monitoring to emulate on-prem monitoring and evaluate recovery process
BigQuery to store, query, and analyze the datastreams received from MDE and to perform adversarial testing’s postpartum analysis
Spanner Graph and partner solutions such as neo4j to build and enumerate the industrial process based on graph-based relationship modeling
Machine learning services (including Vertex AI, Gemini in Security, partner models through Vertex AI Model Garden) to rapidly generate relevant failure scenarios and discover opportunities of secure customized production optimization. Similarly, use Vision AI tools to enhance the digital twin environment, bringing it closer to the real-world physical environment.
Cloud Run functions for serverless compute platform, which can run failure-event-driven code and trigger actions based on digital twin insights
Looker to visualize and create interactive dashboards and reports based on digital twin and event data
Apigee to securely expose and manage APIs for the digital twin environment. This allows for controlled access to real-time data from on-prem OT applications and systems. For example, Apigee can manage APIs for accessing building OT sensor data, controlling HVAC systems, and integrating with third-party applications for energy management.
Google Distributed Cloud to run digital twins in an air-gapped, on-premises, containerized environment
An architectural reference for building and deploying digital twins with Google Cloud.
Security and engineering teams can use the above Google Cloud services illustration as a foundation and customize it to their specific requirements. While building and using digital twins, both security of the twins and security by the twins are critical. To ensure that the lifecycle of the digital twins are secure, cybersecurity hardening, logging, monitoring, detection, and response should be at the core design, build, and execution processes.
This structured approach enables modelers to identify essential tools and services, define in-scope systems and their data capabilities, map communication and network routes, and determine applications needed for business and engineering functions.
Getting started with digital twins
Digital twins are a powerful tool for security teams. They help us better understand and measure cyber-physical resilience through safe application of cyber-physical resilience leading indicators. They also allow for the adversarial testing and analysis of subsystem interactions and the effects of systems moving between secure and insecure conditions without compromising safety or output.
Security teams can begin right away to use Google Cloud to build and scale digital twins for security:
Identify the purpose and function that security teams would like to simulate, monitor, optimize, design, and maintain for resilience.
Select and identify the right physical or industrial object, system, or process to be replicated as the digital twin.
Identify pertinent data flows, and interfaces, and dependencies for data collection and integration.
Be sure to understand the available IT and OT, cloud, and on-premises telemetry across the physical or industrial object,system, or process.
Create the virtual model that accurately represents its physical counterpart in all necessary aspects.
The replica should be connected to its physical counterpart to facilitate real-time data flow to the digital twin. Use a secure on-premises connector such as MDE to make the secure connection between the physical and digital environments running on Google Cloud VPC.
To operationalize the digital twin, build the graph-based entity relationship model using Spanner Graph and partner solutions like neo4j. This uses the live data stream from the physical system and represents it on the digital twin.
Use a combination of Cloud Storage and BigQuery to store discrete and continuous IT and OT data such as system measurements, states, and file dumps from the source and digital twin.
Discover common mode failures based on the mapped processes that include internal and external dependencies.
Use at least one leading indicator with Google Threat Intelligence to perform threat modeling and evaluate the impact on the digital twin model.
Run Google’s AI models on the digital twins to further advance the complexity of cyber-resilience studies.
Look for security and observability gaps. Improve model fidelity. Recreate and update the digital twin environment. Repeat step 10 with a new leading indicator, new threat intelligence, or an updated threat model.
Based on the security discoveries from the resilience studies on the digital twin, design and implement security controls and risk mitigations in the physical counterpart.
To learn more about how to build a digital twin, you can read this ebook chapter and contact Google Cloud’s Office of the CISO.
In today’s digital world, we spend countless hours in our browsers. It’s where we work, collaborate, and access information. But have you ever stopped to consider if you’re fully leveraging the browser security features available to protect your organisation? We explore this in our new paper “The Security Blindspot: Real Attack Insights From Real Browser Attacks,” and the answer might surprise you.
Written in partnership with Mandiant Incident Response experts, the new paper highlights how traditional security measures often overlook available security features within the browser, leaving organizations vulnerable to sophisticated attacks that could be prevented with additional browser security policies. Phishing, data breaches, insider threats, and malicious browser extensions are just some of the risks. Attackers are increasingly using legitimate browser features to trick users and carry out their malicious activities, making them harder to detect.
The paper delves into real-world case studies where increased browser security could have prevented significant security breaches and financial losses. These examples underscore the urgent need for organizations to adopt proactive and comprehensive security strategies within the browser.
Key takeaways from the report include:
Browsers are a major entry point for attacks: Attackers exploit users working on the web to launch advanced attacks.
Traditional security often overlooks the browser: Focusing solely on network and endpoint security leaves a significant gap.
Real-world attacks demonstrate the risks: Case studies reveal the consequences of neglecting security at the browser layer.
Advanced threat and data protection within the browser is essential: Solutions like Chrome Enterprise Premium can help mitigate these risks.
Browser insights for your security teams: Leverage telemetry and advanced browser data to provide a detailed view of your environment, identify risks and enable proactive measures to protect data.
Organizations that don’t take advantage of security within the browser are open to an array of threats, including phishing, data breaches, insider attacks, and malicious browser extensions, making robust browser protection essential. Don’t let your unprotected browser be your biggest security blind spot. To learn more about how to protect your organization from browser-based attacks, read the full whitepaper.
AWS Private Certificate Authority (AWS Private CA) now supports Active Directory (AD) child domains through the Private CA Connector for AD. With this feature, customers get a consistent experience using AWS Private CA across parent and child AD domains. AD administrators can issue certificates to users, computers, and devices in a child domain independently of the parent domain and other child domains. This feature works with on-premises and self-hosted AD deployments that are connected to AWS through AWS Directory Service AD Connector.
Private CA Connector for AD allows you to replace your certificate authorities (CAs) with AWS Private CA, a highly-available, fully-managed cloud CA that secures private key material using hardware security modules (HSMs). Connector for AD supports auto-enrollment to ensure AD domain-joined users, computers, and devices get and maintain valid certificates automatically. In addition to Connector for AD, AWS Private CA provides connectors that enable integration with Kubernetes clusters and enterprise mobile device management (MDM) solutions.
AD child domain support is available in all regions where both AWS Private CA Connector for AD and AWS Directory Service are available. To learn more about using AWS Private CA with Active Directory child domains, visit the AWS Private CA User Guide.
AWS Marketplace has expanded its global accessibility by introducing support for French, Spanish, Korean, and Japanese languages across both the website and AWS console. This enhancement allows customers to discover, evaluate, procure, and deploy solutions in their preferred language, reducing friction for global customers and enhancing their purchasing process.
For a localized experience, buyers select their preferred language out of 5 options in the language dropdown. The resulting language switch extends across the customer journey, allowing customers to browse the AWS Marketplace homepage, search for products, view details, and buy products and services in their chosen language. The localization covers SaaS products, AMI-based products, container-based products, and professional services.
For AWS Marketplace sellers, this launch expands their global reach. AWS Marketplace automatically translates product information into all supported languages, allowing the translated versions to become available to buyers with no additional seller effort. Sellers maintain control over their global presence and can opt out from this feature on a language or listing basis. Furthermore, sellers can now provide End User License Agreements (EULAs) in the primary language of the country for geo-fenced listings.
To get started, select your preferred language in the upper right corner of the website or console header. To learn more about AWS Marketplace’s language support, visit the AWS Marketplace Buyer Guide and Seller Guide.
Amazon FSx for NetApp ONTAP second-generation file systems are now available in additional AWS Regions: Asia Pacific (Mumbai), and Asia Pacific (Tokyo).
Amazon FSx makes it easier and more cost effective to launch, run, and scale feature-rich high-performance file systems in the cloud. Second-generation FSx for ONTAP file systems give you more performance scalability and flexibility over first-generation file systems by allowing you to create or expand file systems with up to 12 highly-available (HA) pairs of file servers, providing your workloads with up to 72 GBps of throughput and 1 PiB of provisioned SSD storage.
With this regional expansion, second-generation FSx for ONTAP file systems are available in the following AWS Regions: US East (N. Virginia, Ohio), US West (N. California, Oregon), Europe (Frankfurt, Ireland, Stockholm), and Asia Pacific (Mumbai, Singapore, Sydney, Tokyo). You can create second-generation Multi-AZ file systems with a single HA pair, and Single-AZ file systems with up to 12 HA pairs. To learn more, visit the FSx for ONTAP user guide.
Today, we’re announcing new and improved agentic Amazon Q Developer experience in the AWS Management Console, Microsoft Teams, and Slack. Amazon Q Developer can now answer more complex queries than ever before in the AWS Management Console chat, offering deeper resource introspection and a dynamic, more interactive troubleshooting experience for users.
Till today, Amazon Q Developer helped simplify manual work needed from cloud engineering teams to monitor and troubleshoot resources by answering basic AWS questions and providing specialized guidance. Now, it combines deep AWS expertise with new multi-step reasoning capabilities that enable it to consult multiple information sources and resolve complex queries within the AWS Management Console and chat applications.Customers can ask questions about AWS services and their resources, leaving Amazon Q to automatically identify the appropriate capabilities tools for the task, selecting from AWS APIs across 200+ services. Amazon Q breaks all queries into executable steps, asks for clarification when needed, and combines information from multiple services to solve the task. For example, you can ask, “Why am I getting 500 errors from my payment processing Lambda function?” and Q automatically gathers relevant CloudWatch logs, examines function’s configuration and permissions, checks connected services like API Gateway and DynamoDB and analyzes recent changes – all while showing progress and reasoning to enable builders to work more efficiently.
These new capabilities are accessible in all AWS regions where Amazon Q Developer is available. Learn more in this deep-dive blog.
Amazon Lex now offers AWS CloudFormation support in AWS GovCloud (US-West), extending infrastructure-as-code capabilities to government agencies and their partners.Additionally, CloudFormation support also now includes composite slots and QnAIntent features across all AWS regions where Amazon Lex operates, allowing developers to define, deploy, and manage these advanced conversational components through CloudFormation templates.
With CloudFormation support in AWS GovCloud (US-West), government agencies can now automate the deployment of Amazon Lex chatbots while maintaining compliance with strict security requirements. Additionally, two key conversational AI features are now supported via CloudFormation: composite Slots, which enable more natural interactions by collecting multiple data points in a single prompt (e.g., “Please provide your city and state” instead of separate questions), and QnAIntent, which automatically answers user questions by searching configured knowledge bases and returning relevant information from documents, FAQs, or knowledge bases.
CloudFormation support for Amazon Lex including composite slots and QnAIntent, is available in all AWS Regions where Amazon Lex operates.
Cost Optimization Hub now supports instance and cluster storage recommendations for Amazon Aurora databases. These recommendations help you identify idle database instances and choose the optimal DB instance class and storage configuration for your Aurora databases.
With this launch, you can view, filter, consolidate, and prioritize Aurora optimization recommendations across your organization’s member accounts and AWS Regions through a single dashboard. Cost Optimization Hub quantifies estimated savings from these recommendations, taking into account your specific discounts, such as Reserved Instances, enabling you to evaluate Aurora cost savings alongside other cost optimization opportunities.
The new Amazon Aurora recommendations are now available in Cost Optimization Hub across all AWS Regions where Cost Optimization Hub is supported.
Today, Amazon DataZone and Amazon SageMaker announced a new user interface (UI) capability allowing a DataZone domain to be upgraded and used directly in the next generation of Amazon SageMaker. This makes the investment customers put into developing Amazon DataZone transferable to Amazon SageMaker. All content created and curated through Amazon DataZone such as assets, metadata forms, glossaries, subscriptions, etc. are available to users through Amazon SageMaker Unified Studio after the upgrade.
As an Amazon DataZone administrator, you can choose which of your domains to upgrade to Amazon SageMaker via a UI driven experience. The upgraded domain lets you leverage your existing Amazon DataZone implementation in the new Amazon SageMaker environment and expand to new SQL analytics, data processing and AI uses cases. Additionally, after upgrading both Amazon DataZone and Amazon SageMaker portals remain accessible. This provides administrators flexibility with user rollout of Amazon SageMaker, while ensuring business continuity for users operating within Amazon DataZone. By upgrading to Amazon SageMaker, users can build on their investment from Amazon DataZone by utilizing Amazon SageMaker’s unified platform that serves as the central hub for all data, analytics, and AI needs.
The domain upgrade capability is available in all AWS Regions where Amazon DataZone and Amazon SageMaker is supported, including: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Seoul), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (Stockholm), Europe (London), South America (São Paulo), Mumbai (BOM), Stockholm (ARN), and Paris (CDG).
AWS Compute Optimizer now provides Amazon Aurora I/O-Optimized recommendations for Aurora DB Cluster Storage. These recommendations help make informed decisions about adopting Aurora I/O-Optimized configurations to increase pricing predictability and achieve potential cost savings based on your cluster’s storage patterns and usage.
AWS Compute Optimizer automatically analyzes your Aurora DB clusters’ instance storage and I/O costs to provide detailed cost comparisons between Aurora I/O-Optimized and Aurora Standard configurations. By default, Compute Optimizer analyzes 14 days of metrics, which you can extend to 32 days for free or up to 93 days with enhanced infrastructure metrics enabled. With enhanced metrics, you can also view month-over-month I/O usage variations to better evaluate the benefits of each storage configuration.
This new feature is available in all AWS Regions where AWS Compute Optimizer is available except the AWS GovCloud (US) and the China Regions. To learn more about the new feature updates, please visit Compute Optimizer’s product page and user guide.
You can never be sure when you’ll be the target of a distributed denial-of-service (DDoS) attack. For investigative journalist Brian Krebs, that day came on May 12, when his site KrebsOnSecurity experienced one of the largest DDoS attacks seen to date.
At 6.3 terabits per second (Tbps), or roughly 63,000 times the speed of broadband internet in the U.S., the attack was 10 times the size of the DDoS attack Krebs faced in 2016 from the Mirai botnet. That 2016 incident took down KrebsOnSecurity.com for four days, and was so severe that his then-DDoS protection service asked him to find another provider, Krebs said in his report on the May attack.
Following the 2016 incident, Krebs signed up for Project Shield, a free Google service that offers at-risk, eligible organizations protection against DDoS attacks. Since then, his site has stayed reliably online in the face of attacks — including the latest incident.
The brunt of the May 12 attack lasted less than a minute and peaked above 6.3 Tbps, one of the largest DDoS attacks observed to date.
Organizations in eligible categories, including news publishers, government elections, and human rights defenders, can use the power of Google Cloud’s networking services in conjunction with Jigsaw to help keep their websites available and online.
Project Shield acts as a reverse proxy service — customers change their DNS settings to send traffic to an IP address provided by Project Shield, and configure Project Shield with information about their hosting server. The customer retains control over both their DNS settings and their hosting server, making it easy to enable or disable Project Shield at any time with a simple DNS switch.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb979a834c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Built on the strength of Google Cloud networking services, including Cloud Load Balancing, Cloud CDN, and Cloud Armor, Project Shield’s services can be configured through the Project Shield dashboard as a managed experience. This solution works together to mitigate attacks and serve cached content from multiple points on Google’s edge network. It’s a combination that has protected KrebsOnSecurity before, and has successfully defended many websites against some of the world’s largest DDoS attacks.
In the May incident against Krebs, the attack was filtered instantly by Google Cloud’s network. Requests for websites protected by Project Shield pass through Google Cloud Load Balancing, which automatically blocks layer 3 and layer 4 volumetric DDoS attacks.
In the May incident, the attacker sent large data packets to random ports at a rate of approximately 585 million packets per second, which is over 1,000 times the usual rate for KrebsOnSecurity.
The attack came from infected devices all around the world.
Cloud Armor, which embeds protection into every load balancer deployment, blocked the attack at the load balancing level because Project Shield sits behind the Google Cloud Load Balancer, which proxies only HTTP/HTTPS traffic. Had the attack occurred with well-formed requests (such as at Layer 7, also known as the application layer), additional defenses from the Google Cloud global front end would have been ready to defend the site.
Cloud CDN, for example, makes it possible to serve content for sites like KrebsOnSecurity from cache, lessening the load on a site’s servers. Cloud Armor would have actively filtered incoming requests for any remaining traffic that may have bypassed the cache to allow only legitimate traffic through.
Additionally, Cloud Armor’s Adaptive Protection uses real-time machine learning, which helps identify attack signatures and dynamically tailor rate limits. These rate limits are actively and continuously refined, allowing Project Shield to harness Google Cloud’s capabilities to mitigate almost all DDoS attacks in seconds.
Project Shield defenses are automated, with no customer defense configuration needed. They’re optimized to capitalize on the powerful blend of defensive tools in Google Cloud’s networking arsenal, which are available to any Google Cloud customer.
As KrebsOnSecurity and others have experienced, DDoS attacks have been getting larger, more sophisticated, and more frequent in recent years. Let the power and scale of Google Cloud help protect your site against attacks when you least expect them. Eligible organizations can apply for Project Shield today, and all organizations can set up their own Cloud Networking configuration like Project Shield by following this guide.
Developers love Cloud Run, Google Cloud’s serverless runtime, for its simplicity, flexibility, and scalability. And today, we’re thrilled to announce that NVIDIA GPU support for Cloud Run is now generally available, offering a powerful runtime for a variety of use cases that’s also remarkably cost-efficient.
Now, you can enjoy the following benefits across both GPUs and CPUs:
Pay-per-second billing: You are only charged for the GPU resources you consume, down to the second.
Scale to zero: Cloud Run automatically scales your GPU instances down to zero when no requests are received, eliminating idle costs. This is a game-changer for sporadic or unpredictable workloads.
Rapid startup and scaling Go from zero to an instance with a GPU and drivers installed in under 5 seconds, allowing your applications to respond to demand very quickly. For example, when scaling from zero (cold start), we achieved an impressive Time-to-First-Token of approximately 19 seconds for a gemma3:4b model (this includes startup time, model loading time, and running the inference)
Full streaming support: Build truly interactive applications with out-of-the box support for HTTP and WebSocket streaming, allowing you to provide LLM responses to your users as they are generated.
Support for GPUs in Cloud Run is a significant milestone, underscoring our leadership in making GPU-accelerated applications simpler, faster, and more cost-effective than ever before.
“Serverless GPU acceleration represents a major advancement in making cutting-edge AI computing more accessible. With seamless access to NVIDIA L4 GPUs, developers can now bring AI applications to production faster and more cost-effectively than ever before.” – Dave Salvator, director of accelerated computing products, NVIDIA
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb98c7c11c0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
AI inference for everyone
One of the most exciting aspects of this GA release is that Cloud Run GPUs are now available to everyone for NVIDIA L4 GPUs, with no quota request required.This removes a significant barrier to entry, allowing you to immediately tap into GPU acceleration for your Cloud Run services. Simply use --gpu 1 from the Cloud Run command line, or check the “GPU” checkbox in the console, no need to request quota:
Production-ready
With general availability, Cloud Run with GPU support is now covered by Cloud Run’s Service Level Agreement (SLA), providing you with assurances for reliability and uptime. By default, Cloud Run offers zonal redundancy, helping to ensure enough capacity for your service to be resilient to a zonal outage; this also applies to Cloud Run with GPUs. Alternatively, you can turn off zonal redundancy and benefit from a lower price for best-effort failover of your GPU workloads in case of a zonal outage.
Multi-regional GPUs
To support global applications, Cloud Run GPUs are available in five Google Cloud regions: us-central1 (Iowa, USA), europe-west1 (Belgium), europe-west4 (Netherlands), asia-southeast1 (Singapore), and asia-south1 (Mumbai, India), with more to come.
Cloud Run also simplifies deploying your services across multiple regions. For instance, you can deploy a service across the US, Europe and Asia with a single command, providing global users with lower latency and higher availability. For instance, here’s how to deploy Ollama, one of the easiest way to run open models, on Cloud Run across three regions:
See it in action: 0 to 100 NVIDIA GPUs in four minutes
You can witness the incredible scalability of Cloud Run with GPUs for yourself with this live demo from Google Cloud Next 25, showcasing how we scaled from 0 to 100 GPUs in just four minutes.
Load testing a Stable Diffusion service running on Cloud Run GPUs to 100 GPU instances in four minutes.
Unlock new use cases with NVIDIA GPUs on Cloud Run jobs
The power of Cloud Run with GPUs isn’t just for real-time inference using request-driven Cloud Run services. We’re also excited to announce the availability of GPUs on Cloud Run jobs, unlocking new use cases, particularly for batch processing and asynchronous tasks:
Model fine-tuning: Easily fine-tune a pre-trained model on specific datasets without having to manage the underlying infrastructure. Spin up a GPU-powered job, process your data, and scale down to zero when it’s complete.
Batch AI inferencing: Run large-scale batch inference tasks efficiently. Whether you’re analyzing images, processing natural language, or generating recommendations, Cloud Run jobs with GPUs can handle the load.
Batch media processing: Transcode videos, generate thumbnails, or perform complex image manipulations at scale.
What Cloud Run customers are saying
Don’t just take our word for it. Here’s what some early adopters of Cloud Run GPUs are saying:
“Cloud Run helps vivo quickly iterate AI applications and greatly reduces our operation and maintenance costs. The automatically scalable GPU service also greatly improves the efficiency of our AI going overseas.” – Guangchao Li, AI Architect, vivo
“L4 GPUs offer really strong performance at a reasonable cost profile. Combined with the fast auto scaling, we were really able to optimize our costs and saw an 85% reduction in cost. We’ve been very excited about the availability of GPUs on Cloud Run.” – John Gill at Next’25, Sr. Software Engineer, Wayfair
“At Midjourney, we have found Cloud Run GPUs to be incredibly valuable for our image processing tasks. Cloud Run has a simple developer experience that lets us focus more on innovation and less on infrastructure management. Cloud Run GPU’s scalability also lets us easily analyze and process millions of images.” – Sam Schickler, Data Team Lead, Midjourney
Amazon Managed Workflows for Apache Airflow (MWAA) now provides the option to update environments without interrupting running tasks on supported Apache Airflow versions (v2.4.3 or later).
Amazon MWAA is a managed service for Apache Airflow that lets you use the same familiar Apache Airflow platform as you do today to orchestrate your workflows and enjoy improved scalability, availability, and security without the operational burden of having to manage the underlying infrastructure. Amazon MWAA now allows you to update your environment without disrupting your ongoing workflow tasks. By choosing this option, you are now able to update an MWAA environment in graceful manner where MWAA will replace Airflow Scheduler and Webserver components, provision new workers, and wait for ongoing worker tasks to complete before removing older workers. The graceful option is available only for supported Apache Airflow versions (v2.4.3 or later) on MWAA.
Apache, Apache Airflow, and Airflow are either registered trademarks or trademarks of the Apache Software Foundationin the United States and/or other countries.
Red Hat Enterprise Linux (RHEL) for AWS, starting with RHEL 10, is now generally available, combining Red Hat’s enterprise-grade Linux software with native AWS integration. RHEL for AWS isbuilt to achieve optimum performance of RHEL running on AWS. This offering features pre-tuned images with AWS-specific performance profiles, built-in Amazon CloudWatch telemetry, integrated AWS Command Line Interface (CLI), image mode using container-native tooling, enhanced security from boot to runtime, and optimized networking with Elastic Network Adapter (ENA) support.
For organizations looking to accelerate innovation and meet customer demands, RHEL for AWS combines the stability of RHEL with native AWS integration. This purpose-built solution is designed to deliver optimized performance, improved security, and simplified management through AWS-specific configurations and tooling. Whether migrating existing workloads or deploying new instances, RHEL for AWS provides a standardized, ready-to-use software that can help teams reduce operational overhead and focus on business initiatives rather than infrastructure management. Customers can save valuable time with built-in AWS service integration, automated monitoring, and streamlined deployment options.
Customers can access RHEL for AWS Amazon Machine Images (AMIs) through the Amazon EC2 Console or AWS Marketplace with flexible procurement options. Please visit Red Hat Enterprise Linux on Amazon EC2 FAQs page for more details.
The service is available across all AWS Commercial and AWS GovCloud (US) Regions. To get started with RHEL for AWS, visit EC2 console or AWS Marketplace.