With the rapid growth of AI/ML, data science teams need a better notebook experience to meet the growing demand for and importance of their work to drive innovation. Additionally, scaling data science workloads also creates new challenges for infrastructure management. Allocating compute resources per user provides strong isolation (the technical separation of workloads, processes, and data from one another), but may cause inefficiencies due to siloed resources. Shared compute resources offer more opportunities for efficiencies, but with a sacrifice in isolation. The benefit of one comes at the expense of the other. There has to be a better way…
We are announcing a new Dataproc capability: multi-tenant clusters. This new feature provides a Dataproc cluster deployment model suitable for many data scientists running their notebook workloads at the same time. The shared cluster model allows infrastructure administrators to improve compute resource efficiency and cost optimization without compromising granular, per-user authorization to data resources, such as Google Cloud Storage (GCS) buckets.
This isn’t just about optimizing infrastructure; it’s about accelerating the entire cycle of innovation that your business depends on. When your data science platform operates with less friction, your teams can move directly from hypothesis to insight to production faster. This allows your organization to answer critical business questions faster, iterate on machine learning models more frequently, and ultimately, deliver data-powered features and improved experiences to your customers ahead of the competition. It helps evolve your data platform from a necessary cost center into a strategic engine for growth.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3ed6201f1520>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
How it works
This new feature builds upon Dataproc’s previously established service account multi-tenancy. For clusters in this configuration, only a restricted set of users declared by the administrator may submit their workloads. Administrators also declare a mapping of users to service accounts. When a user runs a workload, all access to Google Cloud resources is authenticated only as their specific mapped service account. Administrators control authorization in Identity Access Management (IAM), such as granting one service account access to a set of Cloud Storage buckets and another service account access to a different set of buckets.
As part of this launch, we’ve made several key usability improvements to service account multi-tenancy. Previously, the mapping of users to service accounts was established at cluster creation time and unmodifiable. We now support changing the mapping on a running cluster, so that administrators can adapt more quickly to changing organizational requirements. We’ve also added the ability to externalize the mapping to a YAML file for easier management of a large user base.
Jupyter notebooks establish connections to the cluster via the Jupyter Kernel Gateway. The gateway launches each user’s Jupyter kernels, distributed across the cluster’s worker nodes. Administrators can horizontally scale the worker nodes to meet end user demands either by manually adjusting the number of worker nodes or by using an autoscaling policy.
Notebook users can choose Vertex AI Workbench for a fully managed Google Cloud experience or bring their own third-party JupyterLab deployment. In either model, the BigQuery JupyterLab Extension integrates with Dataproc cluster resources. Vertex AI Workbench instances can deploy the extension automatically, or users can install it manually in their third-party JupyterLab deployments.
Under the hood
Dataproc multi-tenant clusters are automatically configured with additional hardening to isolate independent user workloads:
All containers launched by YARN run as a dedicated operating system user that matches the authenticated Google Cloud user.
Each OS user also has a dedicated Kerberos principal for authentication to Hadoop-based Remote Procedure Call (RPC) services, such as YARN.
Each OS user is restricted to accessing only the Google Cloud credentials of their mapped service account. The cluster’s compute service account credentials are inaccessible to end user notebook workloads.
Administrators use IAM policies to define least-privilege access authorization for each mapped service account.
How to use it
Step 1: Create a service account multi-tenancy mapping Prepare a YAML file containing your user service account mapping, and store it in a Cloud Storage bucket. For example:
Step 2: Create a Dataproc multi-tenant cluster Create a new multi-tenant Dataproc cluster using the user mapping file and the new JUPYTER_KERNEL_GATEWAY optional component.
Step 3: Create a Vertex AI Workbench instance with Dataproc kernels enabled For users of VertexAI Workbench, create an instance with Dataproc kernels enabled. This automatically installs the BigQuery JupyterLab extension.
Step 4: Install the BigQuery JupyterLab extension in third-party deployments For users of third-party JupyterLab deployments, such as running on a local laptop, install the BigQuery JupyterLab extension manually.
Step 5: Launch kernels in the Dataproc cluster Open the JupyterLab application either from a Vertex AI Workbench instance or on your local machine.
The JupyterLab Launcher page opens in your browser. It shows the Dataproc Cluster Notebooks sections if you have access to Dataproc clusters with the Jupyter Optional component or Jupyter Kernel Gateway component.
To change the region and project:
Select Settings > Cloud Dataproc Settings.
On the Setup Config tab, under Project Info, change the Project ID and Region, and then click Save.
Restart JupyterLab to make the changes take effect.
Select the kernel spec corresponding to your multi-tenant cluster. Once the kernel spec is selected, the kernel is launched and it takes about 30-50 seconds for the kernel to go from Initializing to Idle state. Once the kernel is in Idle state, it is ready for execution.
Get started with multi-tenant clusters
Stop choosing between security and efficiency. With Dataproc’s new multi-tenant clusters, you can empower your data science teams with a fast, collaborative environment while maintaining centralized control and optimizing costs. This new capability is more than just an infrastructure update; it’s a way to accelerate your innovation lifecycle.
This feature is now available in public preview. Get started today by exploring the technical documentation and creating your first multi-tenant cluster. Your feedback is crucial as we continue to evolve the platform, so please share your thoughts with us at dataproc-feedback@google.com.
The security operations centers of the future will use agentic AI to enable intelligent automation of routine tasks, augment human decision-making, and streamline workflows. At Google Cloud, we want to help prepare today’s security professionals to get the most out of tomorrow’s AI agents.
As we build our agentic vision, we’re also excited to invite you to the first Agentic SOC Workshop: Practical AI for Today’s Security Teams. This complimentary, half-day event series is designed for security practitioners looking to level up their AI skills and move beyond the marketing to unlock AI’s true potential. Ultimately, we believe that agentic AI will empower security professionals to focus more on complex investigations and strategic initiatives, and drive better security outcomes and operational efficiency.
Our vision is a future where every customer has a virtual security assistant — trained by the world’s leading security experts — that anticipates threats and recommends the best path to deliver on security goals. We are building the next class of security experts empowered by AI, and these workshops are your opportunity to become one of them.
How the Agentic SOC Workshop can boost your security skills
The Agentic SOC Workshop combines foundational security capabilities with AI to help security professionals develop the necessary skills for successful AI use. Attendees will:
Explore the agentic SOC future: Learn about Google Cloud’s vision for the future of security operations, where agent AI systems automate complex workflows and empower analysts to focus on high-impact tasks.
Learn by doing: Dive into a practical, real-world AI workshop tailored for security practitioners. Learn how to use Google Cloud’s AI and threat intelligence to automate repetitive tasks, reduce alert fatigue, and improve your security skills.
Participate in a dynamic Capture the Flag challenge: Put your new skills to the test in an interactive game where you use the power of AI to solve challenges and race to the finish line.
Meet and network with peers: Gain valuable insights from industry peers and hear from other customers on their journey to modernize security operations. Connect with peers, partners, and Google experts during networking breaks concluding with a happy hour.
Discover practical uses for AI: Learn how to use Gemini in Google Security Operations to respond to threats faster and more effectively..
Join us in a city near you
These free, half-day workshops are specifically designed for security professionals, including security architects, SOC managers, analysts, and security engineers, as well as security IT decision-makers including CISOs and VPs of security.
We’ll be holding Agentic SOC Workshops starting in Los Angeles on Wednesday, Sept. 17, and Chicago on Friday, Sept. 19. Workshops will continue in October in New York City and Toronto, with more cities to come. To register for a workshop near you, please check out our registration page.
Securing AI systems is a fundamental requirement for business continuity and customer trust, and Google Cloud is at the forefront of driving secure AI innovations and working with partners to meet the evolving needs of customers.
Our secure-by-design cloud platform and built-in security solutions are continuously updated with the latest defenses, helping security-conscious organizations confidently build and deploy AI workloads. We are also collaborating with our partners to give our customers choice and flexibility to secure their AI workloads, encompassing models, data, apps, and agents.
Many of our partners are using Google Cloud’s AI to build new defenses and automate security operations, transforming the security landscape. That’s hardly surprising, given that agentic AI is poised to create a nearly $1 trillion global market, according to a new study by the Boston Consulting Group for Google Cloud.
Today, we’re excited to announce new security solutions from our partners, also available in the Google Cloud Marketplace.
Apiiro has introduced its AutoFix AI Agent to streamline developer workflows. It integrates with Gemini Code Assist to automatically generate remediated code based on an organization’s software development lifecycle context.
Exabeam has provided unified visibility into both human and agent activity by applying behavioral analytics to data from Google Agentspace and Model Armor. Security teams can use it to gain a comprehensive understanding of how AI agents are operating and quickly identify any anomalous behavior that could signal a threat.
Fortinet has enhanced application security with FortiWeb, which can protect applications from intentional attacks while also incorporating Data Loss Prevention to safeguard personally identifiable information used in AI applications. Fortinet has also unveiled how its product ecosystem can help secure AI workloads on Google Cloud.
F5 has worked to address critical areas of API security for generative AI applications and large language models, including prompt injection and API sprawl, with its Application Delivery and Security Platform (ADSP).
Menlo Security has offered gen AI analysis enabled by Gemini with its HEAT Shield AI. This technology expands protection against social engineering and brand impersonation attacks that target users in the browser.
Netskope has introduced DLP On Demand, designed to expand data-loss prevention directly to AI-enabled applications. Integrated into Google Cloud’s Vertex AI and Gemini ecosystem, DLP On Demand offers sensitive data protection, can prevent data leakage, and can enable customers to safely adopt AI without compromising their security or compliance posture.
Palo Alto Networks has designed Prisma AIRS to protect AI and agentic workloads on Google Cloud, including Vertex AI and Agentspace. With Cloud WAN, Prisma Access can provide high-bandwidth and performant connectivity to AI and other cloud-based applications.
Ping Identity is releasing Gemini-based models on the Vertex platform to build identity-focused agents and applications that use Google’s powerful generative AI services. These models support reasoning as well as embedding models in RAG (retrieval-augmented generation) applications that assist administrators and users with contextually-relevant answers to their prompts.
Transmit Security has used Google Cloud’s AI in Mosaic platform to deliver identity and fraud prevention for the era of consumer AI agents. It can help enterprises improve the human–agent relationship across the full identity lifecycle, from login to high-risk account activity.
Wiz has added support for Gemini Code Assist, tackling the critical gap between AI-enabled development workflows and real-time security intelligence.
In addition, to make it easier to discover and deploy agentic tools, we’ve added the new Agent Tools category in the Google Cloud Marketplace.
Collaborating across cybersecurity, including security operations, application and agent security, identity and data protection, our partners are extending the reach of AI-driven defense. They’re actively developing and integrating solutions that use our AI-native tools and platforms, and enabling end-to-end security throughout the entire AI lifecycle, from model to agent development.
Together, we invite you to explore these innovations further and join us in shaping a more secure, AI-driven future. You can learn more about our security partner ecosystem here.
While petabyte-scale data warehouses are becoming more common, getting the performance you need without escalating costs and effort remain key challenges, even in a modern cloud data warehouse. While many data warehouse platform providers continue to work on these challenges, BigQuery has already moved past petabyte-scale data warehouses to petabyte-scale tables. In fact, some BigQuery users have single tables in excess of 200 petabytes and over 70 trillion rows.
At this scale, even metadata is big data that requires an (almost) infinitely scalable design and high performance. In 2021, we presented the Column Metadata (CMETA) index in a 2021 VLDB paper, which, as the name implies, acts like an index for metadata. Compared to existing techniques, CMETA proved to be superior, meeting both our scalability and performance requirements. Further, BigQuery’s implementation thereof requires no user effort to maintain, and in addition to transparently improving query performance, CMETA may also reduce overall slot usage.
In this blog, we take a look at how CMETA works, the impact it can have on your workloads, and how to maximize its benefits. Let’s jump in.
How BigQuery stores data
All data in BigQuery tables is stored as data blocks that are organized in a columnar format. Data blocks also store metadata about all rows within the block. This includes min and max values for each column in the block and any other necessary properties that may be used for query optimization. This metadata allows BigQuery to perform fine-grained dynamic pruning to improve both query performance and resource efficiency.
This approach is well-known and commonly applied in the data management industry. However, as noted above, BigQuery operates on a vast scale, routinely handling tables that have over a hundred petabytes of data spread across billions of blocks in storage. Metadata for these tables frequently reach terabyte scale — larger than many organizations’ entire data warehouses!
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3ed55a935040>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Enter the Column Metadata index
To optimize queries, especially when large tables are involved, BigQuery now leverages CMETA. This system table is automatically created and managed by BigQuery to maintain a snapshot of metadata for all data blocks of user tables that may benefit from the index. This provides additional data to BigQuery’s planner, allowing it to apply additional fine-grained pruning of data blocks, reducing both resource consumption (slots usage and/or bytes scanned) and query execution time.
CMETA relies on a few key techniques.
Index generation CMETA is automatically generated and refreshed in the background at no additional cost and does not impact user workloads. Creation and updates to the index occur automatically whenever BigQuery determines the table will benefit from the index based on size and/or volume of change to the data in an existing table. BigQuery ensures the index remains up-to-date with block statistics and column-level attributes with no need for any user action. Using efficient storage and horizontally scalable techniques, BigQuery can maintain these indexes at scale, even for some of our performance sensitive users with tables over 200 petabytes in size.
Figure 1
Query serving To illustrate how the index serves queries in practice, let’s use the `natality` table from BigQuery’s public dataset. Imagine this table’s data is stored in three blocks (see Figure 1), committed at times 110, 120, and 130. Our column metadata index, with a snapshot taken at time 125, includes block- and column-level statistics for blocks 1 and 2.
code_block
<ListValue: [StructValue([(‘code’, ‘SELECT * FROM samples.natality WHERE weight_pounds >= 7 and is_male = false’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed558489970>)])]>
Considering the query above, BigQuery first scans the index to identify relevant blocks. Since the maximum value of `weight_pounds` in block 2 is 6.56 and the query filters on ‘weight_pounds’ >= 7, we know we can safely skip that block without even inspecting it. The original query then runs only against block 1 and any newer block(s) that haven’t been indexed yet — in this case, block 3. The results are combined and returned to the user.
Figure 2
With rich column-level attributes in the index, BigQuery can prune efficiently at the early stage of query processing. Without the index, pruning occurs at later stages when BigQuery opens the data blocks, which involves more computing resources. For large tables, skipping data blocks with this technique significantly benefits selective queries, enabling BigQuery to support much larger tables. Consider the above example but with a table that has billions of blocks. Imagine the time and slot usage savings from pruning unnecessary blocks without even needing to access the block’s header.
BigQuery’s CMETA index is unique in a few ways:
Zero maintenance cost or effort: The CMETA index is a fully automated background operation
Applicable to all data tables: CMETA works transparently to improve performance regardless of whether the table size is measured in gigabytes or petabytes
Integrated with other Google Cloud services: Works with BigQuery tables and BigLake External Tables
Safe: Always returns correct results regardless of whether CMETA is available or up-to-date
Measuring CMETA’s impact
Early adopters of CMETA have reported up to 60x improvement in query performance and up to 10x reduction in slot usage for some queries. The benefits are particularly pronounced for queries with more selective filters, especially for filters on clustering columns, as CMETA minimizes the amount of data processed by query workers.
Figure 3
Maximizing CMETA’s benefits
BigQuery currently automatically manages CMETA at no additional cost to users and allocates resources to create or refresh the index in a round robin way. If your tables grow or change very rapidly and you have strict performance requirements, you may choose to use your own resource pool (slots) for CMETA maintenance operations to maximize CMETA’s throughput. This will provide the most consistent experience in query performance improvement via CMETA. To do this, simply create a reservation assignment and allocate slots for background jobs, and CMETA maintenance jobs will automatically use it. More details are available in the documentation.
More to come
While this first iteration of CMETA is now generally available, we’re already working on future iterations to further improve BigQuery’s autonomous query processing capabilities, without any extra effort or cost on your part. Stay tuned for more to come.
Growing up in a Navy family instilled a strong sense of purpose in me. My father’s remarkable 42 years of naval service not only shaped my values, but inspired me to join the Navy myself. Today, as the leader of Google Public Sector, I’m honored to continue that tradition of service in support of our federal government.
Over the past year, Google Public Sector has made significant strides in delivering AI and secure cloud infrastructure to government agencies. We introduced a Google Workspace discount through GSA’s OneGov Strategy in April, and just last month expanded that partnership with the introduction of a “Gemini for Government” offering to provide a comprehensive set of AI tools to federal agencies at less than $.50 per government agency for a year. In addition, Google Public Sector secured a $200 million contract with the Department of Defense’s (DoD) Chief Digital and Artificial Intelligence Office (CDAO) to accelerate the adoption of AI and cloud capabilities. These are just a few examples of Google’s commitment to the federal workforce.
In addition to these efforts, my role at Google Public Sector affords me the opportunity to work closely with a community I know well: our nation’s veterans. I’ve seen firsthand the immense value and leadership skills veterans bring to the table, yet the transition to civilian life can be difficult, with veterans facing challenges ranging from underemployment to difficulties securing meaningful work.
At Google Public Sector, we’ve long been committed to changing this narrative by empowering the veteran community with the skills and resources needed for successful career transitions.
Sign up today for Google Launchpad for Veterans
I’m excited to share that registration is officially open for the next cohort of Google Launchpad for Veterans. Introduced in 2024, this three-week, no-cost, virtual training program provides veterans with the foundational skills necessary to jump start rewarding careers with generative AI. The program is open to US and Canada military veterans and service members.
Last year, we trained over 4,000 veterans to help them transition into high-paying tech roles. This next cohort of learners will gain in-demand skills that enable them to thrive in both functional and technical positions, along with the knowledge to help drive digital transformation through AI within their organization.
Become a leader in AI digital transformation with this virtual, no-cost program
The Gen AI Leader training, which does not require previous technical experience, kicks off with a two-day virtual event on November 13th and 14th. You’ll enjoy interactive training sessions and a panel discussion with veterans from Google and learn:
Foundational generative AI knowledge: Grasp the core concepts of generative AI, including Large Language Models (LLMs), machine learning paradigms, and various data types.
AI ecosystem navigation: Learn to navigate the broader AI landscape, encompassing infrastructure, models, platforms, agents, and applications.
Practical business applications: Explore real-world uses of generative AI within business, with a focus on powerful Google Cloud tools like Gemini and NotebookLM.
Strategic perspective: Understand how generative AI agents can drive organizational transformation.
After completing the program, you’ll receive a complimentary voucher to take the Gen AI Leader exam. Attendees are encouraged to take the exam between November 21st and December 19th, 2025. After successful completion, you’ll receive Google’s industry-recognized Gen AI Leader certification, a valuable credential to help you advance your career.
If you’d like additional practice before taking the exam, we’re offering optional exam preparation sessions on November 17th, and 21st. As a bonus, the first 500 individuals to pass the exam will receive a voucher for their very own pair of Google socks!
Register today and get ready to translate your military experience to a powerful career where you can apply the latest in AI, security, and cloud technologies to public sector missions.
Learn more about Google Public Sector
To learn more about Google Public Sector partners, peers, and partners are building the future, join us for our Public Sector Summit, held in Washington, DC on October 29th. You can explore more content at publicsector.google.
The automotive industry is in the midst of a profound transformation, accelerating towards an era of software-defined vehicles (SDVs). This shift, however, presents significant challenges for manufacturers and suppliers alike. Their priority is making great vehicles, not great software, though the latter now contributes — and is increasingly a necessity — to achieve the former. These OEMs must find ways to bring greater efficiency and quality to their software delivery and establish new collaboration models, among other hurdles to achieving their visions for SDVs.
To help meet this moment, we’ve created Horizon, a new open-source software factory for platform development with Android Automotive OS — and beyond. With Horizon, we aim to support the software transformation of the automotive industry and tackle its most pressing challenges by providing a standardized development toolchain so OEMs can generate value by focussing on building products and experiences.
In early deployments at a half-dozen automotive partners, we’ve already seen between 10x to 50x faster feedback for developers, leading to high-frequency releases and higher build quality. In this post we will outline how Horizon helps overcome the key impediments to automotive software transformation.
The Roadblocks to Innovation in Automotive Software Development
Today, traditional automotive manufacturers (OEMs) often approach software development from a hardware-centric perspective that lacks agility and oftentimes struggles to scale. This approach makes software lifecycle support burdensome and is often accompanied by inconsistent and unreliable tools, slowing down development.
OEMs face exploding development costs, quality issues and slow innovation, making it difficult to keep pace with new market entrants and the increasing demand for advanced features. Furthermore, most customers expect frequent, high-quality over-the-air (OTA) software updates similar to what they receive on other devices such as on their smartphones, forcing most OEMs to mirror the consumer electronics experience.
But a car is not a television or refrigerator or even a rolling computer, as many now describe them. Vehicles are made up of many separate, highly complex systems, which typically require the integration of numerous components from multiple suppliers who often provide “closed box” solutions. Even as vehicles have become more connected, and dependent on these connective systems for everything from basic to advanced operations, the vehicle platform has actually become harder, not easier, to integrate and innovate with.
We knew there had to be a better way to keep up with the pace necessary to provide a great customer experience.
Introducing HORIZON: A Collaborative Path Forward
To tackle these pressing industry challenges, Google and Accenture have initiated Horizon. It is an open-source reference development platform designed to transform the automotive industry into a software-driven innovation market.
Our vision for Horizon is enabling automakers and OEMs to greatly accelerate their time to market and increase the agility of their teams while significantly reducing development costs. Horizon provides a holistic platform for the future of automotive software, enabling OEMs to invest more in innovation rather than just integration.
Key Capabilities Driving Software Excellence
Horizon offers a comprehensive suite of capabilities, establishing a developer-centric, cloud-powered, and easy-to-adopt open industry standard for embedded software.
1. Software-First Development with AAOS
Horizon champions a virtual-first approach to product design, deeply integrating with Android Automotive OS (AAOS) to empower software-led development cycles. This involves the effective use of the vehicle hardware abstraction layer (VHAL), virtio, and high-fidelity cloud-based virtual devices like Cuttlefish, which can scale to thousands of instances on demand. This approach allows for scalable automated software regression tests, elastic direct developer testing strategies, and can be seen as the initial step towards creating a complete digital twin of the vehicle.
2. Streamlined Code-Build-Test Pipeline
Horizon aims to introduce a standard for the entire software development lifecycle:
Code: It supports flexible and configurable code management using Gerrit, with the option to use GerritForge managed service via our Google Cloud Marketplace for productive deployments. With Gemini Code Assist, integrated in Cloud Workstations, you can supercharge development by leveraging code completion, bug identification, and test generation, while also aiding in explaining Android APIs.
Build: The platform features a scaled build process that leverages intelligent cloud usage and dynamic scaling. Key to this is the caching for AAOS platform builds based on warmed-up environments and the integration of the optimized Android Build File System (ABFS), which can reduce build times by more than 95% and allow full builds from scratch in one to two minutes with up to 100% cache hits. Horizon supports a wide variety of build targets, including Android 14 and 15, Cuttlefish, AVD, Raspberry Pi devices, and the Google Pixel Tablet. Build environments are containerized, ensuring reproducibility.
Test: Horizon enables scalable testing in Google Cloud with Android’s Compatibility Test Suite (CTS), utilizing Cuttlefish for virtualized runtime environments. Remote access to multiple physical build farms is facilitated by MTK Connect, which allows secure, low-latency interaction with hardware via a web browser, eliminating the need for hardware to be shipped to developers.
3. Cloud-Powered Infrastructure
Built on Google Cloud, Horizon ensures scalability and reliability. Deployment is simplified through tools like Terraform, GitOps and Helm charts, offering a plug-and-play toolchain and allowing for tracking the deployment of tools and applications to Kubernetes.
Unlocking Value for Auto OEMs and the Broader Industry
The Horizon reference platform delivers significant benefits for Auto OEMs:
Reduced costs: Horizon offers a reduction in hardware-related development costs and an overall decrease in rising development expenses.
Faster time to market: By accelerating development and enabling faster innovation cycles, Horizon helps OEMs reduce their time to market and feature cycle time.
Increased quality and productivity: The platform enables stable quality and boosts team productivity by providing standardized toolsets and fostering more effective team collaboration.
Enhanced customer experience: By enabling faster, more frequent and higher-quality builds, OEMs can change the way they develop vehicle software, thus offering enhanced customer experiences and unlocking new revenue streams through software-driven services.
Strategic focus: Horizon underpins the belief that efficient software development platforms should not be a point of differentiation for OEMs; instead, their innovation should be focused on the product itself. This allows OEMs to devote more time and resources to software development with greater quality, efficiency, and flexibility.
Robust ecosystem: To ensure scalable, secure, and future-ready deployments across diverse vehicle platforms, Horizon aims to foster collaboration between Google, integration partners, and platform adopters. While advancing the reference platform capabilities, Horizon also allows for tailored integration and compatibility with vehicle hardware, legacy systems and compliance standards.
The Horizon ecosystem
It’s been said that the best software is the one you don’t notice, so seamless and flawless is its functioning. This is especially true when it comes to the software-defined vehicle, where the focus should be on the road and the joy of the trip.
This is why we believe the platforms enabling efficient software development shouldn’t be differentiating for automakers — their vehicles should be. Like a solid set of tires or a good sound system, software is now essential, but it’s not the product itself. That is the full package put together by the combination of design, engineering, development, and production.
Because software development is now such an integral part of that process, we believe it should be an enabler, not a hindrance, for automakers. To that end, the Google Cloud, Android, and Accenture teams have continuously aimed to simplify access and the use of relevant toolchain components. The integration of OpenBSW and the Android Build File System (ABFS) are just the latest waypoints in a journey that started with GerritForge as providing a managed Gerrit offering, and continuing with additional partners in upcoming releases.
Please, join us on this journey. We invite you to become a part of the community to receive early insights, provide feedback, and actively participate in shaping the future direction of Horizon. You can also explore our open-source releases on Github to evaluate and customize the Horizon platform by deploying it in your Google Cloud environment and running reference workloads.
Horizon is a new dawn for the future of automotive software, though we can only get there together, through open collaboration and cloud-powered innovation.
A special thanks to a village of Googlers and Accenture who delivered this, Mike Annau, Ulrich Gersch, Steve Basra, Taylor Santiago, Haamed Gheibi, James Brook, Ta’id Holmes, Sebastian Kunze, Philip Chen, Alistair Delva, Sam Lin, Femi Akinde, Casey Flynn, Milan Wiezorek, Marcel Gotza, Ram Krishnamoorthy, Achim Ramesohl, Olive Power, Christoph Horn, Liam Friel, Stefan Beer, Colm Murphy, Robert Colbert, Sarah Kern, Wojciech Kowalski, Wojciech Kobryn, Dave M. Smith, Konstantin Weber, Claudine Laukant, Lisa Unterhauser
—
Opening image created using Imagen 4 with the prompt: Generate a blog post header image for the following blog post, illustrating the concept of a software-defined vehicle <insert the first six paragraphs>.
Apache Spark is a fundamental part of most modern lakehouse architectures, and Google Cloud’s Dataproc provides a powerful, fully managed platform for running Spark applications. However, for data engineers and scientists, debugging failures and performance bottlenecks in distributed systems remains a universal challenge.
Manually troubleshooting a Spark job requires piecing together clues from disparate sources — driver and executor logs, Spark UI metrics, configuration files and infrastructure monitoring dashboards.
What if you had an expert assistant to perform this complex analysis for you in minutes?
Accessible directly in the Google Cloud console — either from the resource page (e.g., Serverless for Apache Spark Batch job list or Batch detail page) you are investigating or from the central Cloud Assist Investigations list — Gemini Cloud Assist offers several powerful capabilities:
For data engineers: Fix complex job failures faster. A prioritized list of intelligent summaries and cross-product root cause analyses helps in quickly narrowing down and resolving a problem.
For data scientists and ML engineers: Solve performance and environment issues without deep Spark knowledge. Gemini acts as your on-demand infrastructure and Spark expert so you can focus more on models.
For Site Reliability Engineers (SREs): Quickly determine if a failure is due to code or infrastructure. Gemini finds the root cause by correlating metrics and logs across different Google Cloud services, thereby reducing the time required to identify the problem.
For big data architects and technical managers: Boost team efficiency and platform reliability. Gemini helps new team members contribute faster, describe issues in natural language and easily create support cases.
Debugging Spark applications is inherently complex because failures can stem from anywhere in a highly distributed system. These issues generally fall into two categories. First are the outright job failures. Then, there are the more insidious, subtle performance bottlenecks. Additionally, cloud infrastructure issues can cause workload failures, complicating investigations.
Gemini Cloud Assist is designed to tackle all these challenges head-on:
Gemini Cloud Assist analyzes and correlates a wide range of data, including metrics, configurations, and logs, across Google Cloud services and pinpoints the root cause of infrastructure issues and provides a clear resolution.
Gemini Cloud Assist automatically identifies incorrect or insufficient Spark and cluster configurations, and recommends the right settings for your workload.
Application Problems
Application logic related problems, inefficient code and algorithms
Gemini Cloud Assist analyzes application logs, Spark metrics, and performance data and diagnoses code errors and performance bottlenecks, and provides actionable recommendations to fix them.
Data Problems
Stage/Task failures, data-related issues
Gemini Cloud Assist analyzes Spark metrics and logs and identifies data-related issues like data skew, and provides actionable recommendations to improve performance and stability.
Gemini Cloud Assist: Your AI-powered operational expert
Let’s explore how Gemini transforms the investigation process in common, real-world scenarios.
Example 1: The slow job with performance bottlenecks
Some of the most challenging issues are not outright failures but performance bottlenecks. A job that runs slowly can impact service-level objectives (SLOs) and increase costs, but without error logs, diagnosing the cause requires deep Spark expertise.
Say a critical batch job succeeds but takes much longer than expected. There are no failure messages, just poor performance.
Manual investigation requires a deep-dive analysis in the Spark UI. You would need to manually search for “straggler” tasks that are slowing down the job. The process also involves analyzing multiple task-level metrics to find signs of memory pressure or data skew.
With Gemini assistance
By clicking Investigate, Gemini automatically performs this complex analysis of performance metrics, presenting a summary of the bottleneck.
Gemini acts as an on-demand performance expert, augmenting a developer’s workflow and empowering them to tune workloads without needing to be a Spark internals specialist.
Example 2: The silent infrastructure failure
Sometimes, a Spark job or cluster fails due to issues in the underlying cloud infrastructure or integrated services. These problems are difficult to debug because the root cause is often not in the application logs but in a single, obscure log line from the underlying platform.
Say a cluster configured to use GPUs fails unexpectedly.
The manual investigation begins by checking the cluster logs for application errors. If no errors are found, the next step is to investigate other Google Cloud services. This involves searching Cloud Audit Logs and monitoring dashboards for platform issues, like exceeded resource quotas.
With Gemini assistance
A single click on the Investigate button triggers a cross-product analysis that looks beyond the cluster’s logs. Gemini quickly pinpoints the true root cause, such as an exhausted resource quota, and provides mitigation steps.
Gemini bridges the gap between the application and the platform, saving hours of broad, multi-service investigation.
Get started today!
Spend less time debugging and more time building and innovating. Let Gemini Cloud Assist in Dataproc on Compute Engine and Google Cloud Serverless for Apache Spark be your expert assistant for big data operations.
We’re excited to announce an expansion to our Compute Flexible Committed Use Discounts (Flex CUDs), providing you with greater flexibility across your cloud environment. Your spend commitments now stretch further and cover a wider array of Google Cloud services and VM families, translating into greater savings for your workloads.
Flex CUDs are spend-based commitments that provide deep discounts on Google Cloud compute resources in exchange for a one or three-year term. This model offers maximum flexibility, automatically applying savings across a broad pool of eligible VM families and regions without being tied to a single resource.
More power, more savings with expanded coverage
We understand that modern applications are built on a diverse mix of services, from massive databases to nimble serverless functions. To better support the way you build, we’re expanding Flex CUDs to cover more of the specialized solutions and serverless solutions you use every day:
Memory-optimized VM Families: We’re bringing enhanced discounts to our memory-optimized M1, M2, M3 and the new M4 VM families. Now you can get more value from critical workloads like SAP HANA, in-memory analytics platforms and high-performing databases.
High-performance computing (HPC) VM families: For compute-intensive workloads, Flex CUDs now apply to our HPC-optimized H3 and the new H4D VM families, perfect for complex simulations and scientific research.
Cloud Run and Cloud Functions: For developers and organizations that use Cloud Run’s fully managed platform, we are extending Flex CUDs’ coverage to Cloud Run request-based billing and Cloud Run functions.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e449b5ac7c0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Why this matters
This expansion of Compute Flex CUDs is designed with your growth and efficiency in mind:
Maximize your spend commitments: Instead of being tied to a specific resource type or region, your committed spend can now be applied across a larger portion of your Google Cloud usage. This means less “wasted” commitment and more active savings.
Enhanced financial predictability and control: With greater coverage, you gain a clearer picture of your anticipated cloud spend, making budgeting and financial planning more predictable.
Simplified cost management: A single, flexible commitment can now cover a more diverse set of services, streamlining your financial operations and reducing the complexity of managing multiple, granular commitments.
Fuel innovation: By reducing the cost of core compute and serverless services, you free up budget that can be reinvested into innovation.
An updated Billing model
Compute Flex CUDs’ expanded coverage is made possible by the new and improved spend-based CUDs model, which streamlines how discounts are applied and provides greater flexibility. Enabling this feature triggers some experience changes to the Billing user interface, Cloud Billing export to BigQuery schema, and Cloud Commerce Consumer Procurement API. This new billing model is simpler: we directly charge the discounted rate for CUD-eligible usage, reflecting the applicable discount, instead of using credits to offset usage and reflect savings. It’s also more flexible: we apply discounts to a wider range of products within spend-based CUDs. For more, thisfollow-up resourcedetails the updates, including information on a sample export to preview your monthly bill in the new format, key CUD KPIs, new SKUs added to CUDs, and CUD product information. You can learn more about these changes in the documentation.
Availability and next steps
At Google Cloud, we’re committed to providing you with the most flexible and cost-effective solutions for your evolving cloud needs. This expansion of Compute Flex CUDs is a testament to that commitment, enabling you to build, deploy, and scale your applications with even greater financial efficiency. Starting today, you can opt-in and begin enjoying Compute Flex CUDs’ expanded scope and improved billing model.
Starting January 21, 2026, all customers will be automatically transitioned to the new spend-based model to take advantage of these expanded Flex CUDs. If you don’t opt in to multi-price CUDs, these changes will be automatically applied on January 21, 2026. New customers who create a Billing Account on or after July 15, 2025 will automatically be under the new billing model for Flex CUDs. Stay tuned for more updates as we continue to enhance our offerings to support your success on Google Cloud.
For ten years, Google Kubernetes Engine (GKE) has been at the forefront of innovation, powering everything from microservices to cloud-native AI and edge computing. To honor this special birthday, we’re challenging you to catapult your microservices into the future with cutting-edge agentic AI. Are you ready to celebrate?
Hands-on learning with GKE: This is your shot to build the next evolution of applications by integrating agentic AI capabilities on GKE. We have everything you need to get started: our microservice applications, example agents on GitHub, documentation, quickstarts, tutorials, and a webinar hosted by our experts.
Showcase your skills: You’ll have the opportunity to elevate a sample microservices application into a unique use case. Feel free to get creative with non-traditional use cases and utilize Agent Development Kit (ADK), Model Context Protocol (MCP), and the Agent2Agent (A2A) protocol for extra powerful functionality!
Think you have what it takes to win?Build an app to showcase your agents and you could potentially win:
Overall grand prize: $15,000 in USD, $3,000 in Google Cloud Credits for use with a Cloud Billing Account, A chance to win maximum of two (2) KubeCon North America conference passes in Atlanta, Georgia (November 10-13, 2025), a 1 year, no-costGoogle Developer Program Premium subscription, guest interview on the Kubernetes Podcast, video feature with the GKE team, virtual coffee with a Google team member, and social promo
Regional winners: $8,000 in USD, $1,000 in Google Cloud Credits for use with a Cloud Billing Account, video feature with the GKE team on a Google Cloud social media channel, virtual coffee with a Google team member, and social promo
Honorable mentions: $1000 in USD and $500 in Google Cloud Credits for use with a Cloud Billing Account
Unleash the power of agentic AI on GKE
GKE is built on open-source Kubernetes, but is also tightly integrated with the Google Cloud ecosystem. This makes it easy to get started with a simple application, while having the control you need for more complex application orchestration and management.
When you join the GKE Turns 10 Hackathon, your mission is to take pre-existing microservice applications (either Bank of Anthos or Online Boutique) and then integrate cutting-edge agentic AI capabilities. The goal is not to modify the core application code directly, but instead build new components that interact with its established APIs! Here is some inspiration:
Optimize important processes: Add a sophisticated AI chatbot to the Online Boutique that can query inventory, provide personalized product recommendations, or even check a user’s financial balance via an integrated Bank of Anthos API.
Streamline maintenance and mitigation: Develop an agent that intelligently monitors microservice performance on GKE, suggests troubleshooting steps, and even automates remediation.
Crucial note: Your project must be built using GKE and Google AI models such as Gemini, focusing on how the agents interact with your chosen microservice application. As long as GKE is the foundation, feel free to enhance your project by integrating other Google Cloud technologies!
Ready to start building?
Head over to our hackathon website and watch our webinar to learn more, review the rules, and register.
Tata Steel is one of the world’s largest steel producers, with an annual crude steel capacity exceeding 35 millions tons. With such a large and global output, we needed a way to improve asset availability, product quality, operational safety, and environmental monitoring. By centralizing data from diverse sources and implementing advanced analytics with Google Cloud, we’re driving a more proactive and comprehensive approach to worker safety and environmental stewardship.
To achieve these objectives, we designed and implemented a robust multi-cloud architecture. This setup unifies manufacturing data across various platforms, establishing the Tata Steel Data Lake on Google Cloud as the centralized repository for seamless data aggregation and analytics.
High level IIOT data integration architecture
Building a unified data foundation on Google Cloud
Our comprehensive data acquisition framework spans multiple plant locations, including Jamshedpur, in the eastern Indian state of Jharkhand, where we leverage Litmus and ClearBlade — both available on Google Cloud Marketplace — to collect real-time telemetry data from programmable logic controllers (PLCs) via LAN, SIM cards, and process networks.
As alternatives, we employ an internal data staging setup using SAP Business Objective Data Services (BODS) and Web APIs. We have also developed in-house smart sensors that use LoRaWAN and Web APIs to upstage data. These diverse approaches ensure seamless integration of both Operational Technology (OT) data from PLCs and Information Technology (IT) data from SAP into Google Cloud BigQuery, enabling unified and efficient data consumption.
Initially, Google Cloud IoT Core was used for ingesting crane data. Following its deprecation, we redesigned the data pipeline to integrate ClearBlade IoT Services, ensuring seamless and secure data ingestion into Google Cloud.
Our OT Data Lake is architected on Manufacturing Data Engine (MDE) and BigQuery, which provides decoupled storage and compute capabilities for scalable, cost-efficient data processing. We developed a visualization layer with hourly and daily table partitioning to support both real-time insights and long-term trend analysis, strategically archiving older datasets in Google Cloud Storage for cost optimization.
We also implemented a secure, multi-path data ingestion architecture to upstage OT data with minimal latency, utilizing Litmus and ClearBlade IoT Core. Finally, we developed custom solutions to extract OPC Data Access and OPC Unified Access data from remote OPC servers, staging it through on-premise databases before secure transfer to Google Cloud.
Together, this comprehensive architecture provides immediate access to real-time device data while facilitating batch processing of information from SAP and other on-premise databases. This integrated approach to OT and IT data delivers a holistic view of operations, enabling more informed decision-making for critical initiatives like Asset Health Monitoring, Environment Canvas, and the Central Quality Management System, across all Tata Steel locations.
Crane health monitoring with IoT data
Monitoring health parameters of crane sub devices
Overcoming legacy challenges for real-time operations
Before deploying Industrial IoT with Google Cloud, high-velocity data was not readily accessible in our central storage. Instead, the data resided in local systems, such as mediation servers and IBA, where limited storage capacity led to automatic purging after a defined retention period. This approach, combined with legacy infrastructure, significantly constrained data availability and hindered informed business decision-making. Furthermore, edge analytics and visualization capabilities were limited, and data latency remained high due to processing bottlenecks at the mediation layer.
Our Google Cloud implementation has since enabled the seamless acquisition of high-volume and high-velocity data for analyzing manufacturing assets and processes, all while ensuring compliance with security protocols across both the IT and OT layers. This initiative has enhanced operational efficiency and delivered cost savings.
Our collaboration with Google Cloud to evaluate and implement secure, more resilient manufacturing operations solutions marks a key milestone in Tata Steel’s digital transformation journey. The new unified data foundation empowered data-driven decision-making through AI-enabled capabilities, including:
Asset health monitoring
Event-based alerting mechanisms
Real-time data monitoring
Advanced data analytics for enhanced user experience
The iMEC: Powering predictive maintenance and efficiency
Tata Steel’s Integrated Maintenance Excellence Centre (iMEC) utilizes MDE to build and deploy monitoring solutions. This involves leveraging data analytics, predictive maintenance strategies, and real-time monitoring to enhance equipment reliability and enable proactive asset management.
MDE, which provides a zero code pre-configured set of Google Cloud infrastructure, acts as a central hub for ingesting, processing, and analyzing data from various sensors and systems across the steel plant, enabling the development and implementation of solutions for improved operational efficiency and reduced downtime.
With monitoring solutions helping to deliver real-time advice, maintenance teams can reduce the physical human footprint at hazardous shop floor locations while providing more ergonomic and comfortable working environments to employees compared to near-location control rooms. These solutions also help us centralize asset management and maintenance expertise, employing real-time data to enable significant operational improvements and cost-effectiveness goals, including:
Reducing unplanned outages and increasing equipment availability.
Transitioning from Time-Based Maintenance (TBM) to predictive maintenance.
Optimizing resource use, reducing power costs, and minimizing delays.
Driving safety with video analytics and cloud storage
To strengthen worker safety, we have also deployed a safety violation monitoring system powered by on-premise, in-house video analytics. Detected violation images are automatically uploaded to a Cloud Storage bucket for further analysis and reporting.
We developed and trained a video analytics model in-house, using specific samples of violations and non-violations tailored to each use case. This innovative approach has enabled us to efficiently store a growing catalog of safety violation images on Cloud Storage, harnessing its elastic storage capabilities.
Our Central Quality Management System — which ensures our data is complete, accurate, consistent, and reliable — is also built on Google Cloud, utilizing BigQuery for scalable data storage and analysis, and Looker Studio for intuitive data visualization and reporting.
Google Cloud for environmental monitoring
Tata Steel’s commitment to sustainability is evident in our comprehensive environment monitoring system, which operates entirely on the Google Cloud. Our Environment Canvas system captures a wide array of environmental Key Performance Indicators (KPIs), including stack emission and fugitive emission.
Environment Canvas – Data office & visualization architecture
Environmental parameters
We capture the data for these KPIs through sensors, SAP, and manual entries. While some sensor data from certain plants is initially sent to a different cloud or on-premises systems, we eventually transfer it to Google Cloud for unified consumption and visualization.
By leveraging the power of Google Cloud’s data and AI technologies, we are advancing operational monitoring and safety through a unified data foundation, real-time monitoring, and predictive maintenance — all enabled by iMEC. At the same time, we are reinforcing our commitment to environmental responsibility with a Google Cloud-based system that enables comprehensive monitoring and real-time reporting of environmental KPIs, delivering actionable insights for responsible operations.
In Episode #6 of the Agent Factory podcast, Vlad Kolesnikov and I were joined by Keith Ballinger, VP and General Manager at Google Cloud, for a deep dive into the transformative future of software development with AI. We explore how AI agents are reshaping the developer’s role and boosting team productivity.
This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.
Keith Ballinger kicked off the discussion by redefining a term from his personal blog: “Impossible Computing.” For him, it isn’t about solving intractable computer science problems, but rather about making difficult, time-consuming tasks feel seamless and even joyful for developers.
He described it as a way to “make things that were impossible or at least really, really hard for people, much more easy and almost seamless for them.”
The conversation explored how AI’s impact extends beyond the individual developer to the entire team. Keith shared a practical example of how his teams at Google Cloud use the Gemini CLI as a GitHub action to triage issues and conduct initial reviews on pull requests, showcasing Google Cloud’s commitment to AI-powered software development.
This approach delegates the more mundane tasks, freeing up human developers to focus on higher-level logic and quality control, ultimately breaking down bottlenecks and increasing the team’s overall velocity.
The Developer’s New Role: A Conductor of an Orchestra
A central theme of the conversation was the evolution of the developer’s role. Keith suggested that developers are shifting from being coders who write every line to becoming “conductors of an orchestra.”
In this view, the developer holds the high-level vision (the system architecture) and directs a symphony of AI agents to execute the specific tasks. This paradigm elevates the developer’s most critical skills to high-level design and “context engineering”—the craft of providing AI agents with the right information at the right time for efficient software development.
The Factory Floor
The Factory Floor is our segment for getting hands-on. Here, we moved from high-level concepts to practical code with live demos from both Keith and Vlad.
Keith shared two of his open-source projects as tangible “demonstration[s] of vibe coding intended to provide a trustworthy and verifiable example that developers and researchers can use.”
Terminus: A Go framework for building web applications with a terminal-style interface. Keith described it as a fun, exploratory project he built over a weekend.
Aether: An experimental programming language designed specifically for LLMs. He explained his thesis that a language built for machines—highly explicit and deterministic—could allow an AI to generate code more effectively than with languages designed for human readability.
Keith provided a live demonstration of his vibe coding workflow. Starting with a single plain-English sentence, he guided the Gemini CLI to generate a user guide, technical architecture, and a step-by-step plan. This resulted in a functional command-line markdown viewer in under 15 minutes.
Vlad showcased a different application of AI agents: creative, multi-modal content generation. He walked through a workflow that used Gemini 2.5 Flash Image (also known as Nano Banana) and other AI tools to generate a viral video of a capybara for a fictional ad campaign. This demonstrated how to go from a simple prompt to a final video.
Inspired by Vlad’s Demo?
If you’re interested in learning how to build and deploy creative AI projects like the one Vlad showcased, the Accelerate AI with Cloud Run program is designed to help you take your ideas from prototype to production with workshops, labs, and more.
Keith explained that he sees a role for both major cloud providers and a “healthy ecosystem of startups” in solving challenges like GPU utilization. He was especially excited about how serverless platforms are adapting, highlighting that Cloud Run now offers GPUs to provide the same fast, elastic experience for AI workloads that developers expect for other applications.
In response to a question about a high-level service for orchestrating AI across multi-cloud and edge deployment, Keith was candid that he hasn’t heard a lot of direct customer demand for it yet. However, he called the area “untapped” and invited the question-asker to email him, showing a clear interest in exploring its potential.
Calling it the “billion-dollar question,” Keith emphasized that as AI accelerates development, the need for a mature and robust compliance regime becomes even more critical. His key advice was that the human review piece is more important than ever. He suggested the best place to start is using AI to assist and validate human work. For example, brainstorm a legal brief with an AI rather than having the AI write the final brief for court submission.
Baseten is one of a growing number of AI infrastructure providers, helping other startups run their models and experiments at speed and scale. Given the importance of those two factors to its customers, Baseten has just passed a significant milestone.
By leveraging the latest Google Cloud A4 virtual machines (VMs) based on NVIDIA Blackwell, and Google Cloud’s Dynamic Workload Scheduler (‘DWS’) Baseten has achieved 225% better cost-performance for high-throughput inference and 25% better cost-performance for latency-sensitive inference.
Why it matters: This breakthrough in performance and efficiency enables companies to move powerful agentic AI and reasoning models out of the lab and into production affordably. For technical leaders, this provides a blueprint for building next-generation AI products — such as real-time voice AI, search, and agentic workflows — at a scale and cost-efficiency that has been previously unattainable.
The big picture: Inference is the cornerstone of enterprise AI. As models for multi-step reasoning and decision-making demand exponentially greater compute, the challenge of serving them efficiently has become the primary bottleneck. Enter Baseten, a six-year-old Series C company that partners with Google Cloud and NVIDIA to provide enterprise companies a scalable inference platform for their proprietary models as well as open models like Gemma, DeepSeek ,and Llama, with an emphasis on performance and cost efficiency. Their success hinges on a dual strategy: maximizing the potential of cutting-edge hardware and orchestrating it with a highly optimized, open software stack.
We wanted to share more about how Baseten architected its stack — and what this new level of cost-efficiency can unlock for your inference applications.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5d44251580>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Hardware optimization with the latest NVIDIA GPUs
Baseten delivers production-grade inference by leveraging a wide range of NVIDIA GPUs on Google Cloud, from NVIDIA T4s through the recent A4 VMs (NVIDIA HGX B200). This access to the latest hardware is critical for achieving new levels of performance.
With A4 VMs, Baseten now serves three of the most popular open-source models — DeepSeek V3, DeepSeek R1, and Llama 4 Maverick — directly on their Model APIs with over 225% better cost-performance for high throughput inference, and 25% better cost-performance for latency- sensitive inference.
In addition to its production-ready model APIs, Baseten provides additional flexibility with NVIDIA B200-powered dedicated deployments for customers seeking to run their own custom AI models with the same reliability and efficiency.
Advanced software for peak performance
Baseten’s approach is rooted in coupling the latest accelerated hardware with leading and open-source software to extract the most value possible from every chip. This integration is made possible with Google Cloud’s AI Hypercomputer, which includes a broad suite of advanced inference frameworks, including NVIDIA’s open-source software stack — NVIDIA Dynamo and TensorRT-LLM — as well as SGLang and vLLM.
Using TensorRT-LLM, Baseten optimizes and compiles custom LLMs for one of its largest AI customers, Writer. This has boosted their throughput by more than 60% for Writer’s Palmyra LLMs. The flexibility of TensorRT-LLM also enabled Baseten to develop a custom model builder that speeds up model compilation.
To serve reasoning models like DeepSeek R1 and Llama 4 on NVIDIA Blackwell GPUs, Baseten uses NVIDIA Dynamo. The combination of NVIDIA’s HGX B200 and Dynamo dramatically lowered latency and increased throughput, propelling Baseten to the top GPU performance spot on OpenRouter’s LLM ranking leaderboard.
The team leverages techniques such as kernel fusion, memory hierarchy optimization, and custom attention kernels to increase tokens per second, reduce time to first token, and support longer context windows and larger batch sizes — all while maintaining low latency and high throughput.
Building a backbone for high availability and redundancy
For mission-critical AI services, resilience is non-negotiable. Baseten runs globally across multiple clouds and regions, requiring an infrastructure that can handle ad hoc demand and outages. Flexible consumption models, such as the Dynamic Workload Scheduler within the AI Hypercomputer, help Baseten manage capacity similar to on-demand with additional price benefits. This allows them to scale up on Google Cloud if there are outages across other clouds.
“Baseten runs globally across multi-clouds and Dynamic Workload Scheduler has saved us more than once when we encounter a failure,” said Colin McGrath, head of infrastructure at Baseten. “Our automated system moves affected workloads to other resources including Google Cloud Dynamic Workload scheduler and within minutes, everyone is up and running again. It is impressive — by the time we’re paged and check-in, everything is back and healthy. This is amazing and would not be possible without DWS. It has been the backbone for us to run our business.”
Baseten’s collaboration with Google Cloud and NVIDIA demonstrates how a powerful combination of cutting-edge hardware and flexible, scalable cloud infrastructure can solve the most pressing challenges in AI inference through Google Cloud’s AI Hypercomputer.
This unique combination enables end-users across industries to bring new applications to market, such as powering agentic workflows in financial services, generating real-time audio and video content in media, and accelerating document processing in healthcare. And it’s all happening at a scale and cost that was previously unattainable.
Ever worry about your applications going down just when you need them most? The talk at Cloud Next 2025, Run high-availability multi-region services with Cloud Run, dives deep into building fault tolerant and reliable applications using Google Cloud’s serverless container platform: Cloud Run.
Google experts Shane Ouchi and Taylor Money, along with Seenuvasan Devasenan from Commerzbank, pull back the curtain on Cloud Run’s built-in resilience and walk you through a real-world scenario with the upcoming Cloud Run feature called Service Health.
For the Cloud Next 2025 presentation, Shane kicked things off by discussing the baseline resilience of Cloud Run through autoscaling, a decoupled data and control plane, and N+1 zonal redundancy. Let’s break that down, starting with autoscaling.
Autoscaling to Make Sure Capacity Meets Demand
Cloud Run automatically adds and removes instances based on the incoming request load, ensuring that the capacity of a Cloud Run service meets the demand. Shane calls this hyper-elasticity, referring to Cloud Run’s ability to rapidly add container instances. Rapid autoscaling prevents the failure mode where your application doesn’t have enough server instances to handle all requests.
Note: Cloud Run lets you prevent runaway scaling by limiting the maximum number of instances.
A Decoupled Data and Control Planes Increases Resiliency
The control plane in Cloud Run is the part of the system responsible for management operations, such as deploying new revisions, configuring services, and managing infrastructure resources. It’s decoupled from the data plane. The data plane is responsible for receiving incoming user requests, routing them to container instances, and executing the application code. Because the data plane operates independently from the control plane, issues in the control plane typically don’t impact running services.
N+1 Redundancy for Both Control and Data Plane
Cloud Run is a regional service, and Cloud Run provides N+1 zonal redundancy by default. That means if any of the zones in a region experiences failures, the Cloud Run infrastructure has sufficient failover capacity (that’s the “+1”) in the same region to continue serving all workloads. This isolates your application from zone failures.
Container Probes Increase Availability
If you’re concerned with application availability, you should definitely configure liveness probes to make sure failing instances are shut down. You can configure two distinct types of container instance health checks on Cloud Run.
Startup probe: Confirms that a new instance has successfully started and is ready to receive requests
Liveness probe: Monitors if a running instance remains healthy and able to continue processing requests. This probe is optional, but enabling it allows Cloud Run to automatically remove faulty instances
100% Availability is Unrealistic
Some applications are so important that you want them to always be available. While 100% availability is unrealistic, you can make them as fault tolerant as possible. Getting that right depends on your application architecture and on the underlying platforms and services you use. Cloud Run has several features that increase its baseline resilience, but there’s more you can do to make your application more resilient.
Going Beyond Zonal Redundancy
Since Cloud Run is a regional service, providing zonal redundancy, developers have to actively architect their application to be resilient against regional outages. Fortunately, Cloud Run already supports multi-regional deployments. Here’s how that works:
Deploy a Cloud Run service to multiple regions, each using the same container image and configuration.
Create a global external application load balancer, with one backend and a Serverless Network Endpoint Group (NEG) per Cloud Run service.
Use a single entrypoint with one global external IP address.
Here’s how that looks like in a diagram:
In case you’re not familiar, a Serverless Network Endpoint Group (NEG) is a load balancer backend configuration resource that points to a Cloud Run service or an App Engine app.
Architecting Applications for Regional Redundancy Can Be Challenging
While deploying in multiple regions is straightforward with Cloud Run, the challenge lies in architecting your application in such a way that individual regional services can fail without losing data or impacting services in other regions.
A Preview of Service Health for Automated Regional Failover
If you set up a multi-regional Cloud Run architecture today, requests are always routed to the region closest to them, but they are not automatically routed away if a Cloud Run service becomes unavailable, as shown in the following illustration:
The upcoming feature Service Health adds automatic traffic failover of traffic from one region to another if a service in one region becomes unavailable:
Enabling Service Health
As of August 2025, Service Health is not yet publicly available (it’s in private preview), but I’m hopeful that’ll change soon. One thing to keep in mind is that the feature might still change until it’s generally available. You can sign up to get access by filling in this request form.
Once you have access, you can enable Service Health on a multi-regional service in two steps:
Add a container instance readiness probe to each Cloud Run service.
Set minimum instances to 1 on each Cloud Run service.
That’s really all there is to it. No additional load balancer configuration is required.
Readiness Probes Are Coming to Cloud Run
As part of Service Health, readiness probes are introduced to Cloud Run. A readiness probe periodically checks each container instance via HTTP. If a readiness probe fails, Cloud Run stops routing traffic to that instance until the probe succeeds again. In contrast, a failing liveness probe causes Cloud Run to shut down the unhealthy instance.
Service Health uses the aggregate readiness state of all container instances in a service to determine if the service itself is healthy or not. If a large percentage of the containers is failing, it marks the service as unhealthy and routes traffic to a different region.
A Live Demo at Cloud Next 2025
In a live demo, Taylor deployed the same service to two regions (one near, one far away). He then sent a request via a Global External Application Load Balancer (ALB). The ALB correctly routed the request to the service in the closest region.
After configuring the closest service to flip between failing and healthy every 30 seconds, he demonstrated that the traffic didn’t failover. That’s the current behavior – so far nothing new.
The next step in his demo was enabling Service Health through enabling minimum instances and a readiness probe on each service. For deploying the config changes to the two services, Taylor used a new flag in the Cloud Run gcloud interface: the --regions flag in gcloud run deploy. It’s a great way to deploy the same container image and configuration to multiple regions at the same time.
With the readiness probes in place and minimum instances set, Service Health started detecting service failure and moved over the traffic to the healthy service in the other region. I thought that was a great demo!
Next Steps
In this post, you learned about Cloud Run’s built-in fault tolerance mechanisms, such as autoscaling and zonal redundancy, how to architect multi-region services for higher availability, and got a preview of the upcoming Service Health feature for automated regional failover.
While the Service Health feature is still in private preview, you can sign up to get access by filling in this request form.
In an industry generating vast volumes of streaming data every day, ensuring precision, speed, and transparency in royalty tracking is a constant and evolving priority. For music creators, labels, publishers, and rights holders, even small gaps in data clarity can influence how and when income is distributed — making innovation in data processing and anomaly detection essential.
To stay ahead of these challenges, BMG partnered with Google Cloud to develop StreamSight, an AI-driven application that enhances digital royalty forecasting and detection of reporting anomalies. The tool uses machine learning models to analyze historical data and flag patterns that help predict future revenue — and catch irregularities that might otherwise go unnoticed.
The collaboration combines Google Cloud’s scalable technology, such as BigQuery, Vertex AI, and Looker, with BMG’s deep industry expertise. Together, they’ve built an application that demonstrates how cloud-based AI can help modernize royalty processing and further BMG’s and Google’s commitment to fairer and faster payout of artist share of label and publisher royalties.
“At BMG, we’re accelerating our use of AI and other technologies to continually push the boundaries of how we best serve our artists, songwriters, and partners. StreamSight reflects this commitment — setting a new standard for data clarity and confidence in digital reporting and monetization. Our partnership with Google Cloud has played a key role in accelerating our AI and data strategy.” – Sebastian Hentzschel,Chief Operating Officer, BMG
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee206c9bee0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
From Data to Insights: How StreamSight Works
At its core, StreamSight utilizes several machine learning models within Google BigQuery ML for its analytical power:
For Revenue Forecasting:
ARIMA_PLUS: This model is a primary tool for forecasting revenue patterns. It excels at capturing underlying sales trends over time and is well-suited for identifying and interpreting long-term sales trajectories rather than reacting to short-term volatility.
BOOSTED_TREE: This model is valuable for the exploratory analysis of past sales behavior. It can effectively capture past patterns, short-term fluctuations and seasonality, helping to understand historical dynamics and how sales responded to recent changes.
For Anomaly Detection & Exploratory Analysis:
K-means and ANOMALY_DETECT function: These are highly effective for identifying various anomaly types in datasets, such as sudden spikes, country-based deviations, missing sales periods, or sales reported without corresponding rights.
Together, these models provide a comprehensive approach: ARIMA_PLUS offers robust future trend predictions, while other models contribute to a deeper understanding of past performance and the critical detection of anomalies. This combination supports proactive financial planning and helps safeguard royalty revenues.
Data Flow in Big Query:
Finding the Gaps: Smarter Anomaly Detection
StreamSight doesn’t just forecast earnings — it also flags when things don’t look right. Whether it’s a missing sales period; unexpected spikes or dips in specific markets; or mismatches between reported revenue and rights ownership, the system can highlight problems that would normally require hours of manual review. And now it’s done at the click of a button.
For example:
Missing sales periods: Gaps in data that could mean missing money.
Sales mismatched with rights: Revenue reported from a region where rights aren’t properly registered.
Global irregularities: Sudden increases in streams or sales that suggest a reporting error or unusual promotional impact.
With StreamSight, these issues are detected at scale, allowing teams to take faster and more consistent action.
The StreamSight Dashboard:
Built on Google Cloud for Scale and Simplicity
The technology behind StreamSight is just as innovative as its mission. Developed on Google Cloud, it uses:
BigQuery ML to run machine learning models directly on large datasets using SQL.
Vertex AI and Python for advanced analysis and model training.
Looker Studio to create dashboards that make results easy to interpret and share across teams.
This combination of tools made it possible to move quickly from concept to implementation, while keeping the system scalable and cost-effective.
A Foundation for the Future
While StreamSight is currently a proof of concept, its early success points to vast potential. Future enhancements could include:
Adding data from concert tours and marketing campaigns to refine predictions.
Include more Digital Service Providers (DSPs) that provide access to digital music, such as Amazon, Apple Music or Spotify to allow for better cross-platform comparisons.
Factoring in social media trends or fan engagement as additional inputs.
Segmenting analysis by genre, region, music creator type, or release format.
By using advanced technology for royalty processing, we’re not just solving problems — we’re building a more transparent ecosystem for the future, one that supports our shared commitment to the fairer and faster payout of the artist’s share of label and publisher royalties.
The collaboration between BMG and Google Cloud demonstrates the music industry’s potential to use advanced technology to create a future where data drives smarter decisions and where everyone involved can benefit from a clearer picture of where music earns its value.
We introduced Cross-Cloud Network to help organizations transform hybrid and multicloud connectivity, and today, many customers are using it to build distributed applications across multiple clouds, on-premises networks, and the internet. A key aspect of this evolution is the ability to scale with IPv6 addressing. However, the transition from IPv4 to IPv6 is a gradual process creating a coexistence challenge: How do IPv6-only devices reach services and content that still resides on IPv4 networks?
To ensure a smooth transition to IPv6, we’re expanding our toolkit. After launching IPv6 Private Service Connect endpoints that connect to IPv4 published services, we are now introducing DNS64 and NAT64. Together, DNS64 and NAT64 form a robust mechanism that intelligently translates communication, allowing IPv6-only environments in Google Cloud to interact with the legacy IPv4 applications on the internet. In this post, we explore the vital role DNS64 and NAT64 play in making IPv6 adoption practical and efficient, removing the dependency on migrating legacy IPv4 services to IPv6.
The importance of DNS64 and NAT64
While dual-stack networking assigns both IPv4 and IPv6 addresses to a network interface, it doesn’t solve the pressing issues of private IPv4 address exhaustion or the increasing push for native IPv6 compliance. For major enterprises, the path toward widespread IPv6 adoption of cloud workloads involves creating new single-stack IPv6 workloads without having to migrate legacy IPv4 applications and services to IPv6. Together, DNS64 and NAT64 directly address this requirement, facilitating IPv6-to-IPv4 communication while maintaining access to existing IPv4 infrastructure.
This IPv6-to-IPv4 translation mechanism supports several critical use cases.
Enabling IPv6-only networks: As IPv4 addresses become increasingly scarce and costly, organizations can build future-proof IPv6-only environments, with DNS64 and NAT64 providing the essential translation to access remaining IPv4 services on the internet.
Gradual migration to IPv6: This allows organizations to gradually phase out IPv4 while guaranteeing their IPv6-only clients can still reach vital IPv4-only services.
Supporting legacy applications: Many critical business applications still rely solely on IPv4; these new services ensure they remain accessible to IPv6-only clients, safeguarding ongoing business operations during the transition.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3e97ad267460>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
How does it work?
An IPv6-only workload begins communication by performing a DNS lookup for the specific service URL. If a AAAA record exists, then an IPv6 address is returned and the connection proceeds directly using IPv6.
However, if DNS64 is enabled but a AAAA record cannot be found, the system instead queries for an A record. Once an A record is found, DNS64 constructs a unique synthesized IPv6 address by combining the well-known 64:ff9b::/96 prefix with the IPv4 address obtained from the A record.
The NAT64 gateway recognizes that the destination address is a part of the 64:ff9b::/96 range. It extracts the original IPv4 address from the latter part of the IPv6 address and initiates a new IPv4 connection to the destination, using the NAT64 gateway’s own IPv4 address as the source. Upon receiving a response, the NAT64 gateway prepends the 64:ff9b::/96 prefix to the response packet’s source IP, providing communication back to the IPv6-only client.
Here’s a diagram of the above-mentioned scenario:
Getting started with DNS64 and NAT64
You can simply setup IPv6-only VMs with DNS64 and NAT64 as follows:
Create VPC, subnets, VMs and firewall rules
Create a DNS64 server policy
Create a NAT64 gateway
Step 1: Create VPC, subnets, VMs, and firewall rules
And with that, we hope that you now understand how to connect your IPv6-only workloads to IPv4 destinations by using DNS64 and NAT64. To learn more about enabling DNS64 and NAT64 for IPv6-only workloads, check out the documentation.
Most businesses with mission-critical workloads have a two-fold disaster recovery solution in place that 1) replicates data to a secondary location, and 2) enables failover to that location in the event of an outage. For BigQuery, that solution takes the shape of BigQuery Managed Disaster Recovery. But the risk of data loss while testing a disaster recovery event remains a primary concern. Like traditional “hard failover” solutions, it forces a difficult choice: promote the secondary immediately and risk losing any data within the Recovery Point Objective (RPO), or delay recovery while you wait for a primary region that may never come back online.
Today, we’re addressing this directly with the introduction of soft failover in BigQuery Managed Disaster Recovery. Soft failover logic promotes the secondary region’s compute and datasets only after replication has been confirmed to be complete, providing you with full control over disaster recovery transitions, and minimizing the risk of data loss during a planned failover.
Figure 1: Comparing hard vs. soft failover
Summary of differences between hard failover and soft failover
Hard failover
Soft failover
Use case
Unplanned outages, region down
Failover testing, requires primary and secondary to both be available
Failover timing
As soon as possible ignoring any pending replication between primary and secondary; data loss possible
Subject to primary and secondary acquiescing, minimizing potential for data loss
RPO/RTO
15 minutes / 5 minutes*
N/A
*Supported objective depending on configuration
BigQuery soft failover in action
Imagine a large financial services company, “SecureBank,” which uses BigQuery for its mission-critical analytics and reporting. SecureBank requires a reliable Recovery Time Objective (RTO) and15 minute Recovery Point Objective (RPO) for its primary BigQuery datasets, as robust disaster recovery is a top priority. They regularly conduct DR drills with BigQuery Managed DR to ensure compliance and readiness for unforeseen outages.
Before the introduction of soft failover in BigQuery Managed DR BigQuery, SecureBank faced a dilemma on how to perform their DR drills. While BigQuery Managed DR handled the failover of compute and associated datasets, conducting a full “hard failover” drill meant accepting the risk of up to 15 minutes of data loss if replication wasn’t complete when the failover was initiated — or significant operational disruption if they first manually verified data synchronization across regions. This often led to less realistic or more complex drills, consuming valuable engineering time and causing anxiety.
New solution:
With soft failover in BigQuery Managed DR, administrators have several options for failover procedures. Unlike hard failover for unplanned outages, soft failover initiates failover only after all data is replicated to the secondary region, to help guarantee data integrity.
Figure 2: Soft Failover Mode Selection
Figure 3: Disaster recovery reservations
Figure 4: Replication status / Failover details
BigQuery soft failover feature is available today via the BigQuery UI, DDL, and CLI, providing enterprise-grade control for disaster recovery, confident simulations, and compliance — without risking data loss during testing. Get started today to maintain uptime, prevent data loss, and test scenarios safely.
Written by: Rommel Joven, Josh Fleischer, Joseph Sciuto, Andi Slok, Choon Kiat Ng
In a recent investigation, Mandiant Threat Defense discovered an active ViewState deserialization attack affecting Sitecore deployments leveraging sample machine keys that had been exposed in Sitecore deployment guides from 2017 and earlier. An attacker leveraged the exposed ASP.NET machine keys to perform remote code execution.
Mandiant worked directly with Sitecore to address this issue. Sitecore tracks this vulnerable configuration as CVE-2025-53690, which affects customers who deployed any version of multiple Sitecore products using sample keys exposed in publicly available deployment guides (specifically Sitecore XP 9.0 and Active Directory 1.4 and earlier versions). Sitecore has confirmed that its updated deployments automatically generate unique machine keys and that affected customers have been notified.
Refer to Sitecore’s advisory for more information on which products are potentially impacted.
Summary
Mandiant successfully disrupted the attack shortly after initiating rapid response, which ultimately prevented us from observing the full attack lifecycle. However, our investigation still provided insights into the adversary’s activity. The attacker’s deep understanding of the compromised product and the exploited vulnerability was evident in their progression from initial server compromise to privilege escalation. Key events in this attack chain included:
Initial compromise was achieved by exploiting the ViewState Deserializationvulnerability CVE-2025-53690 on the affected internet-facing Sitecore instance, resulting in remote code execution.
A decrypted ViewState payload contained WEEPSTEEL, a malware designed for internal reconnaissance.
Leveraging this access, the threat actor archived the root directory of the web application, indicating an intent to obtain sensitive files such as web.config. This was followed by host and network reconnaissance.
The threat actor staged tooling in a public directory which included an:
Open-source network tunnel tool, EARTHWORM
Open-source remote access tool,DWAGENT
Open-source Active Directory (AD) reconnaissance tool, SHARPHOUND
Local administrator accounts were created and used to dump SAM/SYSTEM hives in an attempt to compromise cached administrator credentials. The compromised credentials then enabled lateral movement via RDP.
DWAgent provided persistent remote access and was used for Active Directory reconnaissance.
Figure 1: Attack lifecycle
Initial Compromise
External Reconnaissance
The threat actor began their operation by probing the victim’s web server with HTTP requests to various endpoints before ultimately shifting their attention to the /sitecore/blocked.aspxpage. This page is a legitimate Sitecore component that simply returns a message if a request was blocked due to licensing issues. The page’s use of a hidden ViewState form (a standard ASP.NET feature), combined with being accessible without authentication, made it a potential target for ViewState deserialization attacks.
ViewState Deserialization Attack
ViewStates are an ASP.NET feature designed to persist the state of webpages by storing it in a hidden HTML field named __VIEWSTATE. ViewState deserialization attacks exploit the server’s willingness to deserialize ViewState messages when validation mechanisms are either absent or circumvented. When machine keys (which protect ViewState integrity and confidentiality) are compromised, the application effectively loses its ability to differentiate between legitimate and malicious ViewState payloads sent to the server.
Local web server (IIS) logs recorded that the threat actor’s attack began by sending an HTTP POST request to the blocked.aspx endpoint, which was met with an HTTP 302 “Found” response. This web request coincided with a “ViewState verification failed” message in Windows application event logs (Event ID 1316) containing the crafted ViewState payload sent by the threat actor:
Log: Application
Source: ASP.NET 4.0.30319.0
EID: 1316
Type: Information
Event code: 4009-++-Viewstate verification failed. Reason: Viewstate was
invalid.
<truncated>
ViewStateException information:
Exception message: Invalid viewstate.
Client IP: <redacted>
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1;
Trident/5.0) chromeframe/10.0.648.205 Mozilla/5.0
(Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/121.0.0.0 Safari/537.36
PersistedState: <27760 byte encrypted + base64 encoded payload>
Referer:
Path: /sitecore/blocked.aspx
Mandiant recovered a copy of the server’s machine keys, which were stored in the ASP.NET configuration file web.config. Like many other ViewState deserialization attacks, this particular Sitecore instance used compromised machine keys. Knowledge of these keys enabled the threat actor to craft malicious ViewStates that were accepted by the server by using tools like the public ysoserial.net project.
Initial Host Reconnaissance
Mandiant decrypted the threat actor’s ViewState payload using the server’s machine keys and found it contained an embedded .NET assembly named Information.dll. This assembly, which Mandiant tracks as WEEPSTEEL, functions as an internal reconnaissance tool and has similarities to the GhostContainer backdoor and an information-gathering payload previously observed in the wild.
About WEEPSTEEL
WEEPSTEELis a reconnaissance tool designed to gather system, network, and user information. This data is then encrypted and exfiltrated to the attacker by disguising it as a benign __VIEWSTATE response.
The payload is designed to exfiltrate the following system information for reconnaissance:
// Code Snippet from Host Reconnaissance Function
Information.BasicsInfo basicsInfo = new Information.BasicsInfo
{
Directories = new Information.Directories
{
CurrentWebDirectory = HostingEnvironment.MapPath("~/")
},
// Gather system information
OperatingSystemInformation = Information.GetOperatingSystemInformation(),
DiskInformation = Information.GetDiskInformation(),
NetworkAdapterInformation = Information.GetNetworkAdapterInformation(),
Process = Information.GetProcessInformation()
};
// Serialize the 'basicsInfo' object into a JSON string
JavaScriptSerializer javaScriptSerializer = new JavaScriptSerializer();
text = javaScriptSerializer.Serialize(basicsInfo);
WEEPSTEEL appears to borrow some functionality from ExchangeCmdPy.py, a public tool tailored for similar ViewState-related intrusions. This comparison was originally noted in Kaspersky’s write-up on the GhostContainer backdoor. Like ExchangeCmdPy, WEEPSTEEL sends its output through a hidden HTML field masquerading as a legitimate __VIEWSTATE parameter, shown as follows:
Subsequent HTTP POST requests to the blocked.aspx endpoint from the threat actor would result in HTTP 200 “OK” responses, which Mandiant assesses would have contained an output in the aforementioned format. As the threat actor continued their hands-on interaction with the server, Mandiant observed repeated HTTP POST requests with successful responses to the blocked.aspx endpoint.
Establish Foothold
Following successful exploitation, the threat actor gained the NETWORK SERVICE privilege, equivalent to the IIS worker process w3wp.exe. This access provided the actor a starting point for further malicious activities.
Config Extraction
The threat actor then exfiltrated critical configuration files by archiving the contents ofinetpubsitecoreSitecoreCDWebsite, a Sitecore Content Delivery (CD) instance’s web root. This directory contained sensitive files, such as the web.config file, that provide sensitive information about the application’s backend and its dependencies, which would help enable post-exploitation activities.
Host Reconnaissance
After obtaining the key server configuration files, the threat actor proceeded to fingerprint the compromised server through host and network reconnaissance, including but not limited to enumerating running processes, services, user accounts, TCP/IP configurations, and active network connections.
whoami
hostname
net user
tasklist
ipconfig /all
tasklist /svc
netstat -ano
nslookup <domain>
net group domain admins
net localgroup administrators
Staging Directory
The threat actor leveraged public directories such as Music and Video for staging and deploying their tooling. Files written into the Public directory include:
File:C:UsersPublicMusic7za.exe
Description: command-line executable for the 7-Zip file archiver
EARTHWORM is an open-source tunneler that allows attackers to create a covert channel to and from a victim system over a separate protocol to avoid detection and network filtering, or to enable access to otherwise unreachable systems.
During our investigation, EARTHWORMwas executed to initiate a reverse SOCKS proxy connection back to the following command-and-control (C2) server:
130.33.156[.]194:443
103.235.46[.]102:80.
File:C:UsersPublicMusic1.vbs
Description: Attack VBScript: Used to execute threat actor commands, its content varies based on the desired actions.
SHA-256: <hash varies>
In one instance where the file 1.vbs was retrieved, it contained a simple VBS code to launch the EARTHWORM.
Following initial compromise, the threat actor elevated their access from NETWORK SERVICE privileges to the SYSTEM or ADMINISTRATOR level.
This involved creating local administrator accounts and obtaining access to domain administrator accounts. The threat actor was observed using additional tools to escalate privileges.
Adding Local Administrators
asp$: The threat actor leveraged a privilege escalation tool to create the local administrator account, asp$. The naming convention mimicking an ASP.NET service account with a common suffix $ suggests an attempt to blend in and evade detection.
sawadmin: At a later stage, the threat actor established a DWAGENT remote session to create a second local administrator account.
net user sawadmin {REDACTED} /add
net localgroup administrators sawadmin /add
Credential Dumping
The threat actor established RDP access to the host using the two newly created accounts and proceeded to dump the SYSTEM and SAM registry hives from both accounts. While redundant, this gave the attacker the information necessary to extract password hashes of local user accounts on the system. The activities associated with each account are as follows:
asp$
reg save HKLMSYSTEM c:userspublicsystem.hive
reg save HKLMSAM c:userspublicsam.hive
sawadmin: Prior to dumping the registry hives, the threat actor executed GoToken.exe. Unfortunately, the binary was not available for analysis.
GoToken.exe -h
GoToken.exe -l
GoToken.exe -ah
GoToken.exe -t
reg save HKLMSYSTEM SYSTEM.hiv
reg save HKLMSAM SAM.hiv
Maintain Presence
The threat actor maintained persistence through a combination of methods, leveraging both created and compromised administrator credentials for RDP access. Additionally, the threat actor issued commands to maintain long-term access to accounts. This included modifying settings to disable password expiration for administrative accounts of interest:
net user <AdminUser> /passwordchg:no /expires:never
wmic useraccount where name='<AdminUser>' set PasswordExpires=False
For redundancy and continued remote access, the DWAGENT tool was also installed.
Remote Desktop Protocol
The actor used the Remote Desktop Protocol extensively. The traffic was routed through a reverse SOCKS proxy created by EARTHWORM to bypass security controls and obscure their activities. In one RDP session, the threat actor under the context of the account asp$downloaded additional attacker tooling, dwagent.exe and main.exe, into C:Usersasp$Downloads.
File Path
MD5
Description
C:Usersasp$Downloadsdwagent.exe
n/a
DWAgent installer
C:Usersasp$Downloadsmain.exe
be7e2c6a9a4654b51a16f8b10a2be175
Downloaded from hxxp://130.33.156[.]194/main.exe
Table 1: Files written in the RDP session
Remote Access Tool: DWAGENT
DWAGENT is a legitimate remote access tool that enables remote control over the host. DWAGENT operates as a service with SYSTEM privilege and starts automatically, ensuring elevated and persistence access. During the DWAGENT remote session, the attacker wrote the file GoToken.exe. The commands executed suggest that the tool was used to aid in extracting the registry hives.
File Path
MD5
Description
C:UsersPublicMusicGoToken.exe
62483e732553c8ba051b792949f3c6d0
Binary executed prior to dumping of SAM/SYSTEM hives.
Table 2: File written in the DWAgent remote session
Internal Reconnaissance
Active Directory Reconnaissance
During a DWAGENT remote session, the threat actor executed commands to identify Domain Controllers within the target network. The actor then accessed the SYSVOL share on these identified DCs to search for cpassword within Group Policy Object (GPO) XML files. This is a well-known technique attackers employ to discover privileged credentials mistakenly stored in a weakly encrypted format within the domain.
The threat actor then transitioned to a new RDP session using a legitimate administrator account. From this session, SHARPHOUND , the data collection component for the Active Directory security analysis platform BLOODHOUND, was downloaded via a browser and saved to C:UsersPublicMusicsh.exe.
Following the download, the threat actor returned to the DWAGENT remote session and executed sh.exe, performing extensive Active Directory reconnaissance.
sh.exe -c all
Once the reconnaissance concluded, the threat actor switched back to the RDP session (still using the compromised administrator account) to archive the SharpHound output, preparing it for exfiltration.
With administrator accounts compromised, the earlier created asp$ and sawadminaccounts were removed, signaling a shift to more stable and covert access methods.
Move Laterally
The compromised administrator accounts were used to RDP to other hosts. On these systems, the threat actor executed commands to continue their reconnaissance and deploy EARTHWORM.
On one host, the threat actor logged in via RDP using a compromised admin account. Under the context of this account, the threat actor then continued to perform internal reconnaissance commands such as:
quser
whoami
net user <AdminUser> /domain
nltest /DCLIST:<domain>
nslookup <domain-controller>
Recommendations
Mandiant recommends implementing security best practices in ASP.NET including implementing automated machine key rotation, enabling View State Message Authentication Code (MAC), and encrypting any plaintext secrets within the web.config file. For more details, refer to the following resources:
Google Security Operations Enterprise and Enterprise+ customers can leverage the following product threat detections and content updates to help identify and remediate threats. All detections have been automatically delivered to Google Security Operations tenants within the Mandiant Frontline Threats curated detections ruleset. To leverage these updated rules, access Content Hub and search on any of the strings above, then View and Manage each rule you wish to implement or modify.
Earthworm Tunneling Indicators
User Account Created By Web Server Process
Cmd Launching Process From Users Music
Sharphound Recon
User Created With No Password Expiration Execution
Discovery of Privileged Permission Groups by Web Server Process
We would like to extend our gratitude to the Sitecore team for their support throughout this investigation. Additionally, we are grateful to Tom Bennett and Nino Isakovic for their assistance with the payload analysis. We also appreciate the valuable input and technical review provided by Richmond Liclican and Tatsuhiko Ito.
The Agent Development Kit (ADK) Hackathon is officially wrapped. The hackathon wrapped up with over 10,400 participants from 62 countries, resulting in 477 submitted projects and over 1,500 agents built! Building on the excitement from our initial announcement, the hackathon proved to be an invaluable opportunity for developers to experiment with cutting-edge technologies and build the next generation of agents.
The hackathon focused on designing and orchestrating interactions between multiple agents using ADK to tackle complex tasks like automating processes, analyzing data, improving customer service, and generating content.
Now, let’s give a massive round of applause to our outstanding winners. These teams demonstrated exceptional skill, ingenuity, and a deep understanding of ADK.
Grand Prize
SalesShortcut By Merdan Durdyyev and Sergazy Nurbavliyev SalesShortcut is a comprehensive AI-powered Sales Development Representative (SDR) system built with multi-agent architecture for automated lead generation, research, proposal generation, and outreach.
North America regional winner
Energy Agent AI By David Babu Energy Agent AI is a multi-agent AI transforming energy customer management through Google ADK orchestration.
Latin America regional winner
Edu.AI – Multi-Agent Educational System for Brazil By Giovanna Moeller Edu.AI democratizes Brazil’s education with autonomous AI agents that evaluate essays, generate personalized study plans, and create interdisciplinary mock exams, all in one intelligent system.
Asia Pacific regional winner
GreenOps By Aishwarya Nathani and Nikhil Mankani GreenOps automates sustainability as an AI team that continuously audits, forecasts, and optimizes cloud infrastructure.
Europe, Middle East, Africa regional winner
Nexora-AI By Matthias Meierlohr, Luca Bozzetti, Erliassystems, and Markus Huber Nexora is next-gen personalized education. Learn through interactive lessons with visuals, quizzes, and smart AI support.
Honorable mention #1
Particle Physics Agent ByZX Jin and Tianyu Zhang Particle Physics Agent is an AI agent that converts natural language into validated Feynman diagrams, using real physical laws and high-fidelity data — bridging theory, automation, and symbolic reasoning.
Honorable mention #2
TradeSageAI By Suds Kumar TradeSage AI is an intelligent multi-agent financial analysis platform built using ADK, Agent Engine, Cloud Run and Vertex AI, that revolutionizes trading hypothesis evaluation.
Honorable mention #3
Bleach ByVivek Shukla Bleach is a Visual AI agent builder built using Google ADK that describes agents in plain English, designs visually, and tests instantly.
Inspired by the ADK Hackathon?
Learn more about ADK and continue the conversation in the Google Developer Program forums.
Ready for the next hackathon?
Google Kubernetes Engine (GKE) is turning 10, and we’re celebrating with a hackathon! Join us to build powerful AI agents that interact with microservice applications using Google Kubernetes Engine and Google AI models. Compete for over $50,000 in prizes and demonstrate the power of building agentic AI on GKE.
Submissions are open from Aug 18, 2025 to Sept, 22 2025. Learn more and register at our hackathon homepage.
Privacy-protecting Confidential Computing has come a long way since we introduced Confidential Virtual Machines (VMs) five years ago. The technology, which can protect data while in use, strengthens a security gap beyond data encryption at rest and in transit.
By isolating workloads in hardware-based Trusted Execution Environments (TEEs), Confidential Computing empowers customers to process their most sensitive information in the public cloud with assurance.
As part of the advancements we’ve made with Confidential Computing, we added even more security capabilities with the introduction of Confidential VMs with Intel Trust Domain Extensions (TDX) last year. Intel TDX creates an isolated trust domain (TD) in a VM, uses hardware extensions for managing and encrypting memory to protect cloud workloads, and offers hardware-based remote attestation for verification.
Google Cloud Console now offers Google Compute Engine (GCE) customers a new interface for Intel TDX — no code changes required. To get started, follow these steps:
Start at the GCE Create an instance page
Go to the Security tab and under Confidential VM service, click Enable
Then select Intel TDX from the dropdown menu and click Confirm.
It’s that simple to create a Confidential VM.
Create a new Confidential VM with Intel TDX in the Google Cloud console.
Get Confidential Computing in more regions and zones
Confidential VMs with Intel TDX were first available with support for three regions (and nine zones.) To accommodate growing demand, we’ve expanded support for Intel TDX on the C3 machine series to 10 regions (and 21 zones,) and we are planning more for the future. The full list is available here. As regional availability and scalability are critical, your account team is available to help you plan early to ensure your capacity needs are met.
Confidential GKE Nodes with Intel TDX, now generally available
Confidential GKE Nodes are built on top of Confidential VM and deliver hardware-based protections to your Google Kubernetes Engine (GKE) clusters and node pools to ensure that your containerized workloads remain encrypted in memory. Today, Confidential GKE Nodes are generally available with Intel TDX on GKE Standard and GKE Autopilot.
Confidential GKE Nodes with Intel TDX on the C3 machine series can be created on GKE Standard via CLI, API, UI, and Terraform. The confidential setting can be set at the cluster level or the node pool level with no code changes. You can learn more here.
Confidential GKE Nodes with Intel TDX on the C3 machine series can also be created on GKE Autopilot. It can be enabled through the use of custom compute classes. In GKE, a compute class is a profile that consists of a set of node attributes that GKE uses to provision the nodes that run your workloads during autoscaling events. Check out our documentation to get started.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee9c7cfc760>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Confidential Space with Intel TDX, now generally available
Also built on Confidential VM, our Confidential Space offering is a robust solution for many common issues including addressing insider threats, enabling joint machine-learning training and private gen AI inference, and fostering multi-party collaboration on sensitive data. Here are just a few examples of what our customers have built with Confidential Space:
Symphony demonstrated with its Confidential Cloud how SaaS companies can guarantee isolation of customer data from privileged insiders in the highly regulated financial industry.
Duality delivered privacy-preserving federated learning solutions for a broad range of use cases in healthcare, financial services, and the public sector.
Previously, Confidential Space was only available with AMD-based technology and hardware (on the N2D, C2D, C3D, and C4D machine series), but now it is also available with Intel-based technology and hardware. This is ideal for those wanting attestation guarantees with a hardware root of trust and for those focused on Intel’s C3 machine series.
Additionally, Confidential Space with Intel TDX is measured into runtime measurement registers (RTMR) and the measurements are verified by Google Cloud Attestation. Note that for Confidential VMs with Intel TDX, RTMRs are now populated as well. Confidential Space benefits are highlighted in the NCC Group’s latest independent security evaluation.
Confidential VM and Confidential GKE Nodes with NVIDIA H100 GPUs, now generally available
If you’re looking for performance and security while protecting data in use, Confidential VM and Confidential GKE Nodes with NVIDIA H100 GPUs on the accelerator-optimized A3 machine series are now generally available. These offerings deliver Google Cloud’s first Confidential GPUs, focus on ease of use to meet the demand for secure computing, and extend security to data-intensive, AI and ML workloads by having Intel TDX enabled on the CPU and NVIDIA Confidential Computing enabled on the GPU. You now have the ability to secure your data performantly during inference and training across models.
Intel’s attestation verifier service, Intel Tiber Trust Authority, now has a free tier. Google Cloud Confidential VMs and Confidential Space are both integrated with Intel Tiber Trust Authority as a third party attestation service, but now Intel Tiber Trust Authority is making secure attestation more accessible for all by offering a free tier (with optional paid support).
“Thanks to the joint efforts of Super Protocol, Google Cloud, and NVIDIA, the world now gains a new layer of possibility — unlocking Confidential AI without cloud borders. With A3 Confidential VMs built on NVIDIA H100 GPUs now integrated into Super’s decentralized infrastructure and marketplace, companies can securely run, monetize, and collaborate on sensitive AI and data — across any environment. This enables seamless collaboration between Google Cloud customers and partners in other clouds — with no need for shared trust, manual agreements, or compromise. For the broader market, A3 instances at scale accelerate global access, while Super ensures confidentiality, verifiability, and self-sovereignty — fully automated and requiring no expertise in confidential computing. We are excited to open this next chapter of Confidential AI, built to work wherever you and your partners are,” said Nukri Basharuli, founder and CEO, Super Protocol.
“We’re proud to have partnered with Google Cloud to validate their Confidential Computing-enabled GPU solution — a major step forward in securing sensitive data for AI and machine learning workloads, without compromising on performance or scalability. Confidential Computing allows organizations to process sensitive workloads in the cloud while protecting sensitive data and models from both the cloud provider and the organization’s insiders and internal threats. However, for gen AI and agentic AI use cases, protecting the CPU alone isn’t enough — both CPU and GPU must also run in confidential mode with mutual trust. With Google Cloud’s new offering, Anjuna can now launch Confidential Containers that leverage Intel TDX and NVIDIA H100 GPUs in confidential mode. This ensures that data, configurations, secrets, and code remain protected end-to-end from any untrusted entity, bringing state-of-the-art security for sensitive data.” said Steve Van Lare, CTO, Anjuna Security.
“With data processing worldwide growing up to three times faster than ever before and doubling every six months, the future of cloud computing must be built on trust. In collaboration with Google, Modelyo leverages Confidential VMs on the A3 machine series with NVIDIA H100 GPUs, transforming Confidential Computing into a seamless, intuitive, and fully integrated cloud experience. This enables us to deliver end-to-end managed solutions across interconnected environments, empowering organizations to innovate confidently knowing their data remains effortlessly protected at every stage.” said Benny Meir, CEO, Modelyo.
How to get started with Confidential Computing
To add that extra layer of protection and privacy to your sensitive workloads, check out our documentation for Confidential VMs and Confidential GKE Nodes today.
AI is transforming data into a strategic asset, driving demand for flexible, integrated, and real-time data architectures. But yesterday’s data tools can’t handle AI’s demand for massive volumes of real-time and multi-modal data. Data lakes, for instance, offer flexibility for raw data but lack enforcement and consistency. Meanwhile, traditional data marts, warehouses, and lake architectures often result in silos, and require costly ETL to bridge analytical, unstructured, and operational data.
The shift to open lakehouses relies on open table formats like Apache Iceberg, which has emerged as the de facto open-source table format for data lakes. Today, alongside our partners Confluent, Databricks, dbt, Fivetran, Informatica and Snowflake, we’re excited to reiterate our commitment to this open standard. Whether you’re integrating best-of-breed services from diverse providers or navigating a complex data landscape because of a merger and acquisition, adopting an open table format like Iceberg can help you dismantle your traditional data silos.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee9c789efd0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
United in our Iceberg support
At its core, Iceberg provides a metadata layer that enables efficient query planning and data management. This crucial layer, encompassing table schema, partitioning, and data file locations, powers advanced features like time travel and data pruning, which allows data teams to swiftly pinpoint relevant data, streamline performance, and accelerate insights.
The data management industry is coalescing around the open Apache Iceberg standard. At Google Cloud, we recently delivered innovations which leverage Google’s Cloud Storage (GCS) to provide an enterprise-grade experience for managing and interoperating with Iceberg data, including BigLake tables for Apache Iceberg and BigLake Metastore with a new REST Catalog API. Databricks recently announced Iceberg support with their Unity Catalog, allowing users to read and write managed Iceberg tables across a variety of catalogs. Similarly, Snowflake supports interoperable storage with Apache Iceberg tables, allowing organizations to access Iceberg data within Snowflake, minimizing the latency associated with ingesting or copying data.
“This open, standard interface allows any Iceberg-compatible engine — including BigQuery, Apache Spark, Databricks, and Snowflake — to operate on the same, single copy of Iceberg data. This powerful architecture even bridges the gap between analytical and operational workloads. By supporting Iceberg and other open table formats in Unity Catalog, we’re unifying data and governance across the enterprise to truly democratize data and AI. No matter what table format our customers choose, we ensure it’s accessible, optimized, and governed for Business and Technical users.” – Ryan Blue, Original creator of Apache Iceberg, Member of Technical Staff, Databricks
“Customers shouldn’t have to choose between open formats and best-in-class performance or business continuity. Snowflake’s native support for open source standards unifies data while preserving flexibility and choice, paving the way to build and securely scale high-performing lakehouses without silos or operational overhead.” – Rithesh Makkena, Senior Regional Vice President of Partner Solutions Engineering, Snowflake
At Google Cloud, we’re committed to an open Data Cloud that lets data teams build modern, data-driven applications wherever their workloads are, while using open source, open standards and open formats like Apache Iceberg.
We partner closely with an extensive ecosystem of partners including Confluent, dbt Labs, Fivetran, and Informatica on Apache Iceberg initiatives.
“Apache Iceberg has emerged as a critical enabler for the open data ecosystem, providing the flexibility, interoperability, and consistency that modern, real-time data architectures demand. At Confluent, we’re dedicated to helping customers leverage this power. Our Tableflow innovation, by representing Apache Kafka topics as open Iceberg tables, exemplifies how this open format eliminates complex ETL and ensures data is always fresh, accurate, and instantly actionable for critical real-time analytics and AI.” – Shaun Clowes, Chief Product Officer, Confluent
“dbt was born out of an open source project to help people transform data. Open data ecosystems are at the core of what we do. Supporting Iceberg in dbt ensures that our customers will have standards and choices for how they use their transformed data in their AI and data workflows.” – Ryan Segar, Chief Product Office, dbt Labs
“Open table formats like Apache Iceberg make it possible to reduce data copies by decoupling your data from the compute engines used to access it. Fivetran’s Managed Data Lake service ensures data is delivered to cloud storage as transactionally consistent tables in a way that preserves the structure from the source. Fivetran’s Managed Data Lake seamlessly integrates with Google Cloud Storage and BigLake metastore, providing a single governance layer within customers’ Google projects and making Iceberg tables just as easy to query as native BigQuery tables.” – Dan Lynn, Vice President of Product Management, Databases, Fivetran
“Our collaboration with Google on the Iceberg format is ushering in a new era of open, interoperable data architecture. Together, we’re enabling organizations to unify their data effortlessly, accelerate insights and innovate without limits by eliminating silos and unlocking the full power of the modern data ecosystem.” – Rik Tamm-Daniels, GVP, Technology Alliances
The power of shared data
By adopting Iceberg, customers can share data across different query engines and platforms, leveraging shared datasets for a multitude of workloads and improving interoperability. Organizations can now share data from Snowflake to BigQuery, unlocking powerful BigQuery ML capabilities such as text generation or machine translation, and simplifying ML model development and deployment. Likewise, data teams can share data with BigQuery from Databricks to achieve cost efficiencies, leverage built-in ML, or implement agentic workflows.
Customers like Global Payments embraced Iceberg for more flexibility across their diverse data tools. BigQuery and Snowflake serve millions of merchants, and allow the business to analyze transaction data and unlock deep customer insights.
Likewise, Unilever has transformed its data management approach with Iceberg, which allows it to manage large datasets more efficiently, particularly in a data lakehouse architecture. Using a combination of Google Cloud and Databricks, Unilever stores and analyzes large amounts of complex data, allowing them and their suppliers to take action wherever and whenever needed.
Whether you create your Iceberg tables in BigQuery, Databricks, or Snowflake, you can leverage the resulting data from any platform and have your tables stay continuously up-to-date. This interoperability will help you operate with greater efficiency and security, drastically reducing the time you spend moving or duplicating datasets, and eliminating the need for complex pipelines to utilize your preferred tools, platforms, and processing systems.
Get started today with BigQuery and BigLake for your AI-ready lakehouse. You can learn how to build an open data lakehouse with BigQuery and Snowflake by watching a tutorial, then diving into the Quickstart Guide. Learn how to connect and build an open data lakehouse with BigQuery and Databricks.