A year ago today, Google Cloud filed a formal complaint with the European Commission about Microsoft’s anti-competitive cloud licensing practices — specifically those that impose financial penalties on businesses that use Windows Server software on Azure’s biggest competitors.
Despite regulatory scrutiny, it’s clear that Microsoft intends to keep its restrictive licensing policies in place for most cloud customers. In fact, it’s getting worse.
As part of a recent earnings call, Microsoft disclosed that its efforts to force software customers to use Azure are “not anywhere close to the finish line,” and represented one of three pillars “driving [its] growth.” As we approach the end of September, Microsoft is imposing another wave of licensing changes to force more customers to Azure by preventing managed service providers from hosting certain workloads on Azure’s competitors.
Regulators have taken notice. As part of a comprehensive investigation, the U.K.’s Competition and Markets Authority (CMA) recently found that restrictive licensing harms cloud customers, competition, economic growth, and innovation. At the same time, a growingnumber of regulators around the world are also scrutinizing Microsoft’s anti-competitive conduct — proving that fair competition is an issue that transcends politics and borders.
While some progress has been made, restrictive licensing continues to be a global problem, locking in cloud customers, harming economic growth, and stifling innovation.
Economic, security, and innovation harms
Restrictive cloud licensing has caused an enormous amount of harm to the global economy over the last year. This includes direct penalties that Microsoft forces businesses to pay, and downstream harms to economic growth, cybersecurity, and innovation. Ending restrictive licensing could help supercharge economies around the world.
Microsoft still imposes a 400% price markup on customers who choose to move legacy workloads to competitors’ clouds. This penalty forces customers onto Azure by making it more expensive to use a competitor. A mere 5% increase in cloud pricing due to lack of competition costs U.K. cloud customers £500 million annually, according to the CMA. A separate study in the EU found restrictive licensing amounted to a billion-Euro tax on businesses.
With AI technologies disrupting the business market in dramatic ways, ending Microsoft’s anti-competitive licensing is more important than ever as customers move to the cloud to access AI at scale. Customers, not Microsoft, should decide what cloud — and therefore what AI tools — work best for their business.
The ongoing risk of inaction
Perhaps most telling of all, the CMA found that since some of the most restrictive licensing terms went into place over the last few years, Microsoft Azure has gained customers at two or even three times the rate as competitors. Less choice and weaker competition is exactly the type of “existential challenge” to Europe’s competitiveness that the Draghi report warned of.
Ending restrictive licensing could help governments “unlock up to €1.2 trillion in additional EU GDP by 2030” and “generate up to €450 billion per year in fiscal savings and productivity gains,” according to a recent study by the European Centre for International Political Economy. Now is the time for regulators and policymakers globally to act to drive forward digital transformation and innovation.
In the year since our complaint to the European Commission, our message is as clear as ever: Restrictive cloud licensing practices harm businesses and undermine European competitiveness. To drive the next century of technology innovation and growth, regulators must act now to end these anti-competitive licensing practices that harm businesses.
Autopilot is an operational mode for Google Kubernetes Engine (GKE) that provides a fully managed environment and takes care of operational details, like provisioning compute capacity for your workloads. Autopilot allows you to spend more time on developing your own applications and less time on managing node-level details. This year, we upgraded Autopilot’s autoscaling stack to a fully dynamic container-optimized compute platform that rapidly scales horizontally and vertically to support your workloads. Simply attach a horizontal pod autoscaler (HPA) or vertical pod autoscaler (VPA) to your environment, and experience a fully dynamic platform that can scale rapidly to serve your users.
More and more customers, including Hotspring and Contextual AI, understand that Autopilot can dramatically simplify Kubernetes cluster operations and enhance resource efficiency for their critical workloads. In fact, in 2024, 30% of active GKE clusters were created in Autopilot mode. The new container-optimized compute platform has also proved popular with customers, who report rapid performance improvements in provisioning time. The faster GKE provisions capacity, the more responsive your workloads become, improving your customers’ experience and optimizing costs.
Today, we are pleased to announce that the best of Autopilot is now available in all qualified GKE clusters, not just dedicated Autopilot ones. Now, you can utilize Autopilot’s container-optimized compute platform and ease of operation from existing GKE clusters. It’s generally available, starting with clusters enrolled in the Rapid release channel and running GKE version 1.33.1-gke.1107000 or later. Most clusters will qualify and be able to access these new features as they roll out to the other release channels, except clusters enrolled in the Extended channel and those that use the older routes-based networking. To access these new features, enroll in the Rapid channel and upgrade your cluster version, or wait to be auto-upgraded.
Autopilot features are offered in Standard clusters via compute classes, which are a modern way to group and specify compute requirements for workloads in GKE. GKE now has two built-in compute classes, autopilot and autopilot-spot, that are pre-installed on all qualified clusters running on GKE 1.33.1-gke.1107000 or later and enrolled in the Rapid release channel. Running your workload on Autopilot’s container-optimized compute platform is as easy as specifying the autopilot (or autopilot-spot) compute class, like so:
Better still, you can make the Autopilot container-optimized compute platform the default for a namespace, a great way to save both time and money. You get efficient bin-packing, where the workload is charged for resource requests (and can even still burst!), rapid scaling, and you don’t have to plan your node shapes and sizes.
Here’s how to set Autopilot as your default for a namespace:
Pod sizes for the container-optimized compute platform start at 50 milli-CPU (that’s just 5% of 1 CPU core!), and can scale to 28vCPU. With the container-optimized compute platform you only pay for the resources your Pod requests, so you don’t have to worry about system overhead or empty nodes. Pods such as those larger than 28 vCPU or with specific hardware requirements can also run in Autopilot mode on specialized compute with node-based pricing via customized compute classes.
Run AI workloads on GPUs and TPUs with Autopilot
It’s easy to pair Autopilot’s container-optimized compute platform with specific hardware such as GPUs, TPUs and high-performance CPUs to run your AI workloads. You can run those workloads in the same cluster side by side Pods on the container-optimized compute platform. By choosing Autopilot mode for these AI workloads, you benefit from the Autopilot’s managed node properties, where we take a more active role in management. Furthermore, you also get our enterprise-grade privileged admission controls that require workloads to run in user-space, for better supportability, reliability and an improved security posture.
Here’s how to define your own customized compute class that runs in Autopilot mode with specific hardware, in this example a G2 machine type with NVIDIA L4s with two priority rules:
We’re also making compute classes work better with a new provisioning mode that automatically provisions resources for compute classes, without changing how other workloads are scheduled on existing node pools. This means you can now adopt the new deployment paradigm of compute class (including the new Autopilot-enabled compute classes) at your own pace, without affecting existing workloads and deployment strategies.
Until now, to use compute class in Standard clusters with automatic node provisioning, you needed to enable node auto-provisioning for the entire cluster. Node auto-provisioning has been part of GKE for many years, but it was previously an all-or-nothing decision — you couldn’t easily combine a manual node pool with a compute class provisioned by node auto-provisioning without potentially changing how workloads outside of the compute class were scheduled. Now you can, with our new automatically provisioned compute classes. All Autopilot compute classes use this system, so it’s easy to run workloads in Autopilot mode side-by-side with your existing deployments (e.g., on manual node pools). You can also enable this feature on any compute class starting with clusters in the Rapid channel running GKE version 1.33.3-gke.1136000 or later.
With the Autopilot mode for compute classes in Standard clusters, and the new automatic provisioning mode for all compute classes, you can now introduce compute class as an option to more clusters without impacting how any of your existing workloads are scheduled. Customers we’ve spoken to like this, as they can adopt these new patterns gradually for new workloads and by migrating existing ones, without needing to plan a disruptive switch-over.
Autopilot for all
At Google Cloud, we believe in the power of GKE’s Autopilot mode to simplify operations for your GKE clusters and make them more efficient. Now, those benefits are available to all GKE customers! To learn more about GKE Autopilot and how to enable it for your clusters, check out these resources.
The role of the data scientist is rapidly transforming. For the past decade, their mission has centered on analyzing the past to run predictive models that informed business decisions. Today, that is no longer enough. The market now demands that data scientists build the future by designing and deploying intelligent, autonomous agents that can reason, act, and learn on behalf of the enterprise.
This transition moves the data scientist from an analyst to an agentic architect. But the tools of the past — fragmented notebooks, siloed data systems, and complex paths to production — create friction that breaks the creative flow.
At Big Data London, we are announcing the next wave of data innovations built on an AI-native stack, designed to address these challenges. These capabilities help data scientists move beyond analysis to action by enabling them to:
Stop wasting time context-switching. We’re delivering a single, intelligent notebook environment where you can instantly use SQL, Python, and Spark together, letting you build and iterate in one place instead of fighting your tools.
Build agents that understand the real world. We’re giving you native, SQL-based access to the messy, real-time data — like live event streams and unstructured data — that your agents need to make smart, context-aware decisions.
Go from prototype to production in minutes, not weeks. We’re providing a complete ‘Build-Deploy-Connect’ toolkit to move your logic from a single notebook into a secure, production-grade fleet of autonomous agents.
Unifying the environment for data science
The greatest challenge of data science productivity is friction. Data scientists live in a state of constant, forced context-switching: writing SQL in one client, exporting data, loading it into a Python notebook, configuring a separate Spark cluster for heavy lifting, and then switching to a BI tool just to visualize results. Every switch breaks the creative “flow state” where real discovery happens. Our priority is to eliminate this friction by creating the single, intelligent environment an architect needs to engineer, build, and deploy — not just run predictive models.
Today, we are launching fundamental enhancements to Colab Enterprise notebooks in BigQuery and Vertex AI. We’ve added native SQL cells (preview), so you can now iterate on SQL queries and Python code in the same place. This lets you use SQL for data exploration and immediately pipe the results into a BigQuery DataFrame to build models in Python. Furthermore, rich interactive visualization cells (preview) automatically generate editable charts from your data to quickly assess the analysis. This integration breaks the barrier between SQL, Python, and visualization, transforming the notebook into an integrated development environment for data science tasks.
But an integrated environment is only half the solution; it must also be intelligent. This is the power of our Data Science Agent, which acts as an “interactive partner” inside Colab. Recent enhancements to this agent mean it can now incorporate sophisticated tool usage (preview) within its detailed plans, including the use of BigQueryML for training and inferencing, BigQueryDataFrames for analysis using Python, or large scale Spark transformations. This means your analysis gets more advanced, your demanding workloads are more cost-effective to run, and your models get into production quicker.
In addition, we are also making our Lightning Engine generally available. The Lightning Engine acceleratesSpark performance more than 4x compared to open-source Spark. And Lightning Engine is ML and AI-ready by default, seamlessly integrating with BigQuery Notebooks, Vertex AI, and VS Code. This means you can use the same accelerated Spark runtime across your entire workflow in any tool of choice — from initial exploration in a notebook to distributed training on Vertex AI. We’re also announcing advanced support for Spark 4.0 (preview), bringing its latest innovations directly to you.
Building agents that understand the real world
Agentic architects build systems that will sense and respond to the world in real time. This requires access to data that has historically been siloed in separate, specialized systems such as live event streams and unstructured data. To address this challenge we are making real-time streams and unstructured data more accessible for data science teams.
First, to process real-time data using SQL we are announcing stateful processing for BigQuery continuous queries (preview). In the past, it was difficult to ask questions about patterns over time using just SQL on live data. This new capability changes that. It gives your SQL queries a “memory,” allowing you to ask complex, state-aware questions. For example, instead of just seeing a single transaction, you can ask, “Has this credit card’s average transaction value over the last 5 minutes suddenly spiked by 300%?” An agent can now detect this suspicious velocity pattern — which a human analyst reviewing individual alerts would miss — and proactively trigger a temporary block on the card before a major fraudulent charge goes through. This unlocks powerful new use cases, from real-time fraud detection to adaptive security agents that learn and identify new attack patterns as they happen.
Second, we are removing the friction to build AI applications using a vector database, by helping data teams with autonomous embedding generation in BigQuery (preview) over multimodal data. Building on our BigQuery Vector Search capabilities, you no longer have to build, manage, or maintain a separate, complex data pipeline just to create and update your vector embeddings. BigQuery now takes care of this automatically as data arrives and as users search for new terms in natural language. This capability enables agents to connect user intent to enterprise data, and it’s already powering systems like the in-store product finder at Morrisons, which handles 50,000 customer searches on a busy day. Customers can use the product finder on their phones as they walk around the supermarket. By typing in the name of a product, they can immediately find which aisle a product is on and in which part of that aisle. The system uses semantic search to identify the specific product SKU, querying real-time store layout and product catalog data.
Trusted, production ready multi-agent development
When an analyst delivers a report and their job is done. When an architect deploys an autonomous application or agent, their job has just begun. This shift from notebook-as-prototype to agent-as-product introduces a critical new set of challenges: How do you move your notebook logic into a scalable, secure, and production-ready fleet of agents?
To solve this, we are providing a complete “Build-Deploy-Connect” toolkit for the agent architect. First, the Agent Development Kit (ADK) provides the framework to build, test, and orchestrate your logic into a fleet of specialized, production-grade agents. This is how you move from a single-file prototype to a robust, multi-agent system. And this agentic fleet doesn’t just find problems — it acts on them. ADK allows agents to ‘close the loop’ by taking intelligent, autonomous actions, from triggering alerts to creating and populating detailed case files directly in operational systems like ServiceNow or Salesforce.
A huge challenge until now was securely connecting these agents to your enterprise data, forcing developers to build and maintain their own custom integrations. To solve this, we launched first-party BigQuery tools directly integrated within ADK or via MCP. These are Google-maintained, secure tools that allow your agent to intelligently discover datasets, get table info, and execute SQL queries, freeing your team to focus on agent logic, not foundational plumbing. In addition, your agentic fleet can now easily connect to any data platform in Google Cloud using our MCP Toolbox. Available across BigQuery, AlloyDB, Cloud SQL, and Spanner, MCP Toolbox provides a secure, universal ‘plug’ for your agent fleet, connecting them to both the data sources and the tools they need to function.
This “Build-Deploy-Connect” toolkit also extends to the architect’s own workflow. While ADK helps agents connect to data, the architect (the human developer) needs to manage this system using a new primary interface: the command line (CLI). To eliminate the friction of switching to a UI for data tasks, we are integrating data tasks directly into the terminal with our new Gemini CLI extensions for Data Cloud (preview). Through the agentic Gemini CLI, developers can now use natural language to find datasets, analyze data, or generate forecasts — for example, you can simply state gemini bq “analyze error rates for ‘checkout-service'” — and even pipe results to local tools like Matplotlib, all without leaving your terminal.
Architecting the future
These innovations transform the impact data scientists can have within the organization. Using an AI-native stack we are now unifying the development environment in new ways, expanding data boundaries, and enabling trusted production ready development.
You can now automate tasks and use agents to become an agentic architect helping your organization to sense, reason, and act with intelligence. Ready to experience this transformation? Check out our new Data Science eBook with eight practical use cases and notebooks to get you started building today.
In June, Google introduced Gemini CLI, an open-source AI agent that brings the power of Gemini directly into your terminal. And today, we’re excited to announce open-source Gemini CLI extensions for Google Data Cloud services.
Building applications and analyzing trends with services like Cloud SQL, AlloyDB and BigQuery has never been easier — all from your local development environment! Whether you’re just getting started or a seasoned developer, these extensions make common data interactions such as app development, deployment, operations, and data analytics more productive and easier. So, let’s jump right in!
Using a Data Cloud Gemini CLI extension
Before you get started, make sure you have enabled the APIs and configured the IAM permissions required to access specific services.
To retrieve the newest functionality, install the latest release of the Gemini CLI (v0.6.0):
Replace <EXTENSION> with the name of the service you want to use. For example, alloydb, cloud-sql-postgresql or bigquery-data-analytics.
Before starting the Gemini CLI, you’ll need to configure the extension to connect with your Google Cloud project by adding the required environment variables. The table below provides more information on the configuration required.
Extension Name
Description
Configuration
alloydb
Create resources and interact with AlloyDB for PostgreSQL databases and data.
Now, you can start the Gemini CLI using command gemini. You can view the extensions installed with the command /extensions
You can list the MCP servers and tools included in the extension using command /mcp list
Using the Gemini CLI for Cloud SQL for PostgreSQL extension
The Cloud SQL for PostgreSQL extension lets you perform a number of actions. Some of the main ones are included below:
Create instance: Creates a new Cloud SQL instance for PostgreSQL (and also MySQL, or SQL Server)
List instances: Lists all Cloud SQL instances in a given project
Get instance: Retrieves information about a specific Cloud SQL instance
Create user: Creates a new user account within a specified Cloud SQL instance, supporting both standard and Cloud IAM users
Curious about how to put it in action? Like any good project, start with a solid written plan of what you are trying to do. Then, you can provide that project plan to the CLI as a series of prompts, and the agent will start provisioning the database and other resources:
After configuring the extension to connect to the new database, the agent can generate the required tables based on the approved plan. For easy testing, you can prompt the agent to add test data.
Now the agent can use the context it has to generate an API to make the data accessible.
As you can see, these extensions make it incredibly easy to start building with Google Cloud databases!
Using the BigQuery Analytics extensions
For your analytical needs, we are thrilled to give you a first look at the Gemini CLI extension for BigQuery Data Analytics. We are alsoexcited togiveaccess to the Conversational Analytics API through the BigQuery Conversational Analytics extension. This is the first step in our journey to bring the full power of BigQuery directly into your local coding environment, creating an integrated and unified workflow.
With this extension you can
Explore data: Use natural language to search for your tables.
Analyze: Ask business questions on the data and generate intelligent insights.
Dive deeper: Use conversational analytics APIs to dive deeper into the insights.
And extend: Use other tools or extensions to extend into advanced workflows like charting, reporting, code management, etc.
This initial release provides a comprehensive suite of tools to Gemini CLI:
Metadata tools: Discover and understand the BigQuery data landscape.
Query execution tool: Run any BigQuery query and get the results back, summarized to your console.
AI-powered forecasting: Leverage BigQuery’s built-in AI.Forecastfunction for powerful time-series predictions directly from the command line.
Deeper data Insights: The“ask_data_insights” tool provides access to server-side BigQuery agent for richer data insights.
And more …
[Note: To use the conversational analytics extension you need to enable additional APIs. Refer to documentation for additional info.]
Here is an example journey with analytics extensions:
Explore and analyze your data , e.g.,
code_block
<ListValue: [StructValue([(‘code’, ‘> find tables related to PyPi downloadsrn rn✦ I found the following tables related to PyPi downloads:rnrn * file_downloads: projects/bigquery-public-data/datasets/pypi/tables/file_downloadsrn * distribution_metadata: projects/bigquery-public-data/datasets/pypi/tables/distribution_metadata’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fdcc86a8700>)])]>
code_block
<ListValue: [StructValue([(‘code’, ‘> Using bigquery-public-data.pypi.file_downloads show me top 10 downloaded pypi packages this month rnrn✦ Here are the top 10 most downloaded PyPI packages this month:rnrn 1. boto3: 685,007,866 downloadsrn 2. botocore: 531,034,851 downloadsrn 3. urllib3: 512,611,825 downloadsrn 4. requests: 464,595,806 downloadsrn 5. typing-extensions: 459,505,780 downloadsrn 6. certifi: 451,929,759 downloadsrn 7. charset-normalizer: 428,716,731 downloadsrn 8. idna: 409,262,986 downloadsrn 9. grpcio-status: 402,535,938 downloadsrn 10. aiobotocore: 399,650,559 downloads’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7fdcc5054a60>)])]>
Run deeper insights
Using “ask_data_insights” to trigger an agent on the BigQuery (Conversational analytics API) to answer your questions. The server side agent is smart enough to gather additional context about your data and offer deeper insights into your questions.
You can go further and generate charts and reports by mixing BigQuery data with your local tools. Here’s a prompt to try:
”using bigquery-public-data.pypi.file_downloads can you forecast downloads for the last four months of 2025 for package urllib3? Please plot a chart that includes actual downloads for the first 8 months, followed by the forecast for the last four months”
Get started today!
Ready to level up your Gemini CLI extensions for our Data Cloud services? Read more in theextensions documentation. Check out our templates and start building your own extensions to share with the community!
Public sector agencies are under increasing pressure to operate with greater speed and agility, yet are often hampered by decades of legacy data. Critical information, essential for meeting tight deadlines and fulfilling mandates, frequently lies buried within vast collections of unstructured documents. This challenge of transforming institutional knowledge into actionable insight is a common hurdle on the path to modernization.
The Indiana Department of Transportation (INDOT) recently faced this exact scenario. To comply with Governor Mike Braun’s Executive Order 25-13, all state agencies were given 30 days to complete a government efficiency report, mapping all statutory responsibilities to their core purpose. For INDOT, the critical information needed to complete this report was buried in a mix of editable and static documents – decades of policies, procedures, and manuals scattered across internal sites. A manual review was projected to take hundreds of hours, making the deadline nearly impossible. This tight deadline necessitated an innovative approach to data processing and report generation.
Recognizing a complex challenge as an opportunity for transformation, INDOT’s leadership envisioned an AI-powered solution. The agency chose to build its pilot program on its existing Google Cloud environment, which allowed it to deploy Gemini’s capabilities immediately. By taking this strategic approach, the team was able to turn a difficult compliance requirement into a powerful demonstration of government efficiency.
From manual analysis to an AI-powered pilot in one week
Operating in an agile week-long sprint, INDOT’s team built an innovative workflow centered on Retrieval-Augmented Generation (RAG). This technique enhances generative AI models by grounding them in specific, private data, allowing them to provide accurate, context-aware answers.
The technical workflow began with data ingestion and pre-processing. The team quickly developed Python scripts to perform “Extract, Transform, Load” (ETL) on the fly, scraping internal websites for statutes and parsing text from numerous internal files. This crucial step cleaned and structured the data for the next stage: indexing. Using Vertex AI Search, they created a robust, searchable vector index of the curated documents, which formed the definitive knowledge base for the generative model.
With the data indexed, the RAG engine in Vertex AI could efficiently retrieve the most relevant document snippets in response to a query. This contextual information was then passed to Gemini via Vertex AI. This two-step process was critical, as it ensured the model’s responses were based solely on INDOT’s official documents, not on public internet data.
Setting a new standard for government efficiency
Within an intensive, week-long effort, the team delivered a functioning pilot that generated draft reports across nine INDOT divisions with an impressive 98% fidelity – a measure of how accurately the new reports reflected the information in the original source documents. This innovative approach saved an estimated 360 hours of manual effort, freeing agency staff from tedious data collection to focus on the high-value work of refining and validating the reports. The solution enabled INDOT to become the largest Indiana state agency to submit its government efficiency report on time.
The government efficiency report was a novel experience for many on our executive team, demonstrating firsthand the transformative potential of large language models like Gemini. This project didn’t just help us meet a critical deadline; it paved the way for broader executive support of AI initiatives that will ultimately enhance our ability to serve Indiana’s transportation needs.
Alison Grand
Deputy Commissioner and Chief Legal Counsel, Indiana Department of Transportation
The AI-generated report framework was so effective that it became the official template for 60 other state agencies, powerfully demonstrating a responsible use of AI and building significant trust in INDOT as a leader in statewide policy. By building a scalable, secure RAG system on Google Cloud, INDOT not only met its tight deadline but also created a reusable model for future innovation, accelerating its mission to better serve the people of Indiana.
Join us at Google Public Sector Summit
To see Google’s latest AI innovations in action, and learn more about how Google Cloud technology is empowering state and local government agencies, register to attend the Google Public Sector Summit taking place on October 29 in Washington, D.C.
Editor’s note: Today’s post is by Syed Mohammad Mujeeb, CIO and Arsalan Mazhar, Head of Infrastructure,forJS Bank a prominent and rapidly growing midsize commercial bank in Pakistan with a strong national presence of over 293 branches. JS Bank, always at the forefront of technology, deployed a Google stack to modernize operations while maintaining security & compliance.
Snapshot:
JS Bank’s IT department, strained across 293 branches, was hindered by endpoint instability, a complex security stack, and a lack of device standardization. This reactive environment limited their capacity for innovation.
Through a strategic migration to a unified Google ecosystem—including ChromeOS, Google Workspace, and Google Cloud—the bank transformed its operations. The deployment of 1,500 Chromebooks resulted in a more reliable, secure, and manageable IT infrastructure. This shift cut device management time by 40% and halved daily support tickets, empowering the IT team to pivot from routine maintenance to strategic initiatives like digitization and AI integration.
Reduced IT Burden: reduced device management time by 40%
Daily support tickets were halved, freeing up IT time for strategic, value-added projects
Nearly 90% endpoint standardization, creating a manageable and efficient IT architecture
A simplified, powerful security posture with the built-in protection of ChromeOS and Google Workspace
At JS Bank, we pride ourselves as technology pioneers, always bringing new technology into banking. Our slogan, “Barhna Hai Aagey,” means we are always moving onward and upward. But a few years ago, our internal IT infrastructure was holding us back. We researched and evaluated different solutions, but found the combination of ChromeOS and Google Workspace, a perfect fit in today’s technology landscape which is surrounded by cyber threats. When we shifted to a unified Google stack, we paved the way for our future driven by AI, innovation, and operational excellence.
Before our transformation, our legacy solution was functional, but it was a constant struggle. Our IT team was spread thin across our 293 branches, dealing with a cumbersome setup that required numerous security tools, including antivirus, anti-malware, all layered on top of each other. Endpoints crashed frequently, and with a mixture of older devices and some devices running Ubuntu, we lacked the standardization needed for true efficiency and security. It was a reactive environment, and our team was spending too much time on basic fixes rather than driving innovation.
We decided to make a strategic change to align with our bank’s core mission of digitization, and that meant finding a partner with an end-to-end solution. We chose Google because we saw the value in their integrated ecosystem and anticipated the future convergence of public and private clouds. We deployed 1,500 Chromeboxes across branches and fully transitioned to Google Workspace.
Today, we have achieved nearly 90% standardization across our endpoints with Chromebooks and Chromeboxes, all deeply integrated with Google Workspace. This shift has led to significant improvements in security, IT management, and employee productivity. The built-in security features of the Google ecosystem provide peace of mind, especially during periods of heightened cybersecurity threats, as we feel that Google will inherently protect us from cyberattacks. This has simplified security protocols in branches, eliminating the need for multiple antivirus and anti-malware tools, giving our security team incredible peace of mind. Moreover, the lightweight nature of the Google solutions ensures applications are available from anywhere, anytime, and deployments in branches are simplified.
To strengthen security across all corporate devices, we made Chrome our required browser. This provides foundational protections like Safe Browsing to block malicious sites, browser reporting, and password reuse alerts. For 1,500 users, we adopted Chrome Enterprise Premium. This provides features like Zero-Trust enterprise security, centralized management, data loss prevention (DLP) to protect against accidental data loss, secure access to applications with context-aware access restrictions, and scans high-risk files.
With Google, our IT architecture is now manageable. The team’s focus has fundamentally shifted from putting out fires to supporting our customers and building value. We’ve seen a change in our own employees, too; the teams who once managed our legacy systems are now eager to work within the Google ecosystem. From an IT perspective, the results are remarkable: the team required to manage the ChromeOS environment has shrunk to 40%. Daily support tickets have been halved, freeing IT staff from hardware troubleshooting to focus on more strategic application support, enhancing their job satisfaction and career development. Our IT staff now enjoy less taxing weekends due to reduced work hours and a lighter operational burden.
Our “One Platform” vision comes to life
We are simplifying our IT architecture using Google’s ecosystem to achieve our “One Platform” vision. As a Google shop, we’ve deployed Chromebooks enterprise-wide and unified user access with a “One Window” application and single sign-on. Our “One Data” platform uses an Elastic Search data lake on Google Cloud, now being connected to Google’s LLMs. This integrated platform provides our complete AI toolkit—from Gemini and NotebookLM to upcoming Document and Vision AI. By exploring Vertex AI, we are on track to become the region’s most technologically advanced bank by 2026.
Our journey involved significant internal change, but by trusting the process and our partners, we have built a foundation that is not only simpler and more secure but is also ready for the next wave of innovation. We are truly living our mission of moving onward and upward.
As a Python library for accelerator-oriented array computation and program transformation, JAX is widely recognized for its power in training large-scale AI models. But its core design as a system for composable function transformations unlocks its potential in a much broader scientific landscape. Following our recent post on solving high-order partial differential equations, or PDEs, we’re excited to highlight another frontier where JAX is making a significant impact: AI-driven protein engineering.
I recently spoke with April Schleck and Nick Boyd, two co-founders of Escalante, a startup using AI to train models that predict the impact of drugs on cellular protein expression levels. Their story is a powerful illustration of how JAX’s fundamental design choices — especially its functional and composable nature — are enabling researchers to tackle multi-faceted scientific challenges in ways that are difficult to achieve with other frameworks.
A new approach to protein design
April and Nick explained that Escalante’s long-term vision is to train machine learning (ML) models that can design drugs from the ground up. Unlike fields like natural language processing, which benefit from vast amounts of public data, biology currently lacks the specific datasets needed to train models that truly understand cellular systems. Thus, their immediate focus is to solve this data problem by using current AI tools to build new kinds of lab assays that can generate these massive, relevant biological datasets.
This short-term mission puts them squarely in the field of protein engineering, which they described as a complex, multi-objective optimization problem. When designing a new protein, they aren’t just optimizing one thing; it needs to bind to a specific target, while also being soluble, thermostable, and expressible in bacteria. Each of these properties is predicted by a different ML model (see figure below), ranging from complex architectures like AlphaFold 2(implemented in JAX) to simpler, custom-trained models. Their core challenge is to combine all these different objectives into a single optimization loop.
This is where, as April put it, “JAX became a game-changer for us.” She noted that while combining many AI models might be theoretically possible in other frameworks, JAX’s functional nature makes it incredibly natural to integrate a dozen different ones into a single loss function (see figure below).
Easily combine multiple objectives represented by different loss terms and models
In the above code, Nick explained that there are at least two different ways models are being combined — some loss terms that are being linearly combined (e.g. the AF loss + the ESM pseudo log likelihood loss), and some terms where models are being composed serially (e.g., in the first Boltz-1 term we first fold the sequence with Boltz-1 and then compute the sequence likelihood after inverse folding with another model, ProteinMPNN).
To make this work, they embraced the JAX ecosystem, even translating models from PyTorch themselves — a prime example being their JAX translation of the Boltz-2 structure prediction model.
This approach gives what April called an “expressive language for protein design,” where models can be composed, added, and transformed to define a final objective. April said that the most incredible part is that this entire, complex graph of models “can be wrapped in a single jax.jit call that gives great performance” — something they found very difficult to do in other frameworks.
Instead of a typical training run that optimizes a model’s weights, their workflow inverts the process to optimize the input itself, using a collection of fixed, pre-trained neural networks as a complex, multi-objective loss function. The approach is mechanically analogous to Google’s DeepDream. Just as DeepDream takes a fixed, pre-trained image classifier and uses gradient ascent to iteratively modify an input image’s pixels to maximize a chosen layer’s activation, Escalante’s method starts with a random protein sequence. This sequence is fed through a committee of “expert” models — each one a pre-trained scorer for a different desirable property, like binding affinity or stability. The outputs from all the models are combined into a single, differentiable objective functional. They then calculate a gradient of this final score with respect to the input sequence via backpropagation. An optimizer then uses this gradient to update the sequence, nudging it in a direction that better satisfies the collective requirements of all the models. This cycle repeats, evolving the random initial input into a novel, optimized protein sequence that the entire ensemble of models “believes” is ideal.
Nick said that the choice of JAX was critical for this process. Its ability to compile and automatically differentiate complex code makes it ideal for optimizing the sophisticated loss functions at the heart of their work with Escalante’s library of tools for their protein design work, Mosaic. Furthermore, the framework’s native integration with TPU hardware via the XLA compiler allowed them to easily scale these workloads.
Escalante is sampling many potential protein designs for solving a problem (by optimizing the loss function). Each sampling job might generate 1K – 50K potential designs, which are then ranked and filtered. By the end of the process, they test only about 10 designs in the wet lab. This has led them to adopt a unique infrastructure pattern. Using Google Kubernetes Engine) (GKE), they instantly spin up 2,000 to 4,000 spot TPUs, run their optimization jobs for about half an hour, and then shut them all down.
Nick also shared the compelling economics driving this choice. Given current spot pricing, adopting Cloud TPU v6e (Trillium) over an H100 GPU translated to a gain of 3.65x in performance per dollar for their large-scale jobs. He stressed that this cost-effectiveness is critical for their long-term goal of designing protein binders against the entire human proteome, a task that requires immense computational scale.
To build their system, they rely on key libraries within the JAX ecosystem like Equinox and Optax. Nick prefers Equinox because it feels like “vanilla JAX,” calling its concept of representing a model as a simple PyTree “beautiful and easy to reason about.” Optax, meanwhile, gives them the flexibility to easily swap in different optimization algorithms for their design loops.
They emphasized that this entire stack — JAX’s functional core, its powerful ecosystem libraries, and the scalable TPU hardware — is what makes their research possible.
We are excited to see community contributions like Escalante’s Mosaic library, which contains the tools for their protein design work and is now available on GitHub. It’s a fantastic addition to the landscape of JAX-native scientific tools.
Stories like this highlight a growing trend: JAX is much more than a framework for deep learning. Its powerful system of program transformations, like grad and jit, makes it a foundational library for the paradigm of differentiable programming, empowering a new generation of scientific discovery. The JAX team at Google is committed to supporting and growing this vibrant ecosystem, and that starts with hearing directly from you.
Share your story: Are you using JAX to tackle a challenging problem?
Help guide our roadmap: Are there new features or capabilities that would unlock your next breakthrough?
Your feature requests are essential for guiding the evolution of JAX. Please reach out to the team to share your work or discuss what you need from JAX via GitHub.
Our sincere thanks to April and Nick for sharing their insightful journey with us. We’re excited to see how they and other researchers continue to leverage JAX to solve the world’s most complex scientific problems.
AtDeutsche Bank Research, the core mission of our analysts is delivering original, independent economic and financial analysis. However, creating research reports and notes relies heavily on a foundation of painstaking manual work. Or at least that was the case until generative AI came along.
Historically, analysts would sift through and gather data from financial statements, regulatory filings, and industry reports. Then, the true challenge begins — synthesizing this vast amount of information to uncover insights and findings. To do this, they have to build financial models, identify patterns and trends, and draw connections between diverse sources, past research, and the broader global context.
As analysts need to work as quickly as possible to bring valuable insights to market, this time-consuming process can limit the depth of analysis and the range of topics they can cover.
Our goal was to enhance the research analystexperience and reduce the reliance on manual processes and outsourcing. We created DB Lumina — an AI-powered research agent that helps automate data analysis, streamline workflows, and deliver more accurate and timely insights – all while maintaining the stringent data privacy requirements for the highly regulated financial sector.
“The adoption of the DB Lumina digital assistant by hundreds of research analysts is the culmination of more than 12 months of intense collaboration between dbResearch, our internal development team, and many others. This is just the start of our journey, and we are looking forward to building on this foundation as we continue to push the boundaries of how we responsibly use AI in research production to unlock exciting new innovations across our expansive coverage areas.” – Pam Finelli, Global COO for Investment Research at Deutsche Bank
DB Lumina has three key features that transform the research experience for analysts and enhance productivity through advanced technologies.
1. Gen AI-powered chat
DB Lumina’s core conversational interface enables analysts to interact with Google’s state-of-the-art AI foundation models , including the multimodal Gemini models. They can ask questions, brainstorm ideas, refine writing, and even generate content in real time. Additionally, the chat capability supports uploading and querying documents conversationally, leveraging prior chat history to revisit and continue previous sessions. DB Lumina can help with tasks like summarization, proofreading, translation, and content drafting with precision and speed. In addition, we implemented guardrailing techniques to ensure the generation of compliant and reliable outputs.
2. Prompt templates
Prompt Templates offer pre-configured instructions tailored for document processing with consistent, high-quality outcomes. These templates enable analysts to facilitate the summarization of large documents, extraction of key data points, and the creation of reusable workflows for repetitive tasks. They can be customized for specific roles or business needs, and standardized across teams. Analysts can also save and share templates, ensuring more streamlined operations and enhanced collaboration. This functionality is made possible by Google’s long context window combined with advanced prompting techniques, which also provide citations for verification.
3. Knowledge
DB Lumina integrates a Retrieval-Augmented Generation (RAG) architecture that grounds responses in enterprise knowledge sources, such as internal research, external unstructured data (such as SEC filings), and other document repositories. The agent enhances transparency and accuracy by providing inline citations and source viewers for fact-checking. It also implements controlled access to confidential data with audit logging and explainability features, ensuring secure and trustworthy operations. Using advanced RAG architecture, supported by Google Cloud technologies, enables us to bring generative capabilities to enterprise knowledge resources to give analysts access to the latest, most relevant information when creating research reports and notes.
DB Lumina architecture
DB Lumina was designed to enhance Deutsche Bank Research’s productivity by enabling document ingestion, content summarization, Q&A, and editing.
Built on Google Cloud, the architecture leverages the following services:
All of DB Lumina’s AI capabilities are implemented with guardrails to ensure safe and compliant interactions. We also handle logging and monitoring with Google Cloud’s Observability suite, with prompt interactions stored in Cloud Storage and queried through BigQuery. To manage authentication, we use Identity as a Service integrated with Azure AD, and centralize authorization through dbEntitlements.
RAG and document ingestion
When DB Lumina processes and indexes documents, it splits them into chunks and creates embeddings using APIs like Gemini Embeddings API. It then stores these embeddings in a vector database like Vertex AI Vector Search or the pgvector extension on Cloud SQL. Raw text chunks are stored separately, for example, in Datastore or Cloud Storage.
These diagrams below show the typical RAG and ingestion patterns:
Overview of the agent.
When an analyst submits a query, the system then routes it through a query engine. A Python application leverages an LLM API (Gemini 2.0 and 2.5) and retrieves relevant document snippets based on the query, providing context that is then used by the model to generate a relevant response. The sources indicate experimentation with different retrievers, including one using the pgvector extension on Cloud SQL for PostgreSQL, and one based on Vertex AI Search.
User interface
Using sliders in DB Lumina’s interface, users can easily adjust various parameters for summarization, including verbosity, data density, factuality, structure, reader perspective, flow, and individuality. The interface also includes functionality for providing feedback on summaries.
An evaluation framework for gen AI
Evaluating gen AI applications and agents like DB Lumina requires a custom framework due to the complexity and variability of model outputs. Traditional metrics and generic benchmarks often fail to capture the needs for gen AI features, the nuanced expectations of domain-specific users, and the operational constraints of enterprise environments. This necessitates a new set of gen AI metrics to accurately measure performance.
The DB Lumina evaluation framework employs a rich and extensible set of both industry-standard and custom-developed metrics, which are mapped to defined categories and documented in a central metric dictionary to ensure consistency across teams and features. Standard metrics like accuracy, completeness, and latency are foundational, but they are augmented with custom metrics, such as citation precision and recall, false rejection rates, and verbosity control — each tailored to the specific demands and regulatory requirements of financial research and document-grounded generation. Popular frameworks like Ragas also provide a solid foundation for assessing how well our RAG system grounds its responses in retrieved documents and avoids hallucinations.
In addition, test datasets are carefully curated to reflect a wide range of real-world scenarios, edge cases, and potential biases across DB Lumina’s core features like chat, document Q&A, templates, and RAG-based knowledge retrieval. These datasets are version-controlled and regularly updated to maintain relevance as the tool evolves. Their purpose is to provide a stable benchmark for evaluating model behavior under controlled conditions, enabling consistent comparisons across optimization cycles.
Evaluation is both quantitative and qualitative, combining automated scoring with human review for aspects like tone, structure, and content fidelity. Importantly, the framework ensures each feature is assessed for correctness, usability, efficiency, and compliance while enabling the rapid feedback and robust risk management needed to support iterative optimization and ongoing performance monitoring. We compare current metric outputs against historical baselines, leveraging stable test sets, Git hash tracking, and automated metric pipelines to support proactive interventions to ensure that performance deviations are caught early and addressed before they impact users or compliance standards.
This layered approach ensures that DB Lumina is not only accurate and efficient but also aligned with Deutsche Bank’s internal standards, achieving a balanced and rigorous evaluation strategy that supports both innovation and accountability.
Bringing new benefits to the business
We developed an initial pilot for DB Lumina with Google Cloud Consulting, creating a simple prototype early in the use case development that used only embeddings without prompts. Though it was later surpassed by later versions, this pilot informed the subsequent development of DB Lumina’s RAG architecture.
The project transitioned then through our development and application testing environments to our production deployment, eventually going live in September 2024. Currently, DB Lumina is already in the hands of around 5,000 users across Deutsche Bank Research, specifically in divisions like Investment Bank Origination & Advisory and Fixed Income & Currencies. We plan to roll it out to more than 10,000 users across corporate banking and other functions by the end of the year.
DBLumina is expected to deliver significant business benefits for Deutsche Bank:
Time savings: Analysts reported significant time savings, saving 30 to 45 minutes on preparing earnings note templates and up to two hours when writing research reports and roadshow updates.
Increased analysis depth: One analyst increased the analysis in an earnings report by 50%, adding additions sections by region and activity, as well as a summary section for forecast changes. This was achieved through summarization of earnings releases and investor transcripts and subsequent analysis through conversational prompts.
New analysis opportunities: DB Lumina has created new opportunities for teams to analyze new topics. For example, the U.S. and European Economics teams use DB Lumina to score central bank communications to assess hawkishness and dovishness over time. Another analyst was able to analyze and compare budget speeches from eight different ministries, tallying up keywords related to capacity constraints and growth orientation to identify shifts in priorities.
Increased accuracy: Analysts have also started using DB Lumina as part of their editing process. One supervisory analyst noted that since the rollout, there has been a noted improvement in the editorial and grammatical accuracy across analyst notes, especially from non-native English speakers.
Building the future of gen AI and RAG in finance
We’ve seen the power of RAG transform how financial institutions interact with their data. DB Lumina has proved the value of combining retrieval, gen AI, and conversational AI, but this is just the start of our journey. We believe the future lies in embracing and refining the “agentic” capabilities that are inherent in our architecture. We envision building and orchestrating a system where various components act as agents — all working together to provide intelligent and informed responses to complex financial inquiries.
To support our vision moving forward, we plan to deepen agent specialization within our RAG framework, building agents designed to handle specific types of queries or tasks across compliance, investment strategies, and risk assessment. We also want to incorporate the ReAct (Reasoning and Acting) paradigm into our agents’ decision-making process to enable them to not only retrieve information but also actively reason, plan actions, and refine their searches to provide more accurate and nuanced answers.
In addition, we’ll be actively exploring and implementing more of the tools and services available within Vertex AI to further enhance our AI capabilities. This includes exploring other models for specific tasks or to achieve different performance characteristics, optimizing our vector search infrastructure, and utilizing AI pipelines for greater efficiency and scalability across our RAG system.The ultimate goal is to empower DB Lumina to handle increasingly complex and multi-faceted queries through improved context understanding, ensuring it can accurately interpret context like previous interactions and underlying financial concepts. This includes moving beyond simple question answers to providing analysis and recommendations based on retrieved information. To enhance DB Lumina’s ability to provide real-time information and address queries requiring up-to-date external data, we are planning to integrate a feature for grounding responses with internet-based information.
By focusing on these areas, we aim to transform DB Lumina from a helpful information retriever into a powerful AI agent capable of tackling even the most challenging financial inquiries. This will unlock new opportunities for improved customer service, enhanced decision-making, and greater operational efficiency for financial institutions. The future of RAG and gen AI in finance is bright, and we’re excited to be at the forefront of this transformative technology.
Today, we are excited to announce the 2025 DORA Report: State of AI-assisted Software Development. Drawing on insights from over 100 hours of qualitative data and survey responses from nearly 5,000 technology professionals from around the world.
The report reveals a key insight: AI doesn’t fix a team; it amplifies what’s already there. Strong teams use AI to become even better and more efficient. Struggling teams will find that AI only highlights and intensifies their existing problems. The greatest return comes not from the AI tools themselves, but from a strategic focus on the quality of internal platforms, the clarity of workflows, and the alignment of teams.
AI, the great amplifier
As we established from the 2024 report as well as the special report published this year called “Impact of Generative AI in Software Development”, organizations are continuing to heavily adopt AI and receive substantial benefits across important outcomes. And there is evidence of learning to better integrate these tools into our workflow. Unlike last year, we observe a positive relationship between AI adoption on both software delivery throughput and product performance. It appears that people, teams, and tools are learning where, when, and how AI is most useful. However, AI adoption does continue to have a negative relationship with software delivery stability.
This confirms our central theory – AI accelerates software development, but that acceleration can expose weaknesses downstream. Without robust control systems, like strong automated testing, mature version control practices, and fast feedback loops, an increase in change volume leads to instability. Teams working in loosely coupled architectures with fast feedback loops see gains, while those constrained by tightly coupled systems and slow processes see little or no benefit.
Key findings from the 2025 report
Beyond this central theme, this year’s research highlighted the following about modern software development:
AI adoption is near-universal: 90% of survey respondents report using AI at work. More than 80% believe it has increased their productivity. However, skepticism remains as 30% report little or no trust in the code generated by AI, a slightly lower percentage than last year but a key trend to note.
User-centricity is a prerequisite for AI success: AI becomes most useful when it’s pointed at a clear problem, and a user-centric focus provides that essential direction. Our data shows this focus amplifies AI’s positive influence on team performance.
Platform engineering is the foundation: Our data shows that 90% of organizations have adopted at least one platform and there is a direct correlation between a high quality internal platform and an organization’s ability to unlock the value of AI, making it an essential foundation for success.
The seven team archetypes
Simple software delivery metrics alone aren’t sufficient. They tell you what is happening but not why it’s happening. To connect performance data to experience, we conducted a cluster analysis that reveals seven common team profiles or archetypes, each with a unique interplay of performance, stability, and well-being. This model provides leaders with a way to diagnose team health and apply the right interventions.
The ‘Foundational challenges’ group are trapped in survival mode and face significant gaps in their processes and environment, leading to low performance, high system stability, and high levels of burnout and friction. While the ‘Harmonious high achievers’ excel across multiple areas, showing positive metrics for team well-being, product outcomes, and software delivery.
Read more details of each archetype in the “Understanding your software delivery performance: A look at seven team profiles” chapter of the report.
Unlocking the value of AI with the ‘DORA AI Capabilities Model’
This year, we went beyond identifying AI’s impact to investigating the conditions in which AI-assisted technology-professionals realize the best outcomes. The value of AI is unlocked not by the tools themselves, but by the surrounding technical practices and cultural environment.
Our research identified seven capabilities that are shown to magnify the positive impact of AI in organizations.
Where leaders should get started
One of the key insights derived from the research this year is that the value of AI will be unlocked by reimagining the system of work it inhabits. Technology leaders should treat AI adoption as an organizational transformation.
Here’s where we suggest you begin:
Clarify and socialize your AI policies
Connect AI to your internal context
Prioritize foundational practices
Fortify your safety nets
Invest in your internal platform
Focus on your end-users
The DORA research program is committed to serving as a compass to teams and organizations as we navigate the important and transformative period with AI. We hope the new team profiles and the DORA AI capabilities model provide a clear roadmap for you to move beyond simply adopting AI to unlocking its value by investing in teams and people. We look forward to learning how you put these insights into practice. To learn more:
Artificial intelligence is rapidly transforming software development. But simply adopting AI tools isn’t a guarantee of success. Across the industry, tech leaders and developers are asking the same critical questions: How do we move from just using AI to truly succeeding with it? How do we ensure our investment in AI delivers better, faster, and more reliable software?
The DORA research team has developed the inaugural DORA AI Capabilities Model to provide data-backed guidance for organizations grappling with these questions. This is not just another report on AI adoption trends; it is a guide to the specific technical and cultural practices that amplify the benefits of AI.
The DORA AI Capabilities Model: 7 levers of success
We developed the DORA AI Capabilities Model through a three-phase process. First, we identified and prioritized a wide-range of candidate capabilities based on 78 in-depth interviews, existing literature, and perspectives from leading subject-matter experts. Second, we developed and validated survey questions to ensure they were clear, reliable, and measured each capability accurately. Lastly, we evaluated the impact of a subset of these candidates using the rigorous methodology of designing and analyzing our annual survey—which reached almost 5,000 respondents. The analysis identified seven capabilities that substantially either amplify or unlock the benefits of AI:
Clear and communicated AI stance: Your organization’s position on AI-assisted tools must be clear and well-communicated.This includes clarity on expectations for AI use, support for experimentation, and which tools are permitted. Our research indicates that a clear AI stance amplifies AI’s positive impact on individual effectiveness and organizational performance, and can reduce friction for employees. Importantly, this capability does not measure the specific content of AI use policies, meaning organizations can achieve this capability regardless of their unique stance—as long as that stance is clear and communicated.
Healthy data ecosystems: The quality of your internal data is critical to AI success. A healthy data ecosystem, characterized by high-quality, easily accessible, and unified internal data, substantially amplifies the positive influence of AI adoption on organizational performance.
AI-accessible internal data: Connecting AI tools to internal data sources boosts their impact on individual effectiveness and code quality.Providing AI with company-specific context allows it to move beyond a general-purpose assistant into a highly specialized and valuable tool for your developers.
Strong version control practices: With the increased volume and velocity of code generation from AI, strong version control practices are more crucial than ever. Our research shows a powerful connection between mature version control habits and AI adoption. Specifically, frequent commits amplify AI’s positive influence on individual effectiveness, while the frequent use of rollback features boosts the performance of AI-assisted teams.
Working in small batches: Working in small batches, a long-standing DORA principle, is especially powerful in an AI-assisted environment.This practice amplifies the positive influence of AI on product performance and reduces friction for development teams.
User-centric focus: A deep focus on the end-user’s experience is paramount for teams utilizing AI. Our findings show that a user-centric focus amplifies the positive influence of AI on team performance. Importantly, we also found that in the absence of a user-centric focus, AI adoption can have a negative impact on team performance. When users are at the center of strategy, AI can help propel teams in the right direction. But, when users aren’t the focus, AI-assisted development teams may just be moving quickly in the wrong direction.
Quality internal platforms: Quality internal platforms provide the shared capabilities needed to scale the benefits of AI across an organization.In organizations with quality internal platforms, AI’s positive influence on organizational performance is amplified.
Putting the DORA AI Capabilities Model into practice
To successfully leverage AI in software development, it’s not enough to simply adopt new tools. Organizations must foster the right technical and cultural environment for AI-assisted developers to thrive. Based on our seven inaugural DORA AI Capabilities we recommend that organizations seeking to maximize the benefits of their AI adoption:
Clarify and socialize your AI policies: Ambiguity about what is acceptable stifles adoption and creates risk. Establish and clearly communicate your policy on permitted AI tools and usage to build developer trust and provide the psychological safety needed for effective experimentation.
Treat your data as a strategic asset: The benefits of AI on organizational performance are significantly amplified by a healthy data ecosystem. Invest in the quality, accessibility, and unification of your internal data sources.
Connect AI to your internal context: Move beyond generic AI assistance by investing the engineering effort to give your AI tools secure access to internal documentation, codebases, and other data sources. This provides the necessary company-specific context for maximal effectiveness.
Double-down on known best practices, like working in manageable increments: Enforce the discipline of working in small batches to improve product performance and reduce friction for AI-assisted teams.
Prioritize user-centricity: AI-assisted development tools can help developers produce, debug, and review code more quickly. But, if the core product strategy doesn’t center the needs of the end-user, then more code won’t mean more value to the organization. Explicitly centering user needs is a North Star for orienting AI-assisted teams toward the realization of a shared goal.
Embrace and fortify your safety nets: As AI increases the velocity of changes, your version control system becomes a critical safety net. Encourage teams to become highly proficient in using rollback and revert features.
Invest in your internal platform: A quality internal platform provides the necessary guardrails and shared capabilities that allow the benefits of AI to scale effectively and securely across your organization.
DORA’s research has long held that even the best tools and teams can’t succeed without the right organizational conditions. The findings of our inaugural DORA AI Capabilities Model are a reminder of this fact and suggest that successful AI-assisted development isn’t just a purchasing decision; it’s a decision to cultivate the conditions where AI-assisted developers thrive. Investing in these seven capabilities is an important step toward creating an environment where AI-assisted software development succeeds, leading to enhanced outcomes for your developers, your products, and your entire organization.
To explore the DORA AI Capabilities Model in more detail and to access our full 2025 DORA State of AI-Assisted Software Development, please visit the DORA website.
Getting ahead — and staying ahead — of the demand for AI skills isn’t just key for those looking for a new role. Research shows proving your skills through credentials drives promotion, salary increase, leadership opportunities and more. And 8 in 10 Google Cloud learners feel our training helps them stay ahead in the age of AI.1 This is why we are so focused on providing new AI training content ensuring you have the tools to keep up in this ever-evolving space.
That’s why I’m thrilled to announce a new suite of Google Cloud AI training courses. These courses are designed with intermediate and advanced technical learners in mind for roles such as Cloud Infrastructure Engineers, Cloud Architects, AI Engineers and MLOps Engineers, AI Developers and Data Scientists. Whether you’re looking to build and manage powerful AI infrastructure, master the art of fine-tuning generative AI models, leverage serverless AI inference, or secure your AI deployments, we’ve got you covered.
For cloud infrastructure engineers, cloud architects, AI engineers and MLOps engineers:
AI infrastructure mini coursesare your guide to designing, deploying and managing the high-performance infrastructure that powers modern AI. You’ll gain a deep understanding of Google’s TPU and GPU platforms, and learn to use Google Compute Engine (GCE) and Google Kubernetes Engine (GKE) as a robust foundation for any AI workload you can imagine.
For machine learning engineers, data scientists and AI developers:
Build AI Agents with Databases on Google Cloudteaches you how to securely connect AI agents to your existing enterprise databases. You’ll learn to craft agents that perform intelligent querying and semantic search, design and implement advanced multi-step workflows, and deploy and operationalize these powerful AI applications. This course is essential for building robust and reliable AI agents that can leverage your most critical data.
Supervised fine-tuning for Gemini educates you on how to take Google’s powerful models and make them your own by customizing them for your specific tasks, enhancing their quality and efficiency so they deliver precisely what you and your users need.
Cloud Run for AI Inference teaches you how to deploy those innovations with incredible speed and scale of serverless AI workloads. You’ll learn how to handle demanding AI workloads, including lightweight LLMs, and leverage GPU acceleration, ensuring your creations reach your audience efficiently and reliably.
Security engineers, security analysts:
Model Armor: Securing AI Deployments equips you with the knowledge to protect your generative AI applications from critical risks like data leakage and prompt injection. It’s the essential step to ensuring your innovations can be leveraged with confidence.
For individual developers, business analysts, and other non-technical users:
Develop AI-Powered Prototypes in Google AI Studioshows you how to use Google AI Studio, our developer playground for the Gemini API, to quickly sketch and test your ideas. Through hands-on labs and tutorials, you’ll learn how to prototype apps with little upfront setup and create custom models without needing extensive coding expertise. It’s the perfect way to turn a concept into a working model, ensuring your final structure is built on a tested and innovative design.
Start learning
Building a career in AI is about creating a future where you feel empowered and prepared, no matter how the landscape changes. We believe these courses provide the tools and the confidence to do just that.
Google Kubernetes Engine (GKE) is a powerful platform for orchestrating scalable AI and high-performance computing (HPC) workloads. But as clusters grow and jobs become more data-intensive, storage I/O can become a bottleneck. Your powerful GPUs and TPUs can end up idle, while waiting for data, driving up costs and slowing down innovation.
Google Cloud Managed Lustre is designed to solve this problem. Many on-premises HPC environments already use parallel file systems, and Managed Lustre makes it easier to bring those workloads to the cloud. With its managed Container Storage Interface (CSI) driver, Managed Lustre and GKE operations are fully integrated.
Optimizing your move to a high-performance parallel file system can help you get the most out of your investment from day one.
Before deploying, it’s helpful to know when to use Managed Lustre versus other options like Google Cloud Storage. For most AI and ML workloads, Managed Lustre is the recommended solution. It excels in training and checkpointing scenarios that require very low latency (less than a millisecond) and high throughput for small files, which keeps your expensive accelerators fully utilized. For data archiving or workloads with large files (over 50 MB) that can tolerate higher latency, Cloud Storage FUSE with Anywhere Cache can be another choice.
Based on our work with early customers and the learnings from our teams, here are five best practices to ensure you get the most out of Managed Lustre on GKE.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x7f48cffec4c0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
1. Design for data locality
For performance-sensitive applications, you want your compute resources and storage to be as close as possible, ideally within the same zone in a given region. When provisioning volumes dynamically, the volumeBindingMode parameter in your StorageClass is your most important tool. We strongly recommend setting it to WaitForFirstConsumer. GKE provides a built-in StorageClass for Managed Lustre that uses WaitForFirstConsumer binding mode by default.
Why it’s a best practice: Using WaitForFirstConsumer instructs GKE to delay the provisioning of the Lustre instance until a pod that needs it is scheduled. The scheduler then uses the pod’s topology constraints (i.e., the zone it’s scheduled in) to create the Lustre instance in that exact same zone. This guarantees co-location of your storage and compute, minimizing network latency.
2. Right-size your performance with tiers
Not all high-performance workloads are the same. Managed Lustre offers multiple performance tiers (read and write throughput in MB/s per TiB of storage) so you can align cost directly with your performance requirements.
1000 & 500 MB/s/TiB: Ideal for throughput-critical workloads like foundation model training or large-scale physics simulations where I/O bandwidth is the primary bottleneck.
250 MB/s/TiB: A balanced, cost-effective tier great for many general HPC workloads and AI inference serving, and data-heavy analytics pipelines.
125 MB/s/TiB: Best for large-capacity use cases where having a massive, POSIX-compliant file system is more important than achieving peak throughput. This is also useful for migrating on-premises containerized applications without modification,making it easier to migrate on-premises workloads to the cloud storage.
Why it’s a best practice: Defaulting to the highest tier isn’t always the most cost-effective strategy. By analyzing your workload’s I/O profile, you can significantly optimize your total cost of ownership.
3. Master your networking foundation
A parallel file system is a network-attached resource. Getting the networking right up front will save you days of troubleshooting. Before provisioning, ensure your VPC is correctly configured by following the setup steps in our documentation. This involves three key steps detailed in our documentation:
Enable Service Networking.
Create an IP range for VPC peering.
Create a firewall rule to allow traffic from that range on the Lustre network port (TCP 988 or 6988).
Why it’s a best practice: This is a one-time setup per VPC that establishes the secure peering connection that allows your GKE nodes to communicate with the Managed Lustre service.
4. Use dynamic provisioning for simplicity, static for long-lived shared data
The Managed Lustre CSI driver supports two modes for connecting storage to your GKE workloads.
Dynamic provisioning: Use when your storage is tightly coupled to the lifecycle of a specific workload or application. By defining a StorageClass and PersistentVolumeClaim (PVC), GKE will automatically manage the Lustre instance lifecycle for you. This is the simplest, most automated approach.
Static provisioning: Use when you have a long-lived Lustre instance that needs to be shared across multiple GKE clusters and jobs. You create the Lustre instance once, then create a PersistentVolume (PV) and PVC in your cluster to mount it. This decouples the storage lifecycle from any single workload.
Why it’s a best practice: Thinking about your data’s lifecycle helps you choose the right pattern. Use dynamic provisioning as your default because of simplicity, and opt for static provisioning when you need to treat your file system as a persistent, shared resource across your organization.
5. Architecting for parallelism with Kubernetes Jobs
Many AI and HPC tasks, like data preprocessing or batch inference, are suited for parallel execution. Instead of running a single, large pod, use the Kubernetes Job resource to divide the work across many smaller pods.
Consider this pattern:
Create a single PersistentVolumeClaim for your Managed Lustre instance, making it available to your cluster.
Define a Kubernetes job with parallelism set to a high number (e.g., 100).
Each pod created by the Job mounts the same Lustre PVC.
Design your application so that each pod works on a different subset of the data (e.g., processing a different range of files or data chunks).
Why it’s a best practice: In this pattern, you create a single PVC for your Lustre instance and have each pod created by the Job mount that same PVC. By designing your application so that each pod works on a different subset of the data, you turn your GKE cluster into a powerful, distributed data processing engine. The GKE Job controller acts as the parallel task orchestrator, while Managed Lustre serves as the high-speed data backbone, allowing you to achieve massive aggregate throughput.
Get started today
By combining the orchestration power of GKE with the performance of Managed Lustre, you can build a truly scalable and efficient platform for AI and HPC. Following these best practices will help you create a solution that is not only powerful, but also efficient, cost-effective, and easy to manage.
As cloud infrastructure evolves, so should how you safeguard that technology. As part of our efforts to help you maintain a strong security posture, we’ve introduced powerful capabilities that can address some of the thorniest challenges faced by IT teams who work with Google Compute Engine (GCE) virtual machines and Google Kubernetes Engine (GKE) containers.
Infrastructure administrators face critical security challenges such as publicly accessible storage, software flaws, excessive permissions, and malware. That’s why we’ve introduced new, integrated security dashboards in GCE and GKE consoles, powered by Security Command Center (SCC). Available now, these dashboards can provide critical security insights and proactively highlight potential vulnerabilities, misconfiguration risks, and active threats relevant to your compute engine instances and Kubernetes clusters.
Embedding crucial security insights directly in GCE and GKE environments can empower you to address relevant security issues faster, and play a key role in maintaining a more secure environment over time.
Gain better visibility, directly where you work
The GCE Security Risk Overview page now shows top security findings, vulnerability findings over time, and common vulnerabilities and exploits (CVEs) on your virtual machines. These security insights, powered by Google Threat Intelligence, provide dynamic analysis based on the latest threats uncovered by Mandiant expert analysts. With these insights, you can make better decisions such as which virtual machine to patch first, how to better manage public access, and which CVEs to prioritize for your engineering team.
The top security findings can help prioritize the biggest risks in your environment such as misconfigurations that lead to overly accessible resources, critical software vulnerabilities, and potential moderate risks that may pose a combined critical risk.
Vulnerability findings over time can help assess how well your software engineering team is addressing known software vulnerabilities. CVE details are presented in two widgets: a heatmap distribution on the exploitability and potential impact of the vulnerabilities in your environment, and a list of the top five CVEs found in your virtual machines.
New GCE Security Risk Dashboard highlights top security insights.
The updated GKE console is similar, designed to help teams make better remediation decisions and catch threats before they escalate. A dedicated GKE security page displays streamlined findings on misconfigurations, top threats, and vulnerabilities:
The Workloads configuration widget highlights potential misconfigurations, such as over-permissive containers and pod and namespace risks.
Top threats highlight Kubernetes and container threats, such as cryptomining, privilege escalation, and malicious code execution.
Top software vulnerabilities highlight top CVEs and prioritize them based on their prevalence in your environment and the severity impact.
New GKE Security Posture Dashboard highlights key security insights.
Fully activate dashboards by upgrading to Security Command Center Premium
The GCE and GKE security dashboards, powered by Security Command Center, include the security findings widget (in the GCE dashboard) and the workload configurations widget (in the GKE dashboard).
To access the vulnerabilities and threats widgets, we recommend upgrading to Security Command Center Premium directly from the dashboards, available as a 30-day free trial. You can review the GCE documentation and GKE documentation to learn more about the security dashboards. To learn more about Security Command Center Premium and our different service tiers review the service tier documentation. You can learn more about Security Command Center Premium here.
In the latest episode of the Agent Factory podcast, Amit Miraj and I took a deep dive into the Gemini CLI. We were joined by the creator of the Gemini CLI, Taylor Mullen, who shared the origin story, design philosophy, and future roadmap.
This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.
What is the Gemini CLI?
The Gemini CLI is a powerful, conversational AI agent that lives directly in your command line. It’s designed to be a versatile assistant that can help you with your everyday workflows. Unlike a simple chatbot, the Gemini CLI is agentic. This means it can reason, choose tools, and execute multi-step plans to accomplish a goal, all while keeping you informed. It’s open-source, extensible, and as we learned from its creator, Taylor Mullen, it’s built with a deep understanding of the developer workflow.
The Factory Floor
The Factory Floor is our segment for getting hands-on. This week, we put the Gemini CLI to the test with two real-world demos designed to tackle everyday challenges.
I kicked off the demos by tackling a problem I think every developer has faced: getting up to speed with a new codebase. This included using the Gemini CLI to complete the following tasks:
For the next demo, Amit tackled a problem close to his heart: keeping up with the flood of new AI research papers. He showed how he built a personal research assistant using the Gemini CLI to complete the following tasks:
Process a directory of research papers and generate an interactive webpage explainer for each one
Iterate on a simple prompt, creating a detailed, multi-part prompt to generate a better output
Lang Chain 1.0 Alpha: The popular library is refocusing around a new unified agent abstraction built on Lang Graph, bringing production-grade features like state management and human-in-the-loop to the forefront.
Embedding Gemma: Google’s new family of open, lightweight embedding models that allow developers to build on-device, privacy-centric applications.
Gemma 3 270M: A tiny 270 million parameter model from Google, perfect for creating small, efficient sub-agents for simple tasks.
Gemini CLI in Zed Code Editor: The Gemini CLI is now integrated directly into the Z Code editor, allowing developers to explain code and generate snippets without switching contexts.
500 AI Agents Projects: A GitHub repository with a categorized list of open-source agent projects.
Transformers & LLMs cheatsheet: A resource from a team at Stanford that provides a great starting point or refresher on the fundamentals of LLMs.
Taylor Mullen on the Gemini CLI
The highlight of the episode for me was our in-depth conversation with Taylor Mullen. He gave us a fascinating look behind the curtain at the philosophy and future of the Gemini CLI. Here are some of the key questions we covered:
Taylor explained that the project started about a year and a half ago as an experiment with multi-agent systems. While the CLI version was the most compelling, the technology at the time made it too slow and expensive. He said it was “one of those things… that was almost a little bit too early.” Later, seeing the developer community embrace other AI-powered CLIs proved the demand was there. This inspired him to revisit the idea, leading to a week-long sprint where he built the first prototype.
For Taylor, the number one reason for making the Gemini CLI open source was trust and security. He emphasized, “We want people to see exactly how it operates… so they can have trust.” He also spoke passionately about the open-source community, calling it the “number one thing that’s on my mind.” He sees the community as an essential partner that helps keep the project grounded, secure, and building the right things for users.
When I asked Taylor how his team manages to ship an incredible 100 to 150 features, bug fixes, and enhancements every single week, his answer was simple: they use the Gemini CLI to build itself.
Taylor shared a story about the CLI’s first self-built feature: its own Markdown renderer. He explained that while using AI to 10x productivity is becoming easier, the real challenge is achieving 100x. For his team, this means using the agent to parallelize workflows and optimize human time. It’s not about the AI getting everything right on the first try, but about creating a tight feedback loop for human-AI collaboration at scale.
Gemini CLI under the hood: “Do what a person would do”
The guiding principle, Taylor said, is to “do what a person would do and don’t take shortcuts.” He revealed that, surprisingly, the Gemini CLI doesn’t use embeddings for code search. Instead, it performs an agentic search, using tools like grep, reading files, and finding references. This mimics the exact process a human developer would use to understand a codebase. The goal is to ground the AI in the most relevant, real-time context possible to produce the best results.
We also discussed the agent’s ability to “self-heal.” When the CLI hits a wall, it doesn’t just fail; it proposes a new plan. Taylor gave an example where the agent, after being asked for a shareable link, created a GitHub repo and used GitHub Pages to deploy the content.
The team is doubling down on extensibility. The vision is to create a rich ecosystem where anyone can build, share, and install extensions. These are not just new tools, but curated bundles of commands, instructions, and MCP servers tailored for specific workflows. He’s excited to see what the community will build and how users will customize the Gemini CLI for their unique needs.
Your turn to build
The best way to understand the power of the Gemini CLI is to try it yourself.
AI is transforming how people work and how businesses operate. But with these powerful tools comes a critical question: how do we empower our teams with AI, while ensuring corporate data remains protected?
A key answer lies in the browser, an app most employees use every day, for most of their day. Today, we announced several new AI advancements coming to Chrome, which redefine how browsers can help people with daily tasks, and work is no exception. Powerful AI capabilities right in the browser will help business users be more productive than ever, and we’re giving IT and security teams the enterprise-grade controls they need to keep company data safe.
Gemini in Chrome, with enterprise protections
Our work days can be full of distractions— endless context switching between projects, and repetitive tasks that slow people down. That’s why we’re bringing a new level of assistance directly into the browser, where many of these workflows are already taking place.
Gemini in Chrome1 is an AI browsing assistant that helps people at work. It can cut through the complexity of finding and making use of information across tabs and help people get work done faster. Employees can now easily summarize long and complex reports or documents, grab key insights from a video, or even brainstorm ideas for a new project with help from Gemini in Chrome. Gemini in Chrome can understand the context of a user’s tabs, and soon it will even help recall recent tabs they had open.
Gemini in Chrome will be able to recall your past tabs for you
We’re bringing these capabilities to Google Workspace business and education customers with enterprise-grade data protections, ensuring IT teams stay in control of their company’s data.
Gemini in Chrome doesn’t just help you find information that you need for your workday, you can also take action through integrations with Google apps people use every day like Google Calendar, Docs and Drive. So employees can schedule a meeting right in their current workflows.
Gemini in Chrome is now integrated with your favorite Google apps
Gemini in Chrome is becoming available for Mac and Windows users in the U.S., and we’re also bringing Gemini in Chrome to mobile in the U.S. Users can also activate Gemini when using Chrome on Android, and other apps, by holding the power button. And starting soon, on iOS Gemini in Chrome will be built into the app.
IT teams can configure Gemini in Chrome through policies in Chrome Enterprise Core, and enterprise data protections automatically extend to customers with qualifying editions of Google Workspace.
AI Mode from Google Search in Chrome
In addition to Gemini in Chrome, the Chrome omnibox—the address bar people use to navigate the web—is also getting an upgrade. With AI Mode, people can ask complex, multi-part questions specific to their needs in the same place where they already search. You’ll get an AI-generated response, and can keep exploring with follow-up questions and helpful web links. IT teams can manage this feature through the generative AI policies in Chrome Enterprise Core.
Proactive AI Protection
We know that a browser’s greatest value is its ability to keep users safe. As the security threats from AI-generated scams and phishing attacks become more sophisticated, our defenses must evolve just as quickly. That’s why security is one of the core pillars of Chrome’s AI strategy.
Safe Browsing’s Enhanced Protection mode is now even more secure with the help of AI. We’re using it to proactively block increasingly convincing threats such as tech support scams, and will be expanding to fake anti/virus and impersonated brand websites soon. We’ve also added AI to help detect and block scammy and spammy site notifications, which has already led to billions fewer notifications being sent to Chrome on Android users every day.
AI with enterprise controls
Organizations want to empower their workforce with AI for greater productivity, but never at the expense of security. Chrome Enterprise gives IT teams the tools they need to manage these new capabilities effectively: our comprehensive policies allow IT and security teams to decide exactly which AI features in Chrome are enabled for which users, and how that data is treated.
Chrome Enterprise Premium allows organizations even more safeguards. For example, they can use URL filtering to block unapproved AI tools and point employees back to corporate supported AI services. Within AI tools, security teams can apply data masking or other upload and copy/paste restrictions for sensitive data. These advanced capabilities further prevent sensitive information from being accidentally or maliciously shared via AI tools or any other web sites.
With Chrome Enterprise, AI in the browser offers businesses the best of both worlds: a highly productive, AI-enhanced user experience and the enterprise-grade security enterprises depend on to protect their data. To learn more about these new features, view our recent Behind the Browser AI Edition video.
1 Check responses for accuracy. Available on select devices and in select countries, languages, and to users 18+
Enterprises need to move from experimenting with AI agents to achieving real productivity, but many struggle to scale their agents from prototypes to secure, production-ready systems.
The question is no longer if agents deliver value, but how to deploy them with enterprise confidence. And there’s immense potential for those who solve the scaling challenge. Our 2025 ROI of AI Report reveals that 88% of agentic AI early adopters are already seeing a positive return on investment (ROI) on generative AI.
Vertex AI Agent Builder is the unified platform that helps you close this gap. It’s where you can build the smartest agents, and deploy and scale them with enterprise-grade confidence.
Today, we’ll walk you through agent development on Vertex AI Agent Builder, and highlight a couple of key updates to fuel your next wave of agent-driven productivity and growth.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7d8c08a4f0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
The five pillars of enterprise agent development on Vertex AI Agent Builder
Moving an agent from prototype to production requires a cohesive suite of tools. Vertex AI Agent Builder simplifies this complexity by providing an integrated workflow across five essential pillars, supporting your agent through every step of its lifecycle.
1. Agent frameworks Your agent development journey begins here. You configure and orchestrate your agents using your preferred open source framework. The Agent Development Kit (ADK) – what we use internally at Google – is one of the many options available, and it has already seen over 4.7 million downloads since April.
2. Model choice Models are the intelligent core of your agent. Our platform is provider-agnostic, supporting every leading model – including the Gemini 2.5 model family – alongside hundreds of third-party and open source models from Vertex AI Model Garden. With the ability to Provision Throughput, you can secure dedicated capacity for consistent, low-latency performance at scale.
3. Tools for taking actions Once built, your agent needs tools to take action and interact with the real world. Grounding is a critical step that connects your AI to verifiable, real-time data – dramatically reducing hallucinations and building user trust. On Vertex AI, you can connect your agent to trusted, real-time data sources you already rely on. For example, Grounding with Google Maps is now available for everyone in production. Your agents gain accuracy and the ability to reduce hallucinations by accessing the freshness of Google Maps, which includes factual information on 250 million places for location-aware recommendations and actions.
4. Scalability and performance Deploy and manage at scale using Vertex AI Agent Engine. We built this suite of modular, managed services to instantly move your prototypes into production. The platform provides everything needed for operation and scaling, including a fully managed runtime, integrated Sessions and Memory Bank to personalize context across user interactions, and integrated evaluation and observability services.
Since launch, hundreds of thousands of agents have been deployed to Vertex AI Agent Engine. Here are some recent updates we’re most excited about:
Secure code execution: We now provide a managed, sandboxed environment to run agent-generated code. This is vital for mitigating risks while unlocking advanced capabilities for tasks like financial calculations or data science modeling .
Agent-to-Agent collaboration: Build sophisticated, reliable multi-agent systems with native support for the Agent-to-Agent (A2A) protocol when you deploy to the Agent Engine runtime. This allows your agents to securely discover, collaborate, and delegate tasks to other agents, breaking down operational silos .
Real-Time interactive agents: Unlock a new class of interactive experiences with Bidirectional Streaming. This provides a persistent, two-way communication channel ideal for real-time conversational AI, live customer support, and interactive applications that process audio or video inputs .
Simplified path to production: We have streamlined the journey from a local ADK prototype to a live service, with a one-line deployment in the ADK CLI to Agent Engine.
5. Built-in trust and security Security and compliance are built into every layer of the Vertex AI architecture, ensuring control is paramount. This includes preventing data exfiltration with Virtual Private Cloud Service Controls (VPC-SC) and using your own encryption keys with Customer-Managed Encryption Keys (CMEK). We also meet strict compliance milestones like HIPAA and Data Residency (DRZ) compliance requirements. Your agents can handle sensitive workloads in highly regulated industries with full confidence.
Get started today
It’s time to move your AI strategy from experimentation to exponential growth. Bridge the production gap and deploy your first enterprise agent with Vertex AI Agent Builder, the secure, scalable, and intelligent advantage you need to succeed.
We are happy to drop the third installment of our Network Performance Decoded whitepaper series, where we dive into topics in network performance and benchmarking best practices that often come up as you troubleshoot, deploy, scale, or architect your cloud-based workloads. We started this series last year to provide you helpful tips to not only make the best of your network but also avoid costly mistakes that can drastically impact your application performance. Check out our last two installments — tuning TCP and UDP bulk flows performance, and network performance limiters.
In this installment, we provide an overview of three recent whitepapers — one on TCP retransmissions, another on the impact of headers and MTUs on data transfer performance, and finally, using netperf to measure packets per second performance.
1. Make it snappy: Tuning TCP retransmission behaviour
The A Brief Look at Tuning TCP Retransmission Behaviour whitepaper is all about how to make your online applications feel snappier, by tweaking two Linux TCP settings, net.ipv4.tcp_thin_linear_timeouts and net.ipv4.tcp_rto_min_us (or rto_min) Think of it as fine-tuning your application’s response times and how quickly your application recovers when there’s a hiccup in the network.
For all the gory details, you’ll need to read the paper, but here’s the lowdown on what you’ll learn:
Faster recovery is possible: By playing with these settings, especially making rto_min smaller, you can drastically cut down on how long your TCP connections just sit there doing nothing after a brief network interruption. This means your apps respond faster, and users have a smoother experience.
Newer kernels are your friend: If you’re running a newer Linux kernel (like 6.11 or later), you can go even lower with rto_min (down to 5 milliseconds!). This is because these newer kernels have smarter ways of handling things, leading to even quicker recovery.
Protective ReRoute takes resiliency to the next level: For those on Google Cloud, tuning net.ipv4.tcp_rto_min_us can actually help Google Cloud’s Protective ReRoute (PRR) mechanism kick in sooner, making your applications more resilient to network issues.
Not just for occasional outages: Even for random, isolated packet loss, these tweaks can make a difference. If you have a target for how quickly your app should respond, you can use these settings to ensure TCP retransmits data well before that deadline.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3eed51df5cd0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
2. Beyond network link-rate
Consider more than just “link-rate” when thinking about network performance! In our Headers and Data and Bitrates whitepaper, we discuss how the true speed of data transfer is shaped by:
Headers: Think of these as necessary packaging that reduces the actual data sent per packet.
Maximum Transmission Units (MTUs): These dictate maximum packet size. Larger MTUs mean more data per packet, making your data transfers more efficient.
In cloud environments, a VM’s outbound data limit (egress cap) isn’t always the same as the physical network’s speed. While sometimes close, extra cloud-specific headers can still impact your final throughput. Optimize your MTU settings to get the most out of your cloud network. In a nutshell, it’s not just about the advertised speed, but how effectively your data travels!
3. How many transactions can you handle?
In Measuring Aggregate Packets Per Second with netperf, you’ll learn how to use netperf to figure out how many transactions (and thus packets) per second your network can handle, which is super useful for systems that aren’t just pushing huge files around. Go beyond just measuring bulk transfers and learn a way to measure the packets per second rates which can gate the performance of your request/response applications.
Here’s what you’ll learn:
Beating skew error: Ever noticed weird results when running a bunch of netperf tests at once? That’s “skew error,” and this whitepaper describes using “demo mode” to fix it, giving you way more accurate overall performance numbers.
Sizing up your test: Get practical tips on how many “load generators” (the machines sending the traffic) and how many concurrent streams you need to get reliable results. Basically, you want enough power to truly challenge your system.
Why UDP burst mode is your friend: It explains why using “burst-mode UDP/RR” is the secret sauce for measuring packets per second. TCP, as smart as it is, can sometimes hide the true packet rate because it tries to be too efficient.
Full-spectrum testing and analysis: The whitepaper walks you through different test types you can run with the runemomniaggdemo.sh script, giving you an effective means to measure how many network transactions per second the instance under test can achieve. This might help you infer aspects of the rest of your network that influence this benchmark. Plus, it shows you how to crunch the numbers and even get some sweet graphs to visualize your findings.
Stay tuned
With these resources our goal is to foster an open, collaborative community for network benchmarking and troubleshooting. While our examples may be drawn from Google Cloud, the underlying principles are universally applicable, no matter where your workloads operate. You can access all our whitepapers — past, present, and future — on ourwebpage. Be sure to check back for more!
Though its name may suggest otherwise, Seattle Children’s is the largest pediatric healthcare system in the world.
While its main campus is in its namesake city, Seattle Children’s also encompasses 47 satellite hospitals across Alaska, Montana, Idaho, and Washington, and patients come from as far away as Hawaii for treatment. For more than 100 years, Seattle Children’s has helped kids across the Western U.S. get healthy and stay healthy, regardless of the ability to pay.
With so much ground to cover and diverse patient populations to treat, Seattle Children’s has always looked to new technologies to bring improved, consistent care to its patients and providers. Generative AI is now the latest advance in their medical toolkit.
It started roughly two decades ago, when Seattle Children’s created its pediatric clinical pathways, a set of standardized protocols designed to help clinicians make quicker and more reliable decisions to address dozens of medical conditions. Such pathways were becoming commonplace across medicine, and Seattle Children’s had developed some of the first for children’s unique medical needs.
Innovative as these were, they still required clinicians to thumb through indexes and long binders of information to find what they needed for a given ailment. And in healthcare, it’s often the case that every second counts.
Seattle Children’s was already working with Google Cloud on a number of projects, and as we began to explore the potential for generative AI to make the work of our clinicians easier, the clinical pathways seemed like an obvious place to start. Using Vertex AI and Gemini, we were able to quickly develop our Pathways Assistant, which took training from the clinical pathways documentation and supercharged it with not just searchability but conversationality.
Instead of flipping pages, we’d flipped the script on how quickly and reliably clinicians could find the lifesaving information they needed.
The pathways to improved healthcare run through Gemini “Clinical pathways” are end-to-end treatment protocols for a specific condition or illness. Seattle Children’s pediatric clinical pathways are widely respected and used by hospitals around the globe, providing information on everything from diagnostic criteria to testing protocols to medication recommendations.
Previously, these clinical pathways were documented exclusively in PDFs — hundreds of thousands of pages of them. Performing a traditional search of their contents for the answers clinicians needed delayed their ability to provide treatment in an environment where minutes or even seconds can be critical.
Google Cloud engineers worked with Seattle Children’s informatics physicians, who straddle the worlds of healthcare and technology, to create Pathway Assistant. The new multimodal AI chatbot that responds to spoken or written natural-language queries using the information in those PDFs.
After processing a question, Pathway Assistant searches each PDF’s metadata, which contains semi-structured data in JSON format that’s been extracted from the PDFs by Gemini and curated by clinicians. It then selects the most relevant PDFs, parses the information — including any complex flowcharts, diagrams, and illustrations embedded in them — and answers the clinician’s question in just a few seconds.
Interactive information-finding for accurate decision-making Pathway Assistant becomes more accurate with use. Healthcare providers can “discuss” clinical pathways with the chatbot, which, instead of answering a question, poses questions of its own if it needs clarification, going back and forth until it’s confident it can answer accurately. The chatbot always displays the specific sections of each PDF that was the source for formulating its answers, helping clinicians confirm the veracity of responses.
The interface also includes a way for users to provide feedback about the accuracy and appropriateness of the chatbot’s analysis and answers. The feedback is then logged in a BigQuery table for future forensic analysis — both by clinicians, who can query the information using natural language, and by the built-in Gemini models, which processes the feedback and summarizes what clinicians found confusing or how to improve the accuracy of future answers.
This reflexive capability enables Pathway Assistant to update the PDFs based on clinicians’ feedback if the inaccuracy stemmed from the PDF’s content. Clinicians are also finding that the metadata is becoming more accurate and requiring less curation. Pathway Assistant even corrects typos in the documentation automatically. And as new clinical pathways are developed, PDFs containing the latest information are added to the PDF library.
This growing collection is housed securely in Google Cloud Storage, and the bigger it gets, the more useful it becomes — which wasn’t always the case. Whereas an expanding paper-based collection contained more information, it was also more material to wade through, which is especially challenging in emergency medical situations. Pathway Assistant almost entirely relieves this burden, synthesizing and delivering the most complete information at any time in a matter of seconds.
Ultimately, Pathway Assistant is not a decision-making tool but rather an information-finding tool. Research into critical, evidence-based guidelines that used to take hours now takes minutes.
This speed and effectiveness helps clinicians make the right decisions more quickly at the point of care, drastically reducing research time and improving patient safety and outcomes. Ultimately, clinicians can spend more time with more patients, not with more PDFs.
Ask any physician, they’ll tell you that’s what the best medical technology enables them to do — focus on the patient, not paperwork.
In today’s world where instant responses and seamless experiences are the norm, industries like mortgage servicing face tough challenges. When navigating a maze of regulations, piles of financial documents, and the high stakes of homeownership, consumers quickly find that even simple questions can turn into complicated issues. And the same can be true for the customer reps trying to help them navigate all that complexity.
Like many enterprises, Mr. Cooper is exploring how agentic AI and advanced AI agents can help both our customers and employees meet their needs with confidence. In our work to develop just such an agent with Google Cloud, one of our curious discoveries has been that like a good team, the best AI agents may just be made up of groups of agents with distinct skillsets and abilities, and we come to the best results when they’re working in concert.
At Mr. Cooper, our mission is to “Keep the dream of homeownership alive.”We’re here to simplify the journey, provide clarity, and ensure our customers feel confident every step of the way. That confidence is key when they’re making one of the most consequential purchases, and decisions, of their lives.
With those dual goals of simplicity and certainty in mind, we partnered with Google Cloud to develop an agentic AI framework designed to complement and support our team. We call it the Coaching Intelligent Education & Resource Agent, or CIERA. We asked ourselves how to implement a chatbot that could effectively collaborate with our human agents to streamline both sides of the customer service experience.
And just as we prioritize hiring great groups of customer reps and mortgage agents, we’ve discovered how important it is to put together the right group of agents to effectively meet the needs of all our users. CIERA is designed to do exactly that, handling routine and time-consuming tasks to enhance efficiency, while empowering our people to focus on delivering what they do best — empathy, judgment and meaningful human connection.
CIERA represent an exciting step forward in blending human expertise with AI capabilities, creating a collaborative approach that elevates both the customer experience and our team’s impact. And just as important as this work is for Mr. Cooper, CIERA also demonstrates how our multi-agent approach can serve as a model for companies across industries. Read on to learn how we did it, and how you can, too.
The challenge: Beyond the reach of traditional automation
Mortgage servicing is uniquely complex, where a customer might have a single question that requires an agent to cross-reference multiple documents.
This presents several challenges for traditional automation:
Siloed Knowledge: Scattered information makes it hard to see the full picture, but AI surfaces key data, helping agents make faster, smarter decisions for customers.
Lack of Understanding: Traditional systems rely on rigid keywords and decision trees, often missing the true intent behind customer inquiries. Our AI framework uncovers context and intent, equipping agents with the insights they need to respond with empathy and accuracy.
Inflexible Processes: When conversations take unexpected turns, legacy automation often fails, creating dead ends for customers and the team. AI provides real-time adaptive guidance, helping agents navigate these twists seamlessly.
To truly elevate the customer experience, we needed a solution capable of reasoning, orchestrating, and understanding context — one that enhances and amplifies our capabilities to deliver exceptional service.
The vision: Introducing CIERA, a collaborative AI agent workforce
Our vision was to create an agentic framework that supports our call center agents by leveraging Google Cloud’s Vertex AI platform. CIERA’s AI agents handle repetitive and complex tasks, allowing our team to focus on what technology can’t.
Guided by the principle that AI enhances human performance, these digital collaborators are designed to deliver accurate, comprehensive, and human-centered solutions.
Building the agent workforce: Our architectural blueprint Our modular architecture assigns distinct roles to each AI agent, creating a scalable, efficient. and manageable system that seamlessly collaborates with people to make work smoother and more rewarding.
Meet the key players of our digital team and the solutions they deliver for team members and customers:
Sage, the Head Agent: Sage monitors how all other AI agents perform. By learning from patterns across workflows, Sage helps ensure that each AI agent works in harmony with human teams. Key abilities include intelligent agent monitoring, recognizing useful trends and fine-tuning orchestration.
Ava, the Orchestrator: Ava serves as the team’s coordinator, managing complex customer inquiries by breaking them into manageable tasks and assigning them to the appropriate AI assistants. While Ava doesn’t interact directly with customers, it ensures processes run smoothly, empowering human agents to remain central to delivering solutions.
Lex, the Task Specialist: Lex specializes in complex tasks, helping human agents during customer calls by quickly offering insights to questions around loan applications or escrow analyses. Working behind the scenes, Lex provides insights that allow people to focus on connecting with customers and making informed decisions.
Sky, the Data Specialist: Sky helps human teams navigate internal knowledge bases and FAQs. For questions about policies, procedures, or definitions, Sky provides accurate and timely information, freeing people to spend more time on meaningful interactions, rather than searching for data.
Remy, the Memory Agent: Remy assists by remembering past actions and outcomes, which helps personalize workflows and inform future decisions. Remy’s memory supports ongoing learning and training, making it easier for human agents to access shared knowledge and continuously improve their skills.
Iris, the Evaluation Agent: By evaluating confidence scores, detecting hallucinations, and grounding responses with Model Armor, Iris ensures consistency and authenticity, helping human agents provide trustworthy customer support.
A sample analysis performed by CIERA.
How it works in practice: A real-life scenario
Imagine a customer initiates a call asking, “I received a notice my escrow payment is increasing. Can you explain why and tell me what my new total payment will be?”
Instead of relying solely on automated responses, CIERA ensures every step is grounded in close partnership between AI agents and human team members:
Orchestration: Ava receives the query, understands the two distinct parts (the “why” and the “what”), and creates a plan. Ava consults with a human agent, confirms the correct context and then delegates tasks to the Lex agents.
Parallel Processing: With human oversight, Ava assigns the “why” task to Lex, pointing it to the customer’s most recent escrow analysis document. Simultaneously, it tasks another Lex agent to calculate the new total payment based on data from our systems.
Synthesis: The Lex agent reads the document and reports back to the human agent: “The increase is due to a $200 annual rise in property taxes.” The other agent confirms the new total payment. The human similarly reviews the payment calculation before moving ahead.
Resolution: Ava gathers all AI-generated insights, but the human agent validates and personalizes the final response as needed to ensure clarity, empathy, and accuracy before delivering it to the customer.
This human-in-the-loop approach ensures complex, multifaceted questions are resolved with both the efficiency of advanced AI and understanding nuances with the trust that only people can provide. The partnership guarantees every answer is not just quick, but also trustworthy and tailored to the customer’s needs.
Ensuring quality and trust: The “agentic pulse” and human oversight
In a regulated industry like ours, trust and accuracy are non-negotiable. Deploying advanced AI requires an equally advanced framework for evaluation and governance. To achieve this, we developed two key concepts:
The “Agentic Pulse” Dashboard: Our central command center for monitoring the health and performance of our agent workforce. Powered by model-based evaluation services within the Vertex AI platform , it goes beyond simple metrics. We track:
Faithfulness: Is the agent’s answer grounded in the source documents
Relevance: Does the answer directly address the customer’s question
Safety: Does the agent avoid generating harmful or inappropriate content
Business Metrics: How do we correlate these quality scores with classic KPIs like average handle time (AHT) and customer satisfaction (CSAT)
The “Sandbox” for HITL: Our “Sandbox” environment provides space for our business and technical teams to safely review, test and refine agent processes. Additionally, if the “Agentic Pulse” flags an interaction for review, a human expert can analyze the agent’s reasoning and provide feedback, ensuring a continuous cycle of improvement and learning.
This robust governance framework gives us the confidence to deploy these powerful tools responsibly.
Example of a theoretical loan analysis assisted by CIERA.
Projected impact: From complex processes to clear wins
While CIERA is on its journey towards full production, our projections based on extensive testing and modelling point to historic and transformative gains across the board:
For our customers: We project a reduction in wait times and a higher rate of first-contact resolution, so customers can get answers quicker and with the benefits of round-the-clock support for many complex scenarios.
For our human agents: By automating tedious research, CIERA will free up our human agents to focus on sensitive and complex customer relationships that require a human touch and create better tools and resources for more engaging work.
For our business: We anticipate a major reduction in average handling times for a large segment of inquiries and faster, more accurate resolutions that are a direct driver of customer happiness and loyalty.
Beyond mortgages: A blueprint for any complex industry
The architectural patterns developed with CIERA are not limited to mortgage servicing. This agentic approach — of using an orchestrator to manage a team of specialized AI agents—is a powerful blueprint that can be applied to any industry, including healthcare, logistics and manufacturing, by grappling with information and task complexity.
A typical workflow with CIERA.
The future is agentic and collaborative
Our journey with CIERA is just beginning, but it has already solidified our belief that the future of customer service is agentic driven. By combining Mr. Cooper’s deep industry expertise with Google Cloud’s world-class AI infrastructure, we are not just building bots, we are cultivating a digital workforce.
This collaboration is about more than just lowering costs or improving efficiency — it’s about building trust, delivering clarity, and creating a customer experience truly worthy of the dream of homeownership.
The team would like to thank Googlers Sumit Agrawal and Crispin Velez and the GSD AI Incubation team for their support and technical leadership on agents and agent frameworks as well as their deep expertise in ADK, MCP, and large language model evaluations.
Organizations today face immense pressure to secure their digital assets against increasingly sophisticated threats — without overwhelming their teams or budgets.
Using managed security service providers (MSSPs) to implement and optimize new technology, and handle security operations, is a strategic delegation that can make internal security operations staff more efficient and effective.
At Google Cloud, we understand the value that MSSPs can bring, so we’ve built a robust ecosystem of MSSP partners, specifically empowered to help you modernize security operations and achieve better security outcomes, faster.
MSSPs can help ease the pressure from three key challenges that can prevent organizations from staying ahead of threats, and achieving the cybersecurity outcomes they need to build operational resilience and power innovation to create growth.
Prolonged time to value and disruptive deployment: CIOs need security investments to demonstrate value quickly. Deploying new security solutions or migrating from existing ones can be costly and time-consuming, delaying the realization of benefits and increasing risk during transition periods. These complexities can lead to protracted implementation cycles, thereby delaying the realization of anticipated benefits and consequently increasing an organization’s risk exposure throughout the transition period.
Limited resources, talent, and expertise: CISOs often find their teams stretched thin, struggling with the sheer volume of security alerts and manual tasks while often lacking specialized knowledge in emerging threat areas or modern security solutions. The demand for cybersecurity professionals continues to grow faster than the supply of qualified workers. The 2024 ISC2 Cybersecurity Workforce Study estimated a global workforce gap of 4.8 million professionals.
High costs: CEOs often see the cost of building and maintaining internal resources to protect the business, and wonder why they’re not getting the expected return on their investment in terms of successful security outcomes. A purely in-house cybersecurity strategy demands substantial upfront capital investment, ongoing operational costs, and a significant commitment to recruiting and retaining highly specialized talent in a competitive market.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3eed50dcae50>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
An expert, experienced member of the team armed with modern technology
Addressing these challenges effectively requires strategic investments and a clear understanding of where to allocate resources to achieve optimal security outcomes. Approved Google Security Operations MSSP partners are uniquely positioned to help you overcome these hurdles, combining their deep expertise with the power of Google Cloud Security products.
Accelerate time to value and deployment: Google Security Operations MSSP partners can help you to accelerate your security operations modernization journey with tailored migrations and efficient technology deployments. With thousands of security operations transformations and deployments under their belt, MSSP partners can get your company’s rules, detections, alerts, and telemetry sources in production quickly.
Augment resources, talent, and expertise: By using best-in-class tooling like intelligence-driven and AI-enabled Google Cloud Security products, partners can filter out noise and escalate only issues requiring business context, helping to reduce the manual work your team faces.
Drive cost efficiency and better outcomes: Delegating some or all of your organization’s security efforts to external resources such as MSSPs or managed detection and response (MDR) services offers immediate access to specialized expertise, advanced technologies, 24/7 monitoring, and scalable solutions without the overhead of an in-house team.
Why partner with a Google Cloud SecOps MSSP?
Choosing a certified Google Cloud MSSP partner means gaining access to differentiated, end-to-end security solutions powered by Google Cloud Security products, including Google Security Operations, Google Threat Intelligence, and Mandiant Solutions. Our tools provide technical advantages like comprehensive data ingestion from multiple clouds and context-aware detections to prioritize threats.
They can help our partners to deliver unique managed security services offerings tailored to the security requirements of your business. They help you offload security operations strain, increase risk awareness, and significantly reduce response times. They can protect your workloads regardless of location (on-premises or multicloud) and integrate with your existing security investments.
Hear from some of our partners and customers on the value they are seeing:
“In partnership with Google Cloud, we ensure comprehensive protection for your workloads, regardless of their location. We leverage telemetry from your existing security solutions to provide seamless and robust defense. This integrated approach maximizes your security investment and minimizes risk,” said Laurent Besset, deputy CEO, Cyberdefense Ops, I-TRACING.
As an example of the results possible on Google Cloud, Jayesh Barai, VP of Sales at Netenrich, shared a recent customer success story: “We’ve seen customers achieve transformative results by tackling legacy security operations head-on. With Netenrich’s AI-driven Adaptive MDR solution, one client’s security efficacy became remarkable: they reduced their mean time to detect and respond from hours to just 15 minutes and cut monthly security incidents needing manual intervention from nearly 2,000 to fewer than 10. This operational efficiency also drove major cost savings, including a 50% reduction in annual security expenses and an 80% reduction in SOC staffing requirements, dramatically streamlining their ability to integrate new acquisitions in days instead of months.”
“Google Cloud’s global reach and unwavering commitment to security and innovation make it an ideal partner to help us safeguard our clients’ most valuable assets. Google Cloud’s experience and expertise in the field, coupled with its transparent and open approach, instill a level of trust that is essential in today’s interconnected world.” said Scott Goree, senior vice-president, Partners, Alliances, and Ecosystems, Optiv.
Find the right partner
You can learn more about how a Google Cloud MSSP partner can help your organization modernize security operations by visiting our updated MSSP Page.