2025 07 22

GCP – 25+ top gen AI how-to guides for enterprise

The best way to learn AI is by building. From finding quick ways to deploy open models to building complex, multi-agentic systems, it’s easy to feel overwhelmed by the sheer volume of resources out there.

To that end, we’ve compiled a living, curated collection of our 25+ favorite how-to guides for Google Cloud. This collection is split into four areas:

Faster model deployment: Create efficient CI/CD pipelines, deploy large models like Llama 3 on high-performance infrastructure, and use open models in Vertex AI Studio.
Building generative AI apps & multi-agentic systems: Build document summarizers, multi-turn chat apps, and advanced research agents with LangGraph.
Fine-tuning, evaluation, and Retrieval-Augmented Generation (RAG): Refine models with supervised fine-tuning, RAG, and Reinforcement Learning from Human Feedback (RLHF).
Integrations: Connect your AI to the world by building multilingual mobile chatbots or integrating with Google Cloud Databases.

Bookmark this page and check back often for our latest finds.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1edc393ca0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Faster model deployment

1. Build a CI/CD pipeline for your ML workflow. Automate the process of building, testing, and deploying a Vertex AI Pipeline by connecting a GitHub repo to Cloud Build triggers. Github repository.

2. Deploy large models like Llama 3 on high-performance A3 VMs. This guide provides the Terraform scripts to provision an AI Hypercomputer cluster (A3 VMs with GPUs) and deploy large open models using JAX for maximum performance. GitHub provisioning documentation.

3. Access DeepSeek models and Llama 4 models on AI Hypercomputer. This TPU recipe outlines the steps to deploy the Llama-4-Scout-17B-16E Model with JetStream MaxText Engine with Trillium TPU. You can deploy Llama4 Scout and Maverick models or DeepSeekV3/R1 models today using inference recipes from the AI Hypercomputer Github repository.

4. Use open models in Vertex AI Studio. Model selection isn’t limited to Gemini anymore–you can select Claude models, too. Here’s the documentation to use open models in Vertex AI Studio.

5. Build and deploy a remote MCP server to Google Cloud Run in under 10 minutes. Drawing directly from the official Cloud Run documentation for hosting MCP servers, this guide shows you the straightforward process of setting up your very own remote MCP server. Blog.

Building gen AI apps & multi-agentic systems

6. Create a document (text) summarizer with Gemini Pro. This Python notebook shows you how to use the Vertex AI SDK to interact with the Gemini Pro model for a practical task: generating a concise summary of a long document. Github recipe.

7. Build multi-turn chat applications with Gemini. This notebook demonstrates how to use the Gemini API to build a stateful, multi-turn chat service that can remember conversation history. Official documentation.

8. Build a multimodal research agent with LangGraph. An advanced recipe for building a true AI agent that can work in a loop. It uses LangGraph to create a workflow where the agent can search the web, analyze images from the results using Gemini, and synthesize a final answer. Sample code. Blog.

9. Get AI to write good SQL queries (text-to-SQL). Learn state-of-the-art approaches to context building and table retrieval, how to do effective evaluation of text-to-SQL quality with LLM-as-a-judge techniques, the best approaches to LLM prompting and post-processing, and how we approach techniques that allow the system to offer virtually certified correct answers. Guide.

10. Convert standalone ADK/MCP agent into an A2A-compatible component and build an orchestrator to manage such agents. Project source code. Official A2A Python SDK. Official A2A Sample Projects

11. Build a simple multi-agent system using ADK – in this case, a trip planning system. Explore project source code.

12. Build an interactive data anonymizer agent using Google’s ADK. The agent interactively analyzes a table’s schema and data to identify sensitive columns, then proposes and generates a ready-to-run SQL script to create an anonymized and sampled copy. Explore project sample code.

13. Build a strong brand logo with Imagen 3 and Gemini. Learn how you can build your brand style with a logo using Imagen 3, Gemini, and the Python Library Pillow. Sample code.

Fine-tuning, evaluation, and RAG

14. The ultimate best practices guide for Supervised Fine Tuning with Gemini. This guide takes you deeper into how developers can streamline their SFT process, including: selecting the optimal model version, crafting a high quality dataset, and best practices to evaluate the models, including tools to diagnose and overcome problems. Full guide. Gen AI repo.

15. The ultimate guide for getting started with Vertex AI RAG. Bookmark the top concepts for understanding Vertex AI RAG Engine. These concepts are listed in the order of the retrieval-augmented generation (RAG) process. Getting started notebook.

16. Design a production-ready RAG system. A comprehensive architecture guide for understanding the end-to-end role of Vertex AI and Vector Search in a generative AI app. It includes system diagrams, design considerations, and best practices. Official architecture guide.

17. Advanced RAG Techniques: Vertex RAG Engine retrieval quality evaluation and hyperparameters tuning. Learn how to evaluate and perform hyperparameter tuning for retrieval with RAG Engine. Github repo.

18. Fine-tune models using reinforcement learning (RLHF). This tutorial demonstrates how to use reinforcement learning from human feedback (RLHF) on Vertex AI to tune a large-language model (LLM). This workflow uses feedback gathered from humans to improve a model’s accuracy. Colab.

19. Fine-tune video inputs on Vertex AI. If your work involves content moderation, video captioning, and detailed event localization, this guide is for you. Sample notebook.

20. Rapidly compare text prompts and models during development. Use this “Rapid Evaluation” SDK to quickly compare the outputs of different text-based prompts or models side-by-side. Colab.

21. Get feature attributions with Explainable AI. For classification and regression models, know why a model made a certain prediction using Vertex Explainable AI. Documentation.

22. Optimize your RAG retrieval. Step-by-step ways to minimize hallucinations and build trust in AI applications, from root cause analysis to creating a testing framework. Blog.

Integrations

23. Build a multilingual chatbot for mobile. A complete end-to-end guide for building a multilingual chatbot on Android. It combines Gemma, the Gemini API, and MCP to create a powerful, global-ready application. Github repo. Blog.

24. Develop ADK agents that connect to external MCP servers. Use this example of an ADK agent leveraging MCP to access Wikipedia articles, which is a common use case to retrieve external specialised data. We will also introduce Streamable HTTP, the next-generation transport protocol designed to succeed SSE for MCP communications. Guide.

25. Encode text embeddings using the Vertex AI embeddings for text service and the StackOverflow dataset. Vector Search is a fully managed offering, further reducing operational overhead. It’s built upon Approximate Nearest Neighbor (ANN) technology developed by Google Research. Notebook.

26. Integrate MCP with Google Cloud Databases. Learn how to integrate any MCP-compatible AI assistant (including Claude Code, Cursor, Windsurf, Cline, and many more) with Google Cloud Databases. The blog walks you through how to write application code that queries your database, design a schema for a new application, refactor code when the data model changes, generate data for integration testing, etc. Blog.

Stay tuned

And that’s a wrap — for now. Did we miss a game-changing GitHub repo or a codelab that saved you hours of work? Share your favorite resources with us on X.

GCP – 25+ top gen AI how-to guides for enterprise

Faster model deployment

Building gen AI apps & multi-agentic systems

Fine-tuning, evaluation, and RAG

Integrations

Stay tuned

Related Posts

AWS – Amazon EC2 G6 instances now available in the AWS GovCloud (US-East) Region

AWS – Amazon EC2 Single GPU P5 instances are now generally available

AWS – Amazon SageMaker AI now supports P6e-GB200 UltraServers