GCP – Day 2 at Next ’24 recap: building AI agents
Hello from Las Vegas, where day two of Google Cloud Next ‘24 just wrapped up. What happened in Vegas today? We got hands-on with Gemini and AI agents.
At the annual developer keynote, Google Cloud Chief Evangelist Richard Seroter told the audience how Gemini in Google Cloud can not only meet you where you are today, “but frankly take you further than anyone else can.”
In a wide-ranging presentation full of demos, Richard and Senior Developer Advocate Chloe Condon dove deep with fellow Googlers and partners into Google Cloud AI technologies and integrations that help with the core tasks that Google Cloud customers do every day: build, run and operate amazing applications.
Let’s take a deeper look.
Build
Google Cloud’s generative AI experience for developers starts with Gemini Code Assist. Google Cloud VP and GM Brad Calder showed the audience how support for Gemini 1.5 in Code Assist enables a 1M token context window — the largest in the industry.
Then, Jason Davenport, Google Cloud Developer Advocate showed how Gemini Cloud Assist makes it easier to design, operate, troubleshoot, and optimize your application by using context from your specific cloud environment and resources, be they error logs, load balancer configurations, firewall rules — you name it.
Finally, with Gemini embedded across Google Cloud applications like BigQuery and Looker, support in Google Cloud databases for vector search and embedding, plus integrations into developer tools like Cloud Workstations and web user interface libraries like React, developers got a taste of what AI brings to the table: Now, you can add AI capabilities like taking multi-modal inputs (i.e., both text and images) and use them to create recommendations, predictions, and syntheses — all in a fraction of the time it took before. Google Cloud Product Manager Femi Akinde and Chloe showed us how to go from a great idea to an immersive, inspirational AI app in just a few minutes.
New things that makes this possible:
App Hub – Announced today, and with a deep integration into Google Cloud Assist, App Hub provides an accurate, up-to-date representation of deployed applications and their resource dependencies, regardless of the specific Google Cloud products they use.
BigQuery continuous queries – In preview, BigQuery can now provide continuous SQL processing over data streams, enabling real-time pipelines with AI operators or reverse ETL.
Natural language support in AlloyDB – With support for Google’s state-of-the-art ScaNN algorithm, AlloyDB users get the enhanced vector performance that powers some of Google’s most popular services.
Gemini Code Assist in Apigee API management: Use Gemini to help you build enterprise-grade APIs and integrations using natural language prompts.
Run
Building a generative AI app is one thing, but how do you make it production-grade? “That’s the question of the moment,” Google Cloud Developer Advocate Kaslin Fields told the audience.
Thankfully, Google Cloud platforms like Cloud Run make it ridiculously fast to stand up and scale an application, while platforms like Google Kubernetes Engine (GKE) provide robust feature set to power the most demanding, or unique AI applications.
New things that make this possible:
Cloud Run application canvas – Generate, modify and deploy AI applications in Cloud Run, with integrations to Vertex AI so you can consume generative APIs from Cloud Run services in just a few clicks.
Gen AI Quick Start Solutions for GKE – Run AI on GKE with a Retrieval Augmented Generation (RAG) pattern, or integrated with Ray.
Support for Gemma on GKE: GKE offers many paths for running Gemma, Google’s open model based on Gemini. Better yet, the performance is excellent.
Operate
“AI apps can produce emergent behaviors, resulting in novel issues,” said Steve McGhee, Google Cloud Reliability Advocate during the developer keynote.
Indeed, “our systems used to fail in fairly predictable ways,” said another presenter, Charity Majors, cofounder and CTO at Honeycomb.io. But now, “our systems are dynamic and chaotic, our architectures are far-flung and diverse and constantly changing.
But what generative AI taketh away — the predictability of the same old, same old — it also giveth back in the form of new tools to help you understand and deal with change.
New things that make this possible:
Vertex AI MLOps capabilities – In preview, Vertex AI Prompt Management lets customers experiment with migrate, and track prompts and parameters, so they can compare prompt iterations and assess how small changes impact outputs, while Vertex AI Rapid Evaluation helps users evaluate model performance when iterating on the best prompt design.
Shadow API detection – In preview in Advanced API Security, shadow API detection helps you find APIs that don’t have proper oversight or governance, so could be the source of damaging security incidents.
Confidential Accelerators for AI workloads – Confidential VMs on the A3 machine series with NVIDIA Tensor Core H100 GPUs extends hardware-based data and model protection to the CPU to GPUs that handle sensitive AI and machine learning data.
GKE container and model preloading – In preview, GKE can now accelerate workload cold-start to improve GPU utilization, save money, and keep AI inference latency low.
Then, it was off to another day of spotlights, breakout sessions (295 on Day 2 alone), and trainings before the party tonight, where attendees will be entertained by Kings of Leon and Anderson .Paak. Tomorrow, Day 3, is also jam packed, with sessions running all day, including reruns of many of the sessions that you may have missed earlier in the week — be sure to add them to your agenda. And don’t forget to bring your badge to the party tonight!
Read More for the details.