GCP – Datadog expands its AI observability capabilities with new integrations across the Google Cloud stack
Datadog and Google Cloud have long provided customers with powerful capabilities that enable performant, scalable, and differentiated applications in the cloud; in the past two years alone, Datadog’s revenue on Google Cloud Marketplace has more than doubled. As these customers bring Google Cloud’s AI capabilities into their technology stacks, they require observability tools that allow them to better troubleshoot errors, optimize usage, and improve product performance.
Today, Datadog is announcing expanded AI monitoring capabilities with Vertex AI Agent Engine monitoring in its new AI Agents Console. This new feature joins a large and growing set of Google Cloud AI monitoring capabilities that allow joint customers to better innovate and optimize product performance across the AI stack
Full-stack AI observability
With this extensive set of AI observability capabilities, Datadog customers with workloads on Google Cloud have enhanced visibility into all the layers of an AI application.
- Application layer: As businesses adopt autonomous agents to power key workflows, visibility and governance become critical. Datadog’s new AI Agents Console now supports monitoring of agents deployed via Google’s Vertex AI Agent Engine, providing customers with a unified view of the actions, permissions, and business impact of third-party agents — including those orchestrated by Agent Engine.
-
Model layer: Datadog LLM Observability allows users to monitor, troubleshoot, improve and secure their large language model (LLM) applications. Earlier this year, Datadog introduced auto-instrumentation for Gemini models and LLMs in Vertex AI, which allows teams to start monitoring quickly, minimizing setup work and jumping right into troubleshooting efforts.
-
Infrastructure layer: In February, Datadog announced a new integration with Cloud TPU, allowing customers to monitor utilization, resource usage, and performance at the container, node, and worker levels. This helps customers rightsize TPU infrastructure and balance training performance with cost.
-
Data layer: Many Google Cloud customers use BigQuery for data insights. Datadog’s expanded BigQuery monitoring capabilities — launched at Google Cloud Next — help teams optimize costs by showing BigQuery usage per user and project, identifying top spenders and slow queries. It also flags failed jobs for immediate action and identifies data quality issues.
- aside_block
- <ListValue: []>
Optimize monitoring costs
Datadog has regularly invested in optimizing the cost associated with its Google Cloud integrations, and Datadog customers can now use Google Cloud’s Active Metrics APIs, ensuring Datadog only calls Google Cloud APIs when there is new data. This significantly reduces API calls and associated costs, without sacrificing visibility. This joins Datadog’s support for Google Cloud’s Private Service Connect, which allows Datadog users running on Google Cloud to reduce data transfer costs, as another key tool to help Google Cloud customers optimize their monitoring costs without reducing visibility.
Get started today
Datadog’s unified observability and security platform offers a powerful advantage for organizations that want to use Google Cloud’s cutting-edge AI services. By monitoring the full Google Cloud stack across a breadth of telemetry types, Datadog gives Google Cloud customers the tools and insights they need to build more performant, cost-efficient, and scalable applications.
Ready to try it for yourself? Purchase Datadog directly from the Google Cloud Marketplace and start monitoring your environment in minutes. And if you’re in the New York area, you can see some of these new capabilities in action by visiting the Google Cloud booth at Datadog’s annual conference DASH from June 10-11.
Read More for the details.