GCP – Gemini momentum continues with launch of 2.5 Flash-Lite and general availability of 2.5 Flash and Pro on Vertex AI
The momentum of the Gemini 2.5 era continues to build. Following our recent announcements, we’re empowering enterprise builders and developers with even greater access to the intelligence, and flexibility of our most capable models yet, directly within Vertex AI, our unified platform for enterprise-scale AI development.
The significant updates announced today are designed to help your organization build sophisticated, customized, and efficient AI solutions, more confidently. These include:
-
Gemini 2.5 Flash and 2.5 Pro now generally available: Our most intelligent models for speed and advanced reasoning are production-ready providing organizations with the stability, reliability and scalability needed to confidently deploy the most advanced AI capabilities into mission-critical applications.
-
New Gemini 2.5 Flash-Lite in public preview: Experience the cost-efficient Gemini 2.5 model yet with optimized performance for high-volume tasks.
-
New Supervised Fine-Tuning (SFT) for Gemini 2.5 Flash is generally available: Tailor our high-speed model to your unique enterprise data and needs.
-
New updated Live API with native audio in public preview: Streamline the development of complex, real-time audio AI systems.
Build with confidence using production-ready Gemini 2.5
Gemini 2.5 Flash: Optimized for speed, efficiency, and scale
Gemini 2.5 Flash, is now generally available in Vertex AI, the Gemini API, and Google AI Studio, engineered for high-throughput enterprise tasks such as large-scale summarization, responsive chat applications, and efficient data extraction. These advancements provide a comprehensive toolkit to elevate your enterprise applications and unlock new levels of productivity and innovation. Build with confidence on this production-ready foundation.
“SmartBear uses AI to power Test Hub, its solution for building and executing regression tests for web, desktop, and mobile. With Gemini 2.5 Flash on Vertex AI, we can accelerate tasks like translating extensive manual test scripts into robust automated tests with remarkable speed and cost-effectiveness. The ROI is multifaceted: we’re empowering our customers to realize the benefits of automation execution, while simultaneously producing intent-based, resilient-to-change test plans. This drastically increases testing velocity and enables faster feature delivery—helping our customers move with greater speed and confidence, powered by a more efficient and scalable AI foundation.”
– Fitz Nowlan, PhD, VP of AI, SmartBear
“At Connective Health, our mission is empowering healthcare providers and driving better patient outcomes. Gemini 2.5 Flash on Vertex AI is instrumental in helping us extract vital medical records from complex free-text records. Customer trust is paramount, so our AI initiatives are always developed in close collaboration with healthcare providers, ensuring its use is accurate and impactful. The rapid advancements in Gemini’s capabilities allow us to continually enhance how we deliver these critical insights, and we’re excited to explore further applications to improve the lives of more patients and providers.”
– Joe Athman, CTO, Connective Health
“At Suggestic, we’re advancing the future of personalized nutrition by making nutritional data instantly actionable through our next-generation, image-based inference API. By leveraging Gemini 2.5 Flash as our core model, we’ve consistently achieved exceptional accuracy and processing efficiency, significantly outperforming alternative models on the Nutrition5k dataset. Gemini 2.5 Flash delivered a remarkable 25% improvement across critical benchmarks, including processing speed, enabling us to implement advanced image modification tools that enhance inference accuracy without sacrificing response times. Its native support for structured output and unparalleled capability in handling complex, tool-augmented tasks ensures seamless, real-time experiences, making Gemini 2.5 Flash the optimal choice for robust, production-grade solutions.”
– Shai Rozen, Co-founder, Suggestic
Gemini 2.5 Pro: Unlock state-of-the-art intelligence
Our most capable model, Gemini 2.5 Pro, is also now generally available in Vertex AI, the Gemini API, and Google AI Studio. Designed for your most demanding enterprise AI challenges like making sense of massive datasets for scientific discovery or accelerating migration of critical legacy code, it excels at highly complex reasoning, advanced code generation, and deep multimodal understanding.
“At Snap, we believe today’s devices and user interfaces can constrain the full potential of AI. So, we’re bringing AI into the world through Spectacles, our standalone, see-through, immersive AR glasses, and Gemini on Google Cloud. Through the powerful combination of our Depth Module API and Gemini 2.5 Pro, it’s already possible to translate 2D coordinates of an image into 3D space, enabling information and annotations to be anchored on the real world – even as you move around. We’re excited to unlock a whole new paradigm for spatial intelligence on Spectacles.”
– Terek Judi, Staff Product Manager, Snap Inc.
“At Multimodal, we’re reimagining how business and IT teams in finance and insurance co-create intelligent agentic workflows. By integrating Gemini 2.5 Pro into our AgentFlow platform, we’ve transformed how customers experience Zero Shot AI—enabling them to instantly see how AI agents operate on their own documents, workflows, and use cases, without needing lengthy pilots or custom demos. Gemini 2.5’s large context window and structured reasoning unlock a level of depth and adaptability that’s been impossible before, allowing our agents to understand, reason through, and act across highly specific domain workflows. This fundamentally changes the go-to-market experience: business teams can now visualize and validate impact on day one. For industries where trust, compliance, and precision are paramount, that’s a game-changer.”
– Andrew McKishnie, VP of Engineering, Multimodal
Enhanced customization and efficiency for your needs
Gemini 2.5 Flash-Lite in public preview: Gain cost-efficiency with low latency
Get an early look at Gemini 2.5 Flash-Lite, the most cost-effective Gemini 2.5 model yet, optimized for performance in high-volume workloads. Delivering higher performance than the previous Flash-Lite model, 2.5 Flash-Lite is 1.5 times faster than 2.0 Flash, at a lower cost, on Vertex AI. It’s ideal for tasks like classification, translation, intelligent routing, and other cost-sensitive, high-scale operations.
Supervised Fine-Tuning (SFT) for Gemini 2.5 Flash: Customized AI for your business
Achieve unparalleled customization with the GA release of Supervised Fine-Tuning (SFT) for Gemini 2.5 Flash on Vertex AI. Adapt Gemini to your enterprise’s specific datasets, industry-specific terminology, and unique brand voice, leading to higher accuracy on specialized tasks.
Live API with native audio in public preview: Build real-time interactive services
Streamline the development of sophisticated, real-time AI systems with the Live API, now in public preview with native audio-to-audio capabilities. This enables more natural and responsive voice-driven applications and complex AI agent interactions.
“Newo.ai enables small and medium businesses to deploy fully functional AI receptionists that handle all incoming communication channels—voice and text—in just 3 minutes with one click. We’ve worked through thousands of customer scenarios to enable AI Employee creation using only a Google Maps listing or website. While this appears simple, we deliver sophisticated conversation flows requiring advanced reasoning, low latency, multilingual capabilities, and empathetic responses—features powered by the Live API and Gemini 2.5 Flash on Vertex AI. This combination allows us to deliver production-ready AI employees that generate up to 30x ROI for our clients.”
– David Yang, Co-founder, Newo.ai
Driving your enterprise AI initiatives forward, these comprehensive Vertex AI updates enable you to continue to scale confidently with robust, production-grade models. You can now tailor powerful AI precisely to your unique operational needs and data, optimize for cost-efficiency in high-throughput scenarios, and build next-generation, interconnected AI solutions that push the boundaries of innovation.
“At Citizen Health, we develop AI advocates that empower rare‑disease patients/caretakers to understand and navigate their healthcare journeys. Our data pipelines stream longitudinal EHR data – decades of clinician notes, imaging reports, and genomic panels – directly into Gemini 2.5 Pro’s million‑token context windows, enabling patients and caretakers to receive concise, context‑rich answers in near real-time. We orchestrate Gemini 2.5 Flash and Gemini 2.5 Pro models within a LangGraph‑powered multi‑agent framework, ensuring the most relevant evidence reaches patients and caretakers without hallucinations. Gemini’s long‑context comprehension coupled with rapid inference converts exhaustive document review into a seamless conversation, allowing families to spend less time deciphering records and more time making informed care decisions.”
– Daniel Wang, CTO, Citizen Health
Pricing and availability
The Gemini 2.5 family of models offers a range of options to meet diverse enterprise needs. With Gemini 2.5 Flash moving to general availability, its pricing has been updated to reflect its improved quality and comprehensive capabilities. We are also introducing preview pricing for Gemini 2.5 Flash-Lite, our most cost efficient Gemini 2.5 model yet. For complete details on pricing for Gemini 2.5 Flash, Gemini 2.5 Pro, and the Gemini 2.5 Flash-Lite preview, please visit our pricing page.
Start moving to production today with Gemini 2.5 Flash and Gemini 2.5 Pro, now generally available on Vertex AI.
Read More for the details.