2025 03 05

AWS – Announcing latency-optimized inference for Amazon Nova Pro foundation model in Amazon Bedrock

Amazon Nova Pro foundation model now supports latency-optimized inference in preview on Amazon Bedrock, enabling faster response times and improved responsiveness for generative AI applications. Latency-optimized inference speeds up response times for latency-sensitive applications, improving the end-user experience and giving developers more flexibility to optimize performance for their use case. Accessing these capabilities requires no additional setup or model fine-tuning, allowing for immediate enhancement of existing applications with faster response times.

Latency optimized inference for Amazon Nova Pro is available via cross-region inference in US West (Oregon), US East (Virginia), and US East (Ohio) regions. Learn more about Amazon Nova foundation models at the AWS News Blog, the Amazon Nova product page, or the Amazon Nova user guide. Learn more about latency optimized inference on Bedrock in documentation. You can get started with Amazon Nova foundation models in Amazon Bedrock from the Amazon Bedrock console.

AWS – Announcing latency-optimized inference for Amazon Nova Pro foundation model in Amazon Bedrock

Related Posts

AWS – Introducing AWS CDK Refactor (Preview)

GCP – Scaling high-performance inference cost-effectively

GCP – Fast and efficient AI inference with new NVIDIA Dynamo recipe on AI Hypercomputer