2025 08 18

AWS – Amazon Bedrock now supports Batch inference for Anthropic Claude Sonnet 4 and OpenAI GPT-OSS models

Anthropic’s Claude Sonnet 4 and OpenAI’s GPT-OSS 120B and 20B models are now available for Batch inference in Amazon Bedrock. With Batch inference, you can run multiple inference requests asynchronously, improving performance on large datasets at 50% of the on-demand inference pricing. Amazon Bedrock offers select foundation models (FMs) from leading AI providers such as Anthropic, OpenAI, Meta, and Amazon for batch inference, making it easier and more cost-effective to process high-volume workloads.

With Batch inference on Claude Sonnet 4 and OpenAI GPT-OSS models, you can process large datasets for scenarios such as document and customer feedback analysis, bulk content generation (e.g., marketing copy, product descriptions), large-scale prompt or output evaluations, automated summarization of knowledge bases and archives, mass categorization of support tickets or emails, and extraction of structured data from unstructured text—at scale and with lower costs. We’ve optimized our Batch offering to deliver higher overall batch throughput on these newer models compared to previous ones. In addition, you can now track your Batch workload progress at the AWS account level with Amazon CloudWatch metrics. For all models, these metrics include total pending records, processed records and tokens per minute, and for Claude models, they also include tokens pending processing.

To learn more about Batch inference in Amazon Bedrock, visit the Batch inference documentation. You can visit Supported Regions and models for batch inference page for more details on supported models and follow Amazon Bedrock API reference to get started with Batch inference.

AWS – Amazon Bedrock now supports Batch inference for Anthropic Claude Sonnet 4 and OpenAI GPT-OSS models

Related Posts

AWS – SageMaker HyperPod now supports Managed tiered KV cache and intelligent routing

AWS – Amazon SageMaker HyperPod now supports custom Kubernetes labels and taints

AWS – Amazon Kinesis Video Streams now supports a new cost effective warm storage tier