AWS – Sagemaker Real-time Inference now supports response streaming
Customers can now continuously stream inference responses back to the client when using SageMaker real-time inference to help you build interactive experiences for various generative AI applications such as chatbots, virtual assistants, and music generators.
Read More for the details.