GitHub topics: large-model-inference
aws-samples/amazon-sagemaker-llama2-response-streaming-recipes
Amazon SageMaker Llama 2 Inference via Response Streaming
Language: Jupyter Notebook - Size: 565 KB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 13 - Forks: 4

windson/inferentia-deployments
Deploy Large Models on AWS Inferentia (Inf2) instances.
Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
