smooth-quantization | Topic | Ecosyste.ms: Repos

Topic: "smooth-quantization"

aahouzi/llama2-chatbot-cpu

A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

Language: Python - Size: 30.3 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 13 - Forks: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

Topic: "smooth-quantization"

aahouzi/llama2-chatbot-cpu