An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: flash-attention-3

xlite-dev/Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Language: Python - Size: 115 MB - Last synced at: 7 days ago - Pushed at: 15 days ago - Stars: 4,260 - Forks: 294

gietema/attention

Toy Flash Attention implementation in torch

Language: Python - Size: 21.5 KB - Last synced at: 8 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0