Topic: "flash-attention-3"
xlite-dev/Awesome-LLM-Inference
📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.
Language: Python - Size: 115 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,943 - Forks: 274

gietema/attention
Toy Flash Attention implementation in torch
Language: Python - Size: 21.5 KB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0
