ecosyste.ms

Repos

An open API service providing repository metadata for many open source software ecosystems.

Topic: "flash-attention-3"

xlite-dev/Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.

Language: Python - Size: 115 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,943 - Forks: 274

gietema/attention

Toy Flash Attention implementation in torch

Language: Python - Size: 21.5 KB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

Related Topics

flash-attention 2 flash-attention-2 1 torch 1 awesome-llm 1 deepseek 1 deepseek-r1 1 deepseek-v3 1 flash-mla 1 llm-inference 1 minimax-01 1 mla 1 paged-attention 1 qwen3 1 tensorrt-llm 1 vllm 1