An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: flashinfer

sgl-project/whl

Kernel Library Wheel for SGLang

Language: HTML - Size: 51.8 KB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 11 - Forks: 2

Bruce-Lee-LY/decoding_attention

Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.

Language: C++ - Size: 867 KB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 40 - Forks: 4