An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: cutlass

leimao/CUTLASS-Examples

CUTLASS and CuTe Examples

Language: Cuda - Size: 429 KB - Last synced at: about 12 hours ago - Pushed at: 4 months ago - Stars: 49 - Forks: 7

coderonion/awesome-cuda-and-hpc

🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.

Size: 55.7 KB - Last synced at: 1 day ago - Pushed at: 12 days ago - Stars: 258 - Forks: 30

sgl-project/whl

Kernel Library Wheel for SGLang

Language: HTML - Size: 26.4 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 9 - Forks: 1

peterlau123/Lolly

Lightweight and production level C++ Open source Library

Language: C++ - Size: 102 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

prateekshukla1108/cutlass3

Docs

Language: HTML - Size: 22.5 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

bytedance/flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

Language: C++ - Size: 2.6 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 853 - Forks: 55

cjmcv/ai-infra-notes

Reading notes on the open source code of AI infrastructure (sglang, llm, cutlass, hpc, etc.)

Size: 777 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

Bruce-Lee-LY/cutlass_gemm

Multiple GEMM operators are constructed with cutlass to support LLM inference.

Language: C++ - Size: 2.14 MB - Last synced at: 24 days ago - Pushed at: 7 months ago - Stars: 17 - Forks: 2

Bruce-Lee-LY/flash_attention_inference

Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.

Language: C++ - Size: 1.99 MB - Last synced at: 24 days ago - Pushed at: 2 months ago - Stars: 35 - Forks: 4

DD-DuDa/Cute-Learning

Examples of CUDA implementations by Cutlass CuTe

Language: Makefile - Size: 21.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 132 - Forks: 15

YashasSamaga/ConvolutionBuildingBlocks

GEMM and Winograd based convolutions using CUTLASS

Language: Cuda - Size: 218 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 26 - Forks: 3

yester31/Cutlass_EX

study of cutlass

Language: Cuda - Size: 66.4 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 19 - Forks: 4

digital-nomad-cheng/tvm_project_course

Language: Python - Size: 3.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Routhleck/blocksparse-pytorch-implement

pytorch implements block sparse

Language: C++ - Size: 1.41 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0