GitHub topics: cublaslt
Bruce-Lee-LY/cuda_hook
Hooked CUDA-related dynamic libraries by using automated code generation tools.
Language: C - Size: 717 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 165 - Forks: 45

Bruce-Lee-LY/cutlass_gemm
Multiple GEMM operators are constructed with cutlass to support LLM inference.
Language: C++ - Size: 2.81 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 18 - Forks: 2

zhaocc1106/cuxx-programing
cuda、cublas、cublaslt、cusparse...
Language: Cuda - Size: 82 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

nghiapq77/face-recognition-cpp-tensorrt
Face Recognition with RetinaFace and ArcFace.
Language: C++ - Size: 490 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 59 - Forks: 16

vadimkantorov/fastmlp
[WIP] PyTorch bindings for cublasLt with an example of quantized i8f16 MLP
Size: 1000 Bytes - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0
