Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: gemv
DefTruth/CUDA-Learn-Notes
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
Language: Cuda - Size: 2.95 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 581 - Forks: 58
Bruce-Lee-LY/cuda_hgemv
Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
Language: Cuda - Size: 458 KB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 14 - Forks: 3
nsomatilda/Matilda
Matilda is a library to repeatedly multiply a constant matrix with a variable vector
Language: C++ - Size: 38.1 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0
yzhaiustc/Optimizing-SGEMV-on-NVIDIA-GPUs
An implementation of SGEMV with performance comparable to cuBLAS.
Language: Cuda - Size: 43 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 4 - Forks: 4
yzhaiustc/Optimizing-DGEMV-on-Intel-CPUs
Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.
Language: C - Size: 23.4 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 1