Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: gemv

DefTruth/CUDA-Learn-Notes

🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Language: Cuda - Size: 2.95 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 581 - Forks: 58

Bruce-Lee-LY/cuda_hgemv

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

Language: Cuda - Size: 458 KB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 14 - Forks: 3

nsomatilda/Matilda

Matilda is a library to repeatedly multiply a constant matrix with a variable vector

Language: C++ - Size: 38.1 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

yzhaiustc/Optimizing-SGEMV-on-NVIDIA-GPUs

An implementation of SGEMV with performance comparable to cuBLAS.

Language: Cuda - Size: 43 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 4 - Forks: 4

yzhaiustc/Optimizing-DGEMV-on-Intel-CPUs

Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.

Language: C - Size: 23.4 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 1