An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: tensor-cores

8e8bdba457c18cf692a95fe2ec67000b/VulkanCooperativeMatrixAttention

Vulkan & GLSL implementation of FlashAttention-2

Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

xlite-dev/ffpa-attn

📚FFPA(Split-D): Extend FlashAttention with Split-D for large headdim, O(1) GPU SRAM complexity, 1.8x~3x↑🎉 faster than SDPA EA.

Language: Cuda - Size: 4.21 MB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 185 - Forks: 8

xlite-dev/HGEMM

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.

Language: Cuda - Size: 2.84 MB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 79 - Forks: 4

aye-shadow/neural-network-acceleration

Language: Cuda - Size: 23.7 MB - Last synced at: 24 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

NeuralAditya/Neural_Network_C

Neural Network C is an advanced neural network implementation in pure C, optimized for high performance on CPUs and NVIDIA GPUs.

Language: C - Size: 64.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

tgautam03/tGeMM

General Matrix Multiplication using NVIDIA Tensor Cores

Language: Cuda - Size: 47.9 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 13 - Forks: 3

etasnadi/VulkanCooperativeMatrixAttention

Vulkan & GLSL implementation of FlashAttention-2

Language: C++ - Size: 39.1 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

NeilPandya/DeepLearningExamples Fork of NVIDIA/DeepLearningExamples

My personal fork of NVIDIA's Deep Learning Examples.

Language: Jupyter Notebook - Size: 95.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

LDRyan0/Correlator-Bench

A benchmarking framework for correlators of FX telescope arrays

Language: Cuda - Size: 96.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0