GitHub / lucidrains / native-sparse-attention-pytorch
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Stars: 553
Forks: 23
Open issues: 6
License: mit
Language: Python
Size: 34.4 MB
Dependencies parsed at: Pending
Created at: 2 months ago
Updated at: about 1 month ago
Pushed at: about 1 month ago
Last synced at: about 1 month ago
Topics: artificial-intelligence, attention, deep-learning, sparse-attention
Loading...