GitHub topics: efficient-attention

Repositories

thu-ml/SageAttention

Quantized Attention achieves speedup of 2-3x and 3-5x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Language: Cuda - Size: 46.1 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1,413 - Forks: 96

lucidrains/ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Language: Python - Size: 1.01 MB - Last synced at: 29 days ago - Pushed at: 7 months ago - Stars: 510 - Forks: 30

gmlwns2000/sea-attention

Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)

Language: Python - Size: 372 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 1

lucidrains/CoLT5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

Language: Python - Size: 187 KB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 226 - Forks: 13

Ascend-Research/CascadedGaze

The official PyTorch implementation for CascadedGaze: Efficiency in Global Context Extraction for Image Restoration, TMLR'24.

Language: Python - Size: 474 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 49 - Forks: 3

robflynnyh/hydra-linear-attention

Implementation of: Hydra Attention: Efficient Attention with Many Heads (https://arxiv.org/abs/2209.07484)

Language: Python - Size: 8.79 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 0

HolmesShuan/Compact-Global-Descriptor

Pytorch implementation of "Compact Global Descriptor for Neural Networks" (CGD).

Language: Python - Size: 2.55 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 25 - Forks: 7

pszemraj/samba-pytorch

Minimal implementation of Samba by Microsoft in PyTorch

Language: Python - Size: 34.1 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

MAGICS-LAB/NonparametricHopfield

Nonparametric Modern Hopfield Models

Language: Jupyter Notebook - Size: 167 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

davidsvy/cosformer-pytorch

Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".

Language: Jupyter Notebook - Size: 243 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 40 - Forks: 7

Related Keywords

efficient-attention 10 attention 3 attention-mechanism 3 transformer 2 llm 2 pytorch 2 deep-learning 2 artificial-intelligence 2 linear-attention 2 cuda 1 transformers 1 attention-model 1 convolutional-neural-networks 1 efficient 1 language-model 1 long-context-modeling 1 mamba-state-space-models 1 pytorch-implementation 1 ssm 1 efficient-hopfield-models 1 efficient-hopfield-networks 1 efficient-transformers 1 modern-hopfield-model 1 modern-hopfield-networks 1 iclr 1 iclr2022 1 neural-network 1 inference-acceleration 1 llm-infra 1 mlsys 1 quantization 1 triton 1 video-generate 1 video-generation 1 vit 1 distributed-attention 1 long-context 1 sea-attention 1 attention-mechanisms 1 routing 1 deblurring 1 denoising 1 efficiency 1 image-restoration 1 machine-learning 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos