self-attention | Topic | Ecosyste.ms: Repos

Topic: "self-attention"

datawhalechina/leedl-tutorial

《李宏毅深度学习教程》（李宏毅老师推荐👍，苹果书🍎），PDF下载地址：https://github.com/datawhalechina/leedl-tutorial/releases

Language: Jupyter Notebook - Size: 295 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 15,209 - Forks: 3,037

zhouhaoyi/Informer2020

The GitHub repository for the paper "Informer" accepted by AAAI 2021.

Language: Python - Size: 6.34 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 6,310 - Forks: 1,280

cmhungsteve/Awesome-Transformer-Attention

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

Size: 5.65 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 4,981 - Forks: 496

PetarV-/GAT

Graph Attention Networks (https://arxiv.org/abs/1710.10903)

Language: Python - Size: 4.6 MB - Last synced at: 7 months ago - Pushed at: over 3 years ago - Stars: 3,364 - Forks: 665

Diego999/pyGAT

Pytorch implementation of the Graph Attention Network model by Veličković et. al (2017, https://arxiv.org/abs/1710.10903)

Language: Python - Size: 207 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 3,035 - Forks: 699

My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!

Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 2,609 - Forks: 347

github/CodeSearchNet 📦

Datasets, tools, and benchmarks for representation learning of code.

Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 2,378 - Forks: 408

microsoft/DeBERTa

The implementation of DeBERTa

Language: Python - Size: 237 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 2,183 - Forks: 241

NVlabs/MambaVision

[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Language: Python - Size: 2.76 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 1,758 - Forks: 99

speedinghzl/CCNet

CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).

Language: Python - Size: 3.88 MB - Last synced at: 8 months ago - Pushed at: almost 5 years ago - Stars: 1,461 - Forks: 278

DirtyHarryLYL/Transformer-in-Vision

Recent Transformer-based CV and related works.

Size: 1.84 MB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 1,332 - Forks: 143

The-AI-Summer/self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Language: Python - Size: 291 KB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 1,210 - Forks: 155

Separius/awesome-fast-attention 📦

list of efficient attention modules

Language: Python - Size: 156 KB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 1,016 - Forks: 107

brightmart/bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

Language: Python - Size: 16 MB - Last synced at: 8 months ago - Pushed at: about 7 years ago - Stars: 966 - Forks: 211

NVlabs/FasterViT

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention

Language: Python - Size: 1.22 MB - Last synced at: about 2 hours ago - Pushed at: 6 months ago - Stars: 899 - Forks: 69

xxxnell/how-do-vits-work

(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"

Language: Python - Size: 18.3 MB - Last synced at: 8 months ago - Pushed at: over 3 years ago - Stars: 815 - Forks: 79

prakashpandey9/Text-Classification-Pytorch

Text classification using deep learning models in Pytorch

Language: Python - Size: 31.3 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 801 - Forks: 237

kaituoxu/Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Language: Python - Size: 678 KB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 796 - Forks: 197

daiquocnguyen/Graph-Transformer

Universal Graph Transformer Self-Attention Networks (TheWebConf WWW 2022) (Pytorch and Tensorflow)

Language: Python - Size: 109 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 675 - Forks: 77

jayparks/transformer

A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"

Language: Python - Size: 55.7 KB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 557 - Forks: 122

kaushalshetty/Structured-Self-Attention

A Structured Self-attentive Sentence Embedding

Language: Python - Size: 492 KB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 493 - Forks: 104

chsiang426/ML-2021-notes

臺灣大學 (NTU) 李宏毅教授「機器學習 (Machine Learning) 2021 Spring 」課程筆記

Size: 111 MB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 492 - Forks: 46

NVlabs/FAN

Official PyTorch implementation of Fully Attentional Networks

Language: Python - Size: 8.6 MB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 478 - Forks: 28

WenjieDu/SAITS

The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516

Language: Python - Size: 603 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 455 - Forks: 62

leaderj1001/Stand-Alone-Self-Attention

Implementing Stand-Alone Self-Attention in Vision Models using Pytorch

Language: Python - Size: 87.9 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 450 - Forks: 86

NVlabs/GCVit

[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

Language: Python - Size: 858 KB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 437 - Forks: 51

Tixierae/deep_learning_NLP

Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP

Language: Jupyter Notebook - Size: 105 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 435 - Forks: 106

jw9730/tokengt

[NeurIPS'22] Tokenized Graph Transformer (TokenGT), in PyTorch

Language: Python - Size: 1.23 MB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 332 - Forks: 47

WHU-Sigma/HyperSIGMA

The official repo for [TPAMI'25] "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"

Language: Python - Size: 80.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 315 - Forks: 26

fudan-zvg/SOFT

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity

Language: Python - Size: 5.06 MB - Last synced at: 8 months ago - Pushed at: almost 2 years ago - Stars: 310 - Forks: 25

binli123/dsmil-wsi

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image

Language: Python - Size: 48.1 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 302 - Forks: 82

aravindsankar28/DySAT

Representation learning on dynamic graphs using self-attention networks

Language: Python - Size: 2.39 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 296 - Forks: 42

wangxiao5791509/MultiModal_BigModels_Survey

[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models

Size: 13.2 MB - Last synced at: 9 months ago - Pushed at: 11 months ago - Stars: 286 - Forks: 17

wenwenyu/MASTER-pytorch

Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

Language: Python - Size: 4.33 MB - Last synced at: 7 months ago - Pushed at: about 4 years ago - Stars: 280 - Forks: 51

naver-ai/rope-vit

[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"

Language: Python - Size: 1.06 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 271 - Forks: 7

alohays/awesome-visual-representation-learning-with-transformers

Awesome Transformers (self-attention) in Computer Vision

Size: 73.2 KB - Last synced at: 16 days ago - Pushed at: over 4 years ago - Stars: 268 - Forks: 37

emadeldeen24/AttnSleep

[TNSRE 2021] "An Attention-based Deep Learning Approach for Sleep Stage Classification with Single-Channel EEG"

Language: Python - Size: 473 KB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 260 - Forks: 69

kushalj001/pytorch-question-answering

Important paper implementations for Question Answering using PyTorch

Language: Jupyter Notebook - Size: 12 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 260 - Forks: 49

GuanRunwei/Awesome-Vision-Transformer-Collection

Variants of Vision Transformer and its downstream tasks

Size: 59.6 KB - Last synced at: 20 days ago - Pushed at: over 3 years ago - Stars: 250 - Forks: 29

cbaziotis/neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

Language: Vue - Size: 25.4 MB - Last synced at: 9 months ago - Pushed at: over 7 years ago - Stars: 250 - Forks: 24

L0SG/relational-rnn-pytorch

An implementation of DeepMind's Relational Recurrent Neural Networks (NeurIPS 2018) in PyTorch.

Language: Python - Size: 4.49 MB - Last synced at: 6 months ago - Pushed at: about 7 years ago - Stars: 245 - Forks: 35

babycommando/neuralgraffiti

Live-bending a foundation model’s output at neural network level.

Language: Jupyter Notebook - Size: 31.3 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 212 - Forks: 16

VSainteuf/pytorch-psetae

PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention"

Language: Python - Size: 1.98 MB - Last synced at: 8 months ago - Pushed at: almost 4 years ago - Stars: 198 - Forks: 43

flrngel/Self-Attentive-tensorflow 📦

Tensorflow implementation of "A Structured Self-Attentive Sentence Embedding"

Language: Python - Size: 1.4 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 192 - Forks: 39

keonlee9420/Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Language: Python - Size: 99.3 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 189 - Forks: 45

shiningliang/MRC2018

2018百度机器阅读理解技术竞赛

Language: Python - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 167 - Forks: 51

VSainteuf/utae-paps

PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.

Language: Jupyter Notebook - Size: 3.03 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 161 - Forks: 58

gmftbyGMFTBY/MultiTurnDialogZoo

Multi-turn dialogue baselines written in PyTorch

Language: Python - Size: 23 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 161 - Forks: 25

Audio-WestlakeU/FS-EEND

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction". [TASLP 2025]

Language: Python - Size: 5.84 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 155 - Forks: 9

lifanchen-simm/transformerCPI

TransformerCPI: Improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments(BIOINFORMATICS 2020) https://doi.org/10.1093/bioinformatics/btaa524

Language: Python - Size: 47.2 MB - Last synced at: 29 days ago - Pushed at: over 3 years ago - Stars: 151 - Forks: 38

wilile26811249/ViTGAN

A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

Language: Python - Size: 181 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 144 - Forks: 27

leaderj1001/LambdaNetworks

Implementing Lambda Networks using Pytorch

Language: Python - Size: 40 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 137 - Forks: 22

zabaras/transformer-physx

Transformers for modeling physical systems

Language: Python - Size: 31.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 129 - Forks: 32

ubisoft/ubisoft-laforge-daft-exprt

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Language: Python - Size: 1.44 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 129 - Forks: 23

ChongQingNoSubway/DGR-MIL

Code for paper: DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification [ECCV 2024]

Language: Python - Size: 2.29 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 128 - Forks: 4

esceptico/perceiver-io

Unofficial implementation of Perceiver IO

Language: Python - Size: 16.6 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 126 - Forks: 5

monk1337/Various-Attention-mechanisms

This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras

Language: Python - Size: 643 KB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 125 - Forks: 25

sayakpaul/robustness-vit

Contains code for the paper "Vision Transformers are Robust Learners" (AAAI 2022).

Language: Jupyter Notebook - Size: 4.22 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 122 - Forks: 19

foamliu/Self-Attention-Keras

自注意力与文本分类

Language: Python - Size: 97.7 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 116 - Forks: 37

akanimax/fagan

A variant of the Self Attention GAN named: FAGAN (Full Attention GAN)

Language: Python - Size: 146 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 112 - Forks: 31

anthony-wang/CrabNet

Predict materials properties using only the composition information!

Language: Python - Size: 429 MB - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 100 - Forks: 31

aliasgharkhani/SLiMe

1-shot image segmentation using Stable Diffusion

Language: Python - Size: 31.1 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 95 - Forks: 8

lucidrains/global-self-attention-network

A Pytorch implementation of Global Self-Attention Network, a fully-attention backbone for vision tasks

Language: Python - Size: 95.7 KB - Last synced at: 4 months ago - Pushed at: about 5 years ago - Stars: 95 - Forks: 7

roomylee/self-attentive-emb-tf

Simple Tensorflow Implementation of "A Structured Self-attentive Sentence Embedding" (ICLR 2017)

Language: Python - Size: 11 MB - Last synced at: almost 3 years ago - Pushed at: over 7 years ago - Stars: 94 - Forks: 34

Nandan91/ULSAM

ULSAM: Ultra-Lightweight Subspace Attention Module for Compact Convolutional Neural Networks

Language: Python - Size: 861 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 79 - Forks: 15

VSainteuf/lightweight-temporal-attention-pytorch

A PyTorch implementation of the Light Temporal Attention Encoder (L-TAE) for satellite image time series. classification

Language: Python - Size: 935 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 78 - Forks: 17

moraieu/query-selector

LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

Language: Python - Size: 2.02 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 74 - Forks: 19

BUAABIGSCity/PDFormer

[AAAI2023] A PyTorch implementation of PDFormer: Propagation Delay-aware Dynamic Long-range Transformer for Traffic Flow Prediction.

Language: Python - Size: 8.33 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 74 - Forks: 16

daiquocnguyen/R-MeN

Transformer-based Memory Networks for Knowledge Graph Embeddings (ACL 2020) (Pytorch and Tensorflow)

Language: Python - Size: 51.9 MB - Last synced at: almost 3 years ago - Pushed at: almost 4 years ago - Stars: 74 - Forks: 14

zxuu/Self-Attention

Transformer的完整实现。详细构建Encoder、Decoder、Self-attention。以实际例子进行展示，有完整的输入、训练、预测过程。可用于学习理解self-attention和Transformer

Language: Python - Size: 4.79 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 72 - Forks: 12

keonlee9420/VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Language: Python - Size: 122 MB - Last synced at: 6 months ago - Pushed at: over 4 years ago - Stars: 72 - Forks: 14

shamim-hussain/egt_pytorch

Edge-Augmented Graph Transformer

Language: Python - Size: 79.1 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 67 - Forks: 9

jw9730/hot

[NeurIPS'21] Higher-order Transformers for sets, graphs, and hypergraphs, in PyTorch

Language: Python - Size: 1.95 MB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 65 - Forks: 10

lukasruff/CVDD-PyTorch

A PyTorch implementation of Context Vector Data Description (CVDD), a method for Anomaly Detection on text.

Language: Python - Size: 35.2 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 58 - Forks: 20

sjsu-smart-lab/Self-supervised-Monocular-Trained-Depth-Estimation-using-Self-attention-and-Discrete-Disparity-Volum

Reproduction of the CVPR 2020 paper - Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume

Language: Python - Size: 2.61 MB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 57 - Forks: 7

akshitac8/BiAM

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Language: Python - Size: 28.2 MB - Last synced at: almost 3 years ago - Pushed at: about 4 years ago - Stars: 55 - Forks: 10

tensorops/TransformerX

Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)

Language: Python - Size: 508 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 53 - Forks: 8

FutureComputing4AI/Hrrformer

Hrrformer: A Neuro-symbolic Self-attention Model (ICML23)

Language: Python - Size: 126 KB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 52 - Forks: 6

Das-Boot/scite

Causality Extraction based on Self-Attentive BiLSTM-CRF with Transferred Embeddings

Language: Jupyter Notebook - Size: 1.06 MB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 52 - Forks: 13

miniHuiHui/awesome-high-order-neural-network

Size: 43.9 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 51 - Forks: 5

deshwalmahesh/ML-Models-from-Scratch

Repo for ML Models built from scratch such as Self-Attention, Linear +Logistic Regression, PCA, LDA. CNN, LSTM, Neural Networks using Numpy only

Language: Jupyter Notebook - Size: 38.4 MB - Last synced at: 8 months ago - Pushed at: 11 months ago - Stars: 49 - Forks: 8

EagleW/Describing_a_Knowledge_Base

Code for Describing a Knowledge Base

Language: Python - Size: 5.91 MB - Last synced at: almost 3 years ago - Pushed at: over 4 years ago - Stars: 48 - Forks: 29

ROBINADC/BiGRU-CRF-with-Attention-for-NER

Named Entity Recognition (NER) with different combinations of BiGRU, Self-Attention and CRF

Language: Python - Size: 13.5 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 48 - Forks: 17

RLOpensource/Relational_Deep_Reinforcement_Learning

Language: Python - Size: 7.56 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 48 - Forks: 5

gan3sh500/attention-augmented-conv

Implementation from the paper Attention Augmented Convolutional Networks in Tensorflow (https://arxiv.org/pdf/1904.09925v1.pdf)

Language: Python - Size: 74.2 KB - Last synced at: almost 3 years ago - Pushed at: over 6 years ago - Stars: 46 - Forks: 6

threelittlemonkeys/seq2seq-pytorch

Sequence to Sequence Models in PyTorch

Language: Python - Size: 10.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 44 - Forks: 10

Syeda-Farhat/awesome-Transformers-For-Segmentation

Semantic segmentation is an important job in computer vision, and its applications have grown in popularity over the last decade.We grouped the publications that used various forms of segmentation in this repository. Particularly, every paper is built on a transformer.

Size: 376 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 43 - Forks: 2