GitHub topics: self-attention
gordicaleksa/pytorch-GAT
My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!
Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: about 6 hours ago - Pushed at: over 2 years ago - Stars: 2,540 - Forks: 336

nicolay-r/AREnets
Tensorflow-based framework which lists attentive implementation of the conventional neural network models (CNN, RNN-based), applicable for Relation Extraction classification tasks as well as API for custom model implementation
Language: Python - Size: 1.34 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 7 - Forks: 0

MuzzammilShah/GPT-TransformerModel-2
An end-to-end PyTorch implementation of a GPT-2 style language model (124M) released by OpenAI and inspired by Karpathy’s NanoGPT. Covers core components like tokenization, multi-head self-attention, transformer blocks, positional embeddings and various other key ML concepts.
Language: Jupyter Notebook - Size: 3.06 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

NVlabs/GCVit
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
Language: Python - Size: 858 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 434 - Forks: 50

github/CodeSearchNet 📦
Datasets, tools, and benchmarks for representation learning of code.
Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 2,297 - Forks: 398

esceptico/perceiver-io
Unofficial implementation of Perceiver IO
Language: Python - Size: 16.6 KB - Last synced at: 2 days ago - Pushed at: almost 3 years ago - Stars: 121 - Forks: 5

datawhalechina/leedl-tutorial
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Language: Jupyter Notebook - Size: 294 MB - Last synced at: 5 days ago - Pushed at: 11 days ago - Stars: 15,054 - Forks: 3,019

microsoft/DeBERTa
The implementation of DeBERTa
Language: Python - Size: 237 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 2,081 - Forks: 233

deshwalmahesh/ML-Models-from-Scratch
Repo for ML Models built from scratch such as Self-Attention, Linear +Logistic Regression, PCA, LDA. CNN, LSTM, Neural Networks using Numpy only
Language: Jupyter Notebook - Size: 38.4 MB - Last synced at: about 10 hours ago - Pushed at: 3 months ago - Stars: 49 - Forks: 8

hyuki875/Transformers
The Transformers repository provides a comprehensive implementation of the Transformer architecture, a groundbreaking model that has revolutionized both Natural Language Processing (NLP) and Computer Vision tasks. Introduced in the seminal paper "Attention is All You Need" by Vaswani et al.
Size: 1.95 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

codewithdark-git/Transformers
The Transformers repository provides a comprehensive implementation of the Transformer architecture, a groundbreaking model that has revolutionized both Natural Language Processing (NLP) and Computer Vision tasks. Introduced in the seminal paper "Attention is All You Need" by Vaswani et al.
Language: Jupyter Notebook - Size: 2.09 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

xxxnell/how-do-vits-work
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
Language: Python - Size: 18.3 MB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 815 - Forks: 79

Separius/awesome-fast-attention 📦
list of efficient attention modules
Language: Python - Size: 156 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 1,003 - Forks: 108

GuanRunwei/Awesome-Vision-Transformer-Collection
Variants of Vision Transformer and its downstream tasks
Size: 59.6 KB - Last synced at: 5 days ago - Pushed at: almost 3 years ago - Stars: 234 - Forks: 28

VSainteuf/pytorch-psetae
PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention"
Language: Python - Size: 1.98 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 198 - Forks: 43

The-AI-Summer/self-attention-cv
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
Language: Python - Size: 291 KB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 1,205 - Forks: 154

cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Size: 5.65 MB - Last synced at: 11 days ago - Pushed at: 10 months ago - Stars: 4,848 - Forks: 492

emadeldeen24/ECGTransForm
[Biomedical Signal Processing and Control] ECGTransForm: Empowering adaptive ECG arrhythmia classification framework with bidirectional transformer
Language: Python - Size: 1.11 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 40 - Forks: 8

ntat/Class-Conditional-Diffusion
Conditional Diffuser from scratch, applied on CelebA-HQ, Cifar10 and MNIST.
Size: 4.36 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

Dhanush-R-git/MH-Analysis
The MHRoberta is Mental Health Roberta model. The pretrained Roberta transformer based model fine-tunned on Mental Health dataset by adopting PEFT method.
Language: Jupyter Notebook - Size: 3.61 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

miniHuiHui/awesome-high-order-neural-network
Size: 43.9 KB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 47 - Forks: 4

Namkwangwoon/Saliency-Attention-based-DETR
SA-DETR: Saliency Attention-based DETR for Salienct Object Detection
Language: Python - Size: 338 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

VSainteuf/utae-paps
PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.
Language: Jupyter Notebook - Size: 3.03 MB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 161 - Forks: 58

WHU-Sigma/HyperSIGMA
The official repo for [TPAMI'25] "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"
Language: Python - Size: 80.5 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 230 - Forks: 19

theJingqiZhou/DS-AGC-Colab Fork of Vesan-yws/DS-AGC
A Pytorch re-implementation of paper "Semi-Supervised Dual-Stream Self-Attentive Adversarial Graph Contrastive Learning for Cross-Subject EEG-based Emotion Recognition" by Ye et al.
Language: Python - Size: 58.6 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

mayankmittal29/CalcuTron-Neural-Math-Solver-with-Transformers
CalcuTron is a transformer-based sequence-to-sequence model engineered for symbolic arithmetic reasoning. Leveraging multi-head self-attention, positional encoding, and deep encoder-decoder layers, it learns to perform multi-digit addition and subtraction. It generalizes to longer sequences without explicit rules, showcasing emergent algorithmics.
Size: 3.91 KB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

tiongsikng/gc2sa_net
Self-Attentive Contrastive Learning for Conditioned Periocular and Face Biometrics
Language: Jupyter Notebook - Size: 15.1 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 3 - Forks: 2

srinadh99/AstroFormer
Photometry Guided Cross Attention Transformers for Astronomical Image Processing
Language: Jupyter Notebook - Size: 22.2 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

asalekin-ubiquitouslab/Modality-wise-Multple-Instance-Learning
The repository contains our implementation for the work to be presented at Ubicomp 2022
Language: Jupyter Notebook - Size: 31.1 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 3 - Forks: 0

WenjieDu/SAITS
The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516
Language: Python - Size: 583 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 393 - Forks: 55

veb-101/keras-vision
Porting vision models to Keras 3 for easily accessibility. Contains MobileViT v1, MobileViT v2, fastvit
Language: Jupyter Notebook - Size: 4.45 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 11 - Forks: 2

Syeda-Farhat/awesome-Transformers-For-Segmentation
Semantic segmentation is an important job in computer vision, and its applications have grown in popularity over the last decade.We grouped the publications that used various forms of segmentation in this repository. Particularly, every paper is built on a transformer.
Size: 300 KB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 32 - Forks: 2

cocoalex00/Mamba2D
Official PyTorch Implementation of Mamba2D: A Natively Multi-Dimensional State-Space Model for Vision Tasks
Language: Python - Size: 4.34 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 3 - Forks: 1

NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
Language: Python - Size: 1.21 MB - Last synced at: 25 days ago - Pushed at: about 2 months ago - Stars: 843 - Forks: 68

pashtari/deconver
Official PyTorch Implementation of "Deconver: A Deconvolutional Network for Medical Image Segmentation"
Language: Python - Size: 20.5 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 4 - Forks: 0

Diego999/pyGAT
Pytorch implementation of the Graph Attention Network model by Veličković et. al (2017, https://arxiv.org/abs/1710.10903)
Language: Python - Size: 207 KB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 3,018 - Forks: 696

PetarV-/GAT
Graph Attention Networks (https://arxiv.org/abs/1710.10903)
Language: Python - Size: 4.6 MB - Last synced at: 28 days ago - Pushed at: about 3 years ago - Stars: 3,333 - Forks: 660

saadwazir/HistoSeg
HistoSeg is an Encoder-Decoder DCNN which utilizes the novel Quick Attention Modules and Multi Loss function to generate segmentation masks from histopathological images with greater accuracy. This repo contains the code to Test and Train the HistoSeg
Language: Python - Size: 22.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 23 - Forks: 5

zxuu/Self-Attention
Transformer的完整实现。详细构建Encoder、Decoder、Self-attention。以实际例子进行展示,有完整的输入、训练、预测过程。可用于学习理解self-attention和Transformer
Language: Python - Size: 4.79 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 72 - Forks: 12

babycommando/neuralgraffiti
Live-bending a foundation model’s output at neural network level.
Language: Jupyter Notebook - Size: 31.3 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 212 - Forks: 16

cbaziotis/neat-vision
Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)
Language: Vue - Size: 25.4 MB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 250 - Forks: 24

zhouhaoyi/Informer2020
The GitHub repository for the paper "Informer" accepted by AAAI 2021.
Language: Python - Size: 6.33 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5,812 - Forks: 1,199

speedinghzl/CCNet
CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).
Language: Python - Size: 3.88 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 1,454 - Forks: 278

francomano/PSA-GAN
PSA-GAN implementation in pytorch
Language: Jupyter Notebook - Size: 127 MB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 19 - Forks: 1

anthony-wang/CrabNet
Predict materials properties using only the composition information!
Language: Python - Size: 429 MB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 100 - Forks: 31

alohays/awesome-visual-representation-learning-with-transformers
Awesome Transformers (self-attention) in Computer Vision
Size: 73.2 KB - Last synced at: 7 days ago - Pushed at: almost 4 years ago - Stars: 270 - Forks: 38

SCCSMARTCODE/attention-is-all-you-need-from-scratch
A complete implementation of the Transformer architecture from scratch, including self-attention, positional encoding, multi-head attention, and feedforward layers. This repository provides a deep understanding of Transformers and serves as a foundation for advanced NLP and deep learning models.
Language: Jupyter Notebook - Size: 25.4 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

fudan-zvg/SOFT
[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity
Language: Python - Size: 5.06 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 308 - Forks: 25

brightmart/bert_language_understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Language: Python - Size: 16 MB - Last synced at: 29 days ago - Pushed at: over 6 years ago - Stars: 964 - Forks: 211

dcarpintero/ai-engineering
AI Engineering: Annotated NBs to dive into Self-Attention, In-Context Learning, RAG, Knowledge-Graphs, Fine-Tuning, Model Optimization, and many more.
Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

cosbidev/MATNet
Multi-Level Fusion and Self-Attention Transformer-Based Model for Multivariate Multi-Step Day-Ahead PV Generation Forecasting
Language: Python - Size: 82.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 4

HiIAmTzeKean/SC4002-NLP
NTU SC4002 NLP Group Project
Language: Jupyter Notebook - Size: 64.7 MB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 2

wanglh300/EAGLE
EAGLE: Contextual Point Cloud Generation via Adaptive Continuous Normalizing Flow with Self-Attention
Language: Makefile - Size: 39.3 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

mahshid1378/Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Language: Python - Size: 99.3 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

NVlabs/MambaVision
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Language: Python - Size: 723 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1,157 - Forks: 57

jw9730/tokengt
[NeurIPS'22] Tokenized Graph Transformer (TokenGT), in PyTorch
Language: Python - Size: 1.23 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 332 - Forks: 47

wangxiao5791509/MultiModal_BigModels_Survey
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
Size: 13.2 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 286 - Forks: 17

jeongwhanchoi/GFSA
"Graph Convolutions Enrich the Self-Attention in Transformers!" NeurIPS 2024
Language: Python - Size: 6.58 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 17 - Forks: 1

hieunm44/mlhm-lung-disease-detection
Lung disease detecxtion using Vision Transformer.
Language: Jupyter Notebook - Size: 5.92 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

SVECTOR-CORPORATION/SSMA
Structured State Matrix Architecture (SSMA) is a high-performance framework designed for efficient sequence modeling, combining structured state space models with adaptive attention mechanisms.
Language: Python - Size: 139 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 1

DirtyHarryLYL/Transformer-in-Vision
Recent Transformer-based CV and related works.
Size: 1.84 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 1,332 - Forks: 143

zhongshsh/ASR
ECCV‘24, a novel attention-alike structural re-parameterization (ASR)
Size: 2.93 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

jw9730/hot
[NeurIPS'21] Higher-order Transformers for sets, graphs, and hypergraphs, in PyTorch
Language: Python - Size: 1.95 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 65 - Forks: 10

mrorigo/pytorch-fftnet
FFTNet implementation in Pytorch
Language: Python - Size: 28.3 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

DaniGarciaPerez/vision_transformer
A repo to explore the implementation of a Vision Transformer from scratch.
Language: Python - Size: 81.1 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

monk1337/Various-Attention-mechanisms
This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras
Language: Python - Size: 643 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 125 - Forks: 25

jaketae/vit-breast-cancer
Transfer learning pretrained vision transformers for breast histopathology
Language: Python - Size: 18.6 KB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 14 - Forks: 4

Audio-WestlakeU/FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction"
Language: Python - Size: 3.22 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 116 - Forks: 5

sarathir-dev/Multi-Scale-ViT-for-3D-Medical-Imaging
A PyTorch implementation of a Multi-Scale Vision Transformer (ViT) for 3D medical image classification using the OrganMNIST3D dataset. This project explores multi-scale attention mechanisms to enhance classification performance in volumetric medical imaging.
Language: Python - Size: 3.91 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kevalmorabia97/Object-and-Semantic-Part-Detection-pyTorch
Joint detection of Object and its Semantic parts using Attention-based Feature Fusion on PASCAL Parts 2010 dataset
Language: Python - Size: 8.27 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 26 - Forks: 4

AmayaGS/MUSTANG
Multi-stain graph self attention multiple instance learning for histopathology Whole Slide Images - BMVC 2023
Language: Python - Size: 2.59 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 13 - Forks: 2

ostad-ai/Large-Language-Models
THis repository includes topics related to the Large Language Models (LLMs)
Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

FutureComputing4AI/Hrrformer
Hrrformer: A Neuro-symbolic Self-attention Model (ICML23)
Language: Python - Size: 126 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 52 - Forks: 6

lucidrains/global-self-attention-network
A Pytorch implementation of Global Self-Attention Network, a fully-attention backbone for vision tasks
Language: Python - Size: 95.7 KB - Last synced at: 19 days ago - Pushed at: over 4 years ago - Stars: 94 - Forks: 7

divyakraman/SS-SFDA-Self-Supervised-Source-Free-Domain-Adaptation-for-Road-Segmentation-in-Hazardous-Environme
Codebase for the paper 'SS SFDA: Self-Supervised Source Free Domain Adaptation for Road Segmentation in Hazardous Environments'
Language: Python - Size: 3.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 15 - Forks: 2

Forquosh/GPT
Generative Pretrained Transformer built from scratch using PyTorch.
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

sparks-baird/CrabNet Fork of anthony-wang/CrabNet
Predict materials properties using only the composition information!
Language: HTML - Size: 393 MB - Last synced at: 21 days ago - Pushed at: 8 months ago - Stars: 16 - Forks: 5

goutamyg/SMAT
[WACV 2024] Separable Self and Mixed Attention Transformers for Efficient Object Tracking
Language: Python - Size: 1.81 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 35 - Forks: 5

Vlasenko2006/Text_to_Image-hybrid-transformer
Text_to_Image-hybrid-transformer
Language: Python - Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

tensorops/TransformerX
Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)
Language: Python - Size: 508 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 8

Pranavh-2004/GPT-From-Scratch
Exploring transformers by building a GPT model from scratch using nanoGPT, inspired by Andrej Karpathy’s tutorial.
Language: Jupyter Notebook - Size: 16.6 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

SamanKhamesian/Hybrid-Self-Attention-NEAT
This repository is the official implementation of the Hybrid Self-Attention NEAT algorithm. It contains the code to reproduce the results presented in the original paper: https://link.springer.com/article/10.1007/s12530-023-09510-3
Language: Python - Size: 69.3 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 14 - Forks: 1

naver-ai/rope-vit
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
Language: Python - Size: 1.06 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 271 - Forks: 7

evNLP/SelfAttention
Transformer Model Implementation in PyTorch
Language: Jupyter Notebook - Size: 771 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

KhushiRajurkar/Vision-Transformer-Image-Classification
A Vision Transformer (ViT) implementation for image classification using CIFAR-10 dataset, leveraging HuggingFace's Trainer API for computational efficiency
Language: Jupyter Notebook - Size: 33.2 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sayakpaul/robustness-vit
Contains code for the paper "Vision Transformers are Robust Learners" (AAAI 2022).
Language: Jupyter Notebook - Size: 4.22 MB - Last synced at: 28 days ago - Pushed at: over 2 years ago - Stars: 126 - Forks: 18

mahesmeh001/Transformer-with-Self-Attention
Creating transformer (encoder/decoder) from scratch. Also experimented with alibi encodings as opposed to positional
Language: Python - Size: 818 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

pointlander/txt
A natural language model based on context mixing
Language: Go - Size: 1.01 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

valterlej/dvcusi
Published at Journal of Visual Communication and Image Representation
Language: Python - Size: 11.6 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

bhattbhavesh91/self-attention-python
This repository will guide you to implement a simple self-attention mechanism using the Python's NumPy library
Language: Jupyter Notebook - Size: 163 KB - Last synced at: 24 days ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 3

surafiel-habib/Transformer-Based-Amharic-to-English-Machine-Translation-with-Character-Embedding-and-Combined-Regul
Language: Jupyter Notebook - Size: 9.04 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ChongQingNoSubway/DGR-MIL
Code for paper: DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification [ECCV 2024]
Language: Python - Size: 2.29 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 128 - Forks: 4

MahmudulHasan11085/Neighbor-Self-Attention
Arxiv paper will be updated soon
Size: 0 Bytes - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kaituoxu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Language: Python - Size: 678 KB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 771 - Forks: 196

erfanashams/steve
Speech Self-Attention Exploratory Visual Environment
Language: Python - Size: 1.65 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

NVlabs/FAN
Official PyTorch implementation of Fully Attentional Networks
Language: Python - Size: 8.6 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 469 - Forks: 28

aravindsankar28/DySAT
Representation learning on dynamic graphs using self-attention networks
Language: Python - Size: 2.39 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 282 - Forks: 44

kaushalshetty/Structured-Self-Attention
A Structured Self-attentive Sentence Embedding
Language: Python - Size: 492 KB - Last synced at: 6 months ago - Pushed at: over 5 years ago - Stars: 495 - Forks: 106

L0SG/relational-rnn-pytorch
An implementation of DeepMind's Relational Recurrent Neural Networks (NeurIPS 2018) in PyTorch.
Language: Python - Size: 4.49 MB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 245 - Forks: 35

shamspias/Transformers-and-Large-Language-Models-From-Basics-to-Frontier-Research
Dive into the transformative world of NLP with this guide on Transformers. Journey from the roots of NLP to advanced Transformer variants like BERT and GPT. Discover their architecture, practical applications, ethical considerations, and future prospects. A comprehensive resource for AI enthusiasts and experts alike.
Size: 18.6 KB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1
