GitHub topics: attention

Repositories

kyegomez/VO-ROPE

An implementation of the all-new rope from jianlin

Language: Python - Size: 30.3 KB - Last synced at: about 18 hours ago - Pushed at: about 20 hours ago - Stars: 1 - Forks: 0

leondgarse/keras_cv_attention_models

Keras beit,caformer,CMT,CoAtNet,convnext,davit,dino,efficientdet,edgenext,efficientformer,efficientnet,eva,fasternet,fastervit,fastvit,flexivit,gcvit,ghostnet,gpvit,hornet,hiera,iformer,inceptionnext,lcnet,levit,maxvit,mobilevit,moganet,nat,nfnets,pvt,swin,tinynet,tinyvit,uniformer,volo,vanillanet,yolor,yolov7,yolov8,yolox,gpt2,llama2, alias kecam

Language: Python - Size: 4.33 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 616 - Forks: 95

TimS-ml/Boring-LLM-Implementation

Aim to aggregate 40+ papers all into one single transformer implementation.

Language: Python - Size: 1.13 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

aju22/LLaMA2

This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT) variant. The implementation focuses on the model architecture and the inference process. The code is restructured and heavily commented to facilitate easy understanding of the key parts of the architecture.

Language: Python - Size: 10.7 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 64 - Forks: 9

argusswift/YOLOv4-pytorch

This is a pytorch repository of YOLOv4, attentive YOLOv4 and mobilenet YOLOv4 with PASCAL VOC and COCO

Language: Python - Size: 22.1 MB - Last synced at: about 9 hours ago - Pushed at: 12 months ago - Stars: 1,674 - Forks: 330

AshishKumar4/FlaxDiff

A simple, easy-to-understand library for diffusion models using Flax and Jax. Includes detailed notebooks on DDPM, DDIM, and EDM with simplified mathematical explanations. Made as part of my journey for learning and experimenting with generative AI.

Language: Jupyter Notebook - Size: 228 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 25 - Forks: 0

mmahdin/Deep-Learning-for-Computer-Vision

Deep Learning for Computer Vision (University of Michigan - Justin Johnson )

Language: Jupyter Notebook - Size: 35 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

anseki/leader-line 📦

Draw a leader line in your web page.

Language: JavaScript - Size: 1.41 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 3,095 - Forks: 440

xmed-lab/AllSpark

CVPR 2024: AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation

Language: Python - Size: 14.1 MB - Last synced at: about 24 hours ago - Pushed at: about 1 year ago - Stars: 80 - Forks: 11

ywyue/AGILE3D

[ICLR 2024] AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation

Language: Python - Size: 39.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 110 - Forks: 10

gordicaleksa/pytorch-GAT

My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!

Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 2,527 - Forks: 333

chengzeyi/ParaAttention

https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching

Language: Python - Size: 13.4 MB - Last synced at: 4 days ago - Pushed at: 18 days ago - Stars: 241 - Forks: 24

labmlai/annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language: Python - Size: 147 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 59,995 - Forks: 6,067

milistu/bertdistiller

Faster, smaller BERT models in just a few lines.

Language: Python - Size: 237 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

CyberZHG/keras-transformer 📦

Transformer implemented in Keras

Language: Python - Size: 68.4 KB - Last synced at: about 12 hours ago - Pushed at: about 3 years ago - Stars: 371 - Forks: 94

qubvel/residual_attention_network

Keras implementation of Residual Attention Network

Language: Jupyter Notebook - Size: 8.79 KB - Last synced at: 5 days ago - Pushed at: almost 7 years ago - Stars: 109 - Forks: 38

xmu-xiaoma666/External-Attention-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language: Python - Size: 5.43 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 11,873 - Forks: 1,965

8e8bdba457c18cf692a95fe2ec67000b/VulkanCooperativeMatrixAttention

Vulkan & GLSL implementation of FlashAttention-2

Size: 1.95 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

snailpt/CTNet

CTNet: A Convolutional Transformer Network for EEG-Based Motor Imagery Classification

Language: Jupyter Notebook - Size: 2.63 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 234 - Forks: 4

DeepAuto-AI/hip-attention

Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.

Language: Python - Size: 45.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 128 - Forks: 14

harleyszhang/lite_llama

A light llama-like llm inference framework based on the triton kernel.

Language: Python - Size: 39.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 106 - Forks: 12

CyberZHG/torch-multi-head-attention 📦

Multi-head attention in PyTorch

Language: Python - Size: 9.77 KB - Last synced at: 3 days ago - Pushed at: about 6 years ago - Stars: 152 - Forks: 36

songyouwei/ABSA-PyTorch

Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析，使用PyTorch实现。

Language: Python - Size: 3.71 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 2,065 - Forks: 528

km1994/nlp_paper_study

该仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记

Language: C++ - Size: 857 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 3,979 - Forks: 655

kyegomez/awesome-multi-agent-papers

A compilation of the best multi-agent papers

Language: TeX - Size: 292 KB - Last synced at: 7 days ago - Pushed at: 25 days ago - Stars: 514 - Forks: 33

vonfeng/DeepMove

[WWW 2018] DeepMove: Predicting Human Mobility with Attentional Recurrent Network

Language: Python - Size: 143 MB - Last synced at: about 7 hours ago - Pushed at: 3 months ago - Stars: 151 - Forks: 56

gordicaleksa/pytorch-original-transformer

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.

Language: Jupyter Notebook - Size: 948 KB - Last synced at: 7 days ago - Pushed at: over 4 years ago - Stars: 1,023 - Forks: 176

thu-ml/SageAttention

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Language: Cuda - Size: 46.1 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,294 - Forks: 89

tech-srl/how_attentive_are_gats

Code for the paper "How Attentive are Graph Attention Networks?" (ICLR'2022)

Language: Python - Size: 3.34 MB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 328 - Forks: 40

thu-ml/SpargeAttn

SpargeAttention: A training-free sparse attention that can accelerate any model inference.

Language: Cuda - Size: 55.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 435 - Forks: 27

lucidrains/performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch

Language: Python - Size: 34.3 MB - Last synced at: 6 days ago - Pushed at: about 3 years ago - Stars: 1,120 - Forks: 145

HMUNACHI/nanodl

A Jax-based library for designing and training transformer models from scratch.

Language: Python - Size: 44.4 MB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 286 - Forks: 10

xiuqhou/Salience-DETR

[CVPR 2024] Official implementation of the paper "Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement"

Language: Jupyter Notebook - Size: 6.35 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 184 - Forks: 11

lucidrains/transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Language: Python - Size: 34.6 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 1,044 - Forks: 46

hyunwoongko/transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

Language: Python - Size: 1.95 MB - Last synced at: 9 days ago - Pushed at: 9 months ago - Stars: 3,593 - Forks: 510

zer0int/CLIP-XAI-GUI

CLIP GUI - XAI app ~ explainable (and guessable) AI with ViT & ResNet models

Language: Python - Size: 3.46 MB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 20 - Forks: 1

LabShuHangGU/PFT-SR

CVPR2025-Progressive Focused Transformer for Single Image Super-Resolution

Language: Python - Size: 30.4 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 19 - Forks: 1

cloud0912/learn_LLM

LLM Learning & Algorithm Notes 本仓库用于记录我在学习大语言模型（LLM）过程中的技术思考与算法题解笔记。

Language: Python - Size: 115 KB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

ai4co/rl4co

A PyTorch library for all things Reinforcement Learning (RL) for Combinatorial Optimization (CO)

Language: Python - Size: 155 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 550 - Forks: 96

zxuu/Self-Attention

Transformer的完整实现。详细构建Encoder、Decoder、Self-attention。以实际例子进行展示，有完整的输入、训练、预测过程。可用于学习理解self-attention和Transformer

Language: Python - Size: 4.79 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 72 - Forks: 12

sayakpaul/probing-vits

Probing the representations of Vision Transformers.

Language: Jupyter Notebook - Size: 33.3 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 324 - Forks: 20

BobMcDear/attention-in-vision

PyTorch implementation of popular attention mechanisms in vision

Language: Python - Size: 38.1 KB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 2

bentrevett/pytorch-seq2seq

Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.

Language: Jupyter Notebook - Size: 6.02 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 5,521 - Forks: 1,359

jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Language: Python - Size: 162 KB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 9,127 - Forks: 2,016

alenzenx/TrackNetV3

TrackNetV3 : Beyond TrackNetV2 ,and "First" TrackNet using Attention

Language: Python - Size: 78.7 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 91 - Forks: 20

google-research/scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language: Python - Size: 63.7 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 3,495 - Forks: 453

antoineMoPa/rust-text-experiments

Tiny LLM in rust / candle

Language: Rust - Size: 1.3 MB - Last synced at: 3 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

The-AI-Summer/self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Language: Python - Size: 291 KB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 1,199 - Forks: 154

kyegomez/MobileVLM

Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant for Mobile Devices"

Language: Python - Size: 2.17 MB - Last synced at: about 20 hours ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 0

kyegomez/TinyGPTV

Simple Implementation of TinyGPTV in super simple Zeta lego blocks

Language: Python - Size: 2.17 MB - Last synced at: about 20 hours ago - Pushed at: 5 months ago - Stars: 16 - Forks: 0

FlagOpen/FlagAttention

A collection of memory efficient attention operators implemented in the Triton language.

Language: Python - Size: 975 KB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 262 - Forks: 18

coderonion/awesome-snn

🔥🔥🔥A collection of some awesome public SNN(Spiking Neural Network) projects.

Size: 23.4 KB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 181 - Forks: 19

cbaziotis/neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

Language: Vue - Size: 25.4 MB - Last synced at: 11 days ago - Pushed at: almost 7 years ago - Stars: 250 - Forks: 24

ChristophReich1996/Cell-DETR

Official and maintained implementation of the paper "Attention-Based Transformers for Instance Segmentation of Cells in Microstructures" [BIBM 2020].

Language: Python - Size: 32.9 MB - Last synced at: 8 days ago - Pushed at: about 3 years ago - Stars: 101 - Forks: 23

ddbourgin/numpy-ml

Machine learning, in numpy

Language: Python - Size: 10 MB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 16,044 - Forks: 3,796

graykode/nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

Language: Jupyter Notebook - Size: 353 KB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 14,534 - Forks: 3,968

MorvanZhou/NLP-Tutorials

Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com

Language: Python - Size: 888 KB - Last synced at: 11 days ago - Pushed at: almost 2 years ago - Stars: 938 - Forks: 316

naidezhujimo/Triton-FlashAttention

This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The implementations range from basic block-wise processing to advanced techniques like FP8 quantization and prefetching

Language: Python - Size: 521 KB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

praveena2j/Joint-Cross-Attention-for-Audio-Visual-Fusion

IEEE T-BIOM : "Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention"

Language: Python - Size: 290 KB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 38 - Forks: 11

fcbg-platforms/meg-flow

Tracking the neural dynamics underlying variations in flow/attentional states at the timescale of second

Language: Python - Size: 1.68 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

WenjieDu/SAITS

The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516

Language: Python - Size: 583 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 390 - Forks: 55

miniHuiHui/awesome-high-order-neural-network

Size: 43.9 KB - Last synced at: 11 days ago - Pushed at: 7 months ago - Stars: 46 - Forks: 4

rkansal47/MPGAN

The message passing GAN https://arxiv.org/abs/2106.11535 and generative adversarial particle transformer https://arxiv.org/abs/2211.10295 architectures for generating particle clouds

Language: Python - Size: 158 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 13 - Forks: 11

Agora-Lab-AI/HydraNet

HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.

Language: Shell - Size: 2.16 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 5 - Forks: 0

kyegomez/MHMoE

Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch

Language: Python - Size: 2.16 MB - Last synced at: about 20 hours ago - Pushed at: 14 days ago - Stars: 24 - Forks: 4

AI-Hypercomputer/jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"

Language: Python - Size: 1.41 MB - Last synced at: 1 day ago - Pushed at: 24 days ago - Stars: 59 - Forks: 17

OpenSparseLLMs/LLaMA-MoE-v2

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Language: Python - Size: 2.21 MB - Last synced at: 13 days ago - Pushed at: 5 months ago - Stars: 78 - Forks: 11

iCog-Labs-Dev/metta-attention

Economic Attention Network

Language: Python - Size: 648 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 14

haofanwang/natural-language-joint-query-search

Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.

Language: Jupyter Notebook - Size: 12.9 MB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 218 - Forks: 20

maciusyoury15/IFT6135_HW2

Implementing GPT, Decoder, LSTM, Lora, Layer-Batch normalizations, LLMs, FFNN, Attention mechanism, Transformer

Language: Python - Size: 4.41 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

lkfink/lkfink.github.io

Lauren's personal website

Language: TeX - Size: 233 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 1

sovit-123/vision_transformers

Vision Transformers for image classification, image segmentation, and object detection.

Language: Python - Size: 44.1 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 49 - Forks: 9

xlite-dev/ffpa-attn-mma

📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) SRAM complexity large headdim (D > 256), ~2x↑🎉vs SDPA EA.

Language: Cuda - Size: 4.21 MB - Last synced at: 15 days ago - Pushed at: 27 days ago - Stars: 161 - Forks: 7

kyegomez/ScreenAI

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

Language: Python - Size: 2.18 MB - Last synced at: 13 days ago - Pushed at: 16 days ago - Stars: 333 - Forks: 30

shreyansh26/Attention-Mask-Patterns

Using FlexAttention to compute attention with different masking patterns

Language: Python - Size: 4.59 MB - Last synced at: 10 days ago - Pushed at: 7 months ago - Stars: 43 - Forks: 0

aamirshehzad33/Resturant-website

Welcome to a captivating 10-page demo website that blends innovation with aesthetic design! Dive into a dynamic online experience built with a modern tech stack and designed to inspire developers and creators alike.

Language: TypeScript - Size: 5.86 MB - Last synced at: 11 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

ndrplz/dreyeve

[TPAMI 2018] Predicting the Driver’s Focus of Attention: the DR(eye)VE Project. A deep neural network learnt to reproduce the human driver focus of attention (FoA) in a variety of real-world driving scenarios.

Language: C - Size: 7.05 MB - Last synced at: 15 days ago - Pushed at: over 5 years ago - Stars: 105 - Forks: 33