Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: attention-mechanisms
lucidrains/alphafold3-pytorch
Implementation of Alphafold 3 in Pytorch
Language: Python - Size: 1 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 458 - Forks: 21
jshuadvd/LongRoPE
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
Language: Python - Size: 439 KB - Last synced: about 17 hours ago - Pushed: about 18 hours ago - Stars: 82 - Forks: 8
kyegomez/ShallowFF
Zeta implemantion of "Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers"
Language: Python - Size: 36.2 MB - Last synced: 2 days ago - Pushed: 5 days ago - Stars: 6 - Forks: 0
lucidrains/musiclm-pytorch
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
Language: Python - Size: 196 KB - Last synced: 4 days ago - Pushed: 9 months ago - Stars: 3,064 - Forks: 248
changzy00/pytorch-attention
π¦Pytorch implementation of popular Attention Mechanisms, Vision Transformers, MLP-Like models and CNNs.π₯π₯π₯
Language: Python - Size: 3.5 MB - Last synced: 2 days ago - Pushed: 5 months ago - Stars: 313 - Forks: 24
lucidrains/BS-RoFormer
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
Language: Python - Size: 225 KB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 299 - Forks: 12
lucidrains/mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
Language: Python - Size: 34.1 MB - Last synced: 7 days ago - Pushed: 11 months ago - Stars: 98 - Forks: 3
lucidrains/toolformer-pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
Language: Python - Size: 161 KB - Last synced: 6 days ago - Pushed: 6 months ago - Stars: 1,907 - Forks: 120
lucidrains/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Language: Python - Size: 34.5 MB - Last synced: 5 days ago - Pushed: about 1 month ago - Stars: 593 - Forks: 50
lucidrains/complex-valued-transformer
Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"
Language: Python - Size: 34.4 MB - Last synced: 3 days ago - Pushed: 8 months ago - Stars: 52 - Forks: 3
lucidrains/phenaki-pytorch
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch
Language: Python - Size: 263 KB - Last synced: 6 days ago - Pushed: 3 months ago - Stars: 726 - Forks: 79
kyegomez/KosmosG
My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"
Language: Python - Size: 2.79 MB - Last synced: 10 days ago - Pushed: 22 days ago - Stars: 12 - Forks: 0
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Size: 4.96 MB - Last synced: 17 days ago - Pushed: 18 days ago - Stars: 4,305 - Forks: 474
lucidrains/local-attention
An implementation of local windowed attention for language modeling
Language: Python - Size: 34.1 MB - Last synced: 6 days ago - Pushed: about 2 months ago - Stars: 341 - Forks: 35
lucidrains/equiformer-pytorch
Implementation of the Equiformer, SE3/E3 equivariant attention network that reaches new SOTA, and adopted for use by EquiFold for protein folding
Language: Python - Size: 17.4 MB - Last synced: 5 days ago - Pushed: 5 months ago - Stars: 228 - Forks: 22
lucidrains/CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
Language: Python - Size: 938 KB - Last synced: 6 days ago - Pushed: 4 months ago - Stars: 136 - Forks: 9
lucidrains/flash-cosine-sim-attention
Implementation of fused cosine similarity attention in the same style as Flash Attention
Language: Cuda - Size: 34.4 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 192 - Forks: 9
lucidrains/recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Language: Python - Size: 34.3 MB - Last synced: 6 days ago - Pushed: 4 months ago - Stars: 384 - Forks: 15
lucidrains/muse-maskgit-pytorch
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Language: Python - Size: 285 KB - Last synced: 6 days ago - Pushed: 3 months ago - Stars: 821 - Forks: 78
lucidrains/simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Language: Python - Size: 34.1 MB - Last synced: 7 days ago - Pushed: 6 months ago - Stars: 198 - Forks: 10
lucidrains/make-a-video-pytorch
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
Language: Python - Size: 227 KB - Last synced: 17 days ago - Pushed: about 1 month ago - Stars: 1,852 - Forks: 177
pprp/awesome-attention-mechanism-in-cv
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
Language: Python - Size: 3.25 MB - Last synced: 17 days ago - Pushed: about 1 year ago - Stars: 968 - Forks: 160
lucidrains/taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
Language: Python - Size: 776 KB - Last synced: 6 days ago - Pushed: 5 months ago - Stars: 80 - Forks: 2
lucidrains/flash-attention-jax
Implementation of Flash Attention in Jax
Language: Python - Size: 181 KB - Last synced: 19 days ago - Pushed: 3 months ago - Stars: 179 - Forks: 23
lucidrains/magvit2-pytorch
Implementation of MagViT2 Tokenizer in Pytorch
Language: Python - Size: 1.79 MB - Last synced: 23 days ago - Pushed: 23 days ago - Stars: 427 - Forks: 27
lucidrains/meshgpt-pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Language: Python - Size: 1.05 MB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 533 - Forks: 46
lucidrains/q-transformer
Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Functions, out of Google Deepmind
Language: Python - Size: 1.44 MB - Last synced: 26 days ago - Pushed: about 1 month ago - Stars: 289 - Forks: 15
landskape-ai/triplet-attention
Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]
Language: Jupyter Notebook - Size: 9.71 MB - Last synced: 24 days ago - Pushed: over 2 years ago - Stars: 385 - Forks: 46
kyegomez/MobileVLM
Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant for Mobile Devices"
Language: Python - Size: 2.17 MB - Last synced: 27 days ago - Pushed: 3 months ago - Stars: 12 - Forks: 0
kyegomez/MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
Language: Python - Size: 2.25 MB - Last synced: 27 days ago - Pushed: 3 months ago - Stars: 124 - Forks: 11
kyegomez/MambaFormer
Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks"
Language: Python - Size: 2.17 MB - Last synced: 17 days ago - Pushed: about 1 month ago - Stars: 14 - Forks: 1
lucidrains/mmdit
Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
Language: Python - Size: 167 KB - Last synced: 28 days ago - Pushed: 29 days ago - Stars: 127 - Forks: 2
kyegomez/MGQA
The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints"
Language: Python - Size: 248 KB - Last synced: 29 days ago - Pushed: 6 months ago - Stars: 7 - Forks: 0
lucidrains/block-recurrent-transformer-pytorch
Implementation of Block Recurrent Transformer - Pytorch
Language: Python - Size: 34.2 MB - Last synced: 6 days ago - Pushed: 11 months ago - Stars: 205 - Forks: 18
JulesBelveze/time-series-autoencoder
PyTorch Dual-Attention LSTM-Autoencoder For Multivariate Time Series
Language: Python - Size: 360 KB - Last synced: 26 days ago - Pushed: 8 months ago - Stars: 585 - Forks: 63
lucidrains/iTransformer
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks, out of Tsinghua / Ant group
Language: Python - Size: 204 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 340 - Forks: 23
kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
Language: Python - Size: 40.3 MB - Last synced: 29 days ago - Pushed: 5 months ago - Stars: 651 - Forks: 62
kyegomez/FlashMHA
An simple pytorch implementation of Flash MultiHead Attention
Language: Jupyter Notebook - Size: 85 KB - Last synced: 29 days ago - Pushed: 4 months ago - Stars: 12 - Forks: 1
lucidrains/CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
Language: Python - Size: 181 KB - Last synced: 26 days ago - Pushed: 4 months ago - Stars: 216 - Forks: 12
kyegomez/CELESTIAL-1
Omni-Modality Processing, Understanding, and Generation
Language: Python - Size: 2.49 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 6 - Forks: 0
lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Language: Python - Size: 34.3 MB - Last synced: about 1 month ago - Pushed: 5 months ago - Stars: 7,595 - Forks: 658
lucidrains/coordinate-descent-attention
Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk
Language: Python - Size: 34.1 MB - Last synced: about 1 month ago - Pushed: 11 months ago - Stars: 44 - Forks: 1
lucidrains/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Language: Python - Size: 507 KB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 2,252 - Forks: 241
lucidrains/medical-chatgpt
Implementation of ChatGPT, but tailored towards primary care medicine, with the reward being able to collect patient histories in a thorough and efficient manner and come up with a reasonable differential diagnosis
Language: Python - Size: 27.3 KB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 310 - Forks: 32
lucidrains/autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
Language: Python - Size: 5.86 KB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 45 - Forks: 3
lucidrains/robotic-transformer-pytorch
Implementation of RT1 (Robotic Transformer) in Pytorch
Language: Python - Size: 159 KB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 340 - Forks: 31
lucidrains/diffusion-policy
Implementation of Diffusion Policy, Toyota Research's supposed breakthrough in leveraging DDPMs for learning policies for real-world Robotics
Language: Python - Size: 1.02 MB - Last synced: about 1 month ago - Pushed: 5 months ago - Stars: 64 - Forks: 1
lucidrains/kalman-filtering-attention
Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"
Size: 4.88 KB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 54 - Forks: 3
ssghost/var-attn Fork of harvardnlp/var-attn
Language: Python - Size: 93.2 MB - Last synced: about 2 months ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0
lucidrains/pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount of time on any token
Language: Python - Size: 659 KB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 42 - Forks: 0
kyegomez/Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
Language: Python - Size: 2.17 MB - Last synced: about 2 months ago - Pushed: 2 months ago - Stars: 49 - Forks: 0
lucidrains/recurrent-interface-network-pytorch
Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch
Language: Python - Size: 731 KB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 187 - Forks: 14
kyegomez/SparseAttention
Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"
Language: Python - Size: 2.16 MB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 25 - Forks: 1
lucidrains/transframer-pytorch
Implementation of Transframer, Deepmind's U-net + Transformer architecture for up to 30 seconds video generation, in Pytorch
Language: Python - Size: 159 KB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 65 - Forks: 5
lucidrains/zorro-pytorch
Implementation of Zorro, Masked Multimodal Transformer, in Pytorch
Language: Python - Size: 197 KB - Last synced: 19 days ago - Pushed: 8 months ago - Stars: 92 - Forks: 6
lucidrains/Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
Language: Python - Size: 34.2 MB - Last synced: 25 days ago - Pushed: 10 months ago - Stars: 201 - Forks: 11
kyegomez/PaLM2-VAdapter
Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter"
Language: Python - Size: 2.17 MB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 14 - Forks: 0
pouyasattari/Automatic-Generative-Code-with-Neural-Machine-Translation-for-data-security-purpose
Transformers, including the T5 and MarianMT, enabled effective understanding and generating complex programming codes. Consequently, they can help us in Data Security field. Let's see how!
Language: Jupyter Notebook - Size: 130 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0
kyegomez/Hedgehog
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"
Language: Python - Size: 2.16 MB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 5 - Forks: 0
lucidrains/agent-attention-pytorch
Implementation of Agent Attention in Pytorch
Language: Python - Size: 514 KB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 73 - Forks: 1
lucidrains/flash-genomics-model
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
Language: Python - Size: 12.7 KB - Last synced: about 1 month ago - Pushed: 11 months ago - Stars: 52 - Forks: 5
lucidrains/equiformer-diffusion
Implementation of Denoising Diffusion for protein design, but using the new Equiformer (successor to SE3 Transformers) with some additional improvements
Size: 2.93 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 55 - Forks: 3
cbaziotis/neat-vision
Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)
Language: Vue - Size: 25.4 MB - Last synced: 3 months ago - Pushed: about 6 years ago - Stars: 250 - Forks: 26
arnavdantuluri/long-context-transformers
A repository to get train transformers to access longer context for causal language models, most of these methods are still in testing. Try them out if you'd like but please lmk your results so we don't duplicate work :)
Language: Python - Size: 188 KB - Last synced: 3 months ago - Pushed: 11 months ago - Stars: 5 - Forks: 2
selfcontrol7/Korean_Voice_Phishing_Detection
All codes implemented on Korean voice phishing detection papers
Language: Jupyter Notebook - Size: 146 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 5 - Forks: 3
dynastes-team/dynastes
A collection of layers, ops, utilities and more for TensorFlow 2.0 high-level API Keras
Language: Python - Size: 658 KB - Last synced: 25 days ago - Pushed: about 4 years ago - Stars: 9 - Forks: 0
KevinAtsou/formerslab
A simple a set of Transformers building blocks that can be used to build language models
Language: Python - Size: 27.3 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0
JoanaR/multi-mode-CNN-pytorch
A PyTorch implementation of the Multi-Mode CNN to reconstruct Chlorophyll-a time series in the global ocean from oceanic and atmospheric physical drivers
Language: Jupyter Notebook - Size: 8.97 MB - Last synced: 4 months ago - Pushed: about 1 year ago - Stars: 6 - Forks: 0
programmer290399/pyqna
A simple python package for question answering.
Language: Python - Size: 3.95 MB - Last synced: 2 months ago - Pushed: about 1 year ago - Stars: 8 - Forks: 5
DIAGNijmegen/prostateMR_3D-CAD-csPCa
Hierarchical probabilistic 3D U-Net, with attention mechanisms (βππ΅π΅π¦π―π΅πͺπ°π― π-ππ¦π΅, ππππ¦π΄ππ¦π΅) and a nested decoder structure with deep supervision (βπππ¦π΅++). Built in TensorFlow 2.5. Configured for voxel-level clinically significant prostate cancer detection in multi-channel 3D bpMRI scans.
Language: Python - Size: 21.2 MB - Last synced: 8 months ago - Pushed: over 2 years ago - Stars: 36 - Forks: 6
GiantPandaCV/yolov3-point
Learning YOLOv3 from scratch δ»ιΆεΌε§ε¦δΉ YOLOv3代η
Language: Jupyter Notebook - Size: 94.6 MB - Last synced: 7 months ago - Pushed: about 2 years ago - Stars: 203 - Forks: 53
vene/sparse-structured-attention
Sparse and structured neural attention mechanisms
Language: Python - Size: 102 KB - Last synced: 7 months ago - Pushed: almost 4 years ago - Stars: 215 - Forks: 36
super-m-a-n/covid19-vaccine-tweets-sentiment-analysis
Deep learning methods for sentiment analysis classification of covid-19 vaccination tweets
Language: Jupyter Notebook - Size: 3.89 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
umapornp/PAC-MAN
π» The PyTorch implementation for the IEEE Access paper: "PAC-MAN: Multi-Relation Network in Social Community for Personalized Hashtag Recommendation".
Language: Python - Size: 318 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 1 - Forks: 0
umapornp/ARERec
β¨ The implementation for the IEEE Access paper: "ARERec: Attentive Local Interaction Model for Sequential Recommendation".
Language: Python - Size: 7 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
monk1337/Various-Attention-mechanisms
This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras
Language: Python - Size: 643 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 98 - Forks: 20
soran-ghaderi/make-a-video
"Make-A-Video", new SOTA text to video by Meta-FAIR - Tensorflow
Language: Python - Size: 705 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 10 - Forks: 2
elbuco1/AttentionMechanismsTrajectoryPrediction
In this repository, one can find the code for my master's thesis project. The main goal of the project was to study and improve attention mechanisms for trajectory prediction of moving agents.
Language: Python - Size: 13.8 MB - Last synced: over 1 year ago - Pushed: over 4 years ago - Stars: 41 - Forks: 11
veqtor/veqtor_keras π¦
Collection of various of my custom TensorFlow-Keras 2.0+ layers, utils and such
Language: Python - Size: 14.6 KB - Last synced: over 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
johnsmithm/multi-heads-attention-image-classification
Multi heads attention for image classification
Language: Python - Size: 2.93 KB - Last synced: over 1 year ago - Pushed: about 6 years ago - Stars: 72 - Forks: 34
acadTags/Automated-Social-Annotation
Joint Multi-label Attention Network (JMAN)
Language: Python - Size: 59.1 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 11 - Forks: 3
subho406/Sequence-to-Sequence-and-Attention-from-scratch-using-Tensorflow
Sequence to Sequence and attention from scratch using Tensorflow
Language: Jupyter Notebook - Size: 33.2 KB - Last synced: over 1 year ago - Pushed: over 6 years ago - Stars: 29 - Forks: 16
bulatkh/image_captioning
Master Project on Image Captioning using Supervised Deep Learning Methods
Language: Jupyter Notebook - Size: 17.4 MB - Last synced: over 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 2
ssea-lab/DL4ETI
Computer-aided diagnosis in histopathological images of the Endometrium
Language: Python - Size: 38.7 MB - Last synced: over 1 year ago - Pushed: about 4 years ago - Stars: 16 - Forks: 4
KB9/BraccioVisualAttention
An active vision system which builds a 3D environment map autonomously using visual attention mechanisms.
Language: Python - Size: 102 MB - Last synced: over 1 year ago - Pushed: about 6 years ago - Stars: 0 - Forks: 0