GitHub topics: multihead-attention
Separius/awesome-fast-attention π¦
list of efficient attention modules
Language: Python - Size: 156 KB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 1,004 - Forks: 108

yshirai999/Machine-Learning-Lectures
Lectures on reinforcement learning and self-attention
Language: Jupyter Notebook - Size: 178 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Syeda-Farhat/awesome-Transformers-For-Segmentation
Semantic segmentation is an important job in computer vision, and its applications have grown in popularity over the last decade.We grouped the publications that used various forms of segmentation in this repository. Particularly, every paper is built on a transformer.
Size: 300 KB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 32 - Forks: 2

JaewonSon37/Neural_Networks_and_Deep_Learning1
Language: Jupyter Notebook - Size: 7.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Bhazantri/EvoLingua
EvoLingua: A Scalable Mixture-of-Experts Language Model Framework
Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yflyzhang/AnnotatedTransformer
Language: Jupyter Notebook - Size: 29.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

tensorops/TransformerX
Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow β , Pytorch π, and Jax π)
Language: Python - Size: 508 KB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 8

tlatkowski/multihead-siamese-nets
Implementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task.
Language: Jupyter Notebook - Size: 1.43 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 182 - Forks: 43

iafarhan/causal-synthesizer-multihead-attention
Synthesizer Self-Attention is a very recent alternative to causal self-attention that has potential benefits by removing this dot product.
Language: Python - Size: 12.7 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 12 - Forks: 0

changwookjun/Transformer
Chatbot using Tensorflow (Model is transformer) ko
Language: Python - Size: 526 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 29 - Forks: 13

aniketDash7/multihead_attention_implementation
Implementation of Multihead attention mechanism using numpy and pyTorch
Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

dcarpintero/transformer101
Annotated vanilla implementation in PyTorch of the Transformer model introduced in 'Attention Is All You Need'.
Language: Jupyter Notebook - Size: 215 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Pranavhc/Shakespearean-Text-Generator
A Decoder-only Transfomer model for text generation.
Language: Jupyter Notebook - Size: 114 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Resh-97/MixSeq-Connecting-Macroscopic-Time-Series-Forecasting-with-Microscopic-Time-Series-Data
Testing the Reproducibility of the paper: MixSeq. Under the assumption that macroscopic time series follow a mixture distribution, they hypothesise that lower variance of constituting latent mixture components could improve the estimation of macroscopic time series.
Language: Jupyter Notebook - Size: 93.8 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

JivanAcharya/Shakespeare-GPT
Implementing a GPT (Generative Pre-trained Transformer) model from scratch on Shakespeare's work.
Language: Jupyter Notebook - Size: 37.9 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Group-1-ET/English-Telugu-Translator
Deployed locally
Language: Python - Size: 26.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

jaydeepthik/Nano-GPT
Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture
Language: Jupyter Notebook - Size: 429 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

achiverram28/FedLSF-DCOSS
Official implementation of the paper "FedLSF: Federated Local Graph Learning via Specformers"
Language: Python - Size: 3.02 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

puskal-khadka/Transformer
Transformer model based on the research paper: "Attention Is All You Need"
Language: Python - Size: 16.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

hrithickcodes/transformer-tf
This repository contains the code for the paper "Attention Is All You Need" i.e The Transformer.
Language: Jupyter Notebook - Size: 30.4 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 1

jk96491/Advanced_Models
μ¬λ¬κ°μ§ μ λͺ ν μ κ²½λ§ λͺ¨λΈλ€μ μ 곡ν©λλ€. (DCGAN, VAE, Resnet λ±λ±)
Language: Python - Size: 1.98 MB - Last synced at: 6 months ago - Pushed at: almost 4 years ago - Stars: 50 - Forks: 13

OscarHChung/GPT-Model
GPT model that can take a text file from anywhere on the internet and imitate the linguistic style of the text
Language: Python - Size: 528 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

MirunaPislar/multi-head-attention-labeller
Joint text classification on multiple levels with multiple labels, using a multi-head attention mechanism to wire two prediction tasks together.
Language: Python - Size: 4.13 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 15 - Forks: 2

rimo10/ViT
Vision Transformer in pytorch
Language: Jupyter Notebook - Size: 172 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

aman-17/3dprinting-extrusion-detection
3D Printing Extrusion Detection using Multi-Head Attention Model
Language: Python - Size: 9.04 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

abhilash1910/GraphAttentionNetworks
This package is a Tensorflow2/Keras implementation for Graph Attention Network embeddings and also provides a Trainable layer for Multihead Graph Attention.
Language: Python - Size: 142 KB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

meme2515/transformer_pytorch
PyTorch implementation of the Transformer architecture from the paper Attention is All You Need. Includes implementation of attention mechanism.
Language: Python - Size: 318 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

bkhanal-11/transformers
The implementation of transformer as presented in the paper "Attention is all you need" from scratch.
Language: Python - Size: 291 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

datnnt1997/multi-head_self-attention
A Faster Pytorch Implementation of Multi-Head Self-Attention
Language: Jupyter Notebook - Size: 737 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 41 - Forks: 9

shawnhan108/AutoTruckX
An experimental project for autonomous vehicle driving perception with steering angle prediction and semantic segmentation using a combination of UNet, attention and transformers.
Language: Python - Size: 399 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 4

akurniawan/pytorch-transformer
Implementation of "Attention is All You Need" paper
Language: Python - Size: 1.08 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 33 - Forks: 12

whsqkaak/attentions_pytorch
A repository for implementations of attention mechanism by PyTorch.
Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

mpalaourg/PGL-SUM Fork of e-apostolidis/PGL-SUM
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021
Language: Python - Size: 89.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

Mascerade/scale-transformer-encoder
A Transformer Encoder where the embedding size can be down-sized.
Language: Python - Size: 104 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

vasisthasinghal/Machine_Translation
Machine Translation models (with and without attention) to convert sentences in Tamil to Hindi. Transformer models are also used for this same task and performance is compared.
Language: Jupyter Notebook - Size: 78.8 MB - Last synced at: 6 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sarthak7509/ConversationalAi
This is implementation of famous multi head attention mode for conversational ai paper. This model is trained on both Cornell movie data set and WikkiQna data set provided by microsoft
Language: Jupyter Notebook - Size: 64.4 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0
