An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: multihead-attention

Separius/awesome-fast-attention πŸ“¦

list of efficient attention modules

Language: Python - Size: 156 KB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 1,004 - Forks: 108

yshirai999/Machine-Learning-Lectures

Lectures on reinforcement learning and self-attention

Language: Jupyter Notebook - Size: 178 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Syeda-Farhat/awesome-Transformers-For-Segmentation

Semantic segmentation is an important job in computer vision, and its applications have grown in popularity over the last decade.We grouped the publications that used various forms of segmentation in this repository. Particularly, every paper is built on a transformer.

Size: 300 KB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 32 - Forks: 2

JaewonSon37/Neural_Networks_and_Deep_Learning1

Language: Jupyter Notebook - Size: 7.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Bhazantri/EvoLingua

EvoLingua: A Scalable Mixture-of-Experts Language Model Framework

Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yflyzhang/AnnotatedTransformer

Language: Jupyter Notebook - Size: 29.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

tensorops/TransformerX

Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow βœ…, Pytorch πŸ”œ, and Jax πŸ”œ)

Language: Python - Size: 508 KB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 8

tlatkowski/multihead-siamese-nets

Implementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task.

Language: Jupyter Notebook - Size: 1.43 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 182 - Forks: 43

iafarhan/causal-synthesizer-multihead-attention

Synthesizer Self-Attention is a very recent alternative to causal self-attention that has potential benefits by removing this dot product.

Language: Python - Size: 12.7 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 12 - Forks: 0

changwookjun/Transformer

Chatbot using Tensorflow (Model is transformer) ko

Language: Python - Size: 526 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 29 - Forks: 13

aniketDash7/multihead_attention_implementation

Implementation of Multihead attention mechanism using numpy and pyTorch

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

dcarpintero/transformer101

Annotated vanilla implementation in PyTorch of the Transformer model introduced in 'Attention Is All You Need'.

Language: Jupyter Notebook - Size: 215 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Pranavhc/Shakespearean-Text-Generator

A Decoder-only Transfomer model for text generation.

Language: Jupyter Notebook - Size: 114 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Resh-97/MixSeq-Connecting-Macroscopic-Time-Series-Forecasting-with-Microscopic-Time-Series-Data

Testing the Reproducibility of the paper: MixSeq. Under the assumption that macroscopic time series follow a mixture distribution, they hypothesise that lower variance of constituting latent mixture components could improve the estimation of macroscopic time series.

Language: Jupyter Notebook - Size: 93.8 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

JivanAcharya/Shakespeare-GPT

Implementing a GPT (Generative Pre-trained Transformer) model from scratch on Shakespeare's work.

Language: Jupyter Notebook - Size: 37.9 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Group-1-ET/English-Telugu-Translator

Deployed locally

Language: Python - Size: 26.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

jaydeepthik/Nano-GPT

Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture

Language: Jupyter Notebook - Size: 429 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

achiverram28/FedLSF-DCOSS

Official implementation of the paper "FedLSF: Federated Local Graph Learning via Specformers"

Language: Python - Size: 3.02 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

puskal-khadka/Transformer

Transformer model based on the research paper: "Attention Is All You Need"

Language: Python - Size: 16.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

hrithickcodes/transformer-tf

This repository contains the code for the paper "Attention Is All You Need" i.e The Transformer.

Language: Jupyter Notebook - Size: 30.4 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 1

jk96491/Advanced_Models

μ—¬λŸ¬κ°€μ§€ 유λͺ…ν•œ 신경망 λͺ¨λΈλ“€μ„ μ œκ³΅ν•©λ‹ˆλ‹€. (DCGAN, VAE, Resnet λ“±λ“±)

Language: Python - Size: 1.98 MB - Last synced at: 6 months ago - Pushed at: almost 4 years ago - Stars: 50 - Forks: 13

OscarHChung/GPT-Model

GPT model that can take a text file from anywhere on the internet and imitate the linguistic style of the text

Language: Python - Size: 528 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

MirunaPislar/multi-head-attention-labeller

Joint text classification on multiple levels with multiple labels, using a multi-head attention mechanism to wire two prediction tasks together.

Language: Python - Size: 4.13 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 15 - Forks: 2

rimo10/ViT

Vision Transformer in pytorch

Language: Jupyter Notebook - Size: 172 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

aman-17/3dprinting-extrusion-detection

3D Printing Extrusion Detection using Multi-Head Attention Model

Language: Python - Size: 9.04 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

abhilash1910/GraphAttentionNetworks

This package is a Tensorflow2/Keras implementation for Graph Attention Network embeddings and also provides a Trainable layer for Multihead Graph Attention.

Language: Python - Size: 142 KB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

meme2515/transformer_pytorch

PyTorch implementation of the Transformer architecture from the paper Attention is All You Need. Includes implementation of attention mechanism.

Language: Python - Size: 318 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

bkhanal-11/transformers

The implementation of transformer as presented in the paper "Attention is all you need" from scratch.

Language: Python - Size: 291 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

datnnt1997/multi-head_self-attention

A Faster Pytorch Implementation of Multi-Head Self-Attention

Language: Jupyter Notebook - Size: 737 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 41 - Forks: 9

shawnhan108/AutoTruckX

An experimental project for autonomous vehicle driving perception with steering angle prediction and semantic segmentation using a combination of UNet, attention and transformers.

Language: Python - Size: 399 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 4

akurniawan/pytorch-transformer

Implementation of "Attention is All You Need" paper

Language: Python - Size: 1.08 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 33 - Forks: 12

whsqkaak/attentions_pytorch

A repository for implementations of attention mechanism by PyTorch.

Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

mpalaourg/PGL-SUM Fork of e-apostolidis/PGL-SUM

A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

Language: Python - Size: 89.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

Mascerade/scale-transformer-encoder

A Transformer Encoder where the embedding size can be down-sized.

Language: Python - Size: 104 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

vasisthasinghal/Machine_Translation

Machine Translation models (with and without attention) to convert sentences in Tamil to Hindi. Transformer models are also used for this same task and performance is compared.

Language: Jupyter Notebook - Size: 78.8 MB - Last synced at: 6 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sarthak7509/ConversationalAi

This is implementation of famous multi head attention mode for conversational ai paper. This model is trained on both Cornell movie data set and WikkiQna data set provided by microsoft

Language: Jupyter Notebook - Size: 64.4 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Related Keywords
multihead-attention 36 transformer 14 self-attention 12 attention 10 pytorch 10 attention-mechanism 7 attention-is-all-you-need 6 transformers 5 gpt 5 deep-learning 4 nlp 4 tensorflow 3 python 3 positional-encoding 3 encoder-decoder 3 natural-language-processing 2 dot-product-attention 2 numpy 2 machine-learning 2 deep-neural-networks 2 python3 2 computer-vision 2 machine-translation 2 resnet-50 2 bert 2 tokenizer 2 pytorch-implementation 2 semantic-segmentation 2 multi-task-learning 1 joint-models 1 joint-learning 1 semi-supervised-learning 1 sentence-classification 1 sentiment-analysis 1 sequence-labelling 1 seq2seq 1 spectral-gnns 1 zero-shot-learning 1 awesome 1 transformermodel 1 hedge-detection 1 neural-machine-translation 1 error-detection 1 conll-2003 1 bilstm-attention 1 bilstm 1 bigram-model 1 vae 1 sagan 1 transformer-architecture 1 gpt-2 1 gan 1 dcgan 1 cgan 1 transformer-tensorflow2 1 transfer-learning 1 udacity-self-driving-car 1 unet 1 unet-image-segmentation 1 scaled-dot-product-attention 1 ism21 1 supervised-learning 1 video-summarization 1 ai 1 artificial-intelligence 1 encoder 1 learning 1 machine 1 ml 1 sequence 1 deeplearning 1 tokenization 1 neural-networks 1 vision-transformer 1 3d-printing 1 graph-attention-networks 1 keras-tensorflow 1 leaky-relu 1 tf2 1 multi-head 1 multi-head-attention 1 multi-head-self-attention 1 multihead-self-attention 1 pytorch-self-attention 1 transformer-attention 1 autonomous-driving 1 autonomous-vehicles 1 cnn-lstm 1 conv3d 1 setr 1 steering-angle-prediction 1 rmsprop 1 xor-logical-operator 1 gpu-computing 1 llm 1 mixture-of-experts 1 transfomers 1 vit 1 xformers 1 deep-architectures 1