Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: attention-is-all-you-need

Repositories

hkproj/pytorch-transformer

Attention is all you need implementation

Language: Jupyter Notebook - Size: 480 KB - Last synced: 3 days ago - Pushed: 25 days ago - Stars: 390 - Forks: 174

hkproj/python-longnet

Tools and experiments with the LongNet model

Language: Jupyter Notebook - Size: 501 KB - Last synced: 4 days ago - Pushed: 10 months ago - Stars: 9 - Forks: 2

AlvinKimata/Transformers

Scripts for building transformer models for different data modalities.

Language: Jupyter Notebook - Size: 6.06 MB - Last synced: 4 days ago - Pushed: 7 months ago - Stars: 1 - Forks: 0

kyegomez/MobileVLM

Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant for Mobile Devices"

Language: Python - Size: 2.17 MB - Last synced: 5 days ago - Pushed: 2 months ago - Stars: 12 - Forks: 0

kyegomez/MambaTransformer

Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling

Language: Python - Size: 2.25 MB - Last synced: 5 days ago - Pushed: 2 months ago - Stars: 124 - Forks: 11

kyegomez/VideoVIT

Open source implementation of a vision transformer that can understand Videos using max vit as a foundation.

Language: Python - Size: 2.18 MB - Last synced: 6 days ago - Pushed: 2 months ago - Stars: 7 - Forks: 0

kyegomez/M2PT

Implementation of M2PT in PyTorch from the paper: "Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities"

Language: Python - Size: 2.66 MB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 11 - Forks: 1

kyegomez/MMCA

The open source community's implementation of the all-new Multi-Modal Causal Attention from "DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention"

Language: Python - Size: 230 KB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 9 - Forks: 0

kyegomez/Kosmos2.5

My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"

Language: Python - Size: 254 KB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 58 - Forks: 6

Separius/awesome-fast-attention 📦

list of efficient attention modules

Language: Python - Size: 156 KB - Last synced: 4 days ago - Pushed: over 2 years ago - Stars: 977 - Forks: 108

kyegomez/TinyGPTV

Simple Implementation of TinyGPTV in super simple Zeta lego blocks

Language: Python - Size: 2.17 MB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 15 - Forks: 0

kyegomez/MultiModalCrossAttn

The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"

Language: Python - Size: 223 KB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 10 - Forks: 0

kyegomez/CT

Implementation of the attention and transformer from "Building Blocks for a Complex-Valued Transformer Architecture"

Language: Python - Size: 2.16 MB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 5 - Forks: 0

kyegomez/MGQA

The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints"

Language: Python - Size: 248 KB - Last synced: 7 days ago - Pushed: 5 months ago - Stars: 7 - Forks: 0

kyegomez/MMCA-MGQA

Experiments around using Multi-Modal Casual Attention with Multi-Grouped Query Attention

Language: Python - Size: 210 KB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 5 - Forks: 0

kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

Language: Python - Size: 40.3 MB - Last synced: 7 days ago - Pushed: 4 months ago - Stars: 651 - Forks: 62

hkproj/pytorch-llama-notes

Notes about LLaMA 2 model

Language: Python - Size: 2.64 MB - Last synced: 4 days ago - Pushed: 9 months ago - Stars: 32 - Forks: 3

kyegomez/HeptapodLM

An Implementation of an Transformer model that generates tokens non-linearly all at once like the heptapods from Arrival

Language: Python - Size: 36.2 MB - Last synced: 9 days ago - Pushed: 2 months ago - Stars: 7 - Forks: 0

hkproj/transformer-from-scratch-notes

Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)

Size: 1.32 MB - Last synced: 4 days ago - Pushed: 12 months ago - Stars: 154 - Forks: 39

kyegomez/AutoRT

Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"

Language: Python - Size: 2.5 MB - Last synced: 5 days ago - Pushed: 2 months ago - Stars: 28 - Forks: 2

kyegomez/SelfExtend

Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta

Language: Python - Size: 2.17 MB - Last synced: 10 days ago - Pushed: 2 months ago - Stars: 11 - Forks: 0

kyegomez/GATS

Implementation of GATS from the paper: "GATS: Gather-Attend-Scatter" in pytorch and zeta

Language: Python - Size: 2.17 MB - Last synced: 11 days ago - Pushed: 2 months ago - Stars: 8 - Forks: 0

awslabs/sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

Language: Python - Size: 10 MB - Last synced: 9 days ago - Pushed: about 2 months ago - Stars: 1,207 - Forks: 326

sooftware/kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Language: Python - Size: 920 MB - Last synced: 10 days ago - Pushed: 12 months ago - Stars: 577 - Forks: 189

SayamAlt/English-to-Spanish-Language-Translation-using-Seq2Seq-and-Attention

Successfully established a Seq2Seq with attention model which can perform English to Spanish language translation up to an accuracy of almost 97%.

Language: Jupyter Notebook - Size: 1.18 MB - Last synced: 11 days ago - Pushed: 12 days ago - Stars: 0 - Forks: 0

kyegomez/CELESTIAL-1

Omni-Modality Processing, Understanding, and Generation

Language: Python - Size: 2.49 MB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 6 - Forks: 0

Shikhar-S/EvolvedTransformer

Contains pytorch implementation of Transformer and EvolvedTransformer architectures. WIP

Language: Jupyter Notebook - Size: 22.2 MB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 8 - Forks: 4

kyegomez/LiqudNet

Implementation of Liquid Nets in Pytorch

Language: Python - Size: 2.18 MB - Last synced: 7 days ago - Pushed: 2 months ago - Stars: 10 - Forks: 2

jshuadvd/LongRoPE

Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper

Language: Python - Size: 34.5 MB - Last synced: 20 days ago - Pushed: 20 days ago - Stars: 50 - Forks: 8

FreedomIntelligence/TextClassificationBenchmark

A Benchmark of Text Classification in PyTorch

Language: Python - Size: 1.76 MB - Last synced: 20 days ago - Pushed: 28 days ago - Stars: 592 - Forks: 137

willGuimont/learnable_fourier_positional_encoding

Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding

Language: Python - Size: 3.91 KB - Last synced: 20 days ago - Pushed: 20 days ago - Stars: 34 - Forks: 9

kaituoxu/Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Language: Python - Size: 678 KB - Last synced: 20 days ago - Pushed: about 1 year ago - Stars: 762 - Forks: 193

guillaume-chevalier/Linear-Attention-Recurrent-Neural-Network

A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)

Language: Jupyter Notebook - Size: 13.6 MB - Last synced: 17 days ago - Pushed: over 5 years ago - Stars: 143 - Forks: 32

alessioborgi/AdaViT

Project Name: AdaViT | PyTorch Lightning, Python

Language: Python - Size: 1.52 GB - Last synced: 26 days ago - Pushed: 26 days ago - Stars: 0 - Forks: 0

sled-group/InfEdit

[CVPR 2024] Official implementation of CVPR 2024 paper: "Inversion-Free Image Editing with Natural Language"

Language: Python - Size: 261 MB - Last synced: 23 days ago - Pushed: 23 days ago - Stars: 190 - Forks: 5

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Language: Python - Size: 4.83 MB - Last synced: 24 days ago - Pushed: 25 days ago - Stars: 1,496 - Forks: 129

kyegomez/KosmosG

My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"

Language: Python - Size: 2.79 MB - Last synced: 20 days ago - Pushed: 2 months ago - Stars: 12 - Forks: 0

kyegomez/MambaFormer

Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks"

Language: Python - Size: 2.17 MB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 7 - Forks: 1

zimmerrol/attention-is-all-you-need-keras

Implementation of the Transformer architecture described by Vaswani et al. in "Attention Is All You Need"

Language: Python - Size: 3.98 MB - Last synced: 27 days ago - Pushed: about 5 years ago - Stars: 29 - Forks: 11

FloweryK/transformer

pytorch implementation of the transformer from "Attention Is All You Need"

Language: Python - Size: 8.11 MB - Last synced: 27 days ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Language: Python - Size: 162 KB - Last synced: 28 days ago - Pushed: about 1 month ago - Stars: 8,433 - Forks: 1,919

kyegomez/HLT

Implementation of the transformer from the paper: "Real-World Humanoid Locomotion with Reinforcement Learning"

Language: Python - Size: 2.17 MB - Last synced: 14 days ago - Pushed: 2 months ago - Stars: 16 - Forks: 3

kyegomez/RT-X

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"

Language: Python - Size: 1.29 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 103 - Forks: 11

sooftware/speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Language: Python - Size: 72.3 KB - Last synced: 16 days ago - Pushed: over 2 years ago - Stars: 61 - Forks: 8

feifeibear/long-context-attention

Distributed Attention for Long Context LLM Model Training and Inference

Language: Python - Size: 2.12 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 65 - Forks: 2

kyegomez/CM3Leon

An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images

Language: Python - Size: 754 KB - Last synced: about 1 month ago - Pushed: 5 months ago - Stars: 318 - Forks: 15

kyegomez/Jamba

PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"

Language: Python - Size: 2.17 MB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 49 - Forks: 0

lvapeab/nmt-keras

Neural Machine Translation with Keras

Language: Python - Size: 5.63 MB - Last synced: 20 days ago - Pushed: almost 3 years ago - Stars: 533 - Forks: 129

kyegomez/ScreenAI

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

Language: Python - Size: 2.18 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 144 - Forks: 15

kyegomez/PALM-E

Implementation of "PaLM-E: An Embodied Multimodal Language Model"

Language: Python - Size: 9.15 MB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 197 - Forks: 34

kyegomez/SparseAttention

Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"

Language: Python - Size: 2.16 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 25 - Forks: 1

Shuijing725/CrowdNav_Prediction_AttnGraph

[ICRA 2023] Intention Aware Robot Crowd Navigation with Attention-Based Interaction Graph

Language: Python - Size: 45.9 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 116 - Forks: 22

kyegomez/swarmalators

Pytorch Implementation of the Swarmalators algorithm from "Exotic swarming dynamics of high-dimensional swarmalators"

Language: Python - Size: 2.16 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 6 - Forks: 0

ukairia777/tensorflow-transformer

텐서플로우2로 구현한 트랜스포머 챗봇 구현체 (TensorFlow implementation of 'Attention Is All You Need')

Language: Jupyter Notebook - Size: 113 KB - Last synced: 20 days ago - Pushed: about 2 years ago - Stars: 28 - Forks: 18

kyegomez/AthenaOS

AthenaOS is a next generation AI-native operating system managed by Swarms of AI Agents

Language: Rust - Size: 20.5 KB - Last synced: about 1 month ago - Pushed: 10 months ago - Stars: 11 - Forks: 1

hkproj/bert-from-scratch

BERT explained from scratch

Size: 874 KB - Last synced: 4 days ago - Pushed: 7 months ago - Stars: 6 - Forks: 2

himanshuvnm/Foundation-Model-Large-Language-Model-FM-LLM

This repository was commited under the action of executing important tasks on which modern Generative AI concepts are laid on. In particular, we focussed on three coding actions of Large Language Models. Extra and necessary details are given in the README.md file.

Language: Jupyter Notebook - Size: 431 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

jeongwonkwak/OpenNMT-Project

OpenNMT based Korean-to-English Neural Machine Translation

Language: Python - Size: 12.7 MB - Last synced: about 2 months ago - Pushed: over 3 years ago - Stars: 13 - Forks: 14

vietai/dab

Data Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ

Language: Jupyter Notebook - Size: 13.2 MB - Last synced: 27 days ago - Pushed: almost 2 years ago - Stars: 64 - Forks: 7

kyegomez/AudioFlamingo

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"

Language: Python - Size: 2.17 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 19 - Forks: 1

LaurentVeyssier/Abstractive-Text-Summarization-model-in-Keras

Abstractive Text Summarization using Transformer model

Language: Jupyter Notebook - Size: 12.3 MB - Last synced: about 2 months ago - Pushed: almost 4 years ago - Stars: 6 - Forks: 5

kyegomez/PaLM2-VAdapter

Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter"

Language: Python - Size: 2.17 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 14 - Forks: 0

Kyubyong/transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need

Language: Python - Size: 5.32 MB - Last synced: 2 months ago - Pushed: 12 months ago - Stars: 4,128 - Forks: 1,282

notlober/transformer-pytorch

original transformer in pytorch

Language: Python - Size: 4.88 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

kyegomez/Hedgehog

Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"

Language: Python - Size: 2.16 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 5 - Forks: 0

gopikrsmscs/transformer

Unofficial Implementation of Transformer In PyTorch

Language: Python - Size: 6.84 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

puskal-khadka/Transformer

Transformer model based on the research paper: "Attention Is All You Need"

Language: Python - Size: 16.2 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

tgautam03/Transformers

A Gentle Introduction to Transformers Neural Network

Language: Jupyter Notebook - Size: 6.59 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 6 - Forks: 1

lsdefine/attention-is-all-you-need-keras

A Keras+TensorFlow Implementation of the Transformer: Attention Is All You Need

Language: Python - Size: 1.33 MB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 700 - Forks: 194

insdout/BertAttentionViz

BERT Attention Visualization is a web application powered by Streamlit, offering intuitive visualization of attention weights generated by BERT-based models.

Language: Python - Size: 524 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

leviswind/pytorch-transformer

pytorch implementation of Attention is all you need

Language: Python - Size: 248 KB - Last synced: 28 days ago - Pushed: almost 3 years ago - Stars: 236 - Forks: 57

brightmart/bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

Language: Python - Size: 16 MB - Last synced: 3 months ago - Pushed: over 5 years ago - Stars: 959 - Forks: 212

kyegomez/Qwen-VL

My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...

Language: Python - Size: 244 KB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 11 - Forks: 1

kyegomez/ShallowFF

Zeta implemantion of "Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers"

Language: Python - Size: 36.2 MB - Last synced: 11 days ago - Pushed: 11 days ago - Stars: 6 - Forks: 0

kyegomez/LongVit

A simplistic pytorch implementation of LongVit using my previous implementation of LongNet as a foundation.

Language: Shell - Size: 2.15 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 5 - Forks: 0

VachanVY/Attention-Is-All-You-Need

A beginner's guide to Transformers and Language Models in Tensorflow/Keras

Language: Python - Size: 422 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

menon92/BangalASR

Transformer based Bangla Speech Recognition

Language: Jupyter Notebook - Size: 104 MB - Last synced: 3 months ago - Pushed: about 1 year ago - Stars: 48 - Forks: 15

hrithickcodes/transformer-tf

This repository contains the code for the paper "Attention Is All You Need" i.e The Transformer.

Language: Jupyter Notebook - Size: 30.4 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 6 - Forks: 1

ChieBKAI/attention-paper-implementation-from-scratch

Attention is all you need implementation

Language: Python - Size: 43.9 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

GiorgiaAuroraAdorni/gansformer-reproducibility-challenge

Replication of the novel Generative Adversarial Transformer.

Language: Python - Size: 93.6 MB - Last synced: 4 months ago - Pushed: 6 months ago - Stars: 2 - Forks: 3

Agora-X/DailyPaperClub

The repository for the exclusive Daily Paper Club hosted at Agora every 10pm NYC time at this discord: https://discord.gg/Gnzh6dnzyz

Size: 14.6 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 8 - Forks: 0

abc1203/transformer-model

An implementation of the transformer deep learning model, based on the research paper "Attention Is All You Need"

Language: Python - Size: 678 MB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 0 - Forks: 1

dcarpintero/transformer101

Annotated vanilla implementation in PyTorch of the Transformer model introduced in 'Attention Is All You Need'

Language: Jupyter Notebook - Size: 215 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 1 - Forks: 0

oovm/Attention-is-All-U-Need

更适合中国宝宝体质的注意力练习题, 中国也要有自己的拉马努金!

Language: Mathematica - Size: 283 KB - Last synced: 4 months ago - Pushed: 5 months ago - Stars: 3 - Forks: 0

IvanBongiorni/maximal

A TensorFlow-compatible Python library that provides models and layers to implement custom Transformer neural networks. Built on TensorFlow 2.

Language: Python - Size: 396 KB - Last synced: 17 days ago - Pushed: 7 months ago - Stars: 9 - Forks: 1

nevinbaiju/transformer_cpp_ITCS-5182

Optimization of Attention layers for efficient inferencing on the CPU and GPU. It covers optimizations for AVX and CUDA also efficient memory processing techniques.

Language: C++ - Size: 96.7 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

abideenml/TransformerImplementationfromScratch

My implementation of the "Attention is all you Need" 📝 Transformer model Ⓜ️ (Vaswani et al.) in Tensorflow. (https://arxiv.org/abs/1706.03762)

Language: Jupyter Notebook - Size: 4.35 MB - Last synced: 4 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

sgrvinod/a-PyTorch-Tutorial-to-Transformers

Attention Is All You Need | a PyTorch Tutorial to Transformers

Language: Python - Size: 27.5 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 114 - Forks: 24

TatevKaren/BabyGPT-Build_GPT_From_Scratch

BabyGPT: Build Your Own GPT Large Language Model from Scratch Pre-Training Generative Transformer Models: Building GPT from Scratch with a Step-by-Step Guide to Generative AI in PyTorch and Python

Language: Python - Size: 1.34 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 6 - Forks: 2

taxborn/betsi

A light implementation of the 2017 Google paper 'Attention is all you need'.

Language: Jupyter Notebook - Size: 2.35 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

gordicaleksa/pytorch-original-transformer

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.

Language: Jupyter Notebook - Size: 948 KB - Last synced: 6 months ago - Pushed: over 3 years ago - Stars: 829 - Forks: 145

jayparks/transformer

A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"

Language: Python - Size: 55.7 KB - Last synced: 6 months ago - Pushed: over 3 years ago - Stars: 505 - Forks: 119

tnq177/transformers_without_tears

Transformers without Tears: Improving the Normalization of Self-Attention

Language: Python - Size: 9.74 MB - Last synced: 6 months ago - Pushed: 9 months ago - Stars: 124 - Forks: 17

LaurenceLungo/GPT-from-Scratch

PyTorch implementation of GPT from scratch

Language: Jupyter Notebook - Size: 11.2 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

SharathHebbar/Transformers

Transformers Intuition

Language: Jupyter Notebook - Size: 34.3 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

sparks-baird/CrabNet Fork of anthony-wang/CrabNet

Predict materials properties using only the composition information!

Language: HTML - Size: 393 MB - Last synced: 22 days ago - Pushed: 11 months ago - Stars: 11 - Forks: 3

Related Keywords

attention-is-all-you-need 186 transformer 79 attention-mechanism 64 pytorch 63 attention 51 deep-learning 37 transformers 35 machine-learning 27 nlp 26 artificial-intelligence 25 natural-language-processing 20 gpt4 19 tensorflow 17 ai 16 multimodal 15 neural-machine-translation 15 attention-mechanisms 15 ml 15 machine-translation 14 attention-model 14 python 14 keras 11 seq2seq 11 self-attention 11 multi-modal 9 translation 9 bert-model 8 deep-neural-networks 8 gpt 8 language-model 7 transformer-architecture 7 neural-networks 7 neural-network 7 python3 7 sequence-to-sequence 7 bert 6 computer-vision 6 multihead-attention 6 multimodality 6 rnn 6 deeplearning 6 pytorch-implementation 6 tensorflow2 5 multimodal-deep-learning 5 end-to-end 5 lstm 5 recurrent-neural-networks 5 vision-transformer 5 encoder-decoder 4 huggingface-transformers 4 large-language-models 4 nlp-machine-learning 4 transformer-encoder 4 asr 4 nmt 4 speech 4 positional-encoding 4 text-classification 4 llm 4 gpt-4 4 language 4 lstm-neural-networks 4 transformer-models 4 paper-implementations 3 natural-language-understanding 3 cnn 3 word2vec 3 llama 3 attention-seq2seq 3 gpt3 3 encoder-decoder-model 3 multi-modality 3 gpt-2 3 pytorch-tutorial 3 swarms 3 study-notes 3 gan 3 multimodal-learning 3 transformers-models 3 dropout-layers 2 attention-network 2 lemmatization 2 stemming 2 neural-nets 2 lstm-sentiment-analysis 2 pytorch-lightning 2 feedforward 2 softmax 2 tokenization 2 gpt2 2 masked-language-models 2 paper-implementation 2 natural-language-inference 2 natural 2 transformer-pytorch 2 classification 2 sentiment-analysis 2 question-answering 2 huggingface 2 encoder-decoder-architecture 2