GitHub topics: grouped-query-attention

Repositories

reshalfahsi/image-captioning-mobilenet-llama3

Image Captioning With MobileNet-LLaMA 3

Language: Jupyter Notebook - Size: 3.56 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

estnafinema0/russian-jokes-generator

Transformer Models for Humorous Text Generation. Fine-tuned on Russian jokes dataset with ALiBi, RoPE, GQA, and SwiGLU.Plus a custom Byte-level BPE tokenizer.

Language: Jupyter Notebook - Size: 294 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

andrewhsugithub/min-llama

my llama3 implementation

Language: Python - Size: 12.7 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

MyDarapy/SmolLM-experiments-with-grouped-query-attention

(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

Language: Python - Size: 439 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

LukasDrews97/DumbleLLM

Decoder-only LLM trained on the Harry Potter books.

Language: Python - Size: 235 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

lucadellalib/llama3

A single-file implementation of LLaMA 3, with support for jitting, KV caching and prompting

Language: Python - Size: 26.4 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

prajeshshrestha/Llama-2.0-architecture-and-inference-from-scratch-with-PyTorch

Language: Python - Size: 24.4 KB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

knotgrass/attention

several types of attention modules written in PyTorch

Language: Python - Size: 142 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 9

Related Keywords

grouped-query-attention 8 pytorch 5 transformer 4 llm 3 kv-cache 3 llama3 3 nlp 3 transformers 3 rotary-position-embedding 3 swiglu 2 rotary-positional-embedding 2 attention 2 large-language-models 1 large-language-model 1 flash-attention 1 byte-pair-encoding 1 smol-lm 1 smol 1 python 1 llama2 1 pytorch-implementation 1 attention-mechanism 1 multi-head-attention 1 multi-query-attention 1 scale-dot-product-attention 1 softmax-layer 1 ml-efficiency 1 huggingface-smol-lm 1 huggingface 1 rope 1 transformer-models 1 bpe-tokenizer 1 alibi 1 rms-norm 1 pytorch-lightning 1 mobilenetv3 1 image-text 1 image-captioning 1 flickr8k-dataset 1 cnn 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos