GitHub topics: grouped-query-attention
reshalfahsi/image-captioning-mobilenet-llama3
Image Captioning With MobileNet-LLaMA 3
Language: Jupyter Notebook - Size: 3.56 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

estnafinema0/russian-jokes-generator
Transformer Models for Humorous Text Generation. Fine-tuned on Russian jokes dataset with ALiBi, RoPE, GQA, and SwiGLU.Plus a custom Byte-level BPE tokenizer.
Language: Jupyter Notebook - Size: 294 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

andrewhsugithub/min-llama
my llama3 implementation
Language: Python - Size: 12.7 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

MyDarapy/SmolLM-experiments-with-grouped-query-attention
(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)
Language: Python - Size: 439 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

LukasDrews97/DumbleLLM
Decoder-only LLM trained on the Harry Potter books.
Language: Python - Size: 235 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

lucadellalib/llama3
A single-file implementation of LLaMA 3, with support for jitting, KV caching and prompting
Language: Python - Size: 26.4 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

prajeshshrestha/Llama-2.0-architecture-and-inference-from-scratch-with-PyTorch
Language: Python - Size: 24.4 KB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

knotgrass/attention
several types of attention modules written in PyTorch
Language: Python - Size: 142 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 9
