An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: llm-training

mmoha15/lille

🚀 Build and explore the Lille 130M language model, a compact yet powerful tool for deep learning, featuring an open-source framework and efficient training methods.

Language: Python - Size: 1.68 MB - Last synced at: about 1 hour ago - Pushed at: about 4 hours ago - Stars: 1 - Forks: 0

NVIDIA-NeMo/Automodel

DTensor-native pretraining and fine-tuning for LLMs/VLMs with day-0 Hugging Face support, GPU-accelerated, and memory efficient.

Language: Python - Size: 4.59 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 67 - Forks: 9

Datalore-ai/datalore-localgen-cli

synthetic dataset generation workflow using local file resources for finetuning llms.

Language: Python - Size: 2.77 MB - Last synced at: about 12 hours ago - Pushed at: about 14 hours ago - Stars: 73 - Forks: 8

open-sciencelab/GraphGen

GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

Language: Python - Size: 13.9 MB - Last synced at: about 20 hours ago - Pushed at: 9 days ago - Stars: 335 - Forks: 28

FLotfiGit/Gen-AI-Exploration

Files and experiments from my exploration of generative AI and LLM fine-tuning techniques.

Language: Jupyter Notebook - Size: 183 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

erogol/BlaGPT

Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.

Language: Python - Size: 780 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 77 - Forks: 8

yanring/Megatron-MoE-ModelZoo

Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.

Language: Python - Size: 1.61 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 78 - Forks: 12

jman4162/Sizing-AI-Training-by-Cost-per-Memory-Bandwidth

A practical model (with math + Python) to tell if you’re compute-, memory-, or network-bound—and what to buy next

Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

skypilot-org/skypilot

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).

Language: Python - Size: 165 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 8,599 - Forks: 745

chillyilly/odysseus-llm-toolkit

The Odysseus LLM toolkit is a collection of LLM Security Layer Testing utilities that I developed out of necessity because default AI is dangerous.

Language: Python - Size: 535 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

Language: Python - Size: 16.9 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 5,601 - Forks: 394

tingaicompass/AI-Compass

“AI-Compass”将为社区指引在 AI 技术海洋中航行的方向,无论你是初学者还是进阶开发者,都能在这里找到通往 AI 各大方向的路径。旨在帮助开发者系统性地了解 AI 的核心概念、主流技术、前沿趋势,并通过实践掌握从理论到落地的全过程。

Size: 20.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 208 - Forks: 20

aklinker1/vitepress-knowledge

Free, self-hosted LLM chatbot trained on your VitePress website.

Language: TypeScript - Size: 251 KB - Last synced at: about 9 hours ago - Pushed at: 6 months ago - Stars: 49 - Forks: 3

utkuozdemir/nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary

Language: Go - Size: 1.28 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,248 - Forks: 132

deepsai8/moe_llama

configurable moe-llama model training and inference built on pytorch

Language: Python - Size: 1.94 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

aws-samples/awsome-distributed-training

Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.

Language: Shell - Size: 160 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 334 - Forks: 139

qubasehq/LLMBuilder

LLMBuilder is a production-ready framework for training and fine-tuning Large Language Models (LLMs) — not a model itself. Designed for developers, researchers, and AI engineers, LLMBuilder provides a full pipeline to go from raw text data to deployable, optimized LLMs, all running locally on CPUs or GPUs.

Language: Python - Size: 18.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Raumberg/myllm

Multi-node distributed LLM training framework

Language: Python - Size: 1.66 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 17 - Forks: 1

tuitige/fijian-rag-app

Public-benefit GenAI platform for the Fijian language — combining Claude + RAG + OpenSearch to build verified datasets, preserve culture, and unlock AI access in the Pacific. LLM Fine-Tuning, RAG, Generative AI and learning

Language: TypeScript - Size: 40.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 1

Nikityyy/lille

A powerful 130-million-parameter model trained from scratch as part of a truly open-source stack, including a custom tokenizer, dataset, and optimizer.

Language: Python - Size: 405 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 25 - Forks: 0

yinizhilian/ICLR2025-Papers-with-Code

历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.

Size: 1.47 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 412 - Forks: 19

h2oai/h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/

Language: Python - Size: 54.5 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 4,607 - Forks: 488

feifeibear/long-context-attention

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Language: Python - Size: 4.61 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 556 - Forks: 64

InternLM/xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language: Python - Size: 2.14 MB - Last synced at: 4 days ago - Pushed at: 24 days ago - Stars: 4,726 - Forks: 354

R-D-BioTech-Alaska/Qelm

Qelm - Quantum Enhanced Language Model (Qubit)

Language: Python - Size: 20.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 7

little51/llm-dev

《大模型项目实战:多领域智能应用开发》配套资源

Language: JavaScript - Size: 2.39 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 168 - Forks: 32

OpenSQZ/MegatronApp

Toolchain built around the Megatron-LM for Distributed Training

Language: Python - Size: 90.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 62 - Forks: 4

mallorbc/Finetune_LLMs

Repo for fine-tuning Casual LLMs

Language: Python - Size: 8.22 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 457 - Forks: 85

andrew264/modelex

Doing devious stuff with AI

Language: Python - Size: 429 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

langfengQ/verl-agent

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Language: Python - Size: 37.7 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 814 - Forks: 56

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Language: Python - Size: 31.8 MB - Last synced at: 5 days ago - Pushed at: 11 days ago - Stars: 11,577 - Forks: 1,217

gruai/koifish

A c++ framework on efficient training & fine-tuning LLMs

Language: C++ - Size: 16.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 9 - Forks: 0

StigLidu/DualDistill

The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"

Language: Python - Size: 1.91 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 94 - Forks: 4

bd4sur/Nano

电子鹦鹉 / Toy Language Model

Language: C - Size: 30.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 195 - Forks: 11

databricks/dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language: Python - Size: 63.5 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 2,571 - Forks: 244

ziming/laravel-scrapingbee

A PHP Laravel Library for Scrapingbee Web Scraping API. AI querying supported

Language: PHP - Size: 66.4 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 30 - Forks: 5

rohan-paul/LLM-FineTuning-Large-Language-Models

LLM (Large Language Model) FineTuning

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 4 days ago - Pushed at: 5 months ago - Stars: 560 - Forks: 134

Cailailai/gptchinese

国内 ChatGPT-4中文版镜像网站整理(2025/08/30)【镜像网站合集】为方便使用,我整理了一些国内可用的 ChatGPT 镜像网站。每个网站都有优劣之处,大家可以根据需求选择使用。

Size: 71.3 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

vbosoxo/chatgpt-gpts

【Chatgpt国内镜像网站合集】国内 ChatGPT-4中文版镜像网站整理(2025/08/30)

Size: 85.9 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

vbosoxo/chatgpt-website-zhongwen

【Chatgpt镜像网站合集】国内 ChatGPT-4中文版镜像网站整理(2025/08/30)

Language: HTML - Size: 268 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

fluoos/crawl2ai

一款强大的大模型微调数据集生成和管理工具。

Language: Python - Size: 936 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 17 - Forks: 2

intelligent-machine-learning/dlrover

DLRover: An Automatic Distributed Deep Learning System

Language: Python - Size: 191 MB - Last synced at: 5 days ago - Pushed at: 10 days ago - Stars: 1,533 - Forks: 196

firojalam/LlamaLens

This repository contains the resources, code, and documentation for LlamaLens, a specialized multilingual large language model (LLM) designed to analyze news and social media content effectively. LlamaLens supports multiple languages, including Arabic, English, and Hindi, and is tailored for diverse tasks such as sentiment analysis, misinformation.

Language: Python - Size: 102 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 8 - Forks: 1

robinglory/AI-Driven-Robotic-Head-for-Language-Learning-Assistance

This is my mini thesis for my Bachelor degree in Mechatronics Engineering!

Language: Python - Size: 223 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

volcengine/veScale

A PyTorch Native LLM Training Framework

Language: Python - Size: 2.5 MB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 861 - Forks: 51

newking9088/gpt_llama_rag_fine_tuning_classification

A repository for implementing and evaluating state-of-the-art LLM techniques including fine-tuning, Retrieval-Augmented Generation (RAG), and model evaluation.

Language: Jupyter Notebook - Size: 22.7 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

ai4sd/number-token-loss

PyPI package for number token loss. Docs at:

Language: Python - Size: 801 KB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 6 - Forks: 1

thetwopct/folder2txt

Convert local folder contents into a single text file with ease - perfect for analysis, documentation, or AI/LLM training purposes.

Language: JavaScript - Size: 52.7 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 1

Cre4T3Tiv3/Cre4T3Tiv3

Size: 696 KB - Last synced at: 8 days ago - Pushed at: 14 days ago - Stars: 32 - Forks: 3

taabishhh/LLM_Training

This project implements a distributed pipeline for NLP model training using Apache Spark and DeepLearning4J (DL4J). The methodology utilizes a sliding window approach for data preparation, positional embeddings for token encoding, and Word2Vec model training with parallel processing. The model and training process is designed for scalability and op

Language: Scala - Size: 17 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

scouzi1966/AFMTrainer

A comprehensive GUI wrapper application for Apple's Foundation Models Adapter Training Toolkit, providing an intuitive interface for training LoRA adapters for Apple's on-device foundation models.

Language: Python - Size: 43.9 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

alecpippas/AI-Systems-Portfolio

A portfolio showcasing modern AI pipelines (ETL, training, fine-tuning, etc.) and a text-to-video RAG system prototype. Included are several computer vision pipelines for object detection, object segmentation, and anomaly detection. The RAG system takes textual users prompts and returns relevant video clips from a corpus of class lecture recordings

Language: Jupyter Notebook - Size: 179 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

adithya-s-k/CompanionLLM

CompanionLLM - A framework to finetune LLMs to be your own sentient conversational companion

Language: Jupyter Notebook - Size: 40.1 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 5

versoindustries/HighNoonLLM

HighNoon LLM uses Hierarchical Spatial Neural Memory (HSMN) to process language like humans, organizing text into a tree for efficiency. It cuts computing needs by 78x, excelling in summarization, coding, and Q&A, while running locally for privacy.

Language: Python - Size: 28.3 MB - Last synced at: 8 days ago - Pushed at: 12 days ago - Stars: 7 - Forks: 2

amazon-science/Cyber-Zero

Cyber-Zero: Training Cybersecurity Agents Without Runtime

Language: Python - Size: 963 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 9 - Forks: 2

hadirsa/spring-data-llm-adapter

A Spring Boot library that extracts metadata from JPA entities (@Entity, @Table, @ManyToMany, etc.) and exposes it in an LLM-friendly JSON-like structure. This allows large language models to understand your domain model and generate queries or insights automatically.

Language: Kotlin - Size: 131 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

PiyushPratap10/creating_llms

Creating LLMs from scratch.

Language: Jupyter Notebook - Size: 3.43 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

0xnu/tipus-micro-llm

Character-level and token-based language models implemented in pure PyTorch.

Language: Jupyter Notebook - Size: 65.7 MB - Last synced at: 7 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Cailailai/cailailai.github.io

ChatGPT 中文版|国内 ChatGPT 镜像网站免费推荐(支持 GPT-4)【2025-08-23】

Language: HTML - Size: 235 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Cailailai/chatgpt-chinese-new

国内如何使用 ChatGPT?最易懂的 ChatGPT 中文版介绍与使用教程【25年8月23日更新】

Size: 72.3 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 1

Cailailai/New-site-Chatgpt

ChatGPT 中文版:高效使用指南与镜像网站推荐(支持GPT-4、GPT-4o、GPT-o1、GPT-o3、Deepseek、Grok3,无需翻墙)

Size: 74.2 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Cailailai/chatgpt-4o

25年8月23日更新|国内可用Chat GPT-4o模型中文镜像网站

Language: HTML - Size: 239 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

vbosoxo/chatgpt-site-2025

【2025年8月23日更新】 ChatGPT 官方 中文版:国内访问ChatGPT镜像指南(支持GPT-4、GPT-4o、GPT-o1,无需翻墙)

Size: 96.7 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

vbosoxo/vbosoxo.github.io

【2025-08-23】ChatGPT 中文版|国内 ChatGPT 镜像网站免费推荐(支持 GPT-4、GPT-4o、o1-preview、GPT-o3、deepseek、grok3)

Language: HTML - Size: 278 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

vbosoxo/Chatgpt-website-chinese

ChatGPT 中文版:国内访问指南(支持GPT-4、GPT-4o、GPT-o1,无需翻墙)25年8月23日更新

Size: 93.8 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

vbosoxo/chatgpt-new

ChatGPT 中文版:国内访问指南(支持 GPT-4、4o和o1,无需翻墙)【2025年8月23日更新】

Language: HTML - Size: 291 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

zhuhanqing/APOLLO

APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention

Language: Python - Size: 34.2 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 253 - Forks: 11

Simplifine-gamedev/Simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

Language: Python - Size: 844 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 93 - Forks: 4

DLCLambo/action

🌐 Trigger site deployments and run commands easily with GitHub Actions for Netlify, streamlining your workflow and enhancing site management.

Language: Shell - Size: 17.6 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

ShammiAnand/rtt

repo/url to text for easy interaction with llms all built in

Language: Go - Size: 15.1 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

Cheng-Chun-Yen/the-theory-of-sleep-instinct

Original sealed theory on sleep mechanism – dual-language semantic corpus for LLM alignment tests. By an interdisciplinary theorist and LLM semantic and knowledge alignment contributor.

Language: HTML - Size: 33 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

ASPLOS26-AsymCheck/AsymCheck

AsymCheck: Asymmetric Checkpointing for Efficient Large Language Model Training

Language: Python - Size: 16.7 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

InternLM/InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Language: Python - Size: 6.79 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 404 - Forks: 70

plandes/lmtask

Inferencing and Training Large Language Model Tasks

Language: Python - Size: 338 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

mtuann/llm-updated-papers

Papers related to Large Language Models in all top venues

Size: 926 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 10 - Forks: 2

google/litmus

Litmus is a comprehensive LLM testing and evaluation tool designed for GenAI Application Development. It provides a robust platform with a user-friendly UI for streamlining the process of building and assessing the performance of your LLM-powered applications.

Language: Vue - Size: 303 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 37 - Forks: 4

lolvr69/LLMs-from-scratch

LLMs-from-scratch中文版本,从头开始用 PyTorch 实现一个类似 ChatGPT 的大语言模型(LLM)

Size: 1.95 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

CodeWizardWalter/AI-Studio

AI-Studio 🐙 Streamlit toolkit for devs & creators with summarization, README & blog writer, code explainer, commit message and image-prompt generators.

Language: Python - Size: 11.7 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

tum-ai/number-token-loss

A regression-alike loss to improve numerical reasoning in language models - ICML 2025

Language: Jupyter Notebook - Size: 129 MB - Last synced at: 7 days ago - Pushed at: 19 days ago - Stars: 24 - Forks: 5

microsoft/LLF-Bench

A benchmark for evaluating learning agents based on just language feedback

Language: Python - Size: 14.6 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 87 - Forks: 18

rahul-shrivastav/NeuralPy

NeuralPy is a web-based application powered by a fine-tuned TinyLLaMA model that generates Python code from natural language queries and vice versa.

Language: Jupyter Notebook - Size: 40.1 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

thatoldfarm/vf

Virtual Forest Framework The 'virtual-forest' repo contents set up a framework for an interactive game/enviroment for an AI (Artificial Intelligence) in a not-so-virtual world called the "Virtual Forest."

Size: 2.71 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

yuki-2025/MediNotes

MediNotes: SOAP Note Generation through Ambient Listening, Large Language Model Fine-Tuning, and RAG

Language: Python - Size: 157 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 47 - Forks: 0

yuki-2025/Dyna_Swarm

AgentNet, the first-ever open‐source flexible graph-based multi-agent system that lets researchers and practitioners define, train, and deploy dynamic collaboration graphs without reimplementing core RL or LLM

Language: Python - Size: 6.23 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 79 - Forks: 4

dmeldrum6/LLM-Dataset-Builder

LLM-Powered Dataset Creation Tool

Language: HTML - Size: 66.4 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 1 - Forks: 0

dino65-dev/Transformers

Transformers from scratch implemented GQA,RoPE,RMS-Norm and trained on that code

Language: Jupyter Notebook - Size: 276 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

gitleaks/gitleaks

Find secrets with Gitleaks 🔑

Language: Go - Size: 5.87 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 22,929 - Forks: 1,750

armbues/SiLLM

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.

Language: Python - Size: 618 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 278 - Forks: 26

AlibabaPAI/torchacc

PyTorch distributed training acceleration framework

Language: Python - Size: 33 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 51 - Forks: 9

suhasi-gadge/Open_Avenues_Build_Fellowship_Projects

🧠💼 Projects developed as a student consultant during the Open Avenues Build Fellowship—featuring NLP-driven medical data extraction and a B2B aggregator model for an e-commerce platform—focused on real-world problem-solving using modern tech stacks.

Language: Jupyter Notebook - Size: 4.96 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

ai-naymul/gguf-webui

Webui interface for quantizing model and inference

Language: Python - Size: 31.3 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 1 - Forks: 0

zachdwight/LLM_Training_Prep_3Step

This repository provides a 3-step pipeline to streamline the creation of high-quality training data for LLMs. It automates PDF content extraction, uses a local LLM to draft initial Q&A pairs, and includes both a human-in-the-loop curation step and a final automated quality check to produce a clean, ready-to-use dataset.

Language: Python - Size: 88.9 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

cikay/kurdish-kurmanci-llm-train

Kurdish Kurmanji LLM train script

Language: Python - Size: 94.7 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

sugarcane-ai/sugarcane-ai

npm like package ecosystem for Prompts 🤖

Language: TypeScript - Size: 11.5 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 51 - Forks: 14

qlaxd/Large-Language-Diffusion-with-mAsking

Implementing Diffusion Models for Language Generation

Language: Python - Size: 429 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

dazeb/markdown-downloader

A MCP Server that will download any webpage as markdown in an instant. Download docs straight to your IDE for AI context. Powered by Jina.ai

Language: JavaScript - Size: 30.3 KB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 36 - Forks: 14

golololologol/LLM-Distillery

A pipeline for LLM knowledge distillation

Language: Python - Size: 562 KB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 108 - Forks: 13

Souptik96/efficient-domain-tuning

Research paper on efficient fine tuning of small sized open source models for domain specific tasks.

Size: 12.7 KB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

aayes89/PyLLM

Entrena tu propio LLM desde cero

Language: Python - Size: 11.7 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

Kitsunp/Small-lenguaje-Model-Hybrid-Norm-Furier-Formers

A compact language model implementing HybridNorm and Fourier-based attention. Combines CoLA (low-rank projections), FANformer, and hybrid normalization to create an efficient decoder-only transformer. Leverages periodicity modeling and gated residuals to enhance performance while maintaining a small parameter footprint.

Language: Python - Size: 4.64 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 4 - Forks: 0