Topic: "reasoning-language-models"
mims-harvard/TxAgent
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
Language: Python - Size: 55.9 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 420 - Forks: 63

dvlab-research/Seg-Zero
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
Language: Python - Size: 4.4 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 298 - Forks: 7

krystalan/DRT
Deep Reasoning Translation via Reinforcement Learning (arXiv preprint 2025); DRT: Deep Reasoning Translation via Long Chain-of-Thought (arXiv preprint 2024)
Size: 2.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 218 - Forks: 9

a-m-team/a-m-models
a-m-team's exploration in large language modeling
Size: 7.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 51 - Forks: 0

The-FinAI/Fino1
This is the repo of developing reasoning models in the specific domain of financial, aim to enhance models capabilities in handling financial reasoning tasks.
Language: Jupyter Notebook - Size: 137 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 7

spcl/x1
Official Implementation of "Reasoning Language Models: A Blueprint"
Language: Python - Size: 563 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 37 - Forks: 6

Wild-Cooperation-Hub/Awesome-MLLM-Reasoning-Benchmarks
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 26 - Forks: 2

zihao-ai/BoT
🔥🔥🔥Breaking long thought processes of o1-like LLMs, such as DeepSeek-R1, QwQ
Language: Python - Size: 13.9 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 19 - Forks: 0

linhaowei1/kumo
☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models
Language: Jupyter Notebook - Size: 630 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 17 - Forks: 0

mdda/getting-to-aha-with-tpus
Reasoning-from-Zero using gemma.JAX.nnx on TPUs
Language: Python - Size: 292 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 9 - Forks: 0

Ruiyang-061X/Awesome-MLLM-Reasoning
📖Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.
Size: 7.81 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

mims-harvard/ToolUniverse
ToolUniverse is a collection of biomedical tools designed for AI agents
Language: Python - Size: 2.93 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 0

DolbyUUU/Sudoku4LLM
Sudoku4LLM is a Sudoku dataset generator for training and evaluating reasoning in Large Language Models (LLMs). It offers customizable puzzles, difficulty levels, and 11 serialization formats to support structured data reasoning and Chain of Thought (CoT) experiments.
Language: Python - Size: 29.3 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 0

Trustworthy-ML-Lab/ThinkEdit
An effective weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study uncovering how reasoning length is encoded in the model’s representation space.
Language: Python - Size: 6.9 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 4 - Forks: 1

NLPForUA/ZNO
Structured test tasks and model tuning scripts for multiple subjects from ZNO - the Ukrainian External Independent Evaluation (ЗНО)
Language: Python - Size: 2.19 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

DolbyUUU/Logic-RL-Lite
Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".
Language: Python - Size: 14.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

aryan-jadon/Synthetic-Data-Generation-and-Evaluation-using-Reasoning-Models
This repository contains the implementation of our research on optimizing Retrieval-Augmented Generation (RAG) systems for technical domains. Our work addresses the unique challenges of precise information extraction from complex, domain-specific documents by introducing token-aware evaluation metrics and synthetic data generation pipeline.
Language: Jupyter Notebook - Size: 13.9 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

XIXUM/XIXUM-modeler
AI Model Generator
Language: Java - Size: 22.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

DolbyUUU/DeepEnlighten
Pure RL without SFT to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.
Language: Python - Size: 21.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

zyuanlim/Awesome-Open-Reasoning
A curated list of awesome open-source and open-weight language models or methods focused on reasoning capabilities.
Size: 2.93 KB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

tubiccelavi/Poker-COACH
Ai Vr Machine Learning Natural language Poker Coach
Language: JavaScript - Size: 42 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 0 - Forks: 0

xhinini/LLM-Reasoning-Review
A curated collection of research papers on reasoning capabilities of Large Language Models (LLMs). This repository organizes and categorizes works that evaluate, benchmark, and analyze reasoning in LLMs, including methods, techniques, datasets, and survey papers.
Size: 26.4 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Hyun-Ryu/clover
Official code for "Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning", ICLR 2025.
Language: Python - Size: 404 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0
