GitHub topics: llm-reasoning
inclusionAI/AReaL
Distributed RL System for LLM Reasoning
Language: Python - Size: 5.27 MB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 1,145 - Forks: 52

yinizhilian/ICLR2025-Papers-with-Code
历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.
Size: 1.47 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 240 - Forks: 11

AmanPriyanshu/My-Personal-Model-Picks
A curated showcase of small, on-premise models with great utility, enabling privacy-centered deployment and easy adaptation to bio-inspired use cases.
Language: HTML - Size: 65.4 KB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

IAAR-Shanghai/Awesome-Attention-Heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
Language: TeX - Size: 6.07 MB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 338 - Forks: 10

inclusionAI/Ling
Ling is a MoE LLM provided and open-sourced by InclusionAI.
Language: Python - Size: 3.26 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 134 - Forks: 13

bruno686/Awesome-RL-based-LLM-Reasoning
Awesome RL-based LLM Reasoning
Size: 48.8 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 415 - Forks: 21

CodeEval-Pro/CodeEval-Pro
Official repo for "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task"
Language: Python - Size: 4.1 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 26 - Forks: 2

AStroCvijo/react_reproduction
Reproduction of ICLR 2023 paper "ReAct: Synergizing Reasoning and Acting in Language Models".
Language: Python - Size: 5.83 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

ethicalabs-ai/ouroboros
Self-Improving LLMs Through Iterative Refinement
Language: Python - Size: 429 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 3 - Forks: 0

Cristian-Curaba/CryptoFormalEval
We introduce a benchmark for testing how well LLMs can find vulnerabilities in cryptographic protocols. By combining LLMs with symbolic reasoning tools like Tamarin, we aim to improve the efficiency and thoroughness of protocol analysis, paving the way for future AI-powered cybersecurity defenses.
Language: Haskell - Size: 7.43 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 5 - Forks: 1

YangLing0818/buffer-of-thought-llm
[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Language: Python - Size: 1.07 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 612 - Forks: 57

YangLing0818/SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
Language: Python - Size: 3.69 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 62 - Forks: 4

tsinghua-fib-lab/SmartAgent
The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".
Size: 4.68 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 24 - Forks: 1

nl4opt/ORQA
[AAAI 2025] ORQA is a new QA benchmark designed to assess the reasoning capabilities of LLMs in a specialized technical domain of Operations Research. The benchmark evaluates whether LLMs can emulate the knowledge and reasoning skills of OR experts when presented with complex optimization modeling tasks.
Size: 2.48 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 37 - Forks: 0

pittisl/PhyT2V
official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
Language: Python - Size: 116 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 1

tegridydev/mixture-of-persona-research
A “Mixture of Perspectives” Framework for Ethical AI
Size: 12.7 KB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

hriaz17/SayLessRAG
Code for the paper: "Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation"
Size: 4.88 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

trangdf/Creating-sample-means-for-measurement-standards-of-intelligence
Creating sample means for measurement standards of intelligence
Size: 1000 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ogrnv/Creating-sample-means-for-measurement-standards-of-intelligence
Creating sample means for measurement standards of intelligence
Language: C - Size: 153 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

bowen-upenn/llm_token_bias
[EMNLP 2024] This is the official implementation of the paper "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners" in PyTorch.
Language: Python - Size: 57.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 9 - Forks: 1

UKPLab/emnlp2024-code-prompting
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024
Language: Python - Size: 46.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 15 - Forks: 5
