GitHub topics: long-context
ByteDance-Seed/ShadowKV
[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Language: Python - Size: 20.5 MB - Last synced at: about 23 hours ago - Pushed at: 16 days ago - Stars: 179 - Forks: 12

Infini-AI-Lab/TriForce
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Language: Python - Size: 71.7 MB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 250 - Forks: 17

THUDM/LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Language: Python - Size: 15.2 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 491 - Forks: 32

metame-ai/awesome-llm-plaza
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
Size: 1.06 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 196 - Forks: 13

dvlab-research/LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Language: Python - Size: 11.1 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 2,658 - Forks: 288

THUDM/LongWriter
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Language: Python - Size: 178 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 1,650 - Forks: 162

lucidrains/infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
Language: Python - Size: 34.2 MB - Last synced at: about 21 hours ago - Pushed at: 4 months ago - Stars: 111 - Forks: 4

lucidrains/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Language: Python - Size: 34.5 MB - Last synced at: 4 days ago - Pushed at: 5 months ago - Stars: 643 - Forks: 55

lucidrains/recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Language: Python - Size: 34.3 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 407 - Forks: 17

lucidrains/ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Language: Python - Size: 1.01 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 513 - Forks: 31

InternLM/InternLM
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
Language: Python - Size: 7.12 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 6,898 - Forks: 483

abdelfattah-lab/xKV
xKV: Cross-Layer SVD for KV-Cache Compression
Language: Python - Size: 30.9 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 24 - Forks: 1

Linking-ai/SCOPE
SCOPE: Optimizing KV Cache Compression in Long-context Generation
Language: Jupyter Notebook - Size: 6.21 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 23 - Forks: 2

NVIDIA/kvpress
LLM KV cache compression made easy
Language: Python - Size: 5.55 MB - Last synced at: 3 days ago - Pushed at: 12 days ago - Stars: 476 - Forks: 36

bowen-upenn/PersonaMem
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Language: Python - Size: 523 MB - Last synced at: 4 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 1

haoliuhl/ringattention
Large Context Attention
Language: Python - Size: 161 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 710 - Forks: 53

VITA-MLLM/Long-VITA
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
Language: Python - Size: 3.85 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 275 - Forks: 28

X-PLUG/WritingBench
WritingBench: A Comprehensive Benchmark for Generative Writing
Language: Python - Size: 18 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 73 - Forks: 8

adobe-research/NoLiMa
Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"
Language: Python - Size: 53.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 67 - Forks: 4

nightdessert/Retrieval_Head
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
Language: Python - Size: 1020 KB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 189 - Forks: 18

bigai-nlco/LooGLE
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
Language: Python - Size: 7.15 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 182 - Forks: 6

QingFei1/LongRAG
[EMNLP 2024] LongRAG: A Dual-perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
Language: Python - Size: 2.1 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 102 - Forks: 14

CognitiveAISystems/RATE
Official implementation of Recurrent Action Transformer with Memory, an offline RL agent with memory mechanisms. https://sites.google.com/view/rate-model/
Language: Shell - Size: 58.6 MB - Last synced at: 29 days ago - Pushed at: 30 days ago - Stars: 8 - Forks: 2

OpenBMB/InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Language: Python - Size: 6.2 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 319 - Forks: 26

VITA-Group/Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.
Language: Python - Size: 7.53 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 29 - Forks: 3

Glaciohound/LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
Language: Python - Size: 2.09 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 142 - Forks: 13

OpenGVLab/MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.
Language: Python - Size: 2.83 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 115 - Forks: 6

THUDM/LongBench
LongBench v2 and LongBench (ACL 2024)
Language: Python - Size: 8.47 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 836 - Forks: 80

lucidrains/perceiver-ar-pytorch
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
Language: Python - Size: 34.2 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 87 - Forks: 4

THUDM/LongAlign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
Language: Python - Size: 6.08 MB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 249 - Forks: 19

OpenMOSS/Thus-Spake-Long-Context-LLM
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
Size: 50.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 45 - Forks: 1

thunlp/InfLLM
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
Language: Python - Size: 273 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 348 - Forks: 34

kingabzpro/Gemini-2.5-Pro-Coding-App
Load the project from the zip file and ask Gemini 2.5 Pro to improve it.
Language: Jupyter Notebook - Size: 16.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

nick7nlp/Counting-Stars
Counting-Stars (★)
Language: Jupyter Notebook - Size: 120 MB - Last synced at: 11 days ago - Pushed at: 9 months ago - Stars: 82 - Forks: 2

yangjianxin1/LongQLoRA
LongQLoRA: Extent Context Length of LLMs Efficiently
Language: Python - Size: 1.92 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 164 - Forks: 15

bigai-nlco/VideoLLaMB
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
Language: Python - Size: 59.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 63 - Forks: 2

open-compass/Ada-LEval
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
Language: Python - Size: 1.72 MB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 53 - Forks: 3

ZetangForward/MyRLHF Fork of OpenRLHF/OpenRLHF
Copy from OpenRLHF
Language: Jupyter Notebook - Size: 305 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

dmis-lab/ETHIC
[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Language: Python - Size: 367 KB - Last synced at: 13 days ago - Pushed at: 4 months ago - Stars: 14 - Forks: 0

4AI/RAN
RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23
Language: Python - Size: 556 KB - Last synced at: 25 days ago - Pushed at: almost 2 years ago - Stars: 22 - Forks: 3

framsouza/slack-gemini-summarizer
A solution to fetch and analyze Slack channel conversations, leveraging the Gemini 1.5 Pro API for summarization.
Language: Python - Size: 3.91 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

lucidrains/flash-genomics-model
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
Language: Python - Size: 12.7 KB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 52 - Forks: 5

rgtjf/Untie-the-Knots
Untie-the-Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models
Size: 12.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

davendw49/Awesome-Long-Context-Language-Modeling
Papers of Long Context Language Model
Size: 14.6 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 1

dvlab-research/Q-LLM
This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"
Language: Python - Size: 6.84 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 29 - Forks: 0

jeffreysijuntan/lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
Language: Python - Size: 155 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 93 - Forks: 9

sayhitosandy/Mamba_SSM
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Size: 5.52 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

melvinebenezer/Liah-Lie_in_a_haystack
needle in a haystack for LLMs
Language: Python - Size: 2.42 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

lucaslingle/e-lra
Streamlined variant of Long-Range Arena with pinned dependencies, automated data downloads, and deterministic shuffling.
Language: Python - Size: 186 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

nopperl/corporate_emission_reports
Language: TeX - Size: 937 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

asigalov61/Heptabit-Music-Transformer
[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens
Language: Python - Size: 1.3 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 0
