An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: long-context

ByteDance-Seed/ShadowKV

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Language: Python - Size: 20.5 MB - Last synced at: about 23 hours ago - Pushed at: 16 days ago - Stars: 179 - Forks: 12

Infini-AI-Lab/TriForce

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Language: Python - Size: 71.7 MB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 250 - Forks: 17

THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Language: Python - Size: 15.2 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 491 - Forks: 32

metame-ai/awesome-llm-plaza

awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.

Size: 1.06 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 196 - Forks: 13

dvlab-research/LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Language: Python - Size: 11.1 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 2,658 - Forks: 288

THUDM/LongWriter

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Language: Python - Size: 178 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 1,650 - Forks: 162

lucidrains/infini-transformer-pytorch

Implementation of Infini-Transformer in Pytorch

Language: Python - Size: 34.2 MB - Last synced at: about 21 hours ago - Pushed at: 4 months ago - Stars: 111 - Forks: 4

lucidrains/MEGABYTE-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch

Language: Python - Size: 34.5 MB - Last synced at: 4 days ago - Pushed at: 5 months ago - Stars: 643 - Forks: 55

lucidrains/recurrent-memory-transformer-pytorch

Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch

Language: Python - Size: 34.3 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 407 - Forks: 17

lucidrains/ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Language: Python - Size: 1.01 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 513 - Forks: 31

InternLM/InternLM

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Language: Python - Size: 7.12 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 6,898 - Forks: 483

abdelfattah-lab/xKV

xKV: Cross-Layer SVD for KV-Cache Compression

Language: Python - Size: 30.9 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 24 - Forks: 1

Linking-ai/SCOPE

SCOPE: Optimizing KV Cache Compression in Long-context Generation

Language: Jupyter Notebook - Size: 6.21 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 23 - Forks: 2

NVIDIA/kvpress

LLM KV cache compression made easy

Language: Python - Size: 5.55 MB - Last synced at: 3 days ago - Pushed at: 12 days ago - Stars: 476 - Forks: 36

bowen-upenn/PersonaMem

Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale

Language: Python - Size: 523 MB - Last synced at: 4 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 1

haoliuhl/ringattention

Large Context Attention

Language: Python - Size: 161 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 710 - Forks: 53

VITA-MLLM/Long-VITA

✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy

Language: Python - Size: 3.85 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 275 - Forks: 28

X-PLUG/WritingBench

WritingBench: A Comprehensive Benchmark for Generative Writing

Language: Python - Size: 18 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 73 - Forks: 8

adobe-research/NoLiMa

Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"

Language: Python - Size: 53.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 67 - Forks: 4

nightdessert/Retrieval_Head

open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality

Language: Python - Size: 1020 KB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 189 - Forks: 18

bigai-nlco/LooGLE

ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models

Language: Python - Size: 7.15 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 182 - Forks: 6

QingFei1/LongRAG

[EMNLP 2024] LongRAG: A Dual-perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

Language: Python - Size: 2.1 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 102 - Forks: 14

CognitiveAISystems/RATE

Official implementation of Recurrent Action Transformer with Memory, an offline RL agent with memory mechanisms. https://sites.google.com/view/rate-model/

Language: Shell - Size: 58.6 MB - Last synced at: 29 days ago - Pushed at: 30 days ago - Stars: 8 - Forks: 2

OpenBMB/InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Language: Python - Size: 6.2 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 319 - Forks: 26

VITA-Group/Ms-PoE

"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.

Language: Python - Size: 7.53 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 29 - Forks: 3

Glaciohound/LM-Infinite

Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"

Language: Python - Size: 2.09 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 142 - Forks: 13

OpenGVLab/MM-NIAH

[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.

Language: Python - Size: 2.83 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 115 - Forks: 6

THUDM/LongBench

LongBench v2 and LongBench (ACL 2024)

Language: Python - Size: 8.47 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 836 - Forks: 80

lucidrains/perceiver-ar-pytorch

Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch

Language: Python - Size: 34.2 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 87 - Forks: 4

THUDM/LongAlign

[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs

Language: Python - Size: 6.08 MB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 249 - Forks: 19

OpenMOSS/Thus-Spake-Long-Context-LLM

a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation

Size: 50.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 45 - Forks: 1

thunlp/InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"

Language: Python - Size: 273 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 348 - Forks: 34

kingabzpro/Gemini-2.5-Pro-Coding-App

Load the project from the zip file and ask Gemini 2.5 Pro to improve it.

Language: Jupyter Notebook - Size: 16.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

nick7nlp/Counting-Stars

Counting-Stars (★)

Language: Jupyter Notebook - Size: 120 MB - Last synced at: 11 days ago - Pushed at: 9 months ago - Stars: 82 - Forks: 2

yangjianxin1/LongQLoRA

LongQLoRA: Extent Context Length of LLMs Efficiently

Language: Python - Size: 1.92 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 164 - Forks: 15

bigai-nlco/VideoLLaMB

Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges

Language: Python - Size: 59.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 63 - Forks: 2

open-compass/Ada-LEval

The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"

Language: Python - Size: 1.72 MB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 53 - Forks: 3

ZetangForward/MyRLHF Fork of OpenRLHF/OpenRLHF

Copy from OpenRLHF

Language: Jupyter Notebook - Size: 305 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

dmis-lab/ETHIC

[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage

Language: Python - Size: 367 KB - Last synced at: 13 days ago - Pushed at: 4 months ago - Stars: 14 - Forks: 0

4AI/RAN

RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23

Language: Python - Size: 556 KB - Last synced at: 25 days ago - Pushed at: almost 2 years ago - Stars: 22 - Forks: 3

framsouza/slack-gemini-summarizer

A solution to fetch and analyze Slack channel conversations, leveraging the Gemini 1.5 Pro API for summarization.

Language: Python - Size: 3.91 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

lucidrains/flash-genomics-model

My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)

Language: Python - Size: 12.7 KB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 52 - Forks: 5

rgtjf/Untie-the-Knots

Untie-the-Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models

Size: 12.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

davendw49/Awesome-Long-Context-Language-Modeling

Papers of Long Context Language Model

Size: 14.6 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 1

dvlab-research/Q-LLM

This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"

Language: Python - Size: 6.84 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 29 - Forks: 0

jeffreysijuntan/lloco

The official repo for "LLoCo: Learning Long Contexts Offline"

Language: Python - Size: 155 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 93 - Forks: 9

sayhitosandy/Mamba_SSM

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Size: 5.52 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

melvinebenezer/Liah-Lie_in_a_haystack

needle in a haystack for LLMs

Language: Python - Size: 2.42 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

lucaslingle/e-lra

Streamlined variant of Long-Range Arena with pinned dependencies, automated data downloads, and deterministic shuffling.

Language: Python - Size: 186 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

nopperl/corporate_emission_reports

Language: TeX - Size: 937 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

asigalov61/Heptabit-Music-Transformer

[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens

Language: Python - Size: 1.3 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 0

Related Keywords
long-context 51 llm 21 large-language-models 10 benchmark 8 transformers 7 deep-learning 6 artificial-intelligence 5 kv-cache-compression 4 evaluation 4 llm-inference 4 lora 3 attention-mechanisms 3 attention-mechanism 3 large-language-model 3 memory 3 nlp 3 long-text 2 transformer 2 fine-tuning-llm 2 vision-language-model 2 ai 2 chatbot 2 pytorch 2 low-rank 2 fine-tuning 2 rlhf 2 language-model 2 inference 2 longtext 2 evaluation-metrics 1 longlora 1 qlora 1 video-language-pretraining 1 video-language-understanding 1 gpt4 1 acl 1 acl2023 1 sota-model 1 gradio 1 gemini-api 1 training-free 1 training 1 survey 1 paper-list 1 infrastructure 1 architecture 1 alignment 1 artficial-intelligence 1 multimodal-large-language-models 1 model-diagnostics 1 positional-encoding 1 music-transformer 1 music-ai 1 midi 1 heptagram 1 heptagon 1 heptabit 1 information-extraction 1 data-extraction 1 long-range-arena 1 needle-in-haystack 1 llms-benchmarking 1 ssm 1 mamba 1 generative-ai 1 finetune 1 context-compression 1 inference-acceleration 1 fast-inference 1 awesome-list 1 untie-the-knots 1 genomics 1 slack 1 genai 1 gemini-pro 1 recurrent-networks 1 recurrent-attention-networks 1 long-document-modeling 1 long-context-transformers 1 long-context-attention 1 lost-in-the-middle 1 gpt 1 flash-attention 1 chinese 1 efficient-attention 1 distributed-attention 1 recurrence 1 learned-tokenization 1 llm-application 1 awesome-robotics-llm 1 awesome-rlhf 1 awesome-multimod-llm 1 awesome-llm-security 1 awesome-llm-reasoning 1 awesome-llm-prompt 1 awesome-llm-plaza 1 awesome-llm-agents 1 awesome-code-llm 1 awesome 1 citation-generation 1