Projects in Awesome Lists tagged with long-context
A curated list of projects in awesome lists tagged with long-context .
https://github.com/internlm/internlm
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
chatbot chinese fine-tuning-llm flash-attention gpt large-language-model llm long-context pretrained-models rlhf
Last synced: 14 May 2025
https://github.com/InternLM/InternLM
Official release of InternLM2 7B and 20B base and chat models. 200K context support
chatbot chinese fine-tuning-llm flash-attention gpt large-language-model llm long-context pretrained-models rlhf
Last synced: 16 Mar 2025
https://github.com/dvlab-research/longlora
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
fine-tuning-llm large-language-models llm long-context lora
Last synced: 15 May 2025
https://github.com/dvlab-research/LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
fine-tuning-llm large-language-models llm long-context lora
Last synced: 16 Mar 2025
https://github.com/thudm/longwriter
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
fine-tuning llm long-context long-text
Last synced: 14 May 2025
https://github.com/THUDM/LongWriter
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
fine-tuning llm long-context long-text
Last synced: 08 Aug 2025
https://github.com/thudm/longbench
LongBench v2 and LongBench (ACL 2024)
benchmark llm long-context longtext
Last synced: 15 May 2025
https://github.com/THUDM/LongBench
LongBench v2 and LongBench (ACL 2024)
benchmark llm long-context longtext
Last synced: 16 Oct 2025
https://github.com/haoliuhl/ringattention
Large Context Attention
large-language-models long-context memory-efficient transformers
Last synced: 12 Jan 2026
https://github.com/lucidrains/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
artificial-intelligence attention-mechanisms deep-learning learned-tokenization long-context transformers
Last synced: 09 May 2025
https://github.com/lucidrains/megabyte-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
artificial-intelligence attention-mechanisms deep-learning learned-tokenization long-context transformers
Last synced: 14 May 2025
https://github.com/lucidrains/ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
attention-mechanism distributed-attention efficient-attention long-context
Last synced: 15 May 2025
https://github.com/thudm/longcite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
benchmark citation-generation fine-tuning llm long-context
Last synced: 08 Apr 2025
https://github.com/THUDM/LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
benchmark citation-generation fine-tuning llm long-context
Last synced: 16 Oct 2025
https://github.com/nvidia/kvpress
LLM KV cache compression made easy
inference kv-cache kv-cache-compression large-language-models llm long-context python pytorch transformers
Last synced: 09 Apr 2026
https://github.com/lucidrains/recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
artificial-intelligence attention-mechanisms deep-learning long-context memory recurrence transformers
Last synced: 15 May 2025
https://github.com/thunlp/infllm
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
large-language-models llm long-context training-free
Last synced: 06 Apr 2025
https://github.com/thunlp/InfLLM
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
large-language-models llm long-context training-free
Last synced: 05 Apr 2025
https://github.com/openbmb/infinitebench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
benchmark large-language-models long-context
Last synced: 05 Jul 2025
https://github.com/VITA-MLLM/Long-VITA
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
long-context mllm vision-language-model
Last synced: 31 Mar 2025
https://github.com/thudm/longalign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
alignment llm long-context longtext
Last synced: 12 Apr 2025
https://github.com/OpenBMB/InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
benchmark large-language-models long-context
Last synced: 17 Apr 2025
https://github.com/THUDM/LongAlign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
alignment llm long-context longtext
Last synced: 16 Oct 2025
https://github.com/Infini-AI-Lab/TriForce
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
acceleration efficiency inference llm llm-inference long-context speculative-decoding
Last synced: 16 May 2025
https://bigai-nlco.github.io/LooGLE/
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
acl2024 large-language-models llm long-context
Last synced: 29 Mar 2025
https://github.com/yangjianxin1/longqlora
LongQLoRA: Extent Context Length of LLMs Efficiently
llm long-context longlora lora qlora
Last synced: 12 Sep 2025
https://github.com/yangjianxin1/LongQLoRA
LongQLoRA: Extent Context Length of LLMs Efficiently
llm long-context longlora lora qlora
Last synced: 16 Oct 2025
https://github.com/bigai-nlco/LooGLE
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
acl2024 large-language-models llm long-context
Last synced: 09 May 2025
https://github.com/bytedance/shadowkv
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
cpu-offload high-throughput llm-inference long-context low-rank research sparse-attention
Last synced: 04 Apr 2025
https://github.com/nightdessert/Retrieval_Head
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
large-language-models long-context
Last synced: 08 May 2025
https://github.com/x-plug/writingbench
WritingBench: A Comprehensive Benchmark for Generative Writing
ai benchmark evaluation-framework huggingface llm long-context long-text nlp text-generation writing
Last synced: 01 Sep 2025
https://github.com/Glaciohound/LM-Infinite
Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
language-model long-context model-diagnostics
Last synced: 16 May 2025
https://github.com/OpenGVLab/MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.
benchmark long-context multimodal-large-language-models vision-language-model
Last synced: 17 Apr 2025
https://github.com/QingFei1/LongRAG
[EMNLP 2024] LongRAG: A Dual-perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
Last synced: 07 May 2025
https://github.com/lucidrains/perceiver-ar-pytorch
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
artficial-intelligence attention-mechanism deep-learning long-context transformer
Last synced: 15 Jul 2025
https://github.com/nick7nlp/Counting-Stars
Counting-Stars (★)
evaluation-metrics large-language-model long-context
Last synced: 07 May 2025
https://github.com/open-compass/ada-leval
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
Last synced: 14 Aug 2025
https://github.com/lucidrains/flash-genomics-model
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
artificial-intelligence attention-mechanisms deep-learning genomics long-context transformers
Last synced: 30 Apr 2025
https://github.com/dvlab-research/q-llm
This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"
fast-inference inference-acceleration kv-cache-compression large-language-models long-context
Last synced: 03 Jul 2025
https://github.com/vita-group/ms-poe
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.
large-language-models long-context lost-in-the-middle positional-encoding
Last synced: 19 Apr 2025
https://github.com/vectifyai/condb
ConDB: The KV-Cache Native Context Database
agents ai context-database kv-cache llm long-context rag reasoning retrieval tree-search
Last synced: 04 Jun 2026
https://github.com/VITA-Group/Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.
large-language-models long-context lost-in-the-middle positional-encoding
Last synced: 16 May 2025
https://github.com/4ai/ran
RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23
acl acl2023 long-context long-context-attention long-context-transformers long-document-modeling recurrent-attention-networks recurrent-networks
Last synced: 23 Apr 2025
https://github.com/openmoss/longllada
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
diffusion diffusion-language-models large-language-models length-extrapolation long-context
Last synced: 23 Jul 2025
https://github.com/asigalov61/Heptabit-Music-Transformer
[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens
artificial-intelligence heptabit heptagon heptagram long-context midi music-ai music-transformer sota-model
Last synced: 14 Jul 2025
https://github.com/dmis-lab/ethic
[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
benchmark evaluation long-context
Last synced: 12 Oct 2025
https://github.com/openmoss/reattention
[ICLR2025] ReAttention, a training-free approach to break the maximum context length in length extrapolation
large-language-model length-extrapolation long-context triton
Last synced: 09 Oct 2025
https://github.com/jagmarques/nexusquant
Training-free KV cache compression for LLMs. 10-33x compression via E8 lattice quantization + attention-aware token eviction. One line of code.
attention compression e8-lattice inference kv-cache llama llm long-context memory-efficient mistral pytorch quantization token-eviction transformers vector-quantization
Last synced: 01 May 2026
https://github.com/reddec/dreaming-bard
LLM assistant to create long books/stories/documents
llm long-context novel-writing
Last synced: 10 Aug 2025
https://github.com/rgtjf/untie-the-knots
Untie-the-Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models
language-model long-context untie-the-knots
Last synced: 29 Jan 2026
https://github.com/tjamescouch/gro
Provider-agnostic LLM CLI wrapper (claude/openai/gemini)
agent-framework agent-runtime ai-agents ai-infrastructure ai-runtime autonomous-agents context-management llm llm-agents llm-runtime long-context mcp model-context-protocol multi-agent
Last synced: 01 Mar 2026
https://github.com/melvinebenezer/liah-lie_in_a_haystack
needle in a haystack for LLMs
llm llm-inference llms-benchmarking long-context needle-in-haystack
Last synced: 09 Jul 2025
https://github.com/manishklach/intent-attention-kernel
Intent-aware attention research prototype that treats long-context inference as structured semantic blocks instead of a flat token stream, proving CPU-first correctness and analytical KV/FLOP savings before GPU kernel implementation.
agentic-ai ai-infrastructure attention block-attention cost-model cuda gpu-kernels inference kernel-research kv-cache llm-inference long-context python pytorch research semantic-attention sparse-attention systems transformers triton
Last synced: 28 May 2026
https://github.com/zircote/rlm-rs-plugin
Claude Code plugin for processing documents 100x larger than context limits using the Recursive Language Model pattern. Rust-powered chunking, hybrid semantic + BM25 search, and sub-LLM orchestration.
ai-agents bm25 chunking claude-code claude-code-plugin document-processing hybrid-search llm long-context recursive-language-model rlm rust semantic-search sqlite
Last synced: 08 Apr 2026
https://github.com/neosun100/kimi-linear-vllm-docker-serve
Dockerized vLLM serving for Kimi-Linear-48B-A3B (AWQ-4bit), from 128K to 1M context.
awq docker kimi-linear llm-serving long-context vllm
Last synced: 31 Jan 2026
https://github.com/graph-com/haystackcraft
Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation
agent benchmark context-engineering deep-research llm long-context rag retrieval
Last synced: 15 Apr 2026
https://github.com/stanford-oval/sliders
Repository for paper: Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets
agents document-ai long-context
Last synced: 03 Jun 2026
https://github.com/denial-web/hard-needle
Semantically hard multi-needle long-context data generator. Stop testing LLMs with random-password needles.
benchmark llm llm-evaluation long-context needle-in-a-haystack python rag synthetic-data
Last synced: 08 May 2026
https://github.com/harvey-fin/absence-bench
Code implementation for paper AbsenceBench: Language Models Can't Tell What's Missing
benchmark long-context natural-language-processing
Last synced: 29 Nov 2025
https://github.com/framsouza/slack-gemini-summarizer
A solution to fetch and analyze Slack channel conversations, leveraging the Gemini 1.5 Pro API for summarization.
ai gemini-pro genai long-context slack
Last synced: 18 Apr 2026
https://github.com/leagames0221-sys/longctx-bench-honest
Honest measurement of 1M-token long-context benchmarks (RULER + LongBench v2 + NIAH) on Qwen2.5-7B-1M local vs GitHub Models cloud. All zero credit card, drift-checked, reproducible.
benchmark bitsandbytes consumer-laptop github-models llm long-context niah portfolio qwen qwen2-5 transformers vllm
Last synced: 02 Jun 2026