https://github.com/saibo-creator/awesome-llm-constrained-decoding
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
https://github.com/saibo-creator/awesome-llm-constrained-decoding
List: awesome-llm-constrained-decoding
Last synced: 4 months ago
JSON representation
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
- Host: GitHub
- URL: https://github.com/saibo-creator/awesome-llm-constrained-decoding
- Owner: Saibo-creator
- License: mit
- Created: 2024-07-13T17:39:22.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-08-18T08:11:51.000Z (5 months ago)
- Last Synced: 2025-08-30T06:25:29.109Z (4 months ago)
- Size: 143 KB
- Stars: 252
- Watchers: 13
- Forks: 10
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- ultimate-awesome - awesome-llm-constrained-decoding - A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources. (Other Lists / TeX Lists)
README
# Awesome-LLM-Constrained-Decoding
> Towards reliable, controllable and more efficient generation with Large Language Models (LLMs)
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
## Table of Contents
- [Awesome-LLM-Constrained-Decoding](#awesome-llm-constrained-decoding)
- [Table of Contents](#table-of-contents)
- [Libraries](#libraries)
- [Papers](#papers)
- [Benchmark \& Datasets \& Evaluation](#benchmark--datasets--evaluation)
- [Survey](#survey)
- [Blog Posts](#blog-posts)
- [Related Awesome Lists](#related-awesome-lists)
- [Disclaimer](#disclaimer)
- [Contributing](#contributing)
## Libraries
| Library | Feature | Stars |
| --------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ |
| [guidance-ai/guidance](https://github.com/guidance-ai/guidance) | CFG, Regex, JSON Schema, Token Forcing, compatible with Transformers, LLAMA-CPP |  |
| [outlines-dev/outlines](https://github.com/outlines-dev/outlines) | CFG, Unicode support, Hugging Face ecosystem, VLLM support |  |
| [eth-sri/lmql](https://github.com/eth-sri/lmql) | Regex support, various constraints, more powerful control flow |  |
| [jxnl/instructor](https://github.com/jxnl/instructor) | Try-Reject-Repeat approach to ensure constraints are met |  |
| [microsoft/aici](https://github.com/microsoft/aici) | A general framework of LLM controller with native support for CFG, Regex, JSON Schema |  |
| [noamgat/lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) | Regex, JSON Schema, Beam Search etc. |  |
| [mlc-ai/xgrammar](https://github.com/mlc-ai/xgrammar) | CFG, careful system optimizations |  |
| [epfl-dlab/transformers-CFG](https://github.com/epfl-dlab/transformers-CFG) | CFG (EBNF Interface), Compatible with Transformers, Easy to extend for research |  |
| [uiuc-focal-lab/syncode](https://github.com/uiuc-focal-lab/syncode) | CFG generation that supports builtin grammars like JSON, Python, Go, and more |  |
| [Dan-wanna-M/formatron](https://github.com/Dan-wanna-M/formatron) | Regex, JSON Schema, CFG, etc |  |
| [genlm/genlm-control](https://github.com/genlm/genlm-control) | Arbitrary programmable syntactic and semantic constraints, Constrained decoding as posterior inference, Sequential Monte Carlo |  |
| [structuredllm/itergen](https://github.com/structuredllm/itergen) | CFG generation and backtracking to handle semantic constraints |  |
| [eth-sri/type-constrained-code-generation](https://github.com/eth-sri/type-constrained-code-generation) | TypeScript, including type-safety. |  |
| [snowkylin/circuit-transformer](https://github.com/snowkylin/circuit-transformer) | Generating logical circuits that are strictly equivalent to given Boolean functions. |  |
| [epfl-dlab/jsonschemabench](https://github.com/epfl-dlab/jsonschemabench) | A benchmarking framework for evaluating constrained decoding engines on JSON Schema. Supports Guidance, Outlines, XGrammar, OpenAI and more. |  |
| [eth-sri/constrained-diffusion](https://github.com/eth-sri/constrained-diffusion) | Diffusion LLMs, Multi-Region Infilling (CFG) |  |
Disclaimer:
- The libraries listed above are **not** exhaustive and are subject to change.
- The features mentioned are **100% not** exhaustive and I strongly recommend checking the respective repositories for more details.
- The libraries are listed by the Github stars
- If you are the author of a library and would like to add or update the information, please open an issue or submit a pull request.
## Papers
Papers with are newly added papers (not necessarily newly published papers).
**Constrained Decoding for Autoregressive Models**
| Date | Paper | Publication |
| :-----: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------: |
| 2025-06 | [Earley-Driven Dynamic Pruning for Efficient Structured Decoding](https://arxiv.org/abs/2506.01151) | ICML |
| 2025-05 | [Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling](https://arxiv.org/abs/2504.05410) | Preprint |
| 2025-05 | [Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo](https://arxiv.org/abs/2504.13139) | ICLR |
| 2025-04 | [Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models](https://arxiv.org/abs/2504.09135) | AISTATS |
| 2025-02 | [CRANE: Reasoning with constrained LLM generation](https://arxiv.org/abs/2502.09061) | ICML |
| 2025-05 | [Type-Constrained Code Generation with Language Models](https://pldi25.sigplan.org/details/pldi-2025-papers/25/Type-Constrained-Code-Generation-with-Language-Models) | PLDI |
| 2025-02 | [Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding](https://arxiv.org/abs/2502.14969) | Preprint |
| 2025-02 | [Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence](https://arxiv.org/pdf/2502.14905) | Preprint |
| 2025-02 | [Flexible and Efficient Grammar-Constrained Decoding](https://arxiv.org/abs/2502.05111) | Preprint |
| 2025-01 | [Circuit Transformer: A Transformer That Preserves Logical Equivalence](https://openreview.net/forum?id=kpnW12Lm9p) | ICLR |
| 2025-01 | [Generating Structured Outputs from Language Models: Benchmark and Studies](https://arxiv.org/abs/2501.10868) | Preprint |
| 2024-11 | [XGRAMMAR: FLEXIBLE AND EFFICIENT STRUCTURED GENERATION ENGINE FOR LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2411.15100) | Preprint |
| 2024-10 | [IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking](https://arxiv.org/abs/2410.07295) | ICLR |
| 2024-08 | [Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models](https://arxiv.org/html/2408.02442v1) | Preprint |
| 2024-08 | [FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking](https://arxiv.org/abs/2407.13945) | Preprint |
| 2024-07 | [Automata-based constraints for language model decoding](https://arxiv.org/abs/2407.08103) | CoLM |
| 2024-06 | [Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access](https://arxiv.org/abs/2401.09967) | ACL |
| 2024-05 | [Grammar-Aligned Decoding](https://arxiv.org/abs/2405.21047) | Preprint |
| 2024-03 | [SynCode: LLM Generation with Grammar Augmentation](https://arxiv.org/abs/2403.01632) | Preprint |
| 2024-03 | [Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation](https://icml.cc/virtual/2024/poster/33037) | ICML |
| 2024-02 | [Constrained Decoding for Cross-lingual Label Projection](https://arxiv.org/abs/2402.03131) | ICLR |
| 2024-02 | [Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars](https://arxiv.org/abs/2402.17988) | Preprint |
| 2024-02 | [Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents](https://arxiv.org/abs/2402.00798) | Preprint |
| 2023-12 | [SGLang: Efficient Execution of Structured Language Model Programs](https://arxiv.org/abs/2312.07104) | Preprint |
| 2023-12 | [Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context](https://papers.nips.cc/paper_files/paper/2023/hash/662b1774ba8845fc1fa3d1fc0177ceeb-Abstract-Conference.html) | NeurIPS |
| 2023-11 | [Prompt Sketching for Large Language Models](https://arxiv.org/abs/2311.04954) | Preprint |
| 2023-11 | [Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs](https://arxiv.org/abs/2306.03081) | PADL |
| 2023-10 | [Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding](https://arxiv.org/abs/2310.07075) | Preprint |
| 2023-10 | [Amortizing intractable inference in large language models](https://arxiv.org/abs/2310.04363) | ICLR |
| 2023-10 | [Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting](https://aclanthology.org/2023.wmt-1.80/) | EMNLP |
| 2023-10 | [KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection](https://aclanthology.org/2023.emnlp-main.867/) | EMNLP |
| 2023-10 | [Lazy-k Decoding: Constrained Decoding for Information Extraction](https://aclanthology.org/2023.emnlp-main.416/) | EMNLP |
| 2023-07 | [Efficient Guided Generation for Large Language Models](https://arxiv.org/abs/2307.09702) | Preprint |
| 2023-06 | [Grammar Prompting for Domain-Specific Language Generation with Large Language Models](https://proceedings.neurips.cc/paper_files/paper/2023/hash/cd40d0d65bfebb894ccc9ea822b47fa8-Abstract-Conference.html) | NeurIPS |
| 2023-06 | [Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning](https://aclanthology.org/2023.emnlp-main.674/) | EMNLP |
| 2023-06 | [Prompting Is Programming: A Query Language for Large Language Models](https://dl.acm.org/doi/10.1145/3591300) | PLDI |
| 2023-05 | [Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing](https://aclanthology.org/2023.findings-emnlp.478/) | EMNLP Findings |
| 2023-04 | [Tractable Control for Autoregressive Language Generation](https://arxiv.org/abs/2304.07438) | ICML |
| 2022-11 | [Validating Large Language Models with ReLM](https://proceedings.mlsys.org/paper_files/paper/2023/file/93c7d9da61ccb2a60ac047e92787c3ef-Paper-mlsys2023.pdf) | MLSys |
| 2022-11 | [CodePAD: Sequence-based Code Generation with Pushdown Automaton](https://arxiv.org/abs/2211.00818) | ISSTA |
| 2022-05 | [Controllable Text Generation with Neurally-Decomposed Oracle](https://arxiv.org/abs/2205.14219) | NeurIPS |
| 2022-05 | [Gradient-Based Constrained Sampling from Language Models](https://aclanthology.org/2022.emnlp-main.144/) | EMNLP |
| 2022-02 |[COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics](https://arxiv.org/abs/2202.11705) | NeurIPS |
| 2022-01 | [Synchromesh: Reliable code generation from pre-trained language models](https://arxiv.org/abs/2201.11227) | ICLR |
| 2021-12 | [PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models](https://aclanthology.org/2021.emnlp-main.779/) | EMNLP |
| 2021-12 | [Constrained Language Models Yield Few-Shot Semantic Parsers](https://aclanthology.org/2021.emnlp-main.608/) | EMNLP |
| 2021-12 | [Controlled Text Generation as Continuous Optimization with Multiple Constraints](https://papers.neurips.cc/paper/2021/file/79ec2a4246feb2126ecf43c4a4418002-Paper.pdf) | NeurIPS |
| 2021-06 | [NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints](https://aclanthology.org/2021.naacl-main.339.pdf) | NAACL |
| 2019-05 | [A General-Purpose Algorithm for Constrained Sequential Inference](https://aclanthology.org/K19-1045/) | CoNLL |
| 2019-05 | [Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting](https://aclanthology.org/N19-1090/) | NAACL |
| 2018-09 | [CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling](https://arxiv.org/abs/1811.10996) | AAAI |
| 2018-05 | [Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation](https://aclanthology.org/N18-1119/) | NAACL |
| 2018-04 | [Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method](https://ojs.aaai.org/index.php/AAAI/article/view/11990) | AAAI |
| 2017-12 | [Guided Open Vocabulary Image Captioning with Constrained Beam Search](https://aclanthology.org/D17-1098/) | EMNLP |
| 2017-06 | [Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search](https://aclanthology.org/P17-1141/) | ACL |
**Constrained Decoding for Diffusion Models**
| Date | Paper | Publication |
| :-----: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------: |
| 2025-08 | [Constrained Decoding of Diffusion LLMs with Context-Free Grammars](https://arxiv.org/abs/2508.10111) | Preprint |
| 2025-05 | [DINGO: Constrained Inference for Diffusion LLMs](https://arxiv.org/abs/2505.23061) | Preprint |
| 2025-03 | [Constrained Language Generation with Discrete Diffusion Models](https://arxiv.org/abs/2503.09790) | Preprint |
## Benchmark & Datasets & Evaluation
| Date | Paper | Publication |
| :-----: | :-------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------: |
| 2025-08 | [JSON-mode Eval Cleaned/Extended, SMILES Eval and HumanEval MRI C++](https://huggingface.co/collections/eth-sri/constrained-diffusion-tasks-689dc25e7ae9a5b6da001e96) | HF Hub |
| 2025-01 | [JsonSchemaBench: Generating Structured Outputs from Language Models: Benchmark and Studies](https://arxiv.org/abs/2501.10868v1) | Preprint |
| 2024-05 | [COLLIE: Systematic Construction of Constrained Text Generation Tasks](https://arxiv.org/abs/2307.08689) | ICLR |
| 2024-02 | [JSON-mode Eval dataset](https://huggingface.co/datasets/NousResearch/json-mode-eval?row=0) | HF hub |
| 2023-12 | [BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing](https://arxiv.org/abs/2206.10668) | NeurIPS Track on Datasets and Benchmarks |
| 2023-10 | [Evaluating Large Language Models on Controlled Generation Tasks](https://arxiv.org/abs/2310.14542) | Preprint |
| 2023-09 | [Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?](https://arxiv.org/abs/2309.08963) | Preprint |
| 2021-10 | [NLV corpus](https://nlvcorpus.github.io/) | CHI |
| 2020-12 | [CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning](https://aclanthology.org/2020.findings-emnlp.165.pdf) | EMNLP Findings |
| 2018-09 | [Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task ](https://arxiv.org/abs/1809.08887) | EMNLP |
## Survey
| Date | Paper | Publication |
| :-----: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------: |
| 2024-04 | ["We Need Structured Output": Towards User-centered Constraints on Large Language Model Output](https://arxiv.org/abs/2404.07362) | Preprint|
## Blog Posts
- [The good, the bad, and the ugly of Gemini’s structured outputs](https://dylancastillo.co/posts/gemini-structured-outputs.html) by Dylan Castillo
- [Leveraging Constrained Sampling for Fill-in-the-Middle Code Completion](https://blog.nielstron.de/2024/08/14/leveraging-constrained-sampling-for-fill-in-the-middle-code-completion/) by nielstron
- [Proper Well-Formedness for Finite LLM Sampling](https://blog.nielstron.de/2024/08/13/proper-well-formedness-for-finite-llm-sampling/) by nielstron
- [LLM Decoding with Regex Constraints](https://vivien000.github.io/blog/journal/llm-decoding-with-regex-constraints.html) by Vivien
- [Constrained Decoding is Posterior Inference](https://saibo-creator.github.io/post/2024_07_19_constrained_decoding_is_posterior_inference/) by Saibo-creator
- [Making Structured Generation Faster Than Unstructured](https://blog.dottxt.co/selective-multiplication.html)
- [Coding For Structured Generation with LLMs](https://blog.dottxt.co/coding-for-structured-generation.html)
- [Beating GPT-4 with Open Source](https://blog.dottxt.co/oss-v-gpt4.html)
- [Prompt Efficiency - Using Structured Generation to get 8-shot performance from 1-shot.](https://blog.dottxt.co/prompt-efficiency.html)
- [How fast can grammar-structured generation be?](https://blog.dottxt.co/how-fast-cfg.html)
- [Structured Generation Improves LLM performance: GSM8K Benchmark](https://blog.dottxt.co/performance-gsm8k.html)
- [Coalescence: making LLM inference 5x faster](https://blog.dottxt.co/coalescence.html)
- [Constrained Decoding with Arbitrary Constraints is NP-hard](https://saibo-creator.github.io/post/2024_08_25_constrained_decoding_is_np_complete/)
- [LLMs are bad at returning code in JSON](https://aider.chat/2024/08/14/code-in-json.html)
- [Say what I mean](https://github.com/appier-research/structure-gen/blob/main/updates.md)
- [LLGuidance: Making Structured Outputs Go Brrr](https://guidance-ai.github.io/llguidance/llg-go-brrr)
Many of the blogs are written by [Outlines](https://github.com/outlines-dev) team, many thanks to them for their great work! ❤️
## Related Awesome Lists
- [awesome-llm-json](https://github.com/imaurer/awesome-llm-json)
## Disclaimer
This list is not exhaustive and will be updated regularly. If you have any suggestions or want to add a paper, please feel free to open an issue or submit a pull request. We hope to include all relevant papers in this list.
## Contributing
Contributions are welcome! Feel free to submit a pull request or open an issue. Please make sure to read the [Contributing Guidelines](CONTRIBUTING.md) before contributing.