awesome-llm-constrained-decoding
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
https://github.com/saibo-creator/awesome-llm-constrained-decoding
Last synced: 1 day ago
JSON representation
-
Libraries
- guidance-ai/guidance - CPP |  |
- guidance-ai/guidance - CPP |  |
- eth-sri/lmql - sri/lmql) |
- jxnl/instructor - Reject-Repeat approach to ensure constraints are met |  |
- microsoft/aici
- eth-sri/lmql - sri/lmql) |
- jxnl/instructor - Reject-Repeat approach to ensure constraints are met |  |
- microsoft/aici
- noamgat/lm-format-enforcer - format-enforcer) |
- noamgat/lm-format-enforcer - format-enforcer) |
- mlc-ai/xgrammar - ai/xgrammar) |
- mlc-ai/xgrammar - ai/xgrammar) |
- epfl-dlab/transformers-CFG - dlab/transformers-CFG) |
- genlm/genlm-control - control) |
- epfl-dlab/transformers-CFG - dlab/transformers-CFG) |
- Dan-wanna-M/formatron - wanna-M/formatron) |
- Dan-wanna-M/formatron - wanna-M/formatron) |
- genlm/genlm-control - control) |
- structuredllm/itergen
- structuredllm/itergen
- eth-sri/type-constrained-code-generation - safety. |  |
- snowkylin/circuit-transformer - transformer) |
- eth-sri/type-constrained-code-generation - safety. |  |
- snowkylin/circuit-transformer - transformer) |
- epfl-dlab/jsonschemabench - dlab/jsonschemabench) |
- epfl-dlab/jsonschemabench - dlab/jsonschemabench) |
- eth-sri/constrained-diffusion - Region Infilling (CFG) |  |
- eth-sri/constrained-diffusion - Region Infilling (CFG) |  |
- grammarllm
-
Papers
- Earley-Driven Dynamic Pruning for Efficient Structured Decoding
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
- Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
- CRANE: Reasoning with constrained LLM generation
- Earley-Driven Dynamic Pruning for Efficient Structured Decoding
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
- Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
- CRANE: Reasoning with constrained LLM generation
- Type-Constrained Code Generation with Language Models
- Type-Constrained Code Generation with Language Models
- Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding
- Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding
- Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence
- Flexible and Efficient Grammar-Constrained Decoding
- Circuit Transformer: A Transformer That Preserves Logical Equivalence
- Generating Structured Outputs from Language Models: Benchmark and Studies
- XGRAMMAR: FLEXIBLE AND EFFICIENT STRUCTURED GENERATION ENGINE FOR LARGE LANGUAGE MODELS
- Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence
- Flexible and Efficient Grammar-Constrained Decoding
- Circuit Transformer: A Transformer That Preserves Logical Equivalence
- Generating Structured Outputs from Language Models: Benchmark and Studies
- XGRAMMAR: FLEXIBLE AND EFFICIENT STRUCTURED GENERATION ENGINE FOR LARGE LANGUAGE MODELS
- IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
- IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
- Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models
- Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models
- Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
- SGLang: Efficient Execution of Structured Language Model Programs
- Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context
- Prompt Sketching for Large Language Models
- Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs
- Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding
- Amortizing intractable inference in large language models
- Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting
- KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
- FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking
- Automata-based constraints for language model decoding
- FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking
- Automata-based constraints for language model decoding
- Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access
- Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access
- Grammar-Aligned Decoding
- SynCode: LLM Generation with Grammar Augmentation
- Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
- Grammar-Aligned Decoding
- SynCode: LLM Generation with Grammar Augmentation
- Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
- Constrained Decoding for Cross-lingual Label Projection
- Constrained Decoding for Cross-lingual Label Projection
- Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
- SGLang: Efficient Execution of Structured Language Model Programs
- Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context
- Prompt Sketching for Large Language Models
- Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs
- Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding
- Amortizing intractable inference in large language models
- Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting
- KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
- Lazy-k Decoding: Constrained Decoding for Information Extraction
- Efficient Guided Generation for Large Language Models
- Grammar Prompting for Domain-Specific Language Generation with Large Language Models
- Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
- Prompting Is Programming: A Query Language for Large Language Models
- Lazy-k Decoding: Constrained Decoding for Information Extraction
- Efficient Guided Generation for Large Language Models
- Grammar Prompting for Domain-Specific Language Generation with Large Language Models
- Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
- Prompting Is Programming: A Query Language for Large Language Models
- Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
- Tractable Control for Autoregressive Language Generation
- Validating Large Language Models with ReLM
- Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
- Tractable Control for Autoregressive Language Generation
- Validating Large Language Models with ReLM
- CodePAD: Sequence-based Code Generation with Pushdown Automaton
- CodePAD: Sequence-based Code Generation with Pushdown Automaton
- Controllable Text Generation with Neurally-Decomposed Oracle
- Gradient-Based Constrained Sampling from Language Models
- Controllable Text Generation with Neurally-Decomposed Oracle
- Gradient-Based Constrained Sampling from Language Models
- COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
- COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
- Synchromesh: Reliable code generation from pre-trained language models
- PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
- Synchromesh: Reliable code generation from pre-trained language models
- PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
- Constrained Language Models Yield Few-Shot Semantic Parsers
- Controlled Text Generation as Continuous Optimization with Multiple Constraints
- Constrained Language Models Yield Few-Shot Semantic Parsers
- Controlled Text Generation as Continuous Optimization with Multiple Constraints
- NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints
- A General-Purpose Algorithm for Constrained Sequential Inference
- Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting
- CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling
- NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints
- A General-Purpose Algorithm for Constrained Sequential Inference
- Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting
- CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling
- Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
- Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
- Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method
- Guided Open Vocabulary Image Captioning with Constrained Beam Search
- Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
- Constrained Decoding of Diffusion LLMs with Context-Free Grammars
- Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method
- Guided Open Vocabulary Image Captioning with Constrained Beam Search
- Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
- Constrained Decoding of Diffusion LLMs with Context-Free Grammars
- DINGO: Constrained Inference for Diffusion LLMs
- Constrained Language Generation with Discrete Diffusion Models
- DINGO: Constrained Inference for Diffusion LLMs
- Constrained Language Generation with Discrete Diffusion Models
- Grammar-Constrained Decoding Makes Large Language Models Better Logical Parsers
- GRAMMAR-LLM: Grammar-Constrained Natural Language Generation
- Pre3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation
- Syntactic Control of Language Models by Posterior Inference
-
Benchmark & Datasets & Evaluation
- JSON-mode Eval Cleaned/Extended, SMILES Eval and HumanEval MRI C++
- JSON-mode Eval Cleaned/Extended, SMILES Eval and HumanEval MRI C++
- JsonSchemaBench: Generating Structured Outputs from Language Models: Benchmark and Studies
- JsonSchemaBench: Generating Structured Outputs from Language Models: Benchmark and Studies
- COLLIE: Systematic Construction of Constrained Text Generation Tasks
- COLLIE: Systematic Construction of Constrained Text Generation Tasks
- JSON-mode Eval dataset
- BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing
- Evaluating Large Language Models on Controlled Generation Tasks
- Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
- JSON-mode Eval dataset
- Evaluating Large Language Models on Controlled Generation Tasks
- BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing
- Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
- NLV corpus
- CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
- Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task
- NLV corpus
- CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
- Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task
-
Survey
-
Blog Posts
- The good, the bad, and the ugly of Gemini’s structured outputs
- The good, the bad, and the ugly of Gemini’s structured outputs
- Leveraging Constrained Sampling for Fill-in-the-Middle Code Completion
- Proper Well-Formedness for Finite LLM Sampling
- LLM Decoding with Regex Constraints
- Leveraging Constrained Sampling for Fill-in-the-Middle Code Completion
- Proper Well-Formedness for Finite LLM Sampling
- LLM Decoding with Regex Constraints
- Constrained Decoding is Posterior Inference - creator
- Making Structured Generation Faster Than Unstructured
- Constrained Decoding is Posterior Inference - creator
- Making Structured Generation Faster Than Unstructured
- Coding For Structured Generation with LLMs
- Beating GPT-4 with Open Source
- Prompt Efficiency - Using Structured Generation to get 8-shot performance from 1-shot.
- Coding For Structured Generation with LLMs
- Beating GPT-4 with Open Source
- Prompt Efficiency - Using Structured Generation to get 8-shot performance from 1-shot.
- Structured Generation Improves LLM performance: GSM8K Benchmark
- Coalescence: making LLM inference 5x faster
- Constrained Decoding with Arbitrary Constraints is NP-hard
- LLMs are bad at returning code in JSON
- Say what I mean
- Structured Generation Improves LLM performance: GSM8K Benchmark
- Coalescence: making LLM inference 5x faster
- Constrained Decoding with Arbitrary Constraints is NP-hard
- LLMs are bad at returning code in JSON
- Say what I mean
- LLGuidance: Making Structured Outputs Go Brrr
- LLGuidance: Making Structured Outputs Go Brrr
- Outlines
- Outlines
-
Related Awesome Lists
Programming Languages
Categories
Sub Categories
Keywords
llm
12
language-model
6
large-language-models
4
structured-generation
4
llm-inference
4
constrained-decoding
4
wasm
2
transformer
2
rust
2
model-serving
2
llmops
2
llm-serving
2
llm-framework
2
inference
2
ai
2
programming-language
2
huggingface
2
chatgpt
2
wasmtime
2
probabilistic-programming
2
grammar
2
parser
2
code-synthesis
2
type-systems
2
diffusion
2
diffusion-model
2
fill-in-the-middle
2
llms
2
llms-benchmarking
2
multi-region-infilling
2
awesome-list
2
function-calling
2
gpt-actions
2