awesome-llm-constrained-decoding
  
  
    A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources. 
    https://github.com/saibo-creator/awesome-llm-constrained-decoding
  
        Last synced: about 7 hours ago 
        JSON representation
    
- 
            Libraries- guidance-ai/guidance - CPP |  |
- guidance-ai/guidance - CPP |  |
- eth-sri/lmql - sri/lmql) |
- jxnl/instructor - Reject-Repeat approach to ensure constraints are met |  |
- microsoft/aici
- eth-sri/lmql - sri/lmql) |
- jxnl/instructor - Reject-Repeat approach to ensure constraints are met |  |
- microsoft/aici
- noamgat/lm-format-enforcer - format-enforcer) |
- noamgat/lm-format-enforcer - format-enforcer) |
- mlc-ai/xgrammar - ai/xgrammar) |
- mlc-ai/xgrammar - ai/xgrammar) |
- epfl-dlab/transformers-CFG - dlab/transformers-CFG) |
- genlm/genlm-control - control) |
- epfl-dlab/transformers-CFG - dlab/transformers-CFG) |
- Dan-wanna-M/formatron - wanna-M/formatron) |
- Dan-wanna-M/formatron - wanna-M/formatron) |
- genlm/genlm-control - control) |
- structuredllm/itergen
- structuredllm/itergen
- eth-sri/type-constrained-code-generation - safety. |  |
- snowkylin/circuit-transformer - transformer) |
- eth-sri/type-constrained-code-generation - safety. |  |
- snowkylin/circuit-transformer - transformer) |
- epfl-dlab/jsonschemabench - dlab/jsonschemabench) |
- epfl-dlab/jsonschemabench - dlab/jsonschemabench) |
- eth-sri/constrained-diffusion - Region Infilling (CFG) |  |
- eth-sri/constrained-diffusion - Region Infilling (CFG) |  |
- grammarllm
- outlines-dev/outlines - dev/outlines) |
 
- 
            Papers- Earley-Driven Dynamic Pruning for Efficient Structured Decoding
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
- Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
- CRANE: Reasoning with constrained LLM generation
- Earley-Driven Dynamic Pruning for Efficient Structured Decoding
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
- Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
- CRANE: Reasoning with constrained LLM generation
- Type-Constrained Code Generation with Language Models
- Type-Constrained Code Generation with Language Models
- Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding
- Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding
- Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence
- Flexible and Efficient Grammar-Constrained Decoding
- Circuit Transformer: A Transformer That Preserves Logical Equivalence
- Generating Structured Outputs from Language Models: Benchmark and Studies
- XGRAMMAR: FLEXIBLE AND EFFICIENT STRUCTURED GENERATION ENGINE FOR LARGE LANGUAGE MODELS
- Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence
- Flexible and Efficient Grammar-Constrained Decoding
- Circuit Transformer: A Transformer That Preserves Logical Equivalence
- Generating Structured Outputs from Language Models: Benchmark and Studies
- XGRAMMAR: FLEXIBLE AND EFFICIENT STRUCTURED GENERATION ENGINE FOR LARGE LANGUAGE MODELS
- IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
- IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
- Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models
- Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models
- Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
- SGLang: Efficient Execution of Structured Language Model Programs
- Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context
- Prompt Sketching for Large Language Models
- Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs
- Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding
- Amortizing intractable inference in large language models
- Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting
- KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
- FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking
- Automata-based constraints for language model decoding
- FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking
- Automata-based constraints for language model decoding
- Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access
- Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access
- Grammar-Aligned Decoding
- SynCode: LLM Generation with Grammar Augmentation
- Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
- Grammar-Aligned Decoding
- SynCode: LLM Generation with Grammar Augmentation
- Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
- Constrained Decoding for Cross-lingual Label Projection
- Constrained Decoding for Cross-lingual Label Projection
- Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
- SGLang: Efficient Execution of Structured Language Model Programs
- Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context
- Prompt Sketching for Large Language Models
- Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs
- Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding
- Amortizing intractable inference in large language models
- Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting
- KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
- Lazy-k Decoding: Constrained Decoding for Information Extraction
- Efficient Guided Generation for Large Language Models
- Grammar Prompting for Domain-Specific Language Generation with Large Language Models
- Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
- Prompting Is Programming: A Query Language for Large Language Models
- Lazy-k Decoding: Constrained Decoding for Information Extraction
- Efficient Guided Generation for Large Language Models
- Grammar Prompting for Domain-Specific Language Generation with Large Language Models
- Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
- Prompting Is Programming: A Query Language for Large Language Models
- Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
- Tractable Control for Autoregressive Language Generation
- Validating Large Language Models with ReLM
- Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
- Tractable Control for Autoregressive Language Generation
- Validating Large Language Models with ReLM
- CodePAD: Sequence-based Code Generation with Pushdown Automaton
- CodePAD: Sequence-based Code Generation with Pushdown Automaton
- Controllable Text Generation with Neurally-Decomposed Oracle
- Gradient-Based Constrained Sampling from Language Models
- Controllable Text Generation with Neurally-Decomposed Oracle
- Gradient-Based Constrained Sampling from Language Models
- COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
- COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
- Synchromesh: Reliable code generation from pre-trained language models
- PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
- Synchromesh: Reliable code generation from pre-trained language models
- PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
- Constrained Language Models Yield Few-Shot Semantic Parsers
- Controlled Text Generation as Continuous Optimization with Multiple Constraints
- Constrained Language Models Yield Few-Shot Semantic Parsers
- Controlled Text Generation as Continuous Optimization with Multiple Constraints
- NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints
- A General-Purpose Algorithm for Constrained Sequential Inference
- Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting
- CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling
- NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints
- A General-Purpose Algorithm for Constrained Sequential Inference
- Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting
- CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling
- Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
- Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
- Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method
- Guided Open Vocabulary Image Captioning with Constrained Beam Search
- Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
- Constrained Decoding of Diffusion LLMs with Context-Free Grammars
- Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method
- Guided Open Vocabulary Image Captioning with Constrained Beam Search
- Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
- Constrained Decoding of Diffusion LLMs with Context-Free Grammars
- DINGO: Constrained Inference for Diffusion LLMs
- Constrained Language Generation with Discrete Diffusion Models
- DINGO: Constrained Inference for Diffusion LLMs
- Constrained Language Generation with Discrete Diffusion Models
- Grammar-Constrained Decoding Makes Large Language Models Better Logical Parsers
- GRAMMAR-LLM: Grammar-Constrained Natural Language Generation
- Pre3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation
- Syntactic Control of Language Models by Posterior Inference
 
- 
            Benchmark & Datasets & Evaluation- JSON-mode Eval Cleaned/Extended, SMILES Eval and HumanEval MRI C++
- JSON-mode Eval Cleaned/Extended, SMILES Eval and HumanEval MRI C++
- JsonSchemaBench: Generating Structured Outputs from Language Models: Benchmark and Studies
- JsonSchemaBench: Generating Structured Outputs from Language Models: Benchmark and Studies
- COLLIE: Systematic Construction of Constrained Text Generation Tasks
- COLLIE: Systematic Construction of Constrained Text Generation Tasks
- JSON-mode Eval dataset
- BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing
- Evaluating Large Language Models on Controlled Generation Tasks
- Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
- JSON-mode Eval dataset
- Evaluating Large Language Models on Controlled Generation Tasks
- BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing
- Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
- NLV corpus
- CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
- Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task
- NLV corpus
- CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
- Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task
 
- 
            Survey
- 
            Blog Posts- The good, the bad, and the ugly of Gemini’s structured outputs
- The good, the bad, and the ugly of Gemini’s structured outputs
- Leveraging Constrained Sampling for Fill-in-the-Middle Code Completion
- Proper Well-Formedness for Finite LLM Sampling
- LLM Decoding with Regex Constraints
- Leveraging Constrained Sampling for Fill-in-the-Middle Code Completion
- Proper Well-Formedness for Finite LLM Sampling
- LLM Decoding with Regex Constraints
- Constrained Decoding is Posterior Inference - creator
- Making Structured Generation Faster Than Unstructured
- Constrained Decoding is Posterior Inference - creator
- Making Structured Generation Faster Than Unstructured
- Coding For Structured Generation with LLMs
- Beating GPT-4 with Open Source
- Prompt Efficiency - Using Structured Generation to get 8-shot performance from 1-shot.
- Coding For Structured Generation with LLMs
- Beating GPT-4 with Open Source
- Prompt Efficiency - Using Structured Generation to get 8-shot performance from 1-shot.
- Structured Generation Improves LLM performance: GSM8K Benchmark
- Coalescence: making LLM inference 5x faster
- Constrained Decoding with Arbitrary Constraints is NP-hard
- LLMs are bad at returning code in JSON
- Say what I mean
- Structured Generation Improves LLM performance: GSM8K Benchmark
- Coalescence: making LLM inference 5x faster
- Constrained Decoding with Arbitrary Constraints is NP-hard
- LLMs are bad at returning code in JSON
- Say what I mean
- LLGuidance: Making Structured Outputs Go Brrr
- LLGuidance: Making Structured Outputs Go Brrr
- Outlines
- Outlines
 
- 
            Related Awesome Lists
            Programming Languages
          
          
        
            Categories
          
          
        
            Sub Categories
          
          
            Keywords
          
          
              
                llm
                12
              
              
                language-model
                6
              
              
                large-language-models
                4
              
              
                structured-generation
                4
              
              
                llm-inference
                4
              
              
                constrained-decoding
                4
              
              
                wasm
                2
              
              
                transformer
                2
              
              
                rust
                2
              
              
                model-serving
                2
              
              
                llmops
                2
              
              
                llm-serving
                2
              
              
                llm-framework
                2
              
              
                inference
                2
              
              
                ai
                2
              
              
                programming-language
                2
              
              
                huggingface
                2
              
              
                chatgpt
                2
              
              
                wasmtime
                2
              
              
                probabilistic-programming
                2
              
              
                grammar
                2
              
              
                parser
                2
              
              
                code-synthesis
                2
              
              
                type-systems
                2
              
              
                diffusion
                2
              
              
                diffusion-model
                2
              
              
                fill-in-the-middle
                2
              
              
                llms
                2
              
              
                llms-benchmarking
                2
              
              
                multi-region-infilling
                2
              
              
                awesome-list
                2
              
              
                function-calling
                2
              
              
                gpt-actions
                2