{"id":34049646,"url":"https://github.com/genlm/llamppl","last_synced_at":"2026-04-09T00:02:23.701Z","repository":{"id":194233730,"uuid":"681813613","full_name":"genlm/llamppl","owner":"genlm","description":"Probabilistic programming with large language models","archived":false,"fork":false,"pushed_at":"2026-04-08T18:14:42.000Z","size":1175,"stargazers_count":165,"open_issues_count":12,"forks_count":27,"subscribers_count":5,"default_branch":"main","last_synced_at":"2026-04-08T20:16:10.949Z","etag":null,"topics":["huggingface-transformers","language-model","ppl","probabilistic-programming","python3"],"latest_commit_sha":null,"homepage":"https://genlm.org/llamppl/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/genlm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-08-22T20:13:00.000Z","updated_at":"2026-03-22T17:48:43.000Z","dependencies_parsed_at":"2023-11-20T21:26:04.434Z","dependency_job_id":"37113593-fa4b-493e-8c59-a49b8cc00c24","html_url":"https://github.com/genlm/llamppl","commit_stats":null,"previous_names":["probcomp/hfppl","genlm/hfppl","genlm/llamppl"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/genlm/llamppl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/genlm%2Fllamppl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/genlm%2Fllamppl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/genlm%2Fllamppl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/genlm%2Fllamppl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/genlm","download_url":"https://codeload.github.com/genlm/llamppl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/genlm%2Fllamppl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31579058,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"ssl_error","status_checked_at":"2026-04-08T14:31:17.202Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["huggingface-transformers","language-model","ppl","probabilistic-programming","python3"],"created_at":"2025-12-14T00:53:25.384Z","updated_at":"2026-04-09T00:02:23.694Z","avatar_url":"https://github.com/genlm.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLaMPPL\n\n[![docs](https://github.com/genlm/llamppl/actions/workflows/docs.yml/badge.svg)](https://genlm.github.io/llamppl)\n[![Tests](https://github.com/genlm/llamppl/actions/workflows/tests.yml/badge.svg)](https://github.com/genlm/llamppl/actions/workflows/tests.yml)\n[![codecov](https://codecov.io/gh/genlm/llamppl/graph/badge.svg?token=pgVQBiqCuM)](https://codecov.io/gh/genlm/llamppl)\n\n\nLLaMPPL is a research prototype for language model probabilistic programming: specifying language generation tasks by writing probabilistic programs that combine calls to LLMs, symbolic program logic, and probabilistic conditioning. To solve these tasks, LLaMPPL uses a specialized sequential Monte Carlo inference algorithm. This technique, SMC steering, is described in [our recent workshop abstract](https://arxiv.org/abs/2306.03081).\n\nThis library was formerly known as `hfppl`.\n\n## Installation\n\nIf you just want to try out LLaMPPL, check out our [demo notebook on Colab](https://colab.research.google.com/drive/1uJEC-U8dcwsTWccCDGVexpgXexzZ642n?usp=sharing), which performs a simple constrained generation task using GPT-2. (Larger models may require more RAM or GPU resources than Colab's free version provides.)\n\nTo get started on your own machine, you can install this library from PyPI:\n\n```\npip install llamppl\n```\n\nFor faster inference on Apple Silicon devices, you can install with MLX backend:\n\n```bash\npip install llamppl[mlx]\n```\n\n### Local installation\n\nFor local development, clone this repository and run `pip install -e \".[dev,examples]\"` to install `llamppl` and its development dependencies.\n\n```\ngit clone https://github.com/genlm/llamppl\ncd llamppl\npip install -e \".[dev,examples]\"\n```\n\nThen, try running an example. Note that this will cause the weights of a HuggingFace model to be downloaded.\n\n```\npython examples/hard_constraints.py\n```\n\nIf everything is working, you should see the model generate political news using words that are at most five letters long (e.g., \"Dr. Jill Biden may still be a year away from the White House but she is set to make her first trip to the U.N. today.\").\n\n## Modeling with LLaMPPL\n\nA LLaMPPL program is a subclass of the `llamppl.Model` class.\n\n```python\nfrom llamppl import Model, LMContext, CachedCausalLM\n\n# A LLaMPPL model subclasses the Model class\nclass MyModel(Model):\n\n    # The __init__ method is used to process arguments\n    # and initialize instance variables.\n    def __init__(self, lm, prompt, forbidden_letter):\n        super().__init__()\n\n        # A stateful context object for the LLM, initialized with the prompt\n        self.context = LMContext(lm, prompt)\n        self.eos_token = lm.tokenizer.eos_token_id\n\n        # The forbidden letter\n        self.forbidden_tokens = set(i for (i, v) in enumerate(lm.vocab)\n                                      if forbidden_letter in v)\n\n    # The step method is used to perform a single 'step' of generation.\n    # This might be a single token, a single phrase, or any other division.\n    # Here, we generate one token at a time.\n    async def step(self):\n        # Condition on the next token *not* being a forbidden token.\n        await self.observe(self.context.mask_dist(self.forbidden_tokens), False)\n\n        # Sample the next token from the LLM -- automatically extends `self.context`.\n        token = await self.sample(self.context.next_token())\n\n        # Check for EOS or end of sentence\n        if token.token_id == self.eos_token or str(token) in ['.', '!', '?']:\n            # Finish generation\n            self.finish()\n\n    # To improve performance, a hint that `self.forbidden_tokens` is immutable\n    def immutable_properties(self):\n        return set(['forbidden_tokens'])\n```\n\nThe Model class provides a number of useful methods for specifying a LLaMPPL program:\n\n* `self.sample(dist[, proposal])` samples from the given distribution. Providing a proposal does not modify the task description, but can improve inference. Here, for example, we use a proposal that pre-emptively avoids the forbidden letter.\n* `self.condition(cond)` conditions on the given Boolean expression.\n* `self.finish()` indicates that generation is complete.\n* `self.observe(dist, obs)` performs a form of 'soft conditioning' on the given distribution. It is equivalent to (but more efficient than) sampling a value `v` from `dist` and then immediately running `condition(v == obs)`.\n\nTo run inference, we use the `smc_steer` or `smc_standard` methods:\n\n```python\nimport asyncio\nfrom llamppl import smc_steer\n\n# Initialize the language model\nlm = CachedCausalLM.from_pretrained(\"meta-llama/Llama-2-7b-hf\")\n\n# Create a model instance\nmodel = MyModel(lm, \"The weather today is expected to be\", \"e\")\n\n# Run inference\nparticles = asyncio.run(smc_steer(model, 5, 3)) # number of particles N, and beam factor K\n```\n\nSample output:\n\n```\nsunny.\nsunny and cool.\n34° (81°F) in Chicago with winds at 5mph.\n34° (81°F) in Chicago with winds at 2-9 mph.\nhot and humid with a possibility of rain, which is not uncommon for this part of Mississippi.\n```\n\nFurther documentation can be found at https://genlm.github.io/llamppl.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgenlm%2Fllamppl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgenlm%2Fllamppl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgenlm%2Fllamppl/lists"}