{"id":13644561,"url":"https://github.com/srush/minichain","last_synced_at":"2025-05-15T11:05:58.459Z","repository":{"id":66268192,"uuid":"600114716","full_name":"srush/MiniChain","owner":"srush","description":"A tiny library for coding with large language models.","archived":false,"fork":false,"pushed_at":"2024-07-10T14:59:50.000Z","size":55904,"stargazers_count":1231,"open_issues_count":11,"forks_count":75,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-05-12T20:24:58.871Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://srush-minichain.hf.space/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srush.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-10T16:07:11.000Z","updated_at":"2025-05-10T05:22:35.000Z","dependencies_parsed_at":null,"dependency_job_id":"e440a1fb-8c8b-42fe-bf67-177330c24f9c","html_url":"https://github.com/srush/MiniChain","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FMiniChain","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FMiniChain/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FMiniChain/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FMiniChain/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srush","download_url":"https://codeload.github.com/srush/MiniChain/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254328385,"owners_count":22052632,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T01:02:07.595Z","updated_at":"2025-05-15T11:05:58.435Z","avatar_url":"https://github.com/srush.png","language":"Python","funding_links":[],"categories":["NLP"],"sub_categories":[],"readme":"\u003cimg src=\"https://user-images.githubusercontent.com/35882/227030644-f70e55e8-68a3-48d3-afa3-54c4de8fc210.png\" width=\"100%\"\u003e\n\nA tiny library for coding with **large** language models. Check out the [MiniChain Zoo](https://srush-minichain.hf.space/) to get a sense of how it works.\n\n## Coding\n\n* Code ([math_demo.py](https://github.com/srush/MiniChain/blob/main/examples/math_demo.py)): Annotate Python functions that call language models.\n\n```python\n@prompt(OpenAI(), template_file=\"math.pmpt.tpl\")\ndef math_prompt(model, question):\n    \"Prompt to call GPT with a Jinja template\"\n    return model(dict(question=question))\n\n@prompt(Python(), template=\"import math\\n{{code}}\")\ndef python(model, code):\n    \"Prompt to call Python interpreter\"\n    code = \"\\n\".join(code.strip().split(\"\\n\")[1:-1])\n    return model(dict(code=code))\n\ndef math_demo(question):\n    \"Chain them together\"\n    return python(math_prompt(question))\n```\n\n* Chains ([Space](https://srush-minichain.hf.space/)): MiniChain builds a graph (think like PyTorch) of all the calls you make for debugging and error handling.\n\u003cimg src=\"https://user-images.githubusercontent.com/35882/226965531-78df7927-988d-45a7-9faa-077359876730.png\" width=\"50%\"\u003e\n\n\n```python\nshow(math_demo,\n     examples=[\"What is the sum of the powers of 3 (3^i) that are smaller than 100?\",\n               \"What is the sum of the 10 first positive integers?\"],\n     subprompts=[math_prompt, python],\n     out_type=\"markdown\").queue().launch()\n```\n\n\n* Template ([math.pmpt.tpl](https://github.com/srush/MiniChain/blob/main/examples/math.pmpt.tpl)): Prompts are separated from code.\n\n```\n...\nQuestion:\nA robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?\nCode:\n2 + 2/2\n\nQuestion:\n{{question}}\nCode:\n```\n\n* Installation\n\n```bash\npip install minichain\nexport OPENAI_API_KEY=\"sk-***\"\n```\n\n## Examples\n\nThis library allows us to implement several popular approaches in a few lines of code.\n\n* [Retrieval-Augmented QA](https://srush.github.io/MiniChain/examples/qa/)\n* [Chat with memory](https://srush.github.io/MiniChain/examples/chatgpt/)\n* [Information Extraction](https://srush.github.io/MiniChain/examples/ner/)\n* [Interleaved Code (PAL)](https://srush.github.io/MiniChain/examples/pal/) - [(Gao et al 2022)](https://arxiv.org/pdf/2211.10435.pdf)\n* [Search Augmentation (Self-Ask)](https://srush.github.io/MiniChain/examples/selfask/) - [(Press et al 2022)](https://ofir.io/self-ask.pdf)\n* [Chain-of-Thought](https://srush.github.io/MiniChain/examples/bash/) - [(Wei et al 2022)](https://arxiv.org/abs/2201.11903)\n\nIt supports the current backends.\n\n* OpenAI (Completions / Embeddings)\n* Hugging Face 🤗\n* Google Search\n* Python\n* Manifest-ML (AI21, Cohere, Together)\n* Bash\n\n## Why Mini-Chain?\n\nThere are several very popular libraries for prompt chaining,\nnotably: [LangChain](https://langchain.readthedocs.io/en/latest/),\n[Promptify](https://github.com/promptslab/Promptify), and\n[GPTIndex](https://gpt-index.readthedocs.io/en/latest/reference/prompts.html).\nThese library are useful, but they are extremely large and\ncomplex. MiniChain aims to implement the core prompt chaining\nfunctionality in a tiny digestable library.\n\n\n## Tutorial\n\nMini-chain is based on annotating functions as prompts.\n\n![image](https://user-images.githubusercontent.com/35882/221280012-d58c186d-4da2-4cb6-96af-4c4d9069943f.png)\n\n\n```python\n@prompt(OpenAI())\ndef color_prompt(model, input):\n    return model(f\"Answer 'Yes' if this is a color, {input}. Answer:\")\n```\n\nPrompt functions act like python functions, except they are lazy to access the result you need to call `run()`.\n\n```python\nif color_prompt(\"blue\").run() == \"Yes\":\n    print(\"It's a color\")\n```\nAlternatively you can chain prompts together. Prompts are lazy, so if you want to manipulate them you need to add `@transform()` to your function. For example:\n\n```python\n@transform()\ndef said_yes(input):\n    return input == \"Yes\"\n```\n\n![image](https://user-images.githubusercontent.com/35882/221281771-3770be96-02ce-4866-a6f8-c458c9a11c6f.png)\n\n```python\n@prompt(OpenAI())\ndef adjective_prompt(model, input):\n    return model(f\"Give an adjective to describe {input}. Answer:\")\n```\n\n\n```python\nadjective = adjective_prompt(\"rainbow\")\nif said_yes(color_prompt(adjective)).run():\n    print(\"It's a color\")\n```\n\n\nWe also include an argument `template_file` which assumes model uses template from the\n[Jinja](https://jinja.palletsprojects.com/en/3.1.x/templates/) language.\nThis allows us to separate prompt text from the python code.\n\n```python\n@prompt(OpenAI(), template_file=\"math.pmpt.tpl\")\ndef math_prompt(model, question):\n    return model(dict(question=question))\n```\n\n### Visualization\n\nMiniChain has a built-in prompt visualization system using `Gradio`.\nIf you construct a function that calls a prompt chain you can visualize it\nby calling `show` and `launch`. This can be done directly in a notebook as well.\n\n```python\nshow(math_demo,\n     examples=[\"What is the sum of the powers of 3 (3^i) that are smaller than 100?\",\n              \"What is the sum of the 10 first positive integers?\"],\n     subprompts=[math_prompt, python],\n     out_type=\"markdown\").queue().launch()\n```\n\n\n### Memory\n\nMiniChain does not build in an explicit stateful memory class. We recommend implementing it as a queue.\n\n![image](https://user-images.githubusercontent.com/35882/221622653-7b13783e-0439-4d59-8f57-b98b82ab83c0.png)\n\nHere is a class you might find useful to keep track of responses.\n\n```python\n@dataclass\nclass State:\n    memory: List[Tuple[str, str]]\n    human_input: str = \"\"\n\n    def push(self, response: str) -\u003e \"State\":\n        memory = self.memory if len(self.memory) \u003c MEMORY_LIMIT else self.memory[1:]\n        return State(memory + [(self.human_input, response)])\n```\n\nSee the full Chat example.\nIt keeps track of the last two responses that it has seen.\n\n### Tools and agents.\n\nMiniChain does not provide `agents` or `tools`. If you want that functionality you can use the `tool_num` argument of model which allows you to select from multiple different possible backends. It's easy to add new backends of your own (see the GradioExample).\n\n```python\n@prompt([Python(), Bash()])\ndef math_prompt(model, input, lang):\n    return model(input, tool_num= 0 if lang == \"python\" else 1)\n```\n\n### Documents and Embeddings\n\nMiniChain does not manage documents and embeddings. We recommend using\nthe [Hugging Face Datasets](https://huggingface.co/docs/datasets/index) library with\nbuilt in FAISS indexing.\n\n![image](https://user-images.githubusercontent.com/35882/221387303-e3dd8456-a0f0-4a70-a1bb-657fe2240862.png)\n\n\nHere is the implementation.\n\n```python\n# Load and index a dataset\nolympics = datasets.load_from_disk(\"olympics.data\")\nolympics.add_faiss_index(\"embeddings\")\n\n@prompt(OpenAIEmbed())\ndef get_neighbors(model, inp, k):\n    embedding = model(inp)\n    res = olympics.get_nearest_examples(\"embeddings\", np.array(embedding), k)\n    return res.examples[\"content\"]\n```\n\nThis creates a K-nearest neighbors (KNN) prompt that looks up the\n3 closest documents based on embeddings of the question asked.\nSee the full [Retrieval-Augemented QA](https://srush.github.io/MiniChain/examples/qa/)\nexample.\n\n\nWe recommend creating these embeddings offline using the batch map functionality of the\ndatasets library.\n\n```python\ndef embed(x):\n    emb = openai.Embedding.create(input=x[\"content\"], engine=EMBEDDING_MODEL)\n    return {\"embeddings\": [np.array(emb['data'][i]['embedding'])\n                           for i in range(len(emb[\"data\"]))]}\nx = dataset.map(embed, batch_size=BATCH_SIZE, batched=True)\nx.save_to_disk(\"olympics.data\")\n```\n\nThere are other ways to do this such as [sqllite](https://github.com/asg017/sqlite-vss)\nor [Weaviate](https://weaviate.io/).\n\n\n### Typed Prompts\n\nMiniChain can automatically generate a prompt header for you that aims to ensure the\noutput follows a given typed specification. For example, if you run the following code\nMiniChain will produce prompt that returns a list of `Player` objects.\n\n```python\nclass StatType(Enum):\n    POINTS = 1\n    REBOUNDS = 2\n    ASSISTS = 3\n\n@dataclass\nclass Stat:\n    value: int\n    stat: StatType\n\n@dataclass\nclass Player:\n    player: str\n    stats: List[Stat]\n\n\n@prompt(OpenAI(), template_file=\"stats.pmpt.tpl\", parser=\"json\")\ndef stats(model, passage):\n    out = model(dict(passage=passage, typ=type_to_prompt(Player)))\n    return [Player(**j) for j in out]\n```\n\nSpecifically it will provide your template with a string `typ` that you can use. For this example the string will be of the following form:\n\n\n```\nYou are a highly intelligent and accurate information extraction system. You take passage as input and your task is to find parts of the passage to answer questions.\n\nYou need to output a list of JSON encoded values\n\nYou need to classify in to the following types for key: \"color\":\n\nRED\nGREEN\nBLUE\n\n\nOnly select from the above list, or \"Other\".⏎\n\n\nYou need to classify in to the following types for key: \"object\":⏎\n\nString\n\n\n\nYou need to classify in to the following types for key: \"explanation\":\n\nString\n\n[{ \"color\" : \"color\" ,  \"object\" : \"object\" ,  \"explanation\" : \"explanation\"}, ...]\n\nMake sure every output is exactly seen in the document. Find as many as you can.\n```\n\nThis will then be converted to an object automatically for you.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrush%2Fminichain","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrush%2Fminichain","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrush%2Fminichain/lists"}