{"id":20207005,"url":"https://github.com/firattamur/llmdantic","last_synced_at":"2025-04-10T12:33:17.915Z","repository":{"id":228248241,"uuid":"773505610","full_name":"firattamur/llmdantic","owner":"firattamur","description":"Structured Output Is All You Need!","archived":false,"fork":false,"pushed_at":"2024-03-19T21:47:44.000Z","size":1892,"stargazers_count":54,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-24T11:13:17.247Z","etag":null,"topics":["langchain","langchain-python","llm","llms","pydantic","pydantic-v2"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/firattamur.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-17T20:51:25.000Z","updated_at":"2025-03-10T22:38:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"984af268-1a4f-435b-9b62-0f68b4a6f1e7","html_url":"https://github.com/firattamur/llmdantic","commit_stats":null,"previous_names":["firattamur/llmdantic"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/firattamur%2Fllmdantic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/firattamur%2Fllmdantic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/firattamur%2Fllmdantic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/firattamur%2Fllmdantic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/firattamur","download_url":"https://codeload.github.com/firattamur/llmdantic/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248217131,"owners_count":21066633,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["langchain","langchain-python","llm","llms","pydantic","pydantic-v2"],"created_at":"2024-11-14T05:27:05.984Z","updated_at":"2025-04-10T12:33:17.901Z","avatar_url":"https://github.com/firattamur.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n \u003cpicture\u003e\n   \u003cimg src=\"https://raw.githubusercontent.com/firattamur/llmdantic/main/.github/assets/llmdantic.png\" alt=\"image\" width=\"300\"\u003e\n \u003c/picture\u003e  \n\u003c/p\u003e\n\n\u003ch3 style=\"font-size: 5em\" align=\"center\"\u003e\n   Structured Output Is All You Need!  \n\u003c/h3\u003e\n\n\u003cbr\u003e\n\nLLMdantic is a powerful and efficient Python library that simplifies the integration of Large Language Models (LLMs) into your projects. Built on top of the incredible [Langchain](https://github.com/hwchase17/langchain) package and leveraging the power of [Pydantic](https://github.com/pydantic/pydantic) models, LLMdantic provides a seamless and structured approach to working with LLMs.\n\n## Features 🚀\n\n- 🌐 Wide range of LLM support through Langchain integrations\n- 🛡️ Ensures data integrity with Pydantic models for input and output validation\n- 🧩 Modular and extensible design for easy customization\n- 💰 Cost tracking and optimization for OpenAI models\n- 🚀 Efficient batch processing for handling multiple data points\n- 🔄 Robust retry mechanism for smooth and uninterrupted experience\n\n## Getting Started 🌟\n\n### Requirements\n\nBefore using LLMdantic, make sure you have set the required API keys for the LLMs you plan to use. For example, if you're using OpenAI's models, set the `OPENAI_API_KEY` environment variable:\n\n```bash\nexport OPENAI_API_KEY=\"your-api-key\"\n```\n\nIf you're using other LLMs, follow the instructions provided by the respective providers in Langchain's documentation.\n\n### Installation\n\n```bash\npip install llmdantic\n```\n\n### Usage\n\n#### 1. Define input and output schemas using Pydantic:\n\n- Use Pydantic to define input and output models with custom validation rules.\n\n\u003e [!IMPORTANT]\n\u003e\n\u003e Add docstrings to validation rules to provide prompts for the LLM. This will help the LLM understand the validation rules and provide better results\n\n\n```python\nfrom pydantic import BaseModel, field_validator\n\nclass SummarizeInput(BaseModel):\n    text: str\n\nclass SummarizeOutput(BaseModel):\n    summary: str\n\n    @field_validator(\"summary\")  \n    def summary_must_not_be_empty(cls, v) -\u003e bool:\n        \"\"\"Summary cannot be empty\"\"\"  # Add docstring that explains the validation rule. This will be used as a prompt for the LLM.\n        if not v.strip():\n            raise\n        return v\n\n    @field_validator(\"summary\")\n    def summary_must_be_short(cls, v) -\u003e bool:  \n        \"\"\"Summary must be less than 100 words\"\"\"  # Add docstring that explains the validation rule. This will be used as a prompt for the LLM.\n        if len(v.split()) \u003e 100:\n            raise  \n        return v\n```\n\n#### 2. Create an LLMdantic client:\n\n- Provide input and output models, objective, and configuration.\n\n\u003e [!TIP]\n\u003e\n\u003e The `objective` is a prompt that will be used to generate the actual prompt sent to the LLM. It should be a high-level description of the task you want the LLM to perform.\n\u003e\n\u003e The `inp_schema` and `out_schema` are the input and output models you defined in the previous step.\n\u003e \n\u003e The `retries` parameter is the number of times the LLMdantic will retry the request in case of failure.\n\n```python\nfrom llmdantic import LLMdantic, LLMdanticConfig  \nfrom langchain_openai import ChatOpenAI\n\nllm = ChatOpenAI()\n\nconfig: LLMdanticConfig = LLMdanticConfig(\n    objective=\"Summarize the text\", \n    inp_schema=SummarizeInput,\n    out_schema=SummarizeOutput, \n    retries=3,\n)\n\nllmdantic = LLMdantic(llm=llm, config=config)\n```\n\nHere's the prompt template generated based on the input and output models:\n\n```text\nObjective: Summarize the text\n\nInput 'SummarizeInput': \n{input}\n\nOutput 'SummarizeOutput''s fields MUST FOLLOW the RULES:\nSummarizeOutput.summary:\n• SUMMARY CANNOT BE EMPTY\n• SUMMARY MUST BE LESS THAN 100 WORDS\n\n{format_instructions}\n```\n\n#### 3. Generate output using the LLMdantic:\n\n\u003e [!TIP]\n\u003e\n\u003e The `invoke` method is used for single requests, while the `batch` method is used for batch processing.\n\u003e\n\u003e The `invoke` method returns an instance of `LLMdanticResult`, which contains the generated text, parsed output, and other useful information such as cost and usage stats such as the number of input and output tokens. Check out the [LLMdanticResult](#LLMdanticResult) model for more details.\n\u003e\n\n```python\nfrom llmdantic import LLMdanticResult\n\ndata = SummarizeInput(text=\"A long article about natural language processing...\")\nresult: LLMdanticResult = llmdantic.invoke(data)\n\noutput: Optional[SummarizeOutput] = result.output\n\nif output:\n    print(output.summary)\n```\n\nHere's the actual prompt sent to the LLM based on the input data:\n\n```text\nObjective: Summarize the text\n\nInput 'SummarizeInput': \n{'text': 'A long article about natural language processing...'}\n\nOutput 'SummarizeOutput''s fields MUST FOLLOW the RULES:\nSummarizeOutput.summary:\n• SUMMARY CANNOT BE EMPTY\n• SUMMARY MUST BE LESS THAN 100 WORDS\n\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {\"properties\": {\"foo\": {\"title\": \"Foo\", \"description\": \"a list of strings\", \"type\": \"array\", \"items\": {\"type\": \"string\"}}}, \"required\": [\"foo\"]}\nthe object {\"foo\": [\"bar\", \"baz\"]} is a well-formatted instance of the schema. The object {\"properties\": {\"foo\": [\"bar\", \"baz\"]}} is not well-formatted.\n\nHere is the output schema:\n{\"properties\": {\"summary\": {\"title\": \"Summary\", \"type\": \"string\"}}, \"required\": [\"summary\"]}\n```\n\n- For batch processing, pass a list of input data.\n\n\u003e [!IMPORTANT]\n\u003e\n\u003e The `batch` method returns a list of `LLMdanticResult` instances, each containing the generated text, parsed output, and other useful information such as cost and usage stats such as the number of input and output tokens. Check out the [LLMdanticResult](#LLMdanticResult) model for more details.\n\u003e\n\u003e The `concurrency` parameter is the number of concurrent requests to be made. Please check the usage limits of the LLM provider before setting this value.\n\u003e\n\n```python\ndata: List[SummarizeInput] = [\n    SummarizeInput(text=\"A long article about natural language processing...\"),\n    SummarizeInput(text=\"A long article about computer vision...\")  \n]\nresults: List[LLMdanticResult] = llmdantic.batch(data, concurrency=2)\n\nfor result in results:\n    if result.output:\n        print(result.output.summary)\n```\n\n#### 4. Monitor usage and costs:\n\n\u003e [!IMPORTANT]\n\u003e\n\u003e The cost tracking feature is currently available for OpenAI models only.\n\u003e\n\u003e The `usage` attribute returns an instance of `LLMdanticUsage`, which contains the number of input and output tokens, successful requests, cost, and successful outputs. Check out the [LLMdanticUsage](#LLMdanticUsage) model for more details.\n\u003e\n\u003e Please note that the usage is tracked for the entire lifetime of the `LLMdantic` instance. \n\n- Use the `cost` attribute of the LLMdanticResult to track the cost of the request (currently available for OpenAI models).\n\n- Use the `usage` attribute of the LLMdantic to track the usage stats overall.\n\n```python\nfrom llmdantic import LLMdanticResult\n\ndata: SummarizeInput = SummarizeInput(text=\"A long article about natural language processing...\")  \nresult: LLMdanticResult = llmdantic.invoke(data)\n\nif result.output:\n    print(result.output.summary)\n\n# Track the cost of the request (OpenAI models only)\nprint(f\"Cost: {result.cost}\")  \n\n# Track the usage stats\nprint(f\"Usage: {llmdantic.usage}\")\n```\n\n```bash\nCost: 0.0003665\nOverall Usage: LLMdanticUsage(\n  inp_tokens=219,\n  out_tokens=19,\n  total_tokens=238,\n  successful_requests=1,\n  cost=0.000367,\n  successful_outputs=1\n)\n```\n\n## Advanced Usage 🛠\n\n`LLMdantic` is built on top of the langchain package, which provides a modular and extensible framework for working with LLMs. You can easily switch between different LLMs and customize your experience.\n\nSwitching LLMs\n\n\u003e [!IMPORTANT]\n\u003e\n\u003e Make sure to set the required API keys for the new LLM you plan to use.\n\u003e\n\u003e The `llm` parameter of the `LLMdantic` class should be an instance of `BaseLanguageModel` from the langchain package.\n\u003e \n\n\u003e [!TIP]\n\u003e\n\u003e You can use the `langchain_community` package to access a wide range of LLMs from different providers.\n\u003e\n\u003e You may need to provide model_name, api_key, and other parameters based on the LLM you want to use. Check out the documentation of the respective LLM provider for more details.\n\u003e \n\n\n```python\nfrom llmdantic import LLMdantic, LLMdanticConfig\nfrom langchain_community.llm.ollama import Ollama\nfrom langchain.llms.base import BaseLanguageModel\n\nllm: BaseLanguageModel = Ollama()\n\nconfig: LLMdanticConfig = LLMdanticConfig(\n    objective=\"Summarize the text\",\n    inp_schema=SummarizeInput, \n    out_schema=SummarizeOutput,\n    retries=3,\n)\n\nllmdantic = LLMdantic(\n    llm=llm,\n    config=config\n)\n```\n\n## Contributing 🤝\n\nContributions are welcome! Whether you're fixing bugs, adding new features, or improving documentation, your help makes\n**LLMdantic** better for everyone. Feel free to open an issue or submit a pull request.\n\n## License 📄\n\n**LLMdantic** is released under the [MIT License](LICENSE). Feel free to use it, contribute, and spread the word!\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffirattamur%2Fllmdantic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffirattamur%2Fllmdantic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffirattamur%2Fllmdantic/lists"}