{"id":35023738,"url":"https://github.com/avrtt/pochemuchka","last_synced_at":"2026-05-19T13:09:18.724Z","repository":{"id":286676126,"uuid":"957496735","full_name":"avrtt/pochemuchka","owner":"avrtt","description":"Automatic prompt engineering, testing \u0026 load balancing for your AI models in production","archived":false,"fork":false,"pushed_at":"2025-04-07T19:49:36.000Z","size":101,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-07T20:29:08.745Z","etag":null,"topics":["ai-engineering","cicd","llms","load-balancing","mlops","model-integration","modelops","prompt-engineering","prompt-tuning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/avrtt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-30T14:19:48.000Z","updated_at":"2025-04-07T20:13:09.000Z","dependencies_parsed_at":"2025-04-07T20:40:16.019Z","dependency_job_id":null,"html_url":"https://github.com/avrtt/pochemuchka","commit_stats":null,"previous_names":["avrtt/pochemuchka"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/avrtt/pochemuchka","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avrtt%2Fpochemuchka","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avrtt%2Fpochemuchka/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avrtt%2Fpochemuchka/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avrtt%2Fpochemuchka/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/avrtt","download_url":"https://codeload.github.com/avrtt/pochemuchka/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avrtt%2Fpochemuchka/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28073834,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-27T02:00:05.897Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-engineering","cicd","llms","load-balancing","mlops","model-integration","modelops","prompt-engineering","prompt-tuning"],"created_at":"2025-12-27T06:08:04.531Z","updated_at":"2025-12-27T06:08:04.810Z","avatar_url":"https://github.com/avrtt.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003c!--\n\u003cp\u003e\n    \u003cspan style=\"text-align: left\"\u003e\n      ver. 1.1.12\u0026nbsp; •\u0026nbsp; \u003cu\u003eDocumentation\u003c/u\u003e \u003cb\u003e(WIP)\u003c/b\u003e: \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/en.md\"\u003e🇺🇸 EN\u003c/a\u003e\u003c/b\u003e | \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/ru.md\"\u003e🇷🇺 RU\u003c/a\u003e\u003c/b\u003e  \n    \u003c/span\u003e\n    \u003cspan style=\"float: right\"\u003e\n        \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/commands.md\"\u003eCommands\u003c/a\u003e\u003c/b\u003e\u0026nbsp; •\u0026nbsp; \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/styles.md\"\u003eStyles\u003c/a\u003e\u003c/b\u003e\u0026nbsp; •\u0026nbsp; \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/conventions.md\"\u003eConventions\u003c/a\u003e\u003c/b\u003e\n    \u003c/span\u003e\n\u003c/p\u003e\n--\u003e\n\n\u003cp style=\"text-align: center\"\u003e\n  ver. 1.1.12\u0026nbsp; •\u0026nbsp; \u003cu\u003eDocumentation\u003c/u\u003e \u003cb\u003e(WIP)\u003c/b\u003e: \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/en.md\"\u003e🇺🇸 EN\u003c/a\u003e\u003c/b\u003e | \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/ru.md\"\u003e🇷🇺 RU\u003c/a\u003e\u003c/b\u003e  \n\u003c/p\u003e\n\u003cp style=\"text-align: center\"\u003e\n    \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/commands.md\"\u003eCommands\u003c/a\u003e\u003c/b\u003e\u0026nbsp; •\u0026nbsp; \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/style.md\"\u003eStyle\u003c/a\u003e\u003c/b\u003e\u0026nbsp; •\u0026nbsp; \u003cb\u003e\u003ca href=\"https://github.com/avrtt/pochemuchka/blob/main/documentation/conventions.md\"\u003eConventions\u003c/a\u003e\u003c/b\u003e\n\u003c/p\u003e\n\n\u003cbr/\u003e\n\nThis is an all-in-one library built as part of my other SaaS project. It provides various techniques for managing, optimizing and testing prompts for LLMs in both production and research environments. With the client's permission, this demo illustrates a system designed to dynamically integrate data, monitor performance metrics such as latency and cost, and efficiently balance loads among various AI models.\n\nThe system can help to simplify the development and testing of prompt-based interactions with LLMs. By combining real-time monitoring, dynamic caching and integration across multiple models, it offers tools for understanding the capabilities of AI-driven solutions. You can refine your prompt design or automatically adapt learning systems to evolving contexts.\n\n\u003e [!TIP] \n\u003e Check out some simple usage examples in **[examples/getting_started.ipynb](https://github.com/avrtt/pochemuchka/blob/main/examples/getting_started.ipynb)**\n\nSome features:\n- **Dynamic prompt crafting**  \n  Adapt and update prompts on the fly, ensuring you avoid issues like budget overflows by integrating live data.\n- **Multi-model compatibility**  \n  Easily switch between various LLM providers, distributing workload intelligently based on configurable weights.\n- **Real-time performance insights**  \n  Gain immediate visibility into metrics such as latency, token usage and overall cost.\n- **CI/CD testing**  \n  Automatically generate and execute tests during prompt calls by comparing responses with an ideal output provided by a human expert.\n- **Efficient prompt caching**   \n  Leverage a caching system with a short TTL (Time-To-Live) of five minutes to ensure that prompt content is always current while minimizing redundant data fetches.\n- **Asynchronous interaction logging**  \n  Log detailed interaction data in the background so that your application's performance remains unaffected.\n- **User feedback integration**  \n  Enhance prompt quality continuously by incorporating explicit feedback and ideal answers for previous responses.\n\n## Architecture\n\nThe demo implements a smart caching mechanism with some lifespan for each prompt. This includes automatic refresh (every prompt call checks for an updated version from the server, ensuring that the cached version is always fresh), local backup (in case the central service is unavailable, the system reverts to a locally stored version of the prompt) and version synchronization (to maintain consistent versions across both local and remote environments).\n\nThe system supports two distinct methods for creating tests to ensure the quality of prompt outputs: inline and explicit. The first one includes test data with an ideal response during the LLM call, which automatically triggers test creation. The second invokes a test creation method for a given prompt directly, to compare the LLM's response against a predefined ideal answer.\n\nLogs interact asynchronously, so logging happens in the background without impacting response times. You can automatically capture details like response latency, token count and associated costs, store complete snapshots of prompts, context and responses for analysis.\n\nFeedback is integral to continuous improvement. You can attach ideal answers to previous responses, prompting the system to generate new tests and refine prompt formulations.\n\n## License\nMIT","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favrtt%2Fpochemuchka","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Favrtt%2Fpochemuchka","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favrtt%2Fpochemuchka/lists"}