{"id":13644968,"url":"https://github.com/stoerr/aigenpipeline","last_synced_at":"2026-01-26T22:40:09.346Z","repository":{"id":223370390,"uuid":"758867763","full_name":"stoerr/AIGenPipeline","owner":"stoerr","description":"AI based code generation pipeline: command line tool and framework for systematic code generation using AI in a build process","archived":false,"fork":false,"pushed_at":"2025-04-01T07:44:49.000Z","size":5211,"stargazers_count":4,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"develop","last_synced_at":"2025-04-19T09:06:19.365Z","etag":null,"topics":["ai","chatgpt","code-generation","code-generator","llm","openai","software-development"],"latest_commit_sha":null,"homepage":"https://aigenpipeline.stoerr.net/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stoerr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-02-17T10:12:15.000Z","updated_at":"2025-03-01T16:29:06.000Z","dependencies_parsed_at":"2024-05-15T13:16:13.607Z","dependency_job_id":"37d7c33b-6993-4f82-a84e-1abef5d6bbb9","html_url":"https://github.com/stoerr/AIGenPipeline","commit_stats":null,"previous_names":["stoerr/aigenpipeline"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stoerr%2FAIGenPipeline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stoerr%2FAIGenPipeline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stoerr%2FAIGenPipeline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stoerr%2FAIGenPipeline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stoerr","download_url":"https://codeload.github.com/stoerr/AIGenPipeline/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250041034,"owners_count":21365205,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","chatgpt","code-generation","code-generator","llm","openai","software-development"],"created_at":"2024-08-02T01:02:22.118Z","updated_at":"2026-01-26T22:40:09.220Z","avatar_url":"https://github.com/stoerr.png","language":"Java","funding_links":[],"categories":["NLP"],"sub_categories":[],"readme":"# AI based code generation pipeline\n\n\u003cstrong\u003eA command line tool and framework for systematic code generation using AI\u003c/strong\u003e\n\n\u003e In silence, code weaves,\u003cbr/\u003e\n\u003e Through prompts, AI breathes life anew,\u003cbr/\u003e\n\u003e Scripts bloom, knowledge leaves.\n\u003e\n\u003e Git guards every step,\u003cbr/\u003e\n\u003e In the dance of creation,\u003cbr/\u003e\n\u003e Change blooms, watched and kept.\u003cbr/\u003e\n\u003e -- ChatGPT\n\n## Basic idea\n\nThis is a command line tool to generate files using an AI - either ChatGPT or a model with a similar chat completion\ninterface. It can be used to generate code, documentation, or other text files. Each run of the command line tool can\ntake several prompt files with instructions what to generate as argument, but can also take other source files to be\nprocessed as further input. The output is written to a text file.\n\nIt can be used to solve complex tasks with several steps by chaining several runs of the tool -\ne.g. create an OpenAPI document from a specification, then generate an interface from\nthat, a test and an implementation class. Of course, manual inspection and editing of the generated files will usually\nbe necessary.\n\nI suggest to inspect the intermediate and final results and committing them into a version control system like Git.\nThat ensures manual checks when they are regenerated, and minimizes regeneration.\n\n## Some example usages\n\n1. **Basic Generation**: Generating an OpenAPI document interface, tests, and implementation class could be achieved by\n   chaining runs of `AIGenPipeline`, each with the appropriate prompt and input files. For example:\n\n   ```shell\n   aigenpipeline -p openapi_prompt.txt -o generated_openapi.yaml\n   aigenpipeline -p interface_prompt.txt -i generated_openapi.yaml -o generated_interface.java\n   aigenpipeline -p test_prompt.txt -i generated_interface.java -o generated_test.java\n   ```\n\n   By the way: if you want to change an existing output file, you can give it as additional input file to the AI. \n   It will be instructed to check and update the file according to the new input files, minimizing the changes.\n\n2. **Explanation / Query**: After generating an output, if there are questions or the need for clarification on how to\n   improve it:\n\n   ```shell\n   aigenpipeline -p interface_prompt.txt -i generated_openapi.yaml -o generated_interface.java --explain \\\n        \"Why wasn't a @GET annotation used in method foo? How would I have to change the prompt to make sure it's \n   used?\"\n   ```\n\n3. **Force Regeneration**: Normally a versioning mechanism (see below) ensures the result is not regenerated unless the\n   input changes. The `-f` flag disables this checking:\n\n   ```shell\n   aigenpipeline -f -o specs/openapi.yaml api_interface_prompt.txt src/main/java/foo/MyInterface.java\n   ```\n\n   Alternatively, the output files could just be removed before running the tool, or the version comments in suitable\n   input files or prompt file(s) could be changed.\n\n4. **Generate parts of a file**: If you want to combine manually written and ai generated parts in one file, you can use\n   the `-wp \u003cmarker\u003e` option to replace a part of the output file. The marker should occur in exactly two lines of the\n   already existing output file - the lines between them are replaced by the AI generated text. The first line must \n   also contain the version comment (see below).\n\n   ```shell\n   aigenpipeline -wp generatedtable -o outputfile.md -p maketablefromjson.txt input.json\n   ```\n  \n   could e.g. replace the part between the lines `\u003c!-- start generatedtable AIGenVersion(1.0) --\u003e` and `\u003c!-- end \n   generatedtable --\u003e` in:\n\n   ```\n   This is the hand generated part\n   \u003c!-- start generatedtable AIGenVersion(1.0) --\u003e\n   | Column1 | Column2 |\n   |---------|---------|\n   | value1  | value2  |\n   \u003c!-- end generatedtable --\u003e\n   Here can be more handwritten stuff that is untouched.\n   ```\n\n## Caching and versioning\n\nSince generation takes time, costs a little and the results often have to be manually checked, the tool takes precaution\nnot to regenerate the output files unless the input changes.\n\nIn the simplest case it can just do nothing when the output file is already there. If input files are changed,\nthe output file would have to be removed to enforce regeneration.\n\nAnother work mode is checking whether the output file is newer than the input files (e.g. Makefile style usage)\nBut this has the problem that that does not sensibly work if the output files are checked in with a version control\nsystem with Git, and would also regenerate files in case of minor changes in the input files.\n\nThe main suggested work mode (which is also the default) is to provide the input and prompt files with version comments\nthat declare the version of the input file, and only generate the output file if it is not present, or regenerate the\noutput file if the versions have changed. A version comment in the output file will be generated, which declares a\nversion for the output file (a hash of it's content) and the versions of the used input files. The version comment\nin input and prompt files can be manually changed to force regeneration of the output file.\n\n### Structure of version comments\n\nA version comment can e.g. look like this:\n\n    /* AIGenVersion(ourversion, inputfile1@version1, inputfile2@version2, ...) */\n\nTo declare a version for a manually created prompt / source file you can just put in a version comment like\n`AIGenVersion(1.0)` and change the version number each time you want to force regeneration.\n\nThe comment syntax (in this case /* */) is ignored - we just look for the AIGenVersion via regular expression.\nA version comment will be written at the start or end of the output file; that and the used comment syntax is\ndetermined by the file extension.\n\n## Using different large language models\n\nWhile the tool defaults to using the OpenAI chat completion service, it is possible to use other services / LLM as \nwell. I tried with [Anthropic Claude](https://www.anthropic.com/claude) \n[text generation](https://docs.anthropic.com/claude/docs/text-generation) and some local models run with the nice \n[LM Studio](https://lmstudio.ai/). See [using other models](https://aigenpipeline.stoerr.net/otherModels.md) for \nsome examples.\n\n## Configuration files\n\nThe tool can read configuration files with common configurations (e.g. which AI backend to use). These should simply\ncontain command line options; we'll split it at whitespaces just like in bash. Also, there can be an environment \nvariable AIGENPIPELINE_CONFIG that can contain options.\n\nConfiguration files can be given explicitly (option `-cf` / `--configfile`) or the tool can scan for files named\n`.aigenpipeline` upwards from the output file directory. The search for `.aigenpipeline` files can be switched off\nwith the `-cn` / `--confignoscan` option. If that option is given in one of these configuration files, that aborts\nthe scanning further upwards in the directory tree.\n\nThe order these configurations are processed is: environment variable, `.aigenpipeline` files from top to bottom,\ncommand line arguments. Thus, the later override the earlier one. Explicitly given configuration files are\nprocessed at the point where the argument occurs when processing the command line arguments. The option `-cp` / \n`--configprint` gives an overview of the used files / sources of configuration.\n\n## Other features\n\nIf you are not satisfied with the result, the tool can also be used to ask the AI for clarification: ask a question\nabout the result or have it make suggestions how to improve the prompt. This mode will not write the output file, but\nrecreate the conversation that lead to the output file and ask the AI for a clarification or suggestion in form of a\nchat continuation.\n\nIt's also possible to ask the tool questions about itself - use the `-ha` / `--helpai` option with a question, and it \ntries to answer that from this documentation. There is also an \n[OpenAI GPT](https://chatgpt.com/g/g-zheGoARkR-ai-based-code-generation-pipeline-helper)\nthat can be asked.\n\n## Command Usage\n\n```\nUsage:\naigenpipeline [options] [\u003cinput_files\u003e...]\n\nOptions:\n\n  General options:\n    -h, --help               Show this help message and exit.\n    -ha, --helpai \u003cquestion\u003e Answer a question about the tool from the text on its documentation site and exit.\n                             Use as last argument - rest of command line counts as question.\n    --version                Show the version of the AIGenPipeline tool and exit.\n    -c, --check              Only check if the output needs to be regenerated based on input versions without actually \n                             generating it. The exit code is 0 if the output is up to date, 1 if it needs to be \n                             regenerated.\n    -n, --dry-run            Enable dry-run mode, where the tool will only print to stderr what it would do without \n                             actually calling the AI or writing any files.\n    -v, --verbose            Enable verbose output to stderr, providing more details about the process.\n\n  Input / outputs:\n    -o, --output \u003cfile\u003e      Specify the output file where the generated content will be written.\n    -ifp, --infileprompt \u003cmarker\u003e \u003cfile\u003e  The output and the prompt are in the same file, the marker is used in separating the parts.\n    -upd, --update           Gives the current content of the output as hint to the AI that it should be updated / improved.\n    --hint \u003cfile\u003e            Gives this file as additional clue to the AI (special filename - is stdin), e.g. to tell\n                             it to focus on something for an --update. This is not used for version checking.\n    -p, --prompt \u003cfile\u003e      Reads a prompt from the given file.\n    -s, --sysmsg \u003cfile\u003e      Optional: Reads a system message from the given file instead of using the default.\n    -k \u003ckey\u003e=\u003cvalue\u003e         Sets a key-value pair replacing ${key} in prompt files with the value. \n    -os, --outputscan \u003cpattern\u003e  Searches for files matching the ant-like pattern and scans them for AIGenPromptStart markers.\n                             The infile prompts in these files are processed (see -ifp).\n    -dd, --dependencydiagram Print a dependency diagram (Mermaid graph) of the scanned files and exit.\n\n  AI Generation control:\n    -f, --force              Force regeneration of output files, ignoring any version checks - same as -ga.\n    -ga, --gen-always        Generate the output file always, ignoring version checks.\n    -gn, --gen-notexists     Generate the output file only if it does not exist.\n    -go, --gen-older         Generate the output file if it does not exist or is older than any of the input files.\n    -gv, --gen-versioncheck  Generate the output file if the version of the input files has changed. (Default.)\n    -wv, --write-version     Write the output file with a version comment. (Default.)\n    -wvf, --write-versionfile Write the version comment to a separate file named like the output file with .version appended.\n    -wo, --write-noversion   Write the output file without a version comment. Not compatible with default -gv .\n    -wp, --write-part \u003cmarker\u003e Replace the lines between the first occurrence of the marker and the second occurrence.\n                             If a version marker is written, it has to be in the first of those lines and is changed there.\n                             It is an error if the marker does not occur exactly twice; the output file has to exist.\n    -e, --explain \u003cquestion\u003e Asks the AI a question about the generated result. This needs _exactly_the_same_command_line_\n                             that was given to generate the output file, and the additional --explain \u003cquestion\u003e option.\n                             It recreates the conversation that lead to the output file and asks the AI for a \n                             clarification. The output file is not written, but read to recreate the conversation.\n                             Use as last argument - rest of command line counts as question.\n\n  Configuration files:\n    -cf, --configfile \u003cfile\u003e Read configuration from the given file. These contain options like on the command line.\n    -cn, --confignoscan      Do not scan for `.aigenpipeline` config files.\n    -cne, --configignoreenv  Ignore the environment variable `AIGENPIPELINE_CONFIG`.\n    -cp, --configprint       Print the collected configurations and exit.\n\n  AI backend settings:\n    -u, --url \u003curl\u003e          The URL of the AI server. Default is https://api.openai.com/v1/chat/completions .\n                             In the case of OpenAI the API key is expected to be in the environment variable \n                             OPENAI_API_KEY, or given with the -a option.\n    -a, --api-key \u003ckey\u003e      The API key for the AI server. If not given, it's expected to be in the environment variable \n                             OPENAI_API_KEY, or you could use a -u option to specify a different server that doesnt need\n                             an API key. Used in \"Authorization: Bearer \u003ckey\u003e\" header.\n    -org, --organization \u003cid\u003e The optional organization id in case of the OpenAI server.\n    -m, --model \u003cmodel\u003e      The model to use for the AI. Default is gpt-4o .\n    -t \u003cmaxtokens\u003e           The maximum number of tokens to generate.\n\nArguments:\n  [\u003cinput_files\u003e...]       Input files to be processed into the output file. \n\nExamples:\n  Generate documentation from a prompt file:\n    aigenpipeline -p prompts/documentation_prompt.txt -o generated_documentation.md src/foo/bar.java src/foo/baz.java\n\n  Force regenerate an interface from an OpenAPI document, ignoring version checks:\n    aigenpipeline -f -o specs/openapi.yaml -p prompts/api_interface_prompt.txt src/main/java/foo/MyInterface.java\n\n  Ask how to improve a prompt after viewing the initial generation of specs/openapi.yaml:\n    aigenpipeline -o PreviousOutput.java -p prompts/promptGenertaion.txt specs/openapi.yaml --explain \"Why did you not use annotations?\"\n\n  Scan for files with infile prompts and (re-)generate the AI generated parts of those files:\n    aigenpipeline -os \"src/site/**/*.md\"\n\nInfile prompts:\n  The idea is that files can contain both the prompt that has been used to generate their AI generated part(s) in a\n  comment, and also instructions like the used input files or other settings. In such a file you would have e.g. \n  \n  \u003c!-- AIGenPromptStart(somemarker)\n  (Here would come the prompt for the AI)\n  AIGenCommand(somemarker)\n  data.txt\n  AIGenPromptEnd(somemarker) --\u003e\n  (Here is the generated content placed after calling aigenpipeline.)\n  \u003c!-- AIGenEnd(somemarker) --\u003e\n  \n  That also means that it's not necessary to write a script that processes each of those files, but the tool can scan\n  for AIGenPromptStart markers, as in the example `aigenpipeline -os \"src/site/**/*.md\"` above.\n\nConfiguration files:\n  These contain options like on the command line. The environment variable `AIGENPIPELINE_CONFIG` can contain options.\n  If -cn is not given, the tool scans for files named .aigenpipeline upwards from the output file directory.\n  The order these configurations are processed is: environment variable, .aigenpipeline files from top to bottom,\n  command line arguments. Thus the later override the earlier one, as these get more specific to the current call.\n  Lines starting with a # are ignored in configuration files (comments).\n\nNote:\n  It's recommended to manually review and edit generated files. Use version control to manage and track changes over time. \n  More detailed instructions and explanations can be found at https://aigenpipeline.stoerr.net/ .\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstoerr%2Faigenpipeline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstoerr%2Faigenpipeline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstoerr%2Faigenpipeline/lists"}