{"id":28207631,"url":"https://github.com/samestrin/llm-file-processor","last_synced_at":"2026-03-02T03:33:01.597Z","repository":{"id":293558557,"uuid":"984424565","full_name":"samestrin/llm-file-processor","owner":"samestrin","description":"Automate, standardize, and enrich your files at scale with LLM-powered transformations","archived":false,"fork":false,"pushed_at":"2025-05-16T20:37:02.000Z","size":81,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-04T17:53:14.610Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/samestrin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-15T23:05:10.000Z","updated_at":"2025-05-16T20:37:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"40fc68f6-6b0e-4e05-a729-9bb61eb3f3ec","html_url":"https://github.com/samestrin/llm-file-processor","commit_stats":null,"previous_names":["samestrin/llm-file-processor"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/samestrin/llm-file-processor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-file-processor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-file-processor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-file-processor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-file-processor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/samestrin","download_url":"https://codeload.github.com/samestrin/llm-file-processor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-file-processor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29991794,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-02T01:47:34.672Z","status":"online","status_checked_at":"2026-03-02T02:00:07.342Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-17T12:16:22.959Z","updated_at":"2026-03-02T03:33:01.590Z","avatar_url":"https://github.com/samestrin.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLM File Processor\n\n[![Star on GitHub](https://img.shields.io/github/stars/samestrin/llm-file-processor?style=social)](https://github.com/samestrin/llm-file-processor/stargazers) [![Fork on GitHub](https://img.shields.io/github/forks/samestrin/llm-file-processor?style=social)](https://github.com/samestrin/llm-file-processor/network/members) [![Watch on GitHub](https://img.shields.io/github/watchers/samestrin/llm-file-processor?style=social)](https://github.com/samestrin/llm-file-processor/watchers)\n\n![Version 1.0.0](https://img.shields.io/badge/Version-1.0.0-blue) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Built with Node.js](https://img.shields.io/badge/Built%20with-Node.js-green)](https://nodejs.org/)\n\n\u003e **Automate, standardize, and enrich your files at scale with LLM-powered transformations**\n\nA flexible Node.js CLI that applies custom LLM prompts to files or entire directories—turn unstructured documentation, code, or data into consistent, structured, and actionable outputs with minimal effort.\n\n## Key Features\n\n* **Rule-Driven Workflows**: Define a single prompt file containing transformation rules, and let the CLI enforce them across every input file.\n* **LLM-Agnostic**: Swap models or providers via environment variables; works with any OpenAI-compatible API endpoint.\n* **Batch \u0026 Parallel Processing**: Process individual files or entire directories in configurable batch sizes, with optional delays for rate-limiting.\n* **Dry-Run Mode**: Preview combined prompts without making API calls, perfect for testing and validation.\n* **JSON-First Output**: Receive clean, machine-readable JSON responses for seamless integration into pipelines.\n* **Prompt Validation**: Built-in LLM-based prompt sanity checks to ensure your rules translate into valid transformations.\n\n## Use Cases\n\n1. **Uniform Documentation**\n   Standardize a scattered collection of markdown files—add TOCs, enforce heading hierarchies, flag missing sections, and generate summary sections automatically.\n\n2. **Web Content Summarization**\n   Crawl or aggregate dozens (or hundreds) of web pages, then compress and transform them into structured in-context learning data for your next prompt-engineering or fine-tuning project.\n\n3. **Automated Code Review \u0026 Linting**\n   Feed diffs or code snippets through custom prompts to enforce style guides, detect anti-patterns, and suggest refactors at scale.\n\n4. **Test Case Generation**\n   Generate unit or integration tests by providing source files and rules for expected behaviors—ideal for accelerating test coverage in legacy codebases.\n\n5. **Changelog \u0026 Release Notes**\n   Scan commit messages or diff logs, then automatically produce human-friendly change summaries and release notes in your preferred format.\n\n6. **Data Extraction \u0026 Metadata Tagging**\n   Transform CSVs, logs, or JSON files by extracting key fields, tagging records, or reformatting data for downstream analytics.\n\n7. **Migration of Legacy Formats**\n   Batch-convert legacy documentation, configuration files, or proprietary formats into modern standards (e.g., Markdown → Markdown with frontmatter, YAML → JSON).\n\n8. **Localization \u0026 Internationalization**\n   Automate translation or adaptation of text files by applying LLM-based translation prompts, with markers for review or missing strings.\n\n9. **CI/CD Integration**\n   Incorporate the CLI into Git hooks or CI pipelines to enforce content and code health checks on every commit or pull request.\n\n10. **Training Data Preparation**\n    Generate clean, structured training examples by defining in-context learning rules—ideal for building your own LLM benchmarks or fine-tuning datasets.\n\n## Installation\n\nYou can install the LLM File Processor globally via npm:\n\n```bash\nnpm install -g llm-file-processor\n```\n\nAlternatively, you can use `npx` to run it without installing globally:\n\n```bash\nnpx llm-file-processor [options]\n```\n\nIf you prefer to clone the repository and run it locally:\n\n```bash\n# Clone repository\ngit clone https://github.com/samestrin/llm-file-processor.git\ncd llm-file-processor\n\n# Install dependencies\nnpm install\n\n# Make CLI executable (if running directly)\nchmod +x llm-file-processor.js\n\n# (Optional) Link globally for local development\nnpm link\n```\n\n## Configuration\n\nCreate a `.env` file in the project root:\n\n```dotenv\nOPENAI_API_KEY=your_api_key_here\nOPENAI_MODEL=your-model-identifier # e.g. gpt-4.1\n```\n\n\u003e **Tip:** Use any OpenAI-compatible endpoint by setting the `OPENAI_API_URL` environment variable.\n\n## Usage\n\n```bash\n# Process a single file\nllm-file-processor --prompt-file path/to/prompt.txt --file path/to/doc.md\n\n# Process an entire directory\nllm-file-processor --prompt-file path/to/prompt.txt --directory path/to/project/docs\n\n# Preview prompts without API calls\nllm-file-processor -p prompt.txt -f file.md --dry-run\n\n# Generate test files with modified filenames\nllm-file-processor -p test-generation.txt -f userAuthentication.js --insert-before-ext \".test\"\n\n# Process log files and output as JSON\nllm-file-processor -p extract-data.txt -d logs/ --output-ext json\n\n# Process multiple files and merge results into a single output\nllm-file-processor -p extract-data.txt -d logs/ -m json\n\n# Process files and merge with custom extension\nllm-file-processor -p summarize.txt -d articles/ -m md --output-ext summary.md\n\n# Batch process with custom settings\nllm-file-processor -p rules.txt -d src -b 5 --delay 1000\n```\n\n### CLI Options\n\n| Option                        | Description                                                                           |\n| ----------------------------- | ------------------------------------------------------------------------------------- |\n| `-p, --prompt-file \u003cfile\u003e`    | Path to the prompt file (required)                                                    |\n| `-f, --file \u003cfile\u003e`           | Path to a single file to process                                                      |\n| `-d, --directory \u003cdir\u003e`       | Path to a directory of files to process                                               |\n| `-o, --output \u003cdir\u003e`          | Specify a custom output directory (default: `./processed-\u003ctimestamp\u003e`)                |\n| `--insert-before-ext \u003ctext\u003e`  | Insert text before file extension (e.g., \".test\" for \"file.test.js\" from \"file.js\")   |\n| `--output-ext \u003cextension\u003e`    | Change or add file extension (e.g., \"json\" to save as \"file.log.json\")                |\n| `-m, --merge \u003cfilename\u003e`      | Merge all processed files into a single output file \"\u003cfilename\u003e\"                      |\n| `--dry-run`                   | Combine prompts and files without sending to LLM                                      |\n| `-b, --batch-size \u003cnumber\u003e`   | Number of files per batch (default: 1)                                                |\n| `--delay \u003cms\u003e`                | Milliseconds to wait between API batches (default: 500)                               |\n| `-h, --help`                  | Display help information                                                              |\n| `-v, --version`               | Display version information                                                           |\n\n## Writing Effective Prompts\n\nCraft transformation rules in your prompt file to guide the LLM. Example:\n\n```\n1. Generate a table of contents.\n2. Normalize all headings to Markdown `##`, `###`, etc.\n3. Flag sections missing a required `Summary` header.\n4. Append a `## Key Takeaways` section at the end.\n```\n\n## Contribute\n\nContributions to this project are welcome. Please fork the repository and submit a pull request with your changes or improvements.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Share\n\n[![Twitter](https://img.shields.io/badge/X-Tweet-blue)](https://twitter.com/intent/tweet?text=Check%20out%20this%20awesome%20project!\u0026url=https://github.com/samestrin/llm-file-processor) [![Facebook](https://img.shields.io/badge/Facebook-Share-blue)](https://www.facebook.com/sharer/sharer.php?u=https://github.com/samestrin/llm-file-processor) [![LinkedIn](https://img.shields.io/badge/LinkedIn-Share-blue)](https://www.linkedin.com/sharing/share-offsite/?url=https://github.com/samestrin/llm-file-processor)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamestrin%2Fllm-file-processor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamestrin%2Fllm-file-processor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamestrin%2Fllm-file-processor/lists"}