{"id":24590547,"url":"https://github.com/regenrek/codefetch","last_synced_at":"2025-05-16T06:08:19.152Z","repository":{"id":272081276,"uuid":"915470838","full_name":"regenrek/codefetch","owner":"regenrek","description":"Turn code into Markdown for LLMs with one simple terminal command","archived":false,"fork":false,"pushed_at":"2025-01-22T22:45:05.000Z","size":997,"stargazers_count":264,"open_issues_count":1,"forks_count":17,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-11T03:11:30.081Z","etag":null,"topics":["cli","llm","markdown"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/regenrek.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-11T23:25:43.000Z","updated_at":"2025-05-08T23:26:11.000Z","dependencies_parsed_at":"2025-01-12T00:24:06.183Z","dependency_job_id":"e42952b9-b682-4f5a-96f5-15383ea38b5c","html_url":"https://github.com/regenrek/codefetch","commit_stats":null,"previous_names":["regenrek/boltfetch"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/regenrek%2Fcodefetch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/regenrek%2Fcodefetch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/regenrek%2Fcodefetch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/regenrek%2Fcodefetch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/regenrek","download_url":"https://codeload.github.com/regenrek/codefetch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254478193,"owners_count":22077676,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","llm","markdown"],"created_at":"2025-01-24T09:13:49.613Z","updated_at":"2025-05-16T06:08:14.143Z","avatar_url":"https://github.com/regenrek.png","language":"TypeScript","funding_links":[],"categories":["TypeScript"],"sub_categories":[],"readme":"# codefetch\n\n![Codefetch Cover](/public/cover.jpeg)\n\n[![npm (tag)](https://img.shields.io/npm/v/codefetch)](https://www.npmjs.com/package/codefetch)\n\n\u003eTurn code into Markdown for LLMs with one simple terminal command\n\n\nFetches all code files in the current directory, ignoring what's in `.gitignore` and `.codefetchignore`, then outputs them into a single Markdown file with line numbers.\n\nClick here for a [Demo \u0026 Videos](https://x.com/kregenrek/status/1878487131099898269)\n\n## Usage\nBasic usage with output file and tree\n```bash\nnpx codefetch\n# You codebase will be saved to `codefetch/codebase.md`\n```\n\nInclude a default prompt:\n```bash\nnpx codefetch -p improve\n```\n\nInclude a tree with depth\n```bash\nnpx codefetch -t 3\n```\n\nFilter by file extensions:\n```bash\nnpx codefetch -e .ts,.js -o typescript-files.md --token-encoder cl100k\n```\n\nInclude or exclude specific files and directories:\n```bash\n# Exclude node_modules and public directories\nnpx codefetch --exclude-dir test,public\n\n# Include only TypeScript files\nnpx codefetch --include-files \"*.ts\" -o typescript-only.md\n\n# Include src directory, exclude test files\nnpx codefetch --include-dir src --exclude-files \"*.test.ts\" -o src-no-tests.md\n```\n\nDry run (only output to console)\n```bash\nnpx codefetch --d\n```\n\nIf no output file is specified (`-o` or `--output`), it will print to `codefetch/codebase.md`\n\n## Options\n\n| Option | Description |\n|--------|-------------|\n| `-o, --output \u003cfile\u003e` | Specify output filename (defaults to codebase.md) |\n| `--dir \u003cpath\u003e` | Specify the directory to scan (defaults to current directory) |\n| `--max-tokens \u003cnumber\u003e` | Limit output tokens (default: 500,000) |\n| `-e, --extension \u003cext,...\u003e` | Filter by file extensions (e.g., .ts,.js) |\n| `--token-limiter \u003ctype\u003e` | Token limiting strategy when using --max-tokens (sequential, truncated) |\n| `--include-files \u003cpattern,...\u003e` | Include specific files (supports patterns like *.ts) |\n| `--exclude-files \u003cpattern,...\u003e` | Exclude specific files (supports patterns like *.test.ts) |\n| `--include-dir \u003cdir,...\u003e` | Include specific directories |\n| `--exclude-dir \u003cdir,...\u003e` | Exclude specific directories |\n| `-v, --verbose [level]` | Show processing information (0=none, 1=basic, 2=debug) |\n| `-t, --project-tree [depth]` | Generate visual project tree (optional depth, default: 2) |\n| `--token-encoder \u003ctype\u003e` | Token encoding method (simple, p50k, o200k, cl100k) |\n| `--disable-line-numbers` | Disable line numbers in output |\n| `-d, --dry-run` | Output markdown to stdout instead of file |\n\nAll options that accept multiple values use comma-separated lists. File patterns support simple wildcards:\n- `*` matches any number of characters\n- `?` matches a single character\n\n### Project Tree\n\nYou can generate a visual tree representation of your project structure:\n\n```bash\n# Generate tree with default depth (2 levels)\nnpx codefetch --project-tree\n\n# Generate tree with custom depth\nnpx codefetch -t 3\n\n# Generate tree and save code to file\nnpx codefetch -t 2 -o output.md\n```\n\nExample output:\n```\nProject Tree:\n└── my-project\n    ├── src\n    │   ├── index.ts\n    │   ├── types.ts\n    │   └── utils\n    ├── tests\n    │   └── index.test.ts\n    └── package.json\n```\n\n### Using Prompts\n\nYou can add predefined or custom prompts to your output:\n\n```bash\n# Use default prompt (looks for codefetch/prompts/default.md)\nnpx codefetch -p\nnpx codefetch --prompt\n\n# Use built-in prompts\nnpx codefetch -p fix # fixes codebase\nnpx codefetch -p improve # improves codebase\nnpx codefetch -p codegen # generates code\nnpx codefetch -p testgen # generates tests\n\n# Use custom prompts\nnpx codefetch --prompt custom-prompt.md\nnpx codefetch -p my-architect.txt\n```\n\n#### Custom Prompts\n\nCreate custom prompts in `codefetch/prompts/` directory:\n\n1. Create a markdown file (e.g., `codefetch/prompts/my-prompt.md`)\n2. Use it with `--prompt my-prompt.md`\n\nYou can also set a default prompt in your `codefetch.config.mjs`:\n\n```js\nexport default {\n  defaultPromptFile: \"dev\", // Use built-in prompt\n}\n\nexport default {\n  defaultPromptFile: \"custom-prompt.md\", // Use custom prompt file\n}\n```\n\nThe prompt resolution order is:\n1. CLI argument (`-p` or `--prompt`)\n2. Config file prompt setting\n3. No prompt if neither is specified\n\nWhen using just `-p` or `--prompt` without a value, codefetch will look for `codefetch/prompts/default.md`.\n\n## Token Limiting Strategies\n\nWhen using `--max-tokens`, you can control how tokens are distributed across files using the `--token-limiter` option:\n\n```bash\n# Sequential mode - process files in order until reaching token limit\nnpx codefetch --max-tokens 500 --token-limiter sequential\n\n# Truncated mode (default) - distribute tokens evenly across all files\nnpx codefetch --max-tokens 500 --token-limiter truncated\n```\n\n![tokenlimiter](/public/tokenlimiter.png)\n\n- `sequential`: Processes files in order until the total token limit is reached. Useful when you want complete content from the first files.\n- `truncated`: Distributes tokens evenly across all files, showing partial content from each file. This is the default mode and is useful for getting an overview of the entire codebase.\n\n## Ignoring Files\n\ncodefetch supports two ways to ignore files:\n\n1. `.gitignore` - Respects your project's existing `.gitignore` patterns\n2. `.codefetchignore` - Additional patterns specific to codefetch\n\nThe `.codefetchignore` file works exactly like `.gitignore` and is useful when you want to ignore files that aren't in your `.gitignore`. \n\n### Default Ignore Patterns\n\nCodefetch uses a set of default ignore patterns to exclude common files and directories that typically don't need to be included in code reviews or LLM analysis. \n\nYou can view the complete list of default patterns in [default-ignore.ts](src/default-ignore.ts).\n\n## Token Counting\n\nCodefetch supports different token counting methods to match various AI models:\n\n- `simple`: Basic word-based estimation (not very accurate but fastest!)\n- `p50k`: GPT-3 style tokenization\n- `o200k`: gpt-4o style tokenization  \n- `cl100k`: GPT-4 style tokenization\n\nSelect the appropriate encoder based on your target model:\n\n```bash\n# For GPT-4o\nnpx codefetch --token-encoder o200k\n```\n\n## Output Directory\n\nBy default (unless using --dry-run) codefetch will:\n1. Create a `codefetch/` directory in your project\n2. Store all output files in this directory\n\nThis ensures that:\n- Your fetched code is organized in one place\n- The output directory won't be fetched so we avoid fetching the codebase again\n\nAdd `codefetch/` to your `.gitignore` file to avoid committing the fetched codebase. \n\n## Use with AI Tools\n\nYou can use this command to create code-to-markdown in [bolt.new](https://bolt.new), [cursor.com](https://cursor.com), ... and ask the AI chat for guidance about your codebase. \n\n\n## Or install globally:\n```bash\nnpm install -g codefetch\ncodefetch -o output.md\n```\n\n## Integrate codefetch into your project\n\nInitialize your project with codefetch:\n\n```bash\nnpx codefetch init\n```\n\nThis will:\n1. Create a `.codefetchignore` file for excluding files\n2. Generate a `codefetch.config.mjs` with your preferences\n3. Set up the project structure\n\n\n### `codefetch.config.mjs` Config File\n\nCreate a `codefetch.config.mjs` file in your project root:\n\n```js\nexport default {\n  // Output settings\n  outputPath: \"codefetch\", // Directory for output files\n  outputFile: \"codebase.md\", // Output filename\n  maxTokens: 999_000, // Token limit\n  disableLineNumbers: false, // Toggle line numbers in output\n  \n  // Processing options\n  verbose: 1, // Logging level (0=none, 1=basic, 2=debug)\n  projectTree: 2, // Project tree depth\n  defaultIgnore: true, // Use default ignore patterns\n  gitignore: true, // Respect .gitignore\n  dryRun: false, // Output to console instead of file\n  \n  // Token handling\n  tokenEncoder: \"simple\", // Token counting method (simple, p50k, o200k, cl100k)\n  tokenLimiter: \"truncated\", // Token limiting strategy\n  \n  // File filtering\n  extensions: [\".ts\", \".js\"], // File extensions to include\n  includeFiles: [\"src/**/*.ts\"], // Files to include (glob patterns)\n  excludeFiles: [\"**/*.test.ts\"], // Files to exclude\n  includeDirs: [\"src\", \"lib\"], // Directories to include\n  excludeDirs: [\"test\", \"dist\"], // Directories to exclude\n  \n  // AI/LLM settings\n  trackedModels: [\n    \"chatgpt-4o-latest\",\n    \"claude-3-5-sonnet-20241022\",\n    \"o1\",\n    \"deepseek-v3\",\n    \"gemini-exp-1206\",\n  ],\n  \n  // Prompt handling\n  prompt: \"dev\", // Built-in prompt or custom prompt file\n  defaultChat: \"https://chat.com\", // Default chat URL\n  templateVars: {}, // Variables for template substitution\n}\n```\n\nAll configuration options are optional and will fall back to defaults if not specified. You can override any config option using CLI arguments.\n\n## Links\n\n- X/Twitter: [@kregenrek](https://x.com/kregenrek)\n- Bluesky: [@kevinkern.dev](https://bsky.app/profile/kevinkern.dev)\n\n## Courses\n- Learn Cursor AI: [Ultimate Cursor Course](https://www.instructa.ai/en/cursor-ai)\n- Learn to build software with AI: [AI Builder Hub](https://www.instructa.ai/en/ai-builder-hub)\n\n## See my other projects:\n\n* [codefetch](https://github.com/regenrek/codefetch) - Turn code into Markdown for LLMs with one simple terminal command\n* [aidex](https://github.com/regenrek/aidex) A CLI tool that provides detailed information about AI language models, helping developers choose the right model for their needs.\n* [codetie](https://github.com/codetie-ai/codetie) - XCode CLI\n\n## Credits\n\nThis project was inspired by \n\n* [codetie](https://github.com/codetie-ai/codetie) CLI made by [@kevinkern](https://github.com/regenrek) \u0026 [@timk](https://github.com/KerneggerTim)\n* [sitefetch](https://github.com/egoist/sitefetch) CLI made by [@egoist](https://github.com/egoist). While sitefetch is great for fetching documentation and websites, codefetch focuses on fetching local codebases for AI analysis.\n[unjs](https://github.com/unjs) - for bringing us the best javascript tooling system\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fregenrek%2Fcodefetch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fregenrek%2Fcodefetch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fregenrek%2Fcodefetch/lists"}