{"id":25949714,"url":"https://github.com/sandy-sp/gittxt","last_synced_at":"2025-03-04T12:28:49.812Z","repository":{"id":279334393,"uuid":"938461593","full_name":"sandy-sp/gittxt","owner":"sandy-sp","description":"Gittxt is a lightweight CLI tool that extracts text from Git repositories and formats it into AI-friendly outputs (.txt, .json, .md). Whether you’re using ChatGPT, Grok, or Ollama, or any LLM, Gittxt helps process repositories for insights, training, and documentation.","archived":false,"fork":false,"pushed_at":"2025-03-04T09:20:43.000Z","size":170,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-04T09:24:01.457Z","etag":null,"topics":["ai","cli-tool","git","json","llm","markdown","nlp","repository","text","text-extraction"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sandy-sp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-25T01:54:39.000Z","updated_at":"2025-03-01T08:33:57.000Z","dependencies_parsed_at":"2025-02-25T03:29:11.471Z","dependency_job_id":"1500d2d5-0190-4967-ade3-a58fb30fa0ce","html_url":"https://github.com/sandy-sp/gittxt","commit_stats":null,"previous_names":["sandy-sp/gittxt"],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandy-sp%2Fgittxt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandy-sp%2Fgittxt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandy-sp%2Fgittxt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandy-sp%2Fgittxt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sandy-sp","download_url":"https://codeload.github.com/sandy-sp/gittxt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241821125,"owners_count":20025661,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","cli-tool","git","json","llm","markdown","nlp","repository","text","text-extraction"],"created_at":"2025-03-04T12:28:49.188Z","updated_at":"2025-03-04T12:28:49.803Z","avatar_url":"https://github.com/sandy-sp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🚀 Gittxt: Get Text of Your Repo for AI, LLMs \u0026 Docs!\n\n**Gittxt** is a **lightweight CLI tool** that extracts text from **Git repositories** and formats it into AI-friendly outputs (`.txt`, `.json`, `.md`). Whether you’re using ChatGPT, Grok, Ollama or any LLM, Gittxt helps you process repositories for insights, training, and documentation.\n\n---\n\n## ✨ Why Use Gittxt?\n- **Extract Readable Text:** Easily pull text from code, docs, and other repository files.\n- **AI-Friendly Outputs:** Generate outputs in TXT, JSON, and Markdown for different use cases.\n- **Efficient Processing:** Faster scanning with incremental caching.\n- **Flexible Filtering:** Use advanced flags like `--docs-only` and `--auto-filter` to control what’s extracted.\n- **Multi-Repository Support:** Scan one or more repositories in a single command.\n\n---\n\n## 🆕 Release v1.3.1\n\n### New Features \u0026 Enhancements\n- **Interactive Installation:**  \n  Use the new `gittxt install` subcommand to set up your configuration (output directory, logging preferences, etc.) interactively.\n\n- **Multi-Repository Scanning:**  \n  Scan multiple repositories at once, whether they are local or remote.\n\n- **Advanced Filtering Options:**  \n  - `--docs-only`: Extract only documentation files (e.g., README, docs/ folder, etc.).\n  - `--auto-filter`: Automatically skip common unwanted or binary files.\n\n- **Multi-Format Output:**  \n  Specify multiple output formats simultaneously (e.g., `--output-format txt,json,md`).\n\n- **Enhanced Summary Reports:**  \n  Outputs include summary statistics and an estimated token count for further AI processing.\n\n- **Improved Logging \u0026 Caching:**  \n  Faster, more accurate scanning with incremental caching and a rotating log file system.\n\n---\n\n## 📥 Installation\n\n### Via PIP\n```bash\npip install gittxt==1.3.1\n```\n\n### First-Time Setup (Interactive)\nAfter installing, run:\n```bash\ngittxt install\n```\nThis command will prompt you to configure:\n- Your default output directory (automatically set based on your OS, e.g., `~/Gittxt/` on Linux/Mac)\n- Logging level and file logging preferences\n\n---\n\n## 📌 How to Use Gittxt\n\n### 1. Scanning Repositories\nUse the `scan` subcommand to extract text and generate outputs.\n\n#### Scan a Local Repository\n```bash\ngittxt scan .\n```\nExtracts all readable text into the default output directories.\n\n#### Scan a Remote GitHub Repository\n```bash\ngittxt scan https://github.com/sandy-sp/sandy-sp\n```\nAutomatically clones the repository, scans it, and extracts text.\n\n#### Scan Multiple Repositories with Advanced Options\n```bash\ngittxt scan /path/to/repo1 https://github.com/user/repo2 --output-format txt,json --docs-only --auto-filter --summary\n```\n\n---\n\n## 🔧 CLI Options\n\n| Option                   | Description                                                               |\n|--------------------------|---------------------------------------------------------------------------|\n| `--include`              | Include only files matching these patterns.                              |\n| `--exclude`              | Exclude files matching these patterns.                                   |\n| `--size-limit`           | Exclude files larger than the specified size (in bytes).                 |\n| `--branch`               | Specify a Git branch (for remote repositories).                          |\n| `--output-dir`           | Override the default output directory.                                   |\n| `--output-format`        | Comma-separated list of output formats (e.g., `txt,json,md`).               |\n| `--max-lines`            | Limit the number of lines per file.                                      |\n| `--summary`              | Display a summary report after scanning.                                 |\n| `--debug`                | Enable debug mode for detailed logging.                                  |\n| `--docs-only`            | Only extract documentation files (e.g., README, docs folder).              |\n| `--auto-filter`          | Automatically skip common unwanted or binary files.                      |\n\n---\n\n## 📄 Output Formats\n\n- **TXT:** Simple text extraction for AI chat and quick analysis.\n- **JSON:** Structured output ideal for LLM training and data preprocessing.\n- **Markdown (MD):** Neatly formatted documentation for GitHub or project READMEs.\n\nWhen specifying multiple formats (e.g., `--output-format txt,json`), Gittxt generates separate files in their respective output directories.\n\n---\n\n## 🗂 Directory Structure\n\nBy default, outputs are stored in your configured output directory, which is organized as follows:\n```\n\u003coutput_dir\u003e/\n  ├── text/    # Plain text outputs (.txt)\n  ├── json/    # JSON outputs (.json)\n  ├── md/      # Markdown outputs (.md)\n  └── cache/   # Caching for incremental scans\n```\n\n---\n\n## ⚙️ Configuration\n\nGittxt uses a configuration file (`gittxt-config.json`) to store user preferences. You can update this configuration via the interactive install command:\n```bash\ngittxt install\n```\nOr edit the file manually. Key settings include:\n- **Output Directory:** Auto-determined based on your OS (e.g., `~/Gittxt/`).\n- **Logging Options:** Logging level and file logging preferences.\n- **Filtering Options:** Include/exclude patterns, file size limits, etc.\n\n---\n\n## 📌 Contribute \u0026 Develop\n\n1. **Run Tests:**\n   ```bash\n   pytest tests/\n   ```\n2. **Format Code:**\n   ```bash\n   black src/\n   ```\n3. **Submit a PR:**\n   - Fork the repo.\n   - Create a new branch (e.g., `feature/my-change`).\n   - Push your changes.\n   - Submit a PR.\n\nFor more details, see the [Contributing Guide](CONTRIBUTING.md).\n\n---\n\n## 💡 Future Roadmap\n\nOur future plans include enhancements to the user interface and further AI-based features. We’re working on a lightweight web-based UI and additional improvements that streamline repository analysis and documentation extraction.\n\n---\n\n## 📜 License\n\nGittxt is licensed under the **MIT License**.\n\n---\n\n## **Made by [Sandeep Paidipati](https://github.com/sandy-sp)**\n🚀 **Gittxt: Get Text of Your Repo for AI, LLMs \u0026 Docs!**\n\n---","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsandy-sp%2Fgittxt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsandy-sp%2Fgittxt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsandy-sp%2Fgittxt/lists"}