{"id":29564378,"url":"https://github.com/meijieru/github_summary","last_synced_at":"2026-05-17T21:11:30.857Z","repository":{"id":304426488,"uuid":"1018761241","full_name":"meijieru/github_summary","owner":"meijieru","description":"Fetches and summarizes GitHub repository activity, with AI-powered insights \u0026 RSS feed generation","archived":false,"fork":false,"pushed_at":"2026-05-10T04:45:30.000Z","size":215,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-10T06:49:16.806Z","etag":null,"topics":["github","llm","rss"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/meijieru.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-07-13T01:38:15.000Z","updated_at":"2026-05-10T04:45:27.000Z","dependencies_parsed_at":"2025-07-13T03:43:51.440Z","dependency_job_id":"365e7382-a0db-4cb7-aa51-a1136d8a9810","html_url":"https://github.com/meijieru/github_summary","commit_stats":null,"previous_names":["meijieru/github_summary"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/meijieru/github_summary","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meijieru%2Fgithub_summary","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meijieru%2Fgithub_summary/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meijieru%2Fgithub_summary/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meijieru%2Fgithub_summary/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/meijieru","download_url":"https://codeload.github.com/meijieru/github_summary/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meijieru%2Fgithub_summary/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33155544,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-17T09:28:26.183Z","status":"ssl_error","status_checked_at":"2026-05-17T09:27:52.702Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["github","llm","rss"],"created_at":"2025-07-18T19:37:38.143Z","updated_at":"2026-05-17T21:11:30.850Z","avatar_url":"https://github.com/meijieru.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GitHub Repository Summarizer\n\nA modern, async-first CLI tool and RSS server that fetches GitHub repository activity and generates AI-powered summaries using OpenAI and gidgethub.\n\n## 🚀 Quick Start\n\n1. **Install dependencies:**\n\n   ```bash\n   # Using uv (recommended)\n   pip install uv \u0026\u0026 uv sync\n\n   # Or using pip\n   pip install -e .\n   ```\n\n2. **Configure:**\n\n   ```bash\n   cp examples/basic.toml config.toml\n   # Edit config.toml with your GitHub token and LLM API key\n   ```\n\n3. **Run:**\n\n   ```bash\n   # Generate summaries\n   ghsum run\n   # Start RSS web server\n   ghsum serve\n   # Run scheduler daemon\n   ghsum schedule\n   ```\n\n## 📋 Commands\n\n### Generate Repository Summaries\n\n```bash\n# Process all repositories\nghsum run\n\n# Process specific repository\nghsum run --repo owner/repo-name\n\n# Save outputs\nghsum run --save-json --save-markdown\n\n# Skip LLM summary generation (data collection only)\nghsum run --skip-summary\n```\n\n### RSS Web Server\n\n```bash\n# Start server on default port (8000)\nghsum serve\n\n# Custom host and port\nghsum serve --host 0.0.0.0 --port 8080\n\n# Development mode with auto-reload\nghsum serve --reload\n```\n\n### Scheduler Daemon\n\n```bash\n# Run scheduled jobs (blocking)\nghsum schedule\n```\n\n### Utilities\n\n```bash\n# Validate configuration\nghsum utils validate-config\n```\n\n## ⚙️ Configuration\n\n### Basic Setup\n\n```toml\n# Optional. Defaults to:\n#   $XDG_STATE_HOME/github-summary\n# or:\n#   ~/.local/state/github-summary\n# run_dir = \"~/.local/state/github-summary\"\noutput_dir = \"output\"\ncache_dir = \"cache\"\nlog_dir = \"log\"\n\n[github]\ntoken = \"YOUR_GITHUB_TOKEN\"\n\n[[repositories]]\nname = \"owner/repo-name\"\n\n[llm]\napi_key = \"YOUR_LLM_API_KEY\"\nbase_url = \"https://api.openai.com/v1\"  # Optional: custom OpenAI-compatible endpoint\nmodel_name = \"gpt-4o-mini\"\nlanguage = \"English\"\n\n[performance]\nmax_concurrent_repos = 4  # Maximum concurrent repository processing\nmax_concurrent_llm = 3    # Maximum concurrent LLM requests\n```\n\nBy default, runtime files live under the XDG state directory: `$XDG_STATE_HOME/github-summary`, or `~/.local/state/github-summary` when `XDG_STATE_HOME` is not set. Relative `output_dir`, `cache_dir`, and `log_dir` values are resolved inside `run_dir`; absolute paths are used as-is. They can also be overridden from the CLI with `--output-dir`, `--cache-dir`, and `--log-dir`.\n\n### RSS Configuration\n\n```toml\n[rss]\ntitle = \"My GitHub Activity\"\nlink = \"https://my-domain.com/rss.xml\"\ndescription = \"Recent activity in my GitHub repositories\"\nfilename = \"rss.xml\"\n```\n\n### Scheduling\n\n```toml\n[schedule]\ncron = \"0 9,17 * * *\"       # Daily at 9 AM and 5 PM\ntimezone = \"America/New_York\"\n```\n\n**Common Cron Patterns:**\n\n### Per-Repository Scheduling\n\n```toml\n[repositories.schedule]\ncron = \"0 12 * * 1\"         # Every Monday at noon\ntimezone = \"UTC\"\n```\n\n### Release Tracking\n\nEnable or disable release tracking for a repository:\n\n```toml\n[[repositories]]\nname = \"owner/repo-name\"\ninclude_releases = true\n\n# Only fetch releases for this repository\n[[repositories]]\nname = \"owner/another-repo\"\nrelease_only = true\n```\n\n### Filtering Releases\n\n```toml\n[filters.releases]\nexclude_release_names_regex = \"-alpha|-beta\"\n```\n\n### Repository Grouping\n\nRepositories with the same schedule are automatically grouped for efficient processing:\n\n### Repository Grouping\n\nRepositories with the same schedule are automatically grouped for efficient processing:\n\n```toml\n# These will be processed together in a single job\n[[repositories]]\nname = \"owner/repo1\"\n[repositories.schedule]\ncron = \"0 9 * * *\"\n\n[[repositories]]\nname = \"owner/repo2\"\n[repositories.schedule]\ncron = \"0 9 * * *\"\n```\n\n## 🌐 Web Interface\n\nThe RSS server provides:\n\n- **RSS Feed**: Auto-generated from repository summaries\n- **Static Files**: Access generated markdown and JSON reports\n- **Health Check**: `/healthz` endpoint for monitoring\n- **Background Jobs**: Automatic scheduled processing\n\n```bash\n# Start server\nghsum serve --port 8080\n\n# Access feed\ncurl http://localhost:8080/rss.xml\n\n# Health check\ncurl http://localhost:8080/healthz\n```\n\n## 🔧 Advanced Usage\n\n### Environment Variables\n\n```bash\n# Override configuration\nexport GITHUB_TOKEN=\"your_token\"\nexport OPENAI_API_KEY=\"your_key\"\nexport GHSUM_CONCURRENT_REPOS=\"6\"\nexport GHSUM_CONFIG_PATH=\"/path/to/config.toml\"\n\n# Run with overrides\nghsum run --max-concurrent 8\n```\n\n### Programming API\n\n```python\nfrom github_summary.app import GitHubSummaryApp\n\n# Create application instance\napp = GitHubSummaryApp(\"config/config.toml\", skip_summary=False)\n\n# Process all repositories\nawait app.run()\n\n# Process specific repositories\nawait app.run(\n    repo_names=[\"owner/repo1\", \"owner/repo2\"],\n    save_json=True,\n    save_markdown=True,\n    max_concurrent_repos=4\n)\n```\n\n### Docker Deployment\n\nBuild the image:\n\n```bash\ndocker build -t github-summary .\n```\n\nRun the RSS server:\n\n```bash\ndocker run --rm \\\n  -p 8000:8000 \\\n  -v \"$PWD/config/config.toml:/config/config.toml:ro\" \\\n  -v github-summary-state:/data \\\n  github-summary\n```\n\nThe container sets `XDG_STATE_HOME=/data`, so runtime files are stored under `/data/github-summary` by default. Mount `/data` or set `run_dir` in `config.toml` if you want a different persistent location.\n\nRun the scheduler instead of the web server:\n\n```bash\ndocker run --rm \\\n  -v \"$PWD/config/config.toml:/config/config.toml:ro\" \\\n  -v github-summary-state:/data \\\n  github-summary schedule\n```\n\nRun a one-off report:\n\n```bash\ndocker run --rm \\\n  -v \"$PWD/config/config.toml:/config/config.toml:ro\" \\\n  -v github-summary-state:/data \\\n  github-summary run --skip-summary\n```\n\n## 🧪 Testing\n\n```bash\n# Run all tests\npytest\n\n# Run specific test categories\npytest -m unit\npytest -m integration\n\n# Test new architecture\npython test_new_architecture.py\n\n# Validate configuration\nghsum utils validate-config\n```\n\n## 📊 Monitoring\n\n### Logs\n\n```bash\n# Application logs\ntail -f log/github_summary.log\n\n# Structured logging with timestamps\n[2025-01-13 10:00:00] INFO Processing repository: owner/repo\n[2025-01-13 10:00:01] INFO Summary generated (1024 characters)\n```\n\n### Health Checks\n\n```bash\n# Web server health\ncurl http://localhost:8000/healthz\n\n# Scheduler status (check logs)\nghsum schedule 2\u003e\u00261 | grep \"Scheduler\"\n```\n\n### Performance Tuning\n\n```toml\n[performance]\nmax_concurrent_repos = 6     # Adjust based on GitHub API limits\nmax_concurrent_llm = 4       # Adjust based on LLM provider limits\n\nfallback_lookback_days = 14  # Increase for more historical data\n```\n\n## ❓ Troubleshooting\n\n### Common Issues\n\n1. **GitHub API Rate Limits**\n\n   ```bash\n   # Check your limits\n   curl -H \"Authorization: token YOUR_TOKEN\" https://api.github.com/rate_limit\n   # Reduce concurrency\n   ghsum run --max-concurrent 2\n   ```\n\n2. **LLM API Errors**\n\n   ```bash\n   # Test without LLM\n   ghsum run --skip-summary\n   # Check configuration\n   ghsum utils validate-config\n   ```\n\n3. **Configuration Errors**\n\n   ```bash\n   # Validate syntax\n   ghsum utils validate-config\n\n   # Test with minimal config\n   echo '[github]\\ntoken=\"test\"\\n[[repositories]]\\nname=\"owner/repo\"' \u003etest.toml\n   ghsum run --config test.toml --skip-summary\n   ```\n\n### Debug Mode\n\n```bash\n# Enable debug logging\nexport GHSUM_LOG_LEVEL=DEBUG\nghsum run\n\n# Or edit config.toml\nlog_level = \"DEBUG\"\n```\n\n## 🤝 Contributing\n\n1. **Setup Development Environment**\n\n   ```bash\n   git clone https://github.com/your-username/github-summary.git\n   cd github-summary || exit\n   uv sync --dev\n   ```\n\n2. **Run Tests**\n\n   ```bash\n   pytest\n   python test_new_architecture.py\n   ```\n\n3. **Code Style**\n   ```bash\n   ruff check .\n   ruff format .\n   ```\n\n## 📜 License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n## 🙏 Acknowledgments\n\n- [gidgethub](https://github.com/brettcannon/gidgethub) for async GitHub API client\n- [FastAPI](https://fastapi.tiangolo.com/) for web framework\n- [Typer](https://typer.tiangolo.com/) for CLI interface\n- [APScheduler](https://apscheduler.readthedocs.io/) for scheduling\n  github-summary web --reload --config custom.toml\n\n# List repository labels\n\ngithub-summary utils list-labels owner/repo\n\n## ✨ Features\n\n- **Async-first architecture**: Built with OpenAI and gidgethub for high performance\n- **Multi-source data**: Commits, PRs, issues, discussions, and releases via GitHub GraphQL API\n- **AI summaries**: OpenAI and compatible LLM integration with configurable concurrency\n- **Flexible scheduling**: AsyncIOScheduler with cron-based scheduling and timezone support\n- **RSS feeds**: Generate RSS feeds for summaries with markdown support\n- **Advanced filtering**: Regex patterns, author filters, label filters, date ranges\n- **Web service**: FastAPI-based service for RSS feeds and scheduled reports\n- **Robust error handling**: Automatic retries with exponential backoff, rate limiting, and comprehensive logging\n- **Per-repository tracking**: Individual last-run tracking and scheduling per repository\n- **Modern async I/O**: Built on httpx for efficient HTTP operations\n\n## 📁 Project Structure\n\n```\n├── config/ # Configuration files\n├── github_summary/ # Main source code\n├── tests/ # Test suite\n├── docs/ # Detailed documentation\n└── examples/ # Configuration examples\n```\n\n## 📚 Documentation\n\n- **[API Documentation](docs/api_reference.md)** - GraphQL queries and data models\n- **[Configuration Examples](examples/README.md)** - Example configurations for different use cases\n\n## 🤝 Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests\n5. Run linting and tests:\n\n   ```bash\n   ruff check .          # Check for linting issues\n   ruff check . --fix    # Auto-fix linting issues\n   pyrefly check         # Type checking\n   pytest                # Run tests\n   pytest -m unit        # Run only unit tests (fast)\n   pytest -m integration # Run integration tests\n   ```\n\n6. Submit a pull request\n\n### Test Categories\n\n- **Unit tests** (`@pytest.mark.unit`): Fast tests without external dependencies\n- **Integration tests** (`@pytest.mark.integration`): Tests with mocked external services\n\nFor development guidelines and detailed testing information, see [Testing Guide](docs/testing_guide.md).\n\n## 📄 License\n\nMIT License - see [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmeijieru%2Fgithub_summary","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmeijieru%2Fgithub_summary","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmeijieru%2Fgithub_summary/lists"}