{"id":29874344,"url":"https://github.com/bobmatnyc/gitflow-analytics","last_synced_at":"2026-04-27T05:04:35.780Z","repository":{"id":307184773,"uuid":"1028675282","full_name":"bobmatnyc/gitflow-analytics","owner":"bobmatnyc","description":"Analyze Git repositories for developer and project production insights","archived":false,"fork":false,"pushed_at":"2026-04-23T01:39:02.000Z","size":4286,"stargazers_count":6,"open_issues_count":2,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-23T02:16:36.291Z","etag":null,"topics":["analytics","commits","dora","git","loc","mtbf","productivity"],"latest_commit_sha":null,"homepage":"https://github.com/bobmatnyc/gitflow-analytics","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bobmatnyc.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"docs/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-07-29T22:27:57.000Z","updated_at":"2026-04-23T01:39:06.000Z","dependencies_parsed_at":"2025-07-29T23:24:01.547Z","dependency_job_id":"4fc66ca3-7bc8-40b0-b87f-416c85176148","html_url":"https://github.com/bobmatnyc/gitflow-analytics","commit_stats":null,"previous_names":["bobmatnyc/gitflow-analytics"],"tags_count":128,"template":false,"template_full_name":null,"purl":"pkg:github/bobmatnyc/gitflow-analytics","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobmatnyc%2Fgitflow-analytics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobmatnyc%2Fgitflow-analytics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobmatnyc%2Fgitflow-analytics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobmatnyc%2Fgitflow-analytics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bobmatnyc","download_url":"https://codeload.github.com/bobmatnyc/gitflow-analytics/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobmatnyc%2Fgitflow-analytics/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32191873,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-23T15:28:30.493Z","status":"ssl_error","status_checked_at":"2026-04-23T15:28:29.972Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","commits","dora","git","loc","mtbf","productivity"],"created_at":"2025-07-31T00:12:17.115Z","updated_at":"2026-04-23T18:01:27.877Z","avatar_url":"https://github.com/bobmatnyc.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GitFlow Analytics\n\n[![PyPI version](https://badge.fury.io/py/gitflow-analytics.svg)](https://badge.fury.io/py/gitflow-analytics)\n[![Python Support](https://img.shields.io/pypi/pyversions/gitflow-analytics.svg)](https://pypi.org/project/gitflow-analytics/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Documentation](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://github.com/bobmatnyc/gitflow-analytics/tree/main/docs)\n[![Tests](https://github.com/bobmatnyc/gitflow-analytics/workflows/Tests/badge.svg)](https://github.com/bobmatnyc/gitflow-analytics/actions)\n\nA comprehensive Python package for analyzing Git repositories to generate developer productivity insights without requiring external project management tools. Extract actionable metrics directly from Git history with ML-enhanced commit categorization, automated developer identity resolution, and professional reporting.\n\n## 🚀 Key Features\n\n- **🔍 Zero Dependencies**: Analyze productivity without requiring JIRA, Linear, or other PM tools\n- **🧠 ML-Powered Intelligence**: Advanced commit categorization with 85-95% accuracy\n- **👥 Smart Identity Resolution**: Automatically consolidate developer identities across email addresses\n- **🏢 Enterprise Ready**: Organization-wide repository discovery with intelligent caching\n- **📊 Professional Reports**: Rich markdown narratives and CSV exports for executive dashboards\n- **🎫 Ticketing \u0026 Collaboration Tracking**: Measure developer contributions across GitHub Issues, Confluence, and JIRA — blended into a unified `ticketing_score` per developer\n\n## 🎯 Quick Start\n\nGet up and running in 5 minutes:\n\n```bash\n# 1. Install GitFlow Analytics\npip install gitflow-analytics\n\n# 2. Install ML dependencies (optional but recommended)\npython -m spacy download en_core_web_sm\n\n# 3. Create a simple configuration\necho 'version: \"1.0\"\ngithub:\n  token: \"${GITHUB_TOKEN}\"\n  organization: \"your-org\"' \u003e config.yaml\n\n# 4. Set your GitHub token\necho 'GITHUB_TOKEN=ghp_your_token_here' \u003e .env\n\n# 5. Run analysis\ngitflow-analytics -c config.yaml --weeks 8\n```\n\n**What you get:**\n- 📈 Weekly metrics CSV with developer productivity trends\n- 👥 Developer profiles with project distribution and work styles\n- 🔍 Untracked work analysis with ML-powered categorization\n- 📋 Executive summary with actionable insights\n- 📊 Rich markdown report ready for stakeholders\n\n### Sample Output Preview\n\n```markdown\n## Executive Summary\n- **Total Commits**: 156 across 3 projects\n- **Active Developers**: 5 team members\n- **Ticket Coverage**: 73.2% (industry benchmark: 60-80%)\n- **Top Contributor**: Sarah Chen (32 commits, FRONTEND focus)\n\n## Key Insights\n🎯 **High Productivity**: Team averaged 31 commits/week\n📊 **Balanced Workload**: No single developer \u003e40% of total work\n✅ **Good Process**: 73% ticket coverage shows strong tracking\n```\n\n## ✨ Latest Features (v3.13.17+)\n\n- **🔀 PR Status Tracking**: Full pull request lifecycle captured — open, closed, merged states with `pr_state`, `closed_at`, and `is_merged` columns; rejection metrics in narrative reports and CSV output\n- **🤖 Bedrock-Powered Alias Generation**: `gfa aliases` now supports AWS Bedrock (auto-detected from config); configurable `strip_suffixes` in YAML; provider priority: Bedrock \u003e OpenRouter \u003e heuristic-only\n- **📊 PR Data in Reports**: `gfa report` loads cached PR data from the database; DORA metrics, velocity, and narrative reports now reflect real PR lifecycle data\n- **⚡ PR Enrichment for Cached Repos**: `gfa collect` enriches already-cached repos with PR data via `github_repo` config — no need to re-collect commits\n- **🔄 Pipeline Architecture**: Independent `collect`, `classify`, and `report` stages — collect once, report many times\n- **📦 Week-Based Incremental Caching**: Data stored in Monday-aligned weekly increments; only missing weeks are fetched\n- **⚡ `-f/--force` Flag**: Force re-fetch of cached weeks when you need fresh data\n- **🧠 Memory Optimized**: O(1) commit-to-branch mapping eliminates memory leaks at scale (tested on 146+ repos)\n- **📊 DORA Metrics**: Deployment frequency, lead time, change failure rate, and MTTR\n- **💰 Cost Tracking**: Monitor LLM API usage with detailed token and cost reporting\n- **📈 Weekly Trends**: Track classification pattern changes over time with proper Monday-aligned weeks\n- **🏷️ Non-Interactive Alias Management**: `gfa add-alias` command for scripting alias mappings — single or batch via YAML/JSON file, with `--dry-run` support\n\n## 🔥 Core Capabilities\n\n**📊 Analysis \u0026 Insights**\n- Multi-repository analysis with intelligent project grouping\n- ML-enhanced commit categorization (85-95% accuracy)\n- Developer productivity metrics and work pattern analysis\n- Story point extraction from commits and PRs\n- Ticket tracking across JIRA, GitHub, ClickUp, and Linear\n- Ticketing \u0026 collaboration tracking: `ticketing_score` blended from GitHub Issues, Confluence, and JIRA activity\n\n**🏢 Enterprise Features**\n- Organization-wide repository discovery from GitHub\n- Automated developer identity resolution and consolidation\n- Database-backed caching for sub-second report generation\n- Data anonymization for secure external sharing\n- Batch processing optimized for large repositories\n\n**📈 Professional Reporting**\n- Rich markdown narratives with executive summaries\n- Weekly CSV exports with trend analysis\n- Customizable output formats and filtering\n- Performance benchmarking and team comparisons\n\n## 📚 Documentation\n\nComprehensive guides for every use case:\n\n| **Getting Started** | **Advanced Usage** | **Integration** |\n|-------------------|------------------|---------------|\n| [Installation](docs/getting-started/installation.md) | [Complete Configuration](docs/guides/configuration.md) | [CLI Reference](docs/reference/cli-commands.md) |\n| [5-Minute Tutorial](docs/getting-started/quickstart.md) | [ML Categorization](docs/guides/ml-categorization.md) | [JSON Export Schema](docs/reference/json-export-schema.md) |\n| [First Analysis](docs/getting-started/first-analysis.md) | [Enterprise Setup](docs/examples/enterprise-setup.md) | [CI Integration](docs/examples/ci-integration.md) |\n\n**🎯 Quick Links:**\n- 📖 [**Documentation Hub**](docs/README.md) - Complete guide index\n- 🚀 [**Quick Start**](docs/getting-started/quickstart.md) - Get running in 5 minutes\n- ⚙️ [**Configuration**](docs/guides/configuration.md) - Full reference\n- 🤝 [**Contributing**](docs/developer/contributing.md) - Join the project\n\n## ⚡ Installation Options\n\n### Standard Installation (pip)\n```bash\npip install gitflow-analytics\n```\n\n### Homebrew (macOS)\n```bash\nbrew tap bobmatnyc/tools\nbrew install gitflow-analytics\n```\n\n### With ML Enhancement (Recommended)\n```bash\npip install gitflow-analytics\npython -m spacy download en_core_web_sm\n```\n\n### Run from Source (uv)\n```bash\ngit clone https://github.com/bobmatnyc/gitflow-analytics.git\ncd gitflow-analytics\nuv run gfa analyze -c config.yaml --weeks 4\n```\n\n### Development Installation\n```bash\ngit clone https://github.com/bobmatnyc/gitflow-analytics.git\ncd gitflow-analytics\npip install -e \".[dev]\"\npython -m spacy download en_core_web_sm\n```\n\n## 🔧 Configuration\n\n### Option 1: Organization Analysis (Recommended)\n```yaml\n# config.yaml\nversion: \"1.0\"\ngithub:\n  token: \"${GITHUB_TOKEN}\"\n  organization: \"your-org\"  # Auto-discovers all repositories\n  fetch_pr_reviews: true    # Enable pull request review data collection\n\nanalysis:\n  ml_categorization:\n    enabled: true\n    min_confidence: 0.7\n```\n\n### Option 2: Specific Repositories\n```yaml\n# config.yaml\nversion: \"1.0\"\ngithub:\n  token: \"${GITHUB_TOKEN}\"\n\nrepositories:\n  - name: \"my-app\"\n    path: \"~/code/my-app\"\n    github_repo: \"myorg/my-app\"  # Required for PR data enrichment\n    project_key: \"APP\"\n```\n\n### Environment Setup\n```bash\n# .env (same directory as config.yaml)\nGITHUB_TOKEN=ghp_your_token_here\n```\n\n### Run Analysis\n```bash\n# All-in-one analysis (collect → classify → report)\ngfa analyze -c config.yaml --weeks 8\n\n# With custom output directory\ngfa analyze -c config.yaml --weeks 8 --output ./reports\n```\n\n### Pipeline Commands (Recommended for Large Codebases)\n\nFor organizations with many repositories (50+), use the three-stage pipeline for faster iteration:\n\n```bash\n# Stage 1: Collect raw commits into weekly cache\n# Only fetches missing weeks — cached weeks are skipped automatically\n# Also enriches already-cached repos with PR data if github_repo is configured\ngfa collect -c config.yaml --weeks 4\n\n# Stage 2: Classify collected commits\ngfa classify -c config.yaml\n\n# Stage 3: Generate reports (instant — no git operations)\n# Loads cached PR data from the database automatically\ngfa report -c config.yaml --weeks 4 --generate-csv\n\n# Force re-fetch all weeks (bypass cache)\ngfa collect -c config.yaml --weeks 4 -f\n```\n\n**Why use the pipeline?**\n- **Collect once, report many times** — regenerate reports without re-fetching repositories\n- **Re-classify without re-fetching** — tweak classification, rerun just stage 2\n- **Incremental by default** — historical weeks never change, so they're cached permanently\n- **Massive speedup** — subsequent runs of 146+ repos complete in seconds instead of minutes\n- **PR data included** — `gfa report` automatically incorporates cached PR lifecycle data (open/merged/closed) into DORA metrics and narratives\n\n\u003e 💡 **Need more configuration options?** See the [Complete Configuration Guide](docs/guides/configuration.md) for advanced features, integrations, and customization.\n\n## 🎯 Excluding Merge Commits from Metrics\n\nGitFlow Analytics can exclude merge commits from filtered line count calculations, following DORA metrics best practices.\n\n### Why Exclude Merge Commits?\n\nMerge commits represent repository management, not original development work:\n- **Average merge commit**: 236.6 filtered lines vs 30.8 for regular commits (7.7x higher)\n- Merge commits can **skew productivity metrics** and velocity calculations\n- **DORA metrics best practice**: Focus on original development work, not repository management\n\n### Configuration\n\nAdd this setting to your analysis configuration:\n\n```yaml\nanalysis:\n  # Exclude merge commits from filtered line counts (DORA metrics best practice)\n  exclude_merge_commits: true  # Default: false\n```\n\n### Impact Example\n\nReal metrics from EWTN dataset analysis:\n\n| Metric | With Merge Commits | Without Merge Commits | Change |\n|--------|-------------------|----------------------|--------|\n| **Total Filtered Lines** | 138,730 | 54,808 | -60% |\n| **Merge Commits** | 355 commits | 355 commits | (excluded from line counts) |\n| **Regular Commits** | 1,426 commits | 1,426 commits | (unchanged) |\n\n### What Gets Excluded?\n\nWhen `exclude_merge_commits: true`:\n\n✅ **Filtered Stats**: Merge commits (2+ parents) have `filtered_insertions = 0` and `filtered_deletions = 0`\n✅ **Raw Stats**: Always preserved for all commits (accurate commit counts)\n✅ **Reports**: Line count metrics reflect only original development work\n\n❌ **Not affected**: Commit counts, developer activity tracking, ticket references\n\n### When to Use\n\n**✅ Enable when:**\n- You want DORA-compliant metrics for productivity tracking\n- Your workflow uses merge commits for pull requests\n- You need accurate developer velocity without repository overhead\n- You're comparing metrics across teams with different merge strategies\n\n**❌ Disable when:**\n- You want to track all repository activity including management overhead\n- Merge commits represent significant manual conflict resolution in your workflow\n- You're analyzing repositories without merge-heavy workflows\n- You need to measure total repository churn including merges\n\n### Example Configuration\n\n```yaml\n# Full configuration example\nanalysis:\n  weeks_back: 8\n  include_weekends: true\n\n  # DORA-compliant metrics: exclude merge commits\n  exclude_merge_commits: true\n\n  # Analyze ALL branches to capture feature branch work\n  branch_patterns:\n    - \"*\"  # Include all branches (feature, develop, hotfix, etc.)\n```\n\n\u003e 💡 **Pro Tip**: Combine `exclude_merge_commits: true` with `branch_patterns: [\"*\"]` to analyze all development work without merge overhead.\n\n## 📊 Generated Reports\n\nGitFlow Analytics generates comprehensive reports for different audiences:\n\n### 📈 CSV Data Files\n- **weekly_metrics.csv** - Developer productivity trends by week\n- **weekly_velocity.csv** - Lines-per-story-point velocity analysis\n- **developers.csv** - Complete team profiles and statistics  \n- **summary.csv** - Project-wide statistics and benchmarks\n- **untracked_commits.csv** - ML-categorized uncommitted work analysis\n\n### 📋 Executive Reports\n- **narrative_summary.md** - Rich markdown report with:\n  - Executive summary with key metrics\n  - Team composition and work distribution  \n  - Project activity breakdown\n  - Development patterns and recommendations\n  - Weekly trend analysis\n\n### Sample Executive Summary\n```markdown\n## Executive Summary\n- **Total Commits**: 324 commits across 4 projects\n- **Active Developers**: 8 team members  \n- **Ticket Coverage**: 78.4% (above industry benchmark)\n- **Top Areas**: Frontend (45%), API (32%), Infrastructure (23%)\n\n## Key Insights  \n✅ **Strong Process Adherence**: 78% ticket coverage\n🎯 **Balanced Team**: No developer \u003e35% of total work\n📈 **Growth Trend**: +15% productivity vs last quarter\n```\n\n## 🛠️ Common Use Cases\n\n**👥 Team Lead Dashboard**\n- Track individual developer productivity and growth\n- Identify workload distribution and potential burnout\n- Monitor code quality trends and technical debt\n\n**📈 Engineering Management**  \n- Generate executive reports on team velocity\n- Analyze process adherence and ticket coverage\n- Benchmark performance across projects and quarters\n\n**🔍 Process Optimization**\n- Identify untracked work patterns that should be formalized\n- Optimize developer focus and reduce context switching  \n- Improve estimation accuracy with historical data\n\n**🏢 Enterprise Analytics**\n- Organization-wide repository analysis across dozens of projects\n- Automated identity resolution for large, distributed teams\n- Cost-effective analysis without expensive PM tool dependencies\n\n## Command Line Interface\n\n### Main Commands\n\n```bash\n# Analyze repositories (default command)\ngitflow-analytics -c config.yaml --weeks 12 --output ./reports\n\n# Explicit analyze command (backward compatibility)\ngitflow-analytics analyze -c config.yaml --weeks 12 --output ./reports\n\n# Show cache statistics\ngitflow-analytics cache-stats -c config.yaml\n\n# List known developers\ngitflow-analytics list-developers -c config.yaml\n\n# Analyze developer identities\ngitflow-analytics identities -c config.yaml\n\n# Merge developer identities\ngitflow-analytics merge-identity -c config.yaml dev1_id dev2_id\n\n# Discover story point fields in your PM platform\ngitflow-analytics discover-storypoint-fields -c config.yaml\n```\n\n### Options\n\n- `--weeks, -w`: Number of weeks to analyze (default: 12)\n- `--output, -o`: Output directory for reports (default: ./reports)\n- `--anonymize`: Anonymize developer information\n- `--no-cache`: Disable caching for fresh analysis\n- `--clear-cache`: Clear cache before analysis\n- `--validate-only`: Validate configuration without running\n- `--skip-identity-analysis`: Skip automatic identity analysis\n- `--apply-identity-suggestions`: Apply identity suggestions without prompting\n\n## Complete Configuration Example\n\nHere's a complete example showing `.env` file and corresponding YAML configuration:\n\n### `.env` file\n```bash\n# GitHub Configuration\nGITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx\nGITHUB_ORG=your-organization\n\n# PM Platform Configuration\nJIRA_ACCESS_USER=developer@company.com\nJIRA_ACCESS_TOKEN=ATATT3xxxxxxxxxxx\nLINEAR_API_KEY=lin_api_xxxxxxxxxxxx\nCLICKUP_API_TOKEN=pk_xxxxxxxxxxxx\n\n# Note: GitHub Issues uses GITHUB_TOKEN automatically\n```\n\n### `config.yaml` file\n```yaml\nversion: \"1.0\"\n\n# GitHub configuration with organization discovery\ngithub:\n  token: \"${GITHUB_TOKEN}\"\n  organization: \"${GITHUB_ORG}\"       # Required for PR data collection\n  fetch_pr_reviews: true              # Enable PR review data (default: false)\n\n# Multi-platform PM integration\npm:\n  jira:\n    access_user: \"${JIRA_ACCESS_USER}\"\n    access_token: \"${JIRA_ACCESS_TOKEN}\"\n    base_url: \"https://company.atlassian.net\"\n\n  linear:\n    api_key: \"${LINEAR_API_KEY}\"\n    team_ids: [\"team_123abc\"]  # Optional: filter by specific teams\n\n  clickup:\n    api_token: \"${CLICKUP_API_TOKEN}\"\n    workspace_url: \"https://app.clickup.com/12345/v/\"\n\n# JIRA story point integration (optional)\njira_integration:\n  enabled: true\n  fetch_story_points: true\n  story_point_fields:\n    - \"Story point estimate\"     # Your field name\n    - \"customfield_10016\"        # Fallback field ID\n\n# Analysis configuration\nanalysis:\n  # Track tickets from all configured platforms\n  ticket_platforms:\n    - jira\n    - linear\n    - clickup\n    - github  # GitHub Issues (uses GITHUB_TOKEN)\n  \n  # Exclude bot commits and boilerplate files\n  exclude:\n    authors:\n      - \"dependabot[bot]\"\n      - \"renovate[bot]\"\n    paths:\n      - \"**/node_modules/**\"\n      - \"**/*.min.js\"\n      - \"**/package-lock.json\"\n  \n  # Developer identity consolidation\n  identity:\n    similarity_threshold: 0.85\n    # Suffixes stripped from display names during alias generation (Bedrock/OpenRouter)\n    strip_suffixes:\n      - \"-company\"\n      - \".contractor\"\n    manual_mappings:\n      - name: \"John Doe\"\n        primary_email: \"john.doe@company.com\"\n        aliases:\n          - \"jdoe@oldcompany.com\"\n          - \"john@personal.com\"\n\n# Output configuration\noutput:\n  directory: \"./reports\"\n  formats:\n    - csv\n    - markdown\n```\n\n## Output Reports\n\nThe tool generates comprehensive CSV reports and markdown summaries:\n\n### CSV Reports\n\n1. **Weekly Metrics** (`weekly_metrics_YYYYMMDD.csv`)\n   - Week-by-week developer productivity\n   - Story points, commits, lines changed\n   - Ticket coverage percentages\n   - Per-project breakdown\n\n2. **Weekly Velocity** (`weekly_velocity_YYYYMMDD.csv`)\n   - Lines of code per story point analysis\n   - Efficiency trends and velocity patterns\n   - PR-based vs commit-based story points breakdown\n   - Team velocity benchmarking and week-over-week trends\n   - PR rejection rate and merge/close metrics (when PR data is available)\n\n3. **Summary Statistics** (`summary_YYYYMMDD.csv`)\n   - Overall project statistics\n   - Platform-specific ticket counts\n   - Top contributors\n\n4. **Developer Report** (`developers_YYYYMMDD.csv`)\n   - Complete developer profiles\n   - Total contributions\n   - Identity aliases\n\n5. **Developer Activity Summary** (`developer_activity_summary_YYYYMMDD.csv`)\n   - Per-developer activity scores including `ticketing_score`\n   - Blended `raw_activity_score` incorporating commits, PRs, code impact, complexity, and ticketing\n   - Produced by `gfa analyze`\n\n6. **Ticketing Activity Summary** (`ticketing_activity_summary.json`)\n   - Combined per-developer `ticketing_score` across all configured platforms (GitHub Issues, Confluence, JIRA)\n\n7. **GitHub Issues Summary** (`github_issues_summary.json`)\n   - Issue breakdown, resolution times, and top contributors\n\n8. **Confluence Activity Summary** (`confluence_activity_summary.json`)\n   - Page edits by space and author\n\n5. **Untracked Commits Report** (`untracked_commits_YYYYMMDD.csv`)\n   - Detailed analysis of commits without ticket references\n   - Commit categorization (bug_fix, feature, refactor, documentation, maintenance, test, style, build)\n   - Enhanced metadata: commit hash, author, timestamp, project, message, file/line changes\n   - Configurable file change threshold for filtering significant commits\n\n### Enhanced Untracked Commit Analysis\n\nThe untracked commits report provides deep insights into work that bypasses ticket tracking:\n\n**CSV Columns:**\n- `commit_hash` / `short_hash`: Full and abbreviated commit identifiers\n- `author` / `author_email` / `canonical_id`: Developer identification (with anonymization support)\n- `date`: Commit timestamp\n- `project`: Project key for multi-repository analysis\n- `message`: Commit message (truncated for readability)\n- `category`: Automated categorization of work type\n- `files_changed` / `lines_added` / `lines_removed` / `lines_changed`: Change metrics\n- `is_merge`: Boolean flag for merge commits\n\n**Automatic Categorization:**\n- **Feature**: New functionality development (`add`, `new`, `implement`, `create`)\n- **Bug Fix**: Error corrections (`fix`, `bug`, `error`, `resolve`, `hotfix`)\n- **Refactor**: Code restructuring (`refactor`, `optimize`, `improve`, `cleanup`)\n- **Documentation**: Documentation updates (`doc`, `readme`, `comment`, `guide`)\n- **Maintenance**: Routine upkeep (`update`, `upgrade`, `dependency`, `config`)\n- **Test**: Testing-related changes (`test`, `spec`, `mock`, `fixture`)\n- **Style**: Formatting changes (`format`, `lint`, `prettier`, `whitespace`)\n- **Build**: Build system changes (`build`, `compile`, `ci`, `docker`)\n\n### Markdown Reports\n\n5. **Narrative Summary** (`narrative_summary_YYYYMMDD.md`)\n   - **Executive Summary**: High-level metrics and team overview\n   - **Team Composition**: Developer profiles with project percentages and work patterns\n   - **Project Activity**: Detailed breakdown by project with contributor percentages and **commit classifications**\n   - **Development Patterns**: Key insights from productivity and collaboration analysis\n   - **Pull Request Analysis**: PR metrics including size, lifetime, review activity, merge rate, and **rejection metrics** (closed-without-merge)\n   - **Weekly Trends** (v1.1.0+): Week-over-week changes in classification patterns\n\n6. **Database-Backed Qualitative Report** (`database_qualitative_report_YYYYMMDD.md`) (v1.1.0+)\n   - Generated directly from SQLite storage for fast retrieval\n   - Includes weekly trend analysis per developer/project\n   - Shows classification changes over time (e.g., \"Features: +15%, Bug Fixes: -5%\")\n   - **Issue Tracking**: Platform usage and coverage analysis with simplified display\n   - **Enhanced Untracked Work Analysis**: Comprehensive categorization with dual percentage metrics\n   - **PM Platform Integration**: Story point tracking and correlation insights (when available)\n   - **Recommendations**: Actionable insights based on analysis patterns\n\n### Enhanced Narrative Report Sections\n\nThe narrative report provides comprehensive insights through multiple detailed sections:\n\n#### Team Composition Section\n- **Developer Profiles**: Individual developer statistics with commit counts\n- **Project Distribution**: Shows ALL projects each developer works on with precise percentages\n- **Work Style Classification**: Categorizes developers as \"Focused\", \"Multi-project\", or \"Highly Focused\"\n- **Activity Patterns**: Identifies time patterns like \"Standard Hours\" or \"Extended Hours\"\n\n**Example developer profile:**\n```markdown\n**John Developer**\n- Commits: 15\n- Projects: FRONTEND (85.0%), SERVICE_TS (15.0%)\n- Work Style: Focused\n- Active Pattern: Standard Hours\n```\n\n#### Project Activity Section\n- **Activity by Project**: Commits and percentage of total activity per project\n- **Contributor Breakdown**: Shows each developer's contribution percentage within each project\n- **Lines Changed**: Quantifies the scale of changes per project\n\n#### Issue Tracking with Simplified Display\n- **Platform Usage**: Clean display of ticket platform distribution (JIRA, GitHub, etc.)\n- **Coverage Analysis**: Percentage of commits that reference tickets\n- **Enhanced Untracked Work Analysis**: Detailed categorization and recommendations\n\n### Interpreting Dual Percentage Metrics\n\nThe enhanced untracked work analysis provides two key percentage metrics for better context:\n\n1. **Percentage of Total Untracked Work**: Shows how much each developer contributes to the overall untracked work pool\n2. **Percentage of Developer's Individual Work**: Shows what proportion of a specific developer's commits are untracked\n\n**Example interpretation:**\n```\n- John Doe: 25 commits (40% of untracked, 15% of their work) - maintenance, style\n```\n\nThis means:\n- John contributed 25 untracked commits\n- These represent 40% of all untracked commits in the analysis period  \n- Only 15% of John's total work was untracked (85% was properly tracked)\n- Most untracked work was maintenance and style changes (acceptable categories)\n\n**Process Insights:**\n- High \"% of untracked\" + low \"% of their work\" = Developer doing most of the acceptable maintenance work\n- Low \"% of untracked\" + high \"% of their work\" = Developer needs process guidance\n- High percentages in feature/bug_fix categories = Process improvement opportunity\n\n### Example Report Outputs\n\n#### Untracked Commits CSV Sample\n```csv\ncommit_hash,short_hash,author,author_email,canonical_id,date,project,message,category,files_changed,lines_added,lines_removed,lines_changed,is_merge\na1b2c3d4e5f6...,a1b2c3d,John Doe,john@company.com,ID0001,2024-01-15 14:30:22,FRONTEND,Update dependency versions for security patches,maintenance,2,45,12,57,false\nf6e5d4c3b2a1...,f6e5d4c,Jane Smith,jane@company.com,ID0002,2024-01-15 09:15:10,BACKEND,Fix typo in error message,bug_fix,1,1,1,2,false\n9876543210ab...,9876543,Bob Wilson,bob@company.com,ID0003,2024-01-14 16:45:33,FRONTEND,Add JSDoc comments to utility functions,documentation,3,28,0,28,false\n```\n\n#### Complete Narrative Report Sample\n```markdown\n# GitFlow Analytics Report\n\n**Generated**: 2025-08-04 14:27:47\n**Analysis Period**: Last 4 weeks\n\n## Executive Summary\n\n- **Total Commits**: 35\n- **Active Developers**: 3\n- **Lines Changed**: 910\n- **Ticket Coverage**: 71.4%\n- **Active Projects**: FRONTEND, SERVICE_TS, SERVICES\n- **Top Contributor**: John Developer with 15 commits\n\n## Team Composition\n\n### Developer Profiles\n\n**John Developer**\n- Commits: 15\n- Projects: FRONTEND (85.0%), SERVICE_TS (15.0%)\n- Work Style: Focused\n- Active Pattern: Standard Hours\n\n**Jane Smith**\n- Commits: 12\n- Projects: SERVICE_TS (70.0%), FRONTEND (30.0%)\n- Work Style: Multi-project\n- Active Pattern: Extended Hours\n\n## Project Activity\n\n### Activity by Project\n\n**FRONTEND**\n- Commits: 14 (50.0% of total)\n- Lines Changed: 450\n- Contributors: John Developer (71.4%), Jane Smith (28.6%)\n\n**SERVICE_TS**\n- Commits: 8 (28.6% of total)\n- Lines Changed: 280\n- Contributors: Jane Smith (100.0%)\n\n## Issue Tracking\n\n### Platform Usage\n\n- **Jira**: 15 tickets (60.0%)\n- **Github**: 8 tickets (32.0%)\n- **Clickup**: 2 tickets (8.0%)\n\n### Untracked Work Analysis\n\n**Summary**: 10 commits (28.6% of total) lack ticket references.\n\n#### Work Categories\n\n- **Maintenance**: 4 commits (40.0%), avg 23 lines *(acceptable untracked)*\n- **Bug Fix**: 3 commits (30.0%), avg 15 lines *(should be tracked)*\n- **Documentation**: 2 commits (20.0%), avg 12 lines *(acceptable untracked)*\n\n#### Top Contributors (Untracked Work)\n\n- **John Developer**: 1 commits (50.0% of untracked, 6.7% of their work) - *refactor*\n- **Jane Smith**: 1 commits (50.0% of untracked, 8.3% of their work) - *style*\n\n#### Recommendations for Untracked Work\n\n🎯 **Excellent tracking**: Less than 20% of commits are untracked - the team shows strong process adherence.\n\n## Recommendations\n\n✅ The team shows healthy development patterns. Continue current practices while monitoring for changes.\n```\n\n### Configuration for Enhanced Narrative Reports\n\nThe narrative reports automatically include all available sections based on your configuration and data availability:\n\n**Always Generated:**\n- Executive Summary, Team Composition, Project Activity, Development Patterns, Issue Tracking, Recommendations\n\n**Conditionally Generated:**\n- **Pull Request Analysis**: Requires GitHub integration with PR data\n- **PM Platform Integration**: Requires JIRA or other PM platform configuration\n- **Qualitative Analysis**: Requires ChatGPT integration setup\n\n**Customizing Report Content:**\n```yaml\n# config.yaml\noutput:\n  formats:\n    - csv\n    - markdown  # Enables narrative report generation\n  \n# Optional: Enhance narrative reports with additional data\njira:\n  access_user: \"${JIRA_ACCESS_USER}\"\n  access_token: \"${JIRA_ACCESS_TOKEN}\"\n  base_url: \"https://company.atlassian.net\"\n\n# Optional: Add qualitative insights\nanalysis:\n  chatgpt:\n    enabled: true\n    api_key: \"${OPENAI_API_KEY}\"\n```\n\n## Story Point Patterns\n\nConfigure custom regex patterns to match your team's story point format:\n\n```yaml\nstory_point_patterns:\n  - \"SP: (\\\\d+)\"           # SP: 5\n  - \"\\\\[([0-9]+) pts\\\\]\"   # [3 pts]\n  - \"estimate: (\\\\d+)\"     # estimate: 8\n```\n\n## Ticket Platform Support\n\nAutomatically detects and tracks tickets from multiple PM platforms:\n- **JIRA**: `PROJ-123`\n- **GitHub Issues**: `#123`, `GH-123`\n- **ClickUp**: `CU-abc123`\n- **Linear**: `ENG-123`\n\n### Multi-Platform PM Integration\n\nGitFlow Analytics supports multiple project management platforms simultaneously. You can configure one or more platforms based on your team's workflow:\n\n```yaml\n# Configure which platforms to track\nanalysis:\n  ticket_platforms:\n    - jira\n    - linear\n    - clickup\n    - github  # GitHub Issues\n\n# Platform-specific configuration\npm:\n  jira:\n    access_user: \"${JIRA_ACCESS_USER}\"\n    access_token: \"${JIRA_ACCESS_TOKEN}\"\n    base_url: \"https://your-company.atlassian.net\"\n\n  linear:\n    api_key: \"${LINEAR_API_KEY}\"\n    team_ids:  # Optional: filter by team\n      - \"team_123abc\"\n\n  clickup:\n    api_token: \"${CLICKUP_API_TOKEN}\"\n    workspace_url: \"https://app.clickup.com/12345/v/\"\n\n# GitHub Issues uses existing GitHub token automatically\ngithub:\n  token: \"${GITHUB_TOKEN}\"\n```\n\n### Platform Setup Guides\n\n#### JIRA Setup\n1. **Get API Token**: Go to [Atlassian API Tokens](https://id.atlassian.com/manage-profile/security/api-tokens)\n2. **Required Permissions**: Read access to projects and issues\n3. **Configuration**:\n   ```yaml\n   pm:\n     jira:\n       access_user: \"${JIRA_ACCESS_USER}\"  # Your Atlassian email\n       access_token: \"${JIRA_ACCESS_TOKEN}\"\n       base_url: \"https://your-company.atlassian.net\"\n   ```\n\n#### Linear Setup\n1. **Get API Key**: Go to [Linear Settings → API](https://linear.app/settings/api)\n2. **Required Permissions**: Read access to issues\n3. **Configuration**:\n   ```yaml\n   pm:\n     linear:\n       api_key: \"${LINEAR_API_KEY}\"\n       team_ids: [\"team_123abc\"]  # Optional: specify team IDs\n   ```\n\n#### ClickUp Setup\n1. **Get API Token**: Go to [ClickUp Settings → Apps](https://app.clickup.com/settings/apps)\n2. **Get Workspace URL**: Copy from browser when viewing your workspace\n3. **Configuration**:\n   ```yaml\n   pm:\n     clickup:\n       api_token: \"${CLICKUP_API_TOKEN}\"\n       workspace_url: \"https://app.clickup.com/12345/v/\"\n   ```\n\n#### GitHub Issues Setup\nGitHub Issues is automatically enabled when GitHub integration is configured. No additional setup required:\n```yaml\ngithub:\n  token: \"${GITHUB_TOKEN}\"  # Same token for repo access and issues\n```\n\n### JIRA Story Point Integration\n\nGitFlow Analytics can fetch story points directly from JIRA tickets:\n\n```yaml\njira_integration:\n  enabled: true\n  fetch_story_points: true\n  story_point_fields:\n    - \"Story point estimate\"  # Your custom field name\n    - \"customfield_10016\"     # Or use field ID\n```\n\nTo discover your JIRA story point fields:\n```bash\ngitflow-analytics discover-storypoint-fields -c config.yaml\n```\n\n### Environment Variables for Credentials\n\nStore credentials securely in a `.env` file:\n\n```bash\n# .env file (keep this secure and don't commit to git!)\nGITHUB_TOKEN=ghp_your_token_here\n\n# PM Platform Credentials\nJIRA_ACCESS_USER=your.email@company.com\nJIRA_ACCESS_TOKEN=ATATT3xxxxxxxxxxx\nLINEAR_API_KEY=lin_api_xxxxxxxxxxxx\nCLICKUP_API_TOKEN=pk_xxxxxxxxxxxx\n```\n\n## Caching\n\nThe tool uses SQLite for intelligent caching:\n- Commit analysis results\n- Developer identity mappings\n- Pull request data\n\nCache is automatically managed with configurable TTL.\n\n## Developer Identity Resolution\n\nGitFlow Analytics intelligently consolidates developer identities across different email addresses and name variations:\n\n### Automatic Identity Analysis (New!)\n\nIdentity analysis now runs **automatically by default** when no manual mappings exist. The system will:\n\n1. **Analyze all developer identities** in your commits\n2. **Show suggested consolidations** with a clear preview\n3. **Prompt for approval** with a simple Y/n\n4. **Update your configuration** automatically\n5. **Continue analysis** with consolidated identities\n\nExample of the interactive prompt:\n```\n🔍 Analyzing developer identities...\n\n⚠️  Found 3 potential identity clusters:\n\n📋 Suggested identity mappings:\n   john.doe@company.com\n     → 123456+johndoe@users.noreply.github.com\n     → jdoe@personal.email.com\n\n🤖 Found 2 bot accounts to exclude:\n   - dependabot[bot]\n   - renovate[bot]\n\n────────────────────────────────────────────────────────────\nApply these identity mappings to your configuration? [Y/n]: \n```\n\nThis prompt appears at most once every 7 days. \n\nTo skip automatic identity analysis:\n```bash\n# Simplified syntax (default)\ngitflow-analytics -c config.yaml --skip-identity-analysis\n\n# Explicit analyze command\ngitflow-analytics analyze -c config.yaml --skip-identity-analysis\n```\n\nTo manually run identity analysis:\n```bash\ngitflow-analytics identities -c config.yaml\n```\n\n### Smart Identity Matching\n\nThe system automatically detects:\n- **GitHub noreply emails** (e.g., `150280367+username@users.noreply.github.com`)\n- **Name variations** (e.g., \"John Doe\" vs \"John D\" vs \"jdoe\")\n- **Common email patterns** across domains\n- **Bot accounts** for automatic exclusion\n\n### Manual Configuration\n\nYou can also manually configure identity mappings in your YAML:\n\n```yaml\nanalysis:\n  identity:\n    manual_mappings:\n      - name: \"John Doe\"  # Optional: preferred display name for reports\n        primary_email: john.doe@company.com\n        aliases:\n          - jdoe@personal.email.com\n          - 123456+johndoe@users.noreply.github.com\n      - name: \"Sarah Smith\"\n        primary_email: sarah.smith@company.com\n        aliases:\n          - s.smith@oldcompany.com\n```\n\n### Display Name Control\n\nThe optional `name` field in manual mappings allows you to control how developer names appear in reports. This is particularly useful for:\n\n- **Standardizing display names** across different email formats\n- **Resolving duplicates** when the same person appears with slight name variations\n- **Using preferred names** instead of technical email formats\n\n**Example use cases:**\n```yaml\nanalysis:\n  identity:\n    manual_mappings:\n      # Consolidate Austin Zach identities\n      - name: \"Austin Zach\"\n        primary_email: \"john.smith@company.com\"\n        aliases:\n          - \"150280367+jsmith@users.noreply.github.com\"\n          - \"jsmith-company@users.noreply.github.com\"\n      \n      # Standardize name variations\n      - name: \"John Doe\"  # Consistent display across all reports\n        primary_email: \"john.doe@company.com\"\n        aliases:\n          - \"johndoe@company.com\"\n          - \"j.doe@company.com\"\n```\n\nWithout the `name` field, the system uses the canonical email's associated name, which might not be ideal for reporting.\n\n### Disabling Automatic Analysis\n\nTo disable the automatic identity prompt:\n```yaml\nanalysis:\n  identity:\n    auto_analysis: false\n```\n\n## PR Status Tracking\n\nGitFlow Analytics captures the full pull request lifecycle from GitHub, including open, merged, and rejected (closed-without-merge) states.\n\n### Schema\n\nThe `PullRequestCache` table (v4 schema) includes:\n\n| Column | Description |\n|--------|-------------|\n| `pr_state` | Current state: `open`, `closed`, or `merged` |\n| `closed_at` | Timestamp when the PR was closed or merged |\n| `is_merged` | Boolean — `true` if the PR was merged into the target branch |\n\nA PR with `pr_state = \"closed\"` and `is_merged = false` represents a rejected/abandoned PR.\n\n### Configuration\n\n```yaml\ngithub:\n  token: \"${GITHUB_TOKEN}\"\n  organization: \"your-org\"      # Required for org-wide PR collection\n  fetch_pr_reviews: true        # Enable review data (approvals, change requests)\n```\n\nRepositories listed explicitly must include `github_repo` to enable PR enrichment:\n\n```yaml\nrepositories:\n  - name: \"my-app\"\n    path: \"~/code/my-app\"\n    github_repo: \"myorg/my-app\"  # Enables PR data for this repo\n    project_key: \"APP\"\n```\n\n### Incremental Stale-PR Refresh\n\nOpen PRs can change state at any time. Each `gfa collect` run refreshes up to 50 stale open PRs per run to keep the cache current without hammering the GitHub API.\n\n### Rejection Metrics\n\nWhen PR data is available, the narrative summary and CSV output include:\n- **Merge rate**: Percentage of closed PRs that were merged\n- **Rejection rate**: Percentage of closed PRs that were closed without merging\n- **Rejection by author**: Which developers have the highest abandoned-PR rate\n\n## Bedrock-Powered Alias Generation\n\nThe `gfa aliases` command uses an LLM to cluster developer identities and suggest canonical names. AWS Bedrock is now supported as a provider alongside OpenRouter.\n\n### Provider Priority\n\n1. **AWS Bedrock** — Used when `bedrock` provider is configured in your YAML\n2. **OpenRouter** — Used when `openrouter` provider is configured\n3. **Heuristic-only** — Fallback when no LLM provider is available\n\n### Configuration\n\n```yaml\nanalysis:\n  qualitative:\n    provider: bedrock          # or \"openrouter\"\n    bedrock:\n      model_id: \"anthropic.claude-3-haiku-20240307-v1:0\"\n      region: \"us-east-1\"\n\n  identity:\n    # Strip these suffixes from display names when generating aliases\n    strip_suffixes:\n      - \"-contractor\"\n      - \".ext\"\n```\n\nThe `strip_suffixes` list removes common suffixes from email local parts before clustering (e.g., `john-contractor` and `john` are treated as the same person).\n\n## ML-Enhanced Commit Categorization\n\nGitFlow Analytics includes sophisticated machine learning capabilities for categorizing commits with high accuracy and confidence scoring.\n\n### How It Works\n\nThe ML categorization system uses a **hybrid approach** combining:\n\n1. **Semantic Analysis**: Uses spaCy NLP models to understand commit message meaning\n2. **File Pattern Recognition**: Analyzes changed files for additional context signals  \n3. **Rule-based Fallback**: Falls back to traditional regex patterns when ML confidence is low\n4. **Confidence Scoring**: Provides confidence metrics for all categorizations\n\n### Categories Detected\n\nThe system automatically categorizes commits into:\n\n- **Feature**: New functionality development (`add`, `implement`, `create`)\n- **Bug Fix**: Error corrections (`fix`, `resolve`, `correct`)\n- **Refactor**: Code restructuring (`refactor`, `optimize`, `improve`) \n- **Documentation**: Documentation updates (`docs`, `readme`, `comment`)\n- **Maintenance**: Routine upkeep (`update`, `upgrade`, `dependency`)\n- **Test**: Testing-related changes (`test`, `spec`, `coverage`)\n- **Style**: Formatting changes (`format`, `lint`, `prettier`)\n- **Build**: Build system changes (`build`, `ci`, `docker`)\n- **Security**: Security-related fixes (`security`, `vulnerability`)\n- **Hotfix**: Urgent production fixes (`hotfix`, `critical`, `emergency`)\n- **Config**: Configuration changes (`config`, `settings`, `environment`)\n\n### Configuration\n\n```yaml\nanalysis:\n  ml_categorization:\n    # Enable/disable ML categorization (default: true)\n    enabled: true\n    \n    # Minimum confidence for ML predictions (0.0-1.0, default: 0.6)\n    min_confidence: 0.6\n    \n    # Semantic vs file pattern weighting (default: 0.7 vs 0.3)\n    semantic_weight: 0.7\n    file_pattern_weight: 0.3\n    \n    # Confidence threshold for ML vs rule-based (default: 0.5)\n    hybrid_threshold: 0.5\n    \n    # Caching for performance\n    enable_caching: true\n    cache_duration_days: 30\n    \n    # Processing settings\n    batch_size: 100\n```\n\n### Installation Requirements\n\nFor ML categorization, install the spaCy English model:\n\n```bash\npython -m spacy download en_core_web_sm\n```\n\n**Alternative models** (if the default is unavailable):\n```bash\n# Medium model (more accurate, larger)\npython -m spacy download en_core_web_md\n\n# Large model (most accurate, largest)\npython -m spacy download en_core_web_lg\n```\n\n### Performance Expectations\n\n- **Accuracy**: 85-95% accuracy on typical commit messages\n- **Speed**: ~50-100 commits/second with caching enabled\n- **Fallback**: Gracefully disables qualitative analysis if spaCy model unavailable (provides helpful error messages)\n- **Memory**: ~200MB additional memory usage for spaCy models\n\n### Enhanced Reports\n\nWith ML categorization enabled, reports include:\n\n- **Confidence scores** for each categorization\n- **Method indicators** (ML, rules, or cached)\n- **Alternative predictions** for uncertain cases\n- **ML performance statistics** in analysis summaries\n\n### Example Enhanced Output\n\n```csv\ncommit_hash,category,ml_confidence,ml_method,message\na1b2c3d,feature,0.89,ml,\"Add user authentication system\"  \nf6e5d4c,bug_fix,0.92,ml,\"Fix memory leak in cache cleanup\"\n9876543,maintenance,0.74,rules,\"Update dependency versions\"\n```\n\n## Troubleshooting\n\n### YAML Configuration Errors\n\nGitFlow Analytics provides helpful error messages when YAML configuration issues are encountered. Here are common errors and their solutions:\n\n#### Tab Characters Not Allowed\n```\n❌ YAML configuration error at line 3, column 1:\n🚫 Tab characters are not allowed in YAML files!\n```\n**Fix**: Replace all tabs with spaces (use 2 or 4 spaces for indentation)\n- Most editors can show whitespace characters and convert tabs to spaces\n- In VS Code: View → Render Whitespace, then Edit → Convert Indentation to Spaces\n\n#### Missing Colons\n```\n❌ YAML configuration error at line 5, column 10:\n🚫 Missing colon (:) after a key name!\n```\n**Fix**: Add a colon and space after each key name\n```yaml\n# Correct:\nrepositories:\n  - name: my-repo\n    \n# Incorrect:\nrepositories\n  - name my-repo\n```\n\n#### Unclosed Quotes\n```\n❌ YAML configuration error at line 8, column 15:\n🚫 Unclosed quoted string!\n```\n**Fix**: Ensure all quotes are properly closed\n```yaml\n# Correct:\ntoken: \"my-token-value\"\n\n# Incorrect:\ntoken: \"my-token-value\n```\n\n#### Invalid Indentation\n```\n❌ YAML configuration error:\n🚫 Indentation error or invalid structure!\n```\n**Fix**: Use consistent indentation (either 2 or 4 spaces)\n```yaml\n# Correct:\nanalysis:\n  exclude:\n    paths:\n      - \"vendor/**\"\n      \n# Incorrect:\nanalysis:\n  exclude:\n     paths:  # 3 spaces - inconsistent!\n      - \"vendor/**\"\n```\n\n### Tips for Valid YAML\n\n1. **Use a YAML validator**: Check your configuration with online YAML validators before using\n2. **Enable whitespace display**: Make tabs and spaces visible in your editor\n3. **Use quotes for special characters**: Wrap values containing `:`, `#`, `@`, etc. in quotes\n4. **Consistent indentation**: Pick 2 or 4 spaces and stick to it throughout the file\n5. **Check the sample config**: Reference `config-sample.yaml` for proper structure\n\n### Configuration Validation\n\nBeyond YAML syntax, GitFlow Analytics validates:\n- Required fields (`repositories` must have `name` and `path`)\n- Environment variable resolution\n- File path existence\n- Valid configuration structure\n\nIf you encounter persistent issues, run with `--debug` for detailed error information:\n```bash\n# Simplified syntax (default)\ngitflow-analytics -c config.yaml --debug\n\n# Explicit analyze command\ngitflow-analytics analyze -c config.yaml --debug\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n### Development Setup\n\n```bash\n# Clone the repository\ngit clone https://github.com/bobmatnyc/gitflow-analytics.git\ncd gitflow-analytics\n\n# Install development dependencies\nmake install-dev\n\n# Run tests\nmake test\n\n# Format code\nmake format\n\n# Run all quality checks\nmake quality-gate\n```\n\n### Release Workflow\n\nThis project uses a Makefile-based release workflow for simplicity and transparency. See [RELEASE.md](RELEASE.md) for detailed documentation.\n\n**Quick Reference:**\n```bash\nmake release-patch   # Bug fixes (3.13.1 → 3.13.2)\nmake release-minor   # New features (3.13.1 → 3.14.0)\nmake release-major   # Breaking changes (3.13.1 → 4.0.0)\n```\n\nFor more details, see:\n- [RELEASE.md](RELEASE.md) - Comprehensive release guide\n- [RELEASE_QUICKREF.md](RELEASE_QUICKREF.md) - Quick reference card\n- `make help` - All available commands\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbobmatnyc%2Fgitflow-analytics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbobmatnyc%2Fgitflow-analytics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbobmatnyc%2Fgitflow-analytics/lists"}