{"id":50884935,"url":"https://github.com/kdayno/workforce-flux","last_synced_at":"2026-06-15T16:04:19.804Z","repository":{"id":361952112,"uuid":"1256590564","full_name":"kdayno/workforce-flux","owner":"kdayno","description":"End-to-end people analytics on an HR dataset: DuckDB + dbt + Evidence. Translates workforce trends into actionable findings on retention, attrition, and pay.","archived":false,"fork":false,"pushed_at":"2026-06-02T00:25:45.000Z","size":30,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-02T01:19:24.125Z","etag":null,"topics":["analytics-engineering","data-engineering","dbt","duckdb","hr-analytics","portfolio"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kdayno.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-01T23:26:36.000Z","updated_at":"2026-06-02T00:25:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/kdayno/workforce-flux","commit_stats":null,"previous_names":["kdayno/workforce-flux"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/kdayno/workforce-flux","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdayno%2Fworkforce-flux","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdayno%2Fworkforce-flux/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdayno%2Fworkforce-flux/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdayno%2Fworkforce-flux/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kdayno","download_url":"https://codeload.github.com/kdayno/workforce-flux/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdayno%2Fworkforce-flux/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34369850,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics-engineering","data-engineering","dbt","duckdb","hr-analytics","portfolio"],"created_at":"2026-06-15T16:04:15.219Z","updated_at":"2026-06-15T16:04:19.791Z","avatar_url":"https://github.com/kdayno.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"![Workforce Flux Header Image](./docs/workforce-flux-header.png) \n\n- An end-to-end People Analytics project: a raw HR dataset (sourced from [Kaggle](#data-source)) transformed into decision-useful insight on headcount, attrition, and pay.\n- File-based analytics stack (DuckDB + dbt + Evidence), deployed as a static Vercel site.\n\n\u003e 🔗 Live demo: [workforceflux.kdayno.com](https://workforceflux.kdayno.com)\n\n## Objectives\n\n1. **Surface decision-useful HR insight.** Quantify workforce dynamics\n   (headcount growth, annualised turnover, attrition drivers, and the\n   effectiveness of recruitment channels), and translate the findings into\n   recommended actions.\n2. **Apply analytics-engineering best practice.** Transform raw, inconsistent\n   source data into clean, tested, and documented analytical models through a\n   layered ELT pipeline and dimensional modelling.\n3. **Demonstrate analytical rigour.** Define HR metrics correctly (for example,\n   annualised versus cumulative turnover), segment responsibly given the sample\n   size, and state every assumption and limitation transparently.\n\n## Key findings\n\n| # | Finding | Headline number |\n|---|---|---|\n| 1 | Hiring freeze, not attrition crisis | Annual turnover 3.6–9.9% (low by [BLS 2019 labour-turnover data](https://www.bls.gov/opub/mlr/2020/article/job-openings-hires-and-quits-set-record-highs-in-2019.htm)) |\n| 2 | Voluntary attrition concentrated in Production department | 86% of voluntary exits from 67% of headcount |\n| 3 | Production has a structural pay-competitiveness gap | Stayers earn 12.7% more than leavers at 5–10 yrs tenure |\n| 4 | Pay equity is healthy; raw gap is composition | 2.1% raw gap → ~0% within position |\n\n\u003e **Full analysis.** Per-finding tables, methodology, assumptions, and caveats:\n\u003e [`docs/full-analysis.md`](docs/full-analysis.md). The subject company is anonymised in\n\u003e the source dataset; this README refers to it as **Company X**.\n\n## Recommendations\n\nThe single highest-leverage intervention indicated by the analysis is a\n**market-rate salary review for Production roles at 3+ years of tenure**.\nThis would directly address the 11 explicit \"more money\" voluntary exits and\nlikely absorb a portion of the 17 \"Another position\" exits.\n\nTwo supporting recommendations (engagement-survey replacement and a merit-pay\npremium for Production) are detailed in [`docs/full-analysis.md#recommendations`](docs/full-analysis.md#recommendations).\n\n## Tech stack\n\n| Layer | Tool | Role |\n|-------|------|------|\n| Storage | [DuckDB](https://duckdb.org) | Embedded analytical database |\n| Transformation | [dbt](https://www.getdbt.com) (`dbt-duckdb`) | Tested, layered SQL models |\n| Visualisation | [Evidence](https://evidence.dev) | BI-as-code reports |\n| Hosting | [Vercel](https://vercel.com) | Static hosting + auto-deploy on push |\n\n## Data source\n\n[Human Resources Data Set](https://www.kaggle.com/datasets/rhuebner/human-resources-data-set)\nby Dr. Rich Huebner \u0026 Dr. Carla Patalano (Kaggle). A single CSV,\n`HRDataset_v14.csv` (**~311 employees, 36 columns**), one row per employee.\n\nThe raw file is **not committed** (see `.gitignore`). Download it from Kaggle\n(a free account is required) and place it at:\n\n```\ndata/raw/HRDataset_v14.csv\n```\n\n## Project structure\n\n```\nworkforce-flux/\n├── LICENSE\n├── README.md\n├── requirements.txt\n├── .gitignore\n├── data/\n│   └── raw/                  # HRDataset_v14.csv goes here (not committed)\n├── docs/\n│   └── full-analysis.md           # Full per-finding analysis (tables, caveats)\n├── eda/                      # Exploratory data analysis (SQL, run against hr.duckdb)\n│   ├── 01_decline_diagnosis.sql\n│   ├── 02_retention.sql\n│   ├── 03_exit_reasons.sql\n│   └── 04_compensation_equity.sql\n├── hr_dbt/                   # dbt project\n│   ├── dbt_project.yml\n│   ├── profiles.yml\n│   ├── packages.yml\n│   └── models/\n│       ├── staging/\n│       │   ├── stg_employees.sql\n│       │   └── _staging.yml\n│       ├── intermediate/\n│       │   ├── int_employees_enriched.sql\n│       │   ├── int_date_spine.sql\n│       │   └── int_headcount_monthly.sql\n│       └── marts/\n│           ├── dim_employee.sql\n│           ├── mart_headcount_monthly.sql\n│           ├── mart_attrition.sql\n│           ├── mart_recruitment_effectiveness.sql\n│           └── _marts.yml\n├── reports/                  # Evidence reports (live at workforceflux.kdayno.com)\n└── hr.duckdb                 # built by dbt; committed for Vercel\n```\n\n## Setup\n\n```bash\n# 1. Python environment\n# NOTE: dbt does not yet support Python 3.14 (its mashumaro dependency fails\n# to import). Use Python 3.13 or earlier.\npython3.13 -m venv .venv\nsource .venv/bin/activate\npip install --upgrade pip\npip install -r requirements.txt\n\n# 2. Download HRDataset_v14.csv from Kaggle into data/raw/\n\n# 3. Build the pipeline (run dbt from inside hr_dbt/)\ncd hr_dbt\ndbt deps                              # installs dbt_utils\ndbt build --profiles-dir .            # runs models + tests\n\n# 4. Run the Evidence reports locally\ncd ../reports\nnpm install\nnpm run dev                           # opens http://localhost:3000\n```\n\n## Pipeline / data model\n\n```\nHRDataset_v14.csv\n└─ stg_employees ................. clean + type-cast, 1 row per employee\n   └─ int_employees_enriched ..... + derived fields (age, tenure, bands…)\n      ├─ dim_employee ............ employee dimension\n      ├─ mart_attrition .......... department-level separation summary\n      ├─ mart_recruitment_effectiveness\n      └─ int_headcount_monthly ... employee-month grain (uses int_date_spine)\n         └─ mart_headcount_monthly  monthly time series + turnover rate\n```\n\nLayer materialisation: staging \u0026 intermediate are **views**, marts are **tables**.\n\n## Next steps\n\n- **Recruitment-source effectiveness as a fifth finding.** The\n  `mart_recruitment_effectiveness` model exists but no narrative is built on\n  it. Which channels (Indeed, LinkedIn, referral, diversity job fair) produce\n  stayers vs leavers, and at what cost-per-retained-hire?\n- **Tenure survival curve for voluntary exits.** A hazard curve by\n  month-of-tenure, Production vs non-Production, would localise *when* in the\n  lifecycle attrition happens and sharpen the salary-review recommendation to\n  a specific tenure-month trigger.\n- **Quantify the salary-review intervention.** The primary recommendation\n  (\"market-rate review for Production at 3+ years\") is qualitative. A\n  cost-benefit estimate (retained employees and avoided replacement cost vs\n  the raise bill) would turn it into a business case.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkdayno%2Fworkforce-flux","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkdayno%2Fworkforce-flux","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkdayno%2Fworkforce-flux/lists"}