{"id":30076494,"url":"https://github.com/imjuliengaupin/bourne","last_synced_at":"2026-04-18T02:33:05.344Z","repository":{"id":303111415,"uuid":"1014432105","full_name":"imjuliengaupin/bourne","owner":"imjuliengaupin","description":"A Python-based framework for modular, AI-ready data pipeline orchestration featuring plug-and-play agents, real-time terminal dashboards, and Pydantic-powered schema validation.","archived":false,"fork":false,"pushed_at":"2026-03-28T06:47:24.000Z","size":1425,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"PROD","last_synced_at":"2026-03-28T11:32:28.588Z","etag":null,"topics":["agentic-ai","agentic-workflow","agents","ai","etl-framework","github-actions","orchestration-framework","pydantic-v2","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/imjuliengaupin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-07-05T18:00:27.000Z","updated_at":"2026-03-28T06:35:27.000Z","dependencies_parsed_at":"2025-07-05T19:48:53.510Z","dependency_job_id":"6b7f4505-a8a8-4dab-a0d5-356939c34cf3","html_url":"https://github.com/imjuliengaupin/bourne","commit_stats":null,"previous_names":["imjuliengaupin/bourne"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/imjuliengaupin/bourne","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imjuliengaupin%2Fbourne","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imjuliengaupin%2Fbourne/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imjuliengaupin%2Fbourne/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imjuliengaupin%2Fbourne/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/imjuliengaupin","download_url":"https://codeload.github.com/imjuliengaupin/bourne/tar.gz/refs/heads/PROD","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imjuliengaupin%2Fbourne/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31953784,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agentic-workflow","agents","ai","etl-framework","github-actions","orchestration-framework","pydantic-v2","python3"],"created_at":"2025-08-08T15:08:36.462Z","updated_at":"2026-04-18T02:33:05.338Z","avatar_url":"https://github.com/imjuliengaupin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ca name=\"readme-top\"\u003e\u003c/a\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca\u003e\n    \u003cimg src=\"./demo/images/logo.png\" width=\"80\" height=\"80\"\u003e\n  \u003c/a\u003e\n\n\u003ch3 align=\"center\"\u003eBourne\u003c/h3\u003e\n\n\u003cp\u003eEnterprise-grade data pipeline orchestration for ETL workflows\u003c/p\u003e\n\n[![Build Status](https://img.shields.io/github/actions/workflow/status/imjuliengaupin/bourne/devops.yml?branch=PROD\u0026style=for-the-badge\u0026logo=github\u0026label=CI/CD)](https://github.com/imjuliengaupin/bourne/actions/workflows/devops.yml)\n[![Coverage](https://img.shields.io/coveralls/github/imjuliengaupin/bourne/PROD?style=for-the-badge\u0026logo=coveralls\u0026label=COVERAGE)](https://coveralls.io/github/imjuliengaupin/bourne?branch=PROD)\n[![Artifacts](https://img.shields.io/badge/Actions-Artifacts-6c757d?style=for-the-badge\u0026logo=github)](https://github.com/imjuliengaupin/bourne/actions/workflows/devops.yml?query=branch%3APROD)\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#why-bourne\"\u003eWhy Bourne?\u003c/a\u003e •\n  \u003ca href=\"#features\"\u003eFeatures\u003c/a\u003e •\n  \u003ca href=\"#use-cases\"\u003eUse Cases\u003c/a\u003e •\n  \u003ca href=\"#quick-start\"\u003eQuick Start\u003c/a\u003e •\n  \u003ca href=\"#how-it-works\"\u003eHow It Works\u003c/a\u003e •\n  \u003ca href=\"#example-workflow-conceptual\"\u003eExample Workflow\u003c/a\u003e •\n  \u003ca href=\"#demo\"\u003eDemo\u003c/a\u003e •\n  \u003ca href=\"#quality-reliability\"\u003eQuality \u0026 Reliability\u003c/a\u003e •\n  \u003ca href=\"#license\"\u003eLicense\u003c/a\u003e\n\u003c/p\u003e\n\n\u003c/div\u003e\n\n\u003cbr /\u003e\n\n## \u003ca name=\"why-bourne\"\u003eWhy Bourne?\u003c/a\u003e\n\nModern data pipelines require flexibility, reliability, and visibility. Bourne delivers:\n\n- **Modular agent architecture** - Build complex workflows from simple, reusable components\n- **Real-time observability** - Live dashboard showing workflow progress, data transformations, and errors\n- **Configuration-driven execution** - Define pipelines in JSON, no code changes needed\n- **Enterprise-grade validation** - Multi-tier schema validation with automatic fallback strategies\n- **Production-ready** - Comprehensive testing, retry logic, dependency resolution, and stall detection\n\n\u003cbr /\u003e\n\n## :gear: \u003ca name=\"features\"\u003eFeatures\u003c/a\u003e\n\n- [x] **Multi-agent orchestration**: Coordinate complex data workflows with automatic dependency resolution\n- [x] **Extensible framework**: Add custom agents and transformation logic with minimal boilerplate\n- [x] **Live terminal dashboard**: Real-time workflow progress visualization with color-coded statuses\n- [x] **Field-level data preview**: Compare before/after transformations with highlighted changes\n- [x] **JSON-first configuration**: Build entire pipelines with external JSON files—no code required\n- [x] **Flexible data formats**: Native support for JSON and NDJSON ingestion and output\n- [x] **Intelligent schema validation**: Pydantic v2-backed, multi-tier cascade that keeps validation succeeding across data variations\n- [x] **Resilient execution**: Automatic retries, stall detection, and dependency handling\n- [x] **Comprehensive test suite**: Full coverage with scenario-based end-to-end testing\n\n\u003cp align=\"right\"\u003e\n    (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## :bulb: \u003ca name=\"use-cases\"\u003eUse Cases\u003c/a\u003e\n\n**Data Integration \u0026 Normalization**\n\n- Ingest data from multiple JSON sources\n- Normalize key formats (camelCase → snake_case)\n- Apply consistent schema validation\n- Export unified, validated output\n\n**ETL Pipeline Management**\n\n- Complex multi-step workflows with task dependencies\n- Schema transformation and validation at each stage\n- Real-time monitoring of pipeline execution\n- Automatic error recovery and retry logic\n\n**Data Quality Assurance**\n\n- Validate incoming data against strict schemas\n- Identify and log transformation changes\n- Track data lineage with automatic metadata\n- Fail-fast strict mode for critical pipelines\n\n**Custom Data Processing**\n\n- Build domain-specific agents for specialized transformations\n- Compose agents into production workflows\n- Configuration-driven scaling across different data sources\n\n\u003cp align=\"right\"\u003e\n    (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## :rocket: \u003ca name=\"quick-start\"\u003eQuick Start\u003c/a\u003e\n\n### Prerequisites\n\n- Python 3.13\n- Git\n\n### Installation\n\n```bash\n# Clone and install\ngit clone https://github.com/imjuliengaupin/bourne.git\ncd bourne\n\n# Create virtual environment\npython -m venv .venv\nsource .venv/bin/activate\n\n# Install core dependencies only\npip install -e .\n\n# Run a sample workflow\npython main.py \\\n  --workflow json/workflows/default/default.json \\\n  --connector json/connectors/default/single-record.json \\\n  --debug\n```\n\n**Command-line Arguments**\n\n- `--workflow` (required): Point to any JSON workflow configuration\n- `--connector` (required): Point to any JSON data connector configuration\n- `--debug` (optional): Enable live terminal dashboard with progress visualization and real-time logging\n\n\u003cp align=\"right\"\u003e\n    (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## 🏗️ \u003ca name=\"how-it-works\"\u003eHow It Works\u003c/a\u003e\n\n### Architecture at a Glance\n\n[![Open in Eraser](https://img.shields.io/badge/Open%20in-Eraser-blue?logo=eraser\u0026style=for-the-badge)](https://app.eraser.io/workspace/LDgZLTRhjaVsKpyiZU0B)\n\n\u003cbr /\u003e\n\nBourne orchestrates four core agents in configurable sequences:\n\n| 🤖 Agent                    | 🎯 Purpose                                                   |\n| --------------------------- | ------------------------------------------------------------ |\n| **DataIngestionAgent**      | Reads JSON/NDJSON from files or APIs                         |\n| **DataValidationAgent**     | Validates against expected schemas with intelligent fallback |\n| **DataTransformationAgent** | Applies configurable key transformations and normalizations  |\n| **DataStorageAgent**        | Persists processed data to files or external systems         |\n\nEach agent is independent, reusable, and can be composed into complex workflows with automatic dependency resolution.\n\n\u003cbr /\u003e\n\n### Configuration-Driven Pipelines\n\nDefine your entire workflow in JSON—no Python code needed. Typical flow:\n\n- **Ingest**: Pull JSON/NDJSON from a source connector\n- **Validate**: Enforce schemas with automatic fallback (primary → fallback → manual)\n- **Transform**: Apply key normalization modes and add lineage metadata\n- **Save**: Persist processed data to the configured destination\n\nWorkflows automatically handle:\n\n- Task dependency ordering\n- Automatic retries on transient failures\n- Stall detection and recovery\n- Real-time progress tracking\n\n\u003cbr /\u003e\n\n### Transformations \u0026 Validation\n\nApply powerful transformations without writing code:\n\n| 🛠️ Transformation  | 🎯 Use Case                                | 📊 Example                      |\n| ------------------ | ------------------------------------------ | ------------------------------- |\n| `lowercase_keys`   | Convert all keys to lowercase              | `FirstName` → `firstname`       |\n| `uppercase_keys`   | Convert all keys to UPPERCASE              | `FirstName` → `FIRSTNAME`       |\n| `snake_case_keys`  | Convert to Python/database friendly format | `FirstName` → `first_name`      |\n| `camel_case_keys`  | Convert all keys to camelCase format       | `FirstName` → `firstName`       |\n| `pascal_case_keys` | Convert all keys to PascalCase format      | `first_name` → `FirstName`      |\n| `normalize_types`  | Coerce all values to string type           | `{\"id\": 123}` → `{\"id\": \"123\"}` |\n\nTransformation metadata keys auto-match the selected transform mode; toggle inclusion with `include_transformation_metadata` (default: off).\n\nSchema validation uses a cascading approach:\n\n1. **Primary** - Full typed Pydantic validation\n2. **Fallback** - Enhanced validation with type flexibility\n3. **Manual** - String-normalized fallback for ambiguous data\n\nThis ensures your pipeline succeeds even with unexpected data variations.\n\n\u003cp align=\"right\"\u003e\n    (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## 🧪 \u003ca name=\"example-workflow-conceptual\"\u003eExample Workflow (Conceptual)\u003c/a\u003e\n\n- **Source**: JSON/NDJSON ingested through a connector\n- **Validate**: Schema-checked with graceful fallback to keep data flowing\n- **Transform**: Key casing/normalization applied; transformation metadata added\n- **Store**: Written to the configured output target\n- **Observe**: Live dashboard shows task progress, retries, and data diffs\n\nNote: Transformation automatically adds lineage metadata fields whose key casing matches the selected `transform_mode`.\n\n\u003cp align=\"right\"\u003e\n  (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## 🎬 \u003ca name=\"demo\"\u003eDemo\u003c/a\u003e\n\n### Live Workflow Dashboard\n\nReal-time visualization of agent execution, task dependencies, and data flow:\n\n\u003c!-- TODO: Update with new gif --\u003e\n\u003cimg src=\"./demo/images/demo.gif\" width=\"700\"\u003e\n\n\u003cbr /\u003e\n\n### Data Transformation Preview\n\nSee exactly what changed before and after transformations are applied. \u003cspan style=\"color:#6de896\"\u003e\u003cb\u003eHighlighted\u003c/b\u003e\u003c/span\u003e fields show which keys and values were modified:\n\n\u003c!-- TODO: Update with new image --\u003e\n\u003cimg src=\"./demo/images/preview.png\" width=\"700\"\u003e\n\nSample transformation exhibited: `lowercase_keys` with `include_transformation_metadata` fields enabled.\n\n\u003cp align=\"right\"\u003e\n    (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## 💪🏼 \u003ca name=\"quality-reliability\"\u003eQuality \u0026 Reliability\u003c/a\u003e\n\n- **Comprehensive testing**: 21 end-to-end scenario tests + 12 unit tests validate all agents, connectors, and transformation modes\n- **Continuous verification**: Automated CI/CD pipeline (GitHub Actions) with static type checking, linting, and coverage reporting (Coveralls)\n- **Production-ready resilience**: Built-in retries, automatic dependency ordering, stall detection, and graceful validation fallback\n- **Observable \u0026 maintainable**: Full-stack observability through live dashboard; clean, modular architecture for long-term maintenance\n\n\u003cp align=\"right\"\u003e\n  (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n\n## :pencil: \u003ca name=\"license\"\u003eLicense\u003c/a\u003e\n\nAll rights reserved.\n\nThis source code is proprietary. Unauthorized copying, modification, distribution, or use is prohibited without explicit permission from the author.\n\n\u003cp align=\"right\"\u003e\n    (\u003ca href=\"#readme-top\"\u003eback to top\u003c/a\u003e)\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimjuliengaupin%2Fbourne","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fimjuliengaupin%2Fbourne","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimjuliengaupin%2Fbourne/lists"}