An open API service indexing awesome lists of open source software.

https://github.com/mrpaulandrew/procfwk

A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration pipelines with a SQL database and a set of Azure Functions.
https://github.com/mrpaulandrew/procfwk

adf adfprocfwk azure azure-functions azure-sql-database data-engineering data-factory framework metadata pipelines processing procfwk

Last synced: 8 days ago
JSON representation

A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration pipelines with a SQL database and a set of Azure Functions.

Awesome Lists containing this project

README

        

# Read Me - Orchestrate.[procfwk](http://procfwk.com/)

For complete documentation on this solution see [procfwk.com](http://procfwk.com/).

## ProcFwk Has Become CF.Cumulus.Control

See blog: [mrpaulandrew.com](https://mrpaulandrew.com/2024/01/07/procfwk-is-getting-an-upgrade-to-cf-cumulus/)

See new product page: [cloudformations.org/cumulus](https://www.cloudformations.org/cumulus?utm_source=pa&utm_medium=github&utm_campaign=cumulus&utm_content=l2)

[ ![](https://mrpaulandrew.github.io/procfwk/procfwk-to-cumulus.png) ](https://mrpaulandrew.github.io/procfwk/procfwk-to-cumulus.png)

ProcFwk will receive no further development beyond December 2023.

## Framework Capabilities

* Granular metadata control.
* Metadata integrity checking.
* Global properties.
* Complete pipeline dependency chains.
* Concurrent batch executions (hourly/daily/monthly).
* Execution restart-ability.
* Parallel pipeline execution.
* Full execution and error logs.
* Operational dashboards.
* Low cost orchestration.
* Disconnection between framework and Worker pipelines.
* Cross Tenant/Subscription/Data Factory control flows.
* Pipeline parameter support.
* Simple troubleshooting.
* Easy deployment.
* Email alerting.
* Automated testing.
* Azure Key Vault integration.
* Is pipeline already running checks.

## Complete Data Factory Activity Chain

[ ![](https://mrpaulandrew.github.io/procfwk/activitychain-full.png) ](https://mrpaulandrew.github.io/procfwk/activitychain-full.png)

## Issues

If you've found a bug or have a new feature request please log the details using the repository issues.

Go to... [Issues](https://github.com/mrpaulandrew/procfwk/issues)

## Projects
Go to... [External Requests](https://github.com/mrpaulandrew/procfwk/projects/2)

Go to... [Internal Backlog](https://github.com/mrpaulandrew/procfwk/projects/1)

## Release Details

| Version | Overview | Version Details & Release Notes |
|:----:|--------------|--------|
| 2.0 |Azure Synapse Analytics fully supported as an interchangeable orchestrator of pipelines within the procfwk.|GitHub Pages: [Orchestrators](https://mrpaulandrew.github.io/procfwk/orchestrators)[Orchestrator Types](https://mrpaulandrew.github.io/procfwk/orchestratortypes) Release Summary Video: [YouTube - procfwk Playlist](https://www.youtube.com/c/mrpaulandrew)

GitHub Issues:[procfwk #95](https://github.com/mrpaulandrew/procfwk/issues/95) |
| 2.0-beta |Azure Synapse Analytics **Beta** support added.

Development of Azure Functions App completed using the Synapse namespace: _Azure.Analytics.Synapse.Artifacts_ with version **1.0.0-beta.1** of the NuGet package.|GitHub Issues:[procfwk #21](https://github.com/mrpaulandrew/procfwk/issues/21) |
| 1.9.2 |Batch Executions added, plus:


  • Exception Pipeline

  • Running Pipeline Check

  • Pipeline Parameter Last Values

  • Worker Pipeline Validation

|GitHub Pages: [Batch Executions](https://mrpaulandrew.github.io/procfwk/executionbatches) Release Demo Summary Video: [YouTube - procfwk Playlist](https://www.youtube.com/c/mrpaulandrew)

GitHub Issues:[procfwk #78](https://github.com/mrpaulandrew/procfwk/issues/78)
[procfwk #77](https://github.com/mrpaulandrew/procfwk/issues/77)
[procfwk #71](https://github.com/mrpaulandrew/procfwk/issues/71)
[procfwk #73](https://github.com/mrpaulandrew/procfwk/issues/73)
[procfwk #80](https://github.com/mrpaulandrew/procfwk/issues/80)
[procfwk #72](https://github.com/mrpaulandrew/procfwk/issues/72) |
| 1.9.1 |Activity Policy Update, plus:

  • Secure Activity Inputs/Outputs.

  • Execution Wrapper Hardening.

  • New Activity Icons and Framework Factory Cosmetics.

|GitHub Issues:[procfwk #65](https://github.com/mrpaulandrew/procfwk/issues/65)
[procfwk #66](https://github.com/mrpaulandrew/procfwk/issues/66)
[procfwk #67](https://github.com/mrpaulandrew/procfwk/issues/67)
[procfwk #69](https://github.com/mrpaulandrew/procfwk/issues/69) |
| 1.9.0 |Cross Tenant & Subscription Support added, plus:

  • New integration tests created.

  • Infant pipeline refactoring.

  • tSQLt project added.

|GitHub Issues:[procfwk #34](https://github.com/mrpaulandrew/procfwk/issues/34)
[procfwk #35](https://github.com/mrpaulandrew/procfwk/issues/35)
[procfwk #46](https://github.com/mrpaulandrew/procfwk/issues/46)
[procfwk #55](https://github.com/mrpaulandrew/procfwk/issues/55)
[procfwk #56](https://github.com/mrpaulandrew/procfwk/issues/56)
[procfwk #59](https://github.com/mrpaulandrew/procfwk/issues/59) |
| 1.8.6 |Pipeline Expressions Refactored to Use Variables added, plus:

  • New integration tests created.

  • Complete activity chain redrawn in Visio.

|GitHub Issues:[procfwk #51](https://github.com/mrpaulandrew/procfwk/issues/51)
[procfwk #52](https://github.com/mrpaulandrew/procfwk/issues/52) |
| 1.8.5 |Execution Precursor added, plus:
  • PowerShell helper to add initial Worker metadata.
|[procfwk v1.8.5 - Execution Precursor](https://mrpaulandrew.com/2020/08/17/adf-procfwk-v1-8-5-execution-precursor/) |
| 1.8.4 |Database Schema Reorganise and Restructuring |[procfwk v1.8.4 - Database Schema Reorganise and Restructuring](https://mrpaulandrew.com/2020/07/23/adf-procfwk-v1-8-4-database-schema-reorganise-and-restructuring/) |
| 1.8.3 |Bug Fixes from the Community, including:

  • Email alerts sent to blank email addresses due to wrong flow in Child pipeline.

  • Worker pipelines cancelled during an execution fail when the framework is restarted due to missing Parent pipeline clean up condition.

|GitHub Issues:[procfwk #38](https://github.com/mrpaulandrew/procfwk/issues/38)
[procfwk #37](https://github.com/mrpaulandrew/procfwk/issues/37) |
| 1.8.2 |Optionally Store SPN Details in Azure Key Vault |[procfwk v1.8.2 - Optionally Store SPN Details in Azure Key Vault](https://mrpaulandrew.com/2020/07/22/adf-procfwk-v1-8-2-optionally-store-spn-details-in-azure-key-vault/) |
| 1.8.1 |Automated Framework Pipeline Testing added, including tests for:

  • A simple grandparent run.

  • All types of failure dependency handling.

  • Metadata checks when pipelines and staged are disabled.

  • No pipeline parameters provided.

|Blog Series:

  1. [Set up automated testing for Azure Data Factory](https://richardswinbank.net/adf/set_up_automated_testing_for_azure_data_factory)

  2. [Automate integration tests in Azure Data Factory](https://richardswinbank.net/adf/automate_integration_tests_in_azure_data_factory)

  3. [Isolated functional tests for Azure Data Factory](https://richardswinbank.net/adf/isolated_functional_tests_for_azure_data_factory)

  4. [Testing Azure Data Factory in your CI/CD pipeline](https://richardswinbank.net/adf/testing_azure_data_factory_in_your_cicd_pipeline)

  5. [Unit testing Azure Data Factory pipelines](https://richardswinbank.net/adf/unit_testing_azure_data_factory_pipelines)

  6. [Calculating Azure Data Factory test coverage](https://richardswinbank.net/adf/calculating_azure_data_factory_test_coverage)

|
| 1.8.0 |Complete Pipeline Dependency Chains For Failure Handling added, plus:

  • Clean up of a previous execution run if Workers appear as running.

  • New metadata integrity checks.

  • Internal get property value function added.

|[procfwk v1.8 - Complete Pipeline Dependency Chains For Failure Handling](https://mrpaulandrew.com/2020/07/01/adf-procfwk-v1-8-complete-pipeline-dependency-chains-for-failure-handling/) |
| 1.7.3 |Data Factory Deployment Updated To Use azure.datafactory.tools PowerShell Module |[SQLPlayer/azure.datafactory.tools](https://github.com/SQLPlayer/azure.datafactory.tools) |
| 1.7.2 |Pipeline Parameter NULL Handling added, plus:
  • Worker pipelines with a status of 'Running' protected from a new execution start/restart.
|[procfwk v1.7.2 - NULL Pipeline Parameters Handled](https://mrpaulandrew.com/2020/06/22/adf-procfwk-v1-7-2-null-pipeline-parameters-handled/) |
| 1.7.1 |Alerting Check Bug Fix added, plus:
  • Pipeline parameter value size limit removed.
|[procfwk v1.7.1 - Alerting Bug Fix And Pipeline Parameter Size Limit Removed](https://mrpaulandrew.com/2020/06/12/adf-procfwk-v1-7-1-alerting-bug-fix-and-pipeline-parameter-size-limit-removed/) |
| 1.7.0 |Pipleline EMail Alerting added, plus:

  • Send email Function implemented and hardened.

  • Handy Notebook updates.

  • Activity failure paths improved.

  • MIT license and code of conduct added.

  • Error table bug fix. Error code attribute; INT to VARCHAR

|[procfwk v1.7 - Pipeline Email Alerting](https://mrpaulandrew.com/2020/06/08/adf-procfwk-v1-7-pipeline-email-alerting/) |
| 1.6.0 |Error Details for Failed Activities Captured, plus:

  • Pipeline parameters used at runtime captured in execution logs.

  • Emailing Function added, not yet implemented.

  • Unknown Worker outcomes optionally blocks downstream stages.

  • Solution housekeeping.

|[procfwk v1.6 - Error Details for Failed Activities Captured](https://mrpaulandrew.com/2020/05/19/adf-procfwk-v1-6-error-details-for-failed-activities-captured/) |
| 1.5.0 |Power BI Dashboard for Framework Executions, plus:

  • Worker Parallelism View.

  • Pipeline Run ID now logged.

  • Logging Attributes Bug Fix.

|[procfwk v1.5 - Power BI Dashboard for Framework Executions](https://mrpaulandrew.com/2020/05/01/adf-procfwk-v1-5-power-bi-dashboard-for-framework-executions/) |
| 1.4.0 |Enhancements for Long Running Pipelines, plus:

  • Pipeline check status function added.

  • Function Data Factory client moved to internal class.

  • SQL GETDATE() changed to GETUTCDATE().

  • Glossary created, [here](https://github.com/mrpaulandrew/procfwk/blob/master/Glossary.md).

  • Updated database views.

|[procfwk v1.4 - Enhancements for Long Running Pipelines](https://mrpaulandrew.com/2020/04/20/adf-procfwk-v1-4-enhancements-for-long-running-pipelines/) |
| 1.3.0 |Metadata Integrity Checks, plus:

  • Logical pipeline predecessors.

  • Data Factory Powershell deployment script.

  • Helper Notebook.

  • Database objects renames and solution tidy up.

|[procfwk v1.3 - Metadata Integrity Checks](https://mrpaulandrew.com/2020/04/07/adf-procfwk-v1-3-metadata-integrity-checks/) |
| 1.2.0 |Execution Restartability, plus:

  • Data Factory annotations and descriptions.

  • Database covering indexes.

  • Pipeline log status changed from 'Started' to 'Preparing'.

  • Pipeline log start date/time now set in child pipeline.

|[procfwk v1.2 - Execution Restartability](https://mrpaulandrew.com/2020/03/24/adf-procfwk-v1-2-execution-restartability/) |
| 1.1.0 |Service Principal Handling via Metadata, plus:

  • Data Factory table.

  • Properties table and view.

  • Function body bug fix.

  • New sample data.

|[procfwk v1.1 - Service Principal Handling via Metadata](https://mrpaulandrew.com/2020/03/17/adf-procfwk-v1-1-service-principal-handling-via-metadata/) |
| 1.0.0 |Simple framework designed and base compontents built.

  • Part 1 - Design, concepts, service coupling, caveats, problems.

  • Part 2 - Database build and metadata.

  • Part 3 - Data Factory build.

  • Part 4 - Execution, conclusions, enhancements.

|Blog Series:
[Creating a Simple Staged Metadata Driven Processing Framework for Azure Data Factory Pipelines](https://mrpaulandrew.com/2020/02/25/creating-a-simple-staged-metadata-driven-processing-framework-for-azure-data-factory-pipelines-part-1-of-4/) |