Giter Club home page Giter Club logo

suneelsunkara / procfwk Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mrpaulandrew/procfwk

0.0 1.0 0.0 38.58 MB

A cross tenant metadata driven processing framework for Azure Data Factory and Azure Synapse Analytics achieved by coupling orchestration pipelines with a SQL database and a set of Azure Functions.

Home Page: https://mrpaulandrew.com/category/azure/data-factory/adf-procfwk/

License: Other

PowerShell 3.97% C# 47.15% TSQL 41.28% PLpgSQL 1.27% Jupyter Notebook 6.25% Scala 0.09%

procfwk's Introduction

Read Me - Orchestrate.procfwk

Documentation

For complete documentation on this solution see procfwk.com.

Framework Capabilities

  • Granular metadata control.
  • Metadata integrity checking.
  • Global properties.
  • Complete pipeline dependency chains.
  • Concurrent batch executions (hourly/daily/monthly).
  • Execution restart-ability.
  • Parallel pipeline execution.
  • Full execution and error logs.
  • Operational dashboards.
  • Low cost orchestration.
  • Disconnection between framework and Worker pipelines.
  • Cross Tenant/Subscription/Data Factory control flows.
  • Pipeline parameter support.
  • Simple troubleshooting.
  • Easy deployment.
  • Email alerting.
  • Automated testing.
  • Azure Key Vault integration.
  • Is pipeline already running checks.

Complete Data Factory Activity Chain

Issues

If you've found a bug or have a new feature request please log the details using the repository issues.

Go to... Issues

Projects

Go to... External Requests

Go to... Internal Backlog

Release Details

Version Overview Version Details & Release Notes
2.0 Azure Synapse Analytics fully supported as an interchangeable orchestrator of pipelines within the procfwk. GitHub Pages:
Orchestrators
Orchestrator Types

Release Summary Video:
YouTube - procfwk Playlist

GitHub Issues:
procfwk #95
2.0-beta Azure Synapse Analytics Beta support added.

Development of Azure Functions App completed using the Synapse namespace: Azure.Analytics.Synapse.Artifacts with version 1.0.0-beta.1 of the NuGet package.
GitHub Issues:
procfwk #21
1.9.2 Batch Executions added, plus:
  • Exception Pipeline
  • Running Pipeline Check
  • Pipeline Parameter Last Values
  • Worker Pipeline Validation
GitHub Pages: Batch Executions

Release Demo Summary Video: YouTube - procfwk Playlist

GitHub Issues:
procfwk #78
procfwk #77
procfwk #71
procfwk #73
procfwk #80
procfwk #72
1.9.1 Activity Policy Update, plus:
  • Secure Activity Inputs/Outputs.
  • Execution Wrapper Hardening.
  • New Activity Icons and Framework Factory Cosmetics.
GitHub Issues:
procfwk #65
procfwk #66
procfwk #67
procfwk #69
1.9.0 Cross Tenant & Subscription Support added, plus:
  • New integration tests created.
  • Infant pipeline refactoring.
  • tSQLt project added.
GitHub Issues:
procfwk #34
procfwk #35
procfwk #46
procfwk #55
procfwk #56
procfwk #59
1.8.6 Pipeline Expressions Refactored to Use Variables added, plus:
  • New integration tests created.
  • Complete activity chain redrawn in Visio.
GitHub Issues:
procfwk #51
procfwk #52
1.8.5 Execution Precursor added, plus:
  • PowerShell helper to add initial Worker metadata.
procfwk v1.8.5 - Execution Precursor
1.8.4 Database Schema Reorganise and Restructuring procfwk v1.8.4 - Database Schema Reorganise and Restructuring
1.8.3 Bug Fixes from the Community, including:
  • Email alerts sent to blank email addresses due to wrong flow in Child pipeline.
  • Worker pipelines cancelled during an execution fail when the framework is restarted due to missing Parent pipeline clean up condition.
GitHub Issues:
procfwk #38
procfwk #37
1.8.2 Optionally Store SPN Details in Azure Key Vault procfwk v1.8.2 - Optionally Store SPN Details in Azure Key Vault
1.8.1 Automated Framework Pipeline Testing added, including tests for:
  • A simple grandparent run.
  • All types of failure dependency handling.
  • Metadata checks when pipelines and staged are disabled.
  • No pipeline parameters provided.
Blog Series:
  1. Set up automated testing for Azure Data Factory
  2. Automate integration tests in Azure Data Factory
  3. Isolated functional tests for Azure Data Factory
  4. Testing Azure Data Factory in your CI/CD pipeline
  5. Unit testing Azure Data Factory pipelines
  6. Calculating Azure Data Factory test coverage
1.8.0 Complete Pipeline Dependency Chains For Failure Handling added, plus:
  • Clean up of a previous execution run if Workers appear as running.
  • New metadata integrity checks.
  • Internal get property value function added.
procfwk v1.8 - Complete Pipeline Dependency Chains For Failure Handling
1.7.3 Data Factory Deployment Updated To Use azure.datafactory.tools PowerShell Module SQLPlayer/azure.datafactory.tools
1.7.2 Pipeline Parameter NULL Handling added, plus:
  • Worker pipelines with a status of 'Running' protected from a new execution start/restart.
procfwk v1.7.2 - NULL Pipeline Parameters Handled
1.7.1 Alerting Check Bug Fix added, plus:
  • Pipeline parameter value size limit removed.
procfwk v1.7.1 - Alerting Bug Fix And Pipeline Parameter Size Limit Removed
1.7.0 Pipleline EMail Alerting added, plus:
  • Send email Function implemented and hardened.
  • Handy Notebook updates.
  • Activity failure paths improved.
  • MIT license and code of conduct added.
  • Error table bug fix. Error code attribute; INT to VARCHAR
procfwk v1.7 - Pipeline Email Alerting
1.6.0 Error Details for Failed Activities Captured, plus:
  • Pipeline parameters used at runtime captured in execution logs.
  • Emailing Function added, not yet implemented.
  • Unknown Worker outcomes optionally blocks downstream stages.
  • Solution housekeeping.
procfwk v1.6 - Error Details for Failed Activities Captured
1.5.0 Power BI Dashboard for Framework Executions, plus:
  • Worker Parallelism View.
  • Pipeline Run ID now logged.
  • Logging Attributes Bug Fix.
procfwk v1.5 - Power BI Dashboard for Framework Executions
1.4.0 Enhancements for Long Running Pipelines, plus:
  • Pipeline check status function added.
  • Function Data Factory client moved to internal class.
  • SQL GETDATE() changed to GETUTCDATE().
  • Glossary created, here.
  • Updated database views.
procfwk v1.4 - Enhancements for Long Running Pipelines
1.3.0 Metadata Integrity Checks, plus:
  • Logical pipeline predecessors.
  • Data Factory Powershell deployment script.
  • Helper Notebook.
  • Database objects renames and solution tidy up.
procfwk v1.3 - Metadata Integrity Checks
1.2.0 Execution Restartability, plus:
  • Data Factory annotations and descriptions.
  • Database covering indexes.
  • Pipeline log status changed from 'Started' to 'Preparing'.
  • Pipeline log start date/time now set in child pipeline.
procfwk v1.2 - Execution Restartability
1.1.0 Service Principal Handling via Metadata, plus:
  • Data Factory table.
  • Properties table and view.
  • Function body bug fix.
  • New sample data.
procfwk v1.1 - Service Principal Handling via Metadata
1.0.0 Simple framework designed and base compontents built.
  • Part 1 - Design, concepts, service coupling, caveats, problems.
  • Part 2 - Database build and metadata.
  • Part 3 - Data Factory build.
  • Part 4 - Execution, conclusions, enhancements.
Blog Series:
Creating a Simple Staged Metadata Driven Processing Framework for Azure Data Factory Pipelines

procfwk's People

Contributors

mrpaulandrew avatar nowinskik avatar pedrofiadeiropeak avatar njlangley avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.