Giter Club home page Giter Club logo

azure_platform_dataengineer's Introduction

RealTime & BatchTime Azure Platform

This is a practical example of a data engineering project. Topics are:

1. Infrastructure as Code (IaC) with terraform:

Benefit:

  • Automate infrastructure management
  • Understanding infrastructure changes before being applied

Objective:

  • Deploy a resource group, a virtual machine, a simple storage, and a data warehouse.

Advanced Criteria:

  • State management: Proper management and storage of the state, possible remote backends.
  • Modularity: Scripts are modularized using modules, promoting reusability.
  • Destruction: Safe destruction of resources without leaving orphaned resources in the cloud environment.

2. Real-Time:

Architecture with Stream Analytic:

  • Generate data (Python) & send to Azure Event Hub
  • Read Stream data by Stream Analytics
  • Storing on Azure Data Lake Storage Gen2
  • Machine Learning Part: Deploy endpoint Machine learning (trained model) by Azure Machine Learning Studio
  • Adding Database features to Azure SQL Server
  • Visualize real-time data by the Power BI dashboard

Architecture with DataBricks:

  • Generate data (Python) & send to Azure Event Hub.
  • Databricks: using spark to read stream data from the event hub, save data with parquet format in Azure Data Lake Storage Gen2, using push API to send data to Power BI dashboard.

3. Batch-Time:

  • Web App (Html, Css, Js, Flask) : Input file csv and show report
  • Storing on Azure Data Lake Storage Gen2
  • Trigger Databricks job when new file arrive in Blob Storage: Azure Function Apps
  • Databrick: Ingest data from blob, ETL, Preprocessing and apply Machine learning model (Spark)
  • Delta Lake : raw data (Bronze), Select feature & processingn missing values (Silver), Result (Gold)
  • Machine Learning Part: Xgboost and ANN
  • Adding Database features to Azure SQL Server
  • Visualize data by Power BI report

Starting generate data

  1. Start terminal in RealTime/EventHub folder
  2. Run pip install -r requirements.txt
  3. Run python generate_realtime_eventhub_operation.py (same with python generate_realtime_eventhub_raman.py)

Starting Web app

  1. Start terminal in BatchTime/WebappDemoplatform folder
  2. Run pip install -r requirements.txt
  3. Run python main.py

You should create a new env.

Reference:

WEB DEMO

http://demoplatformv1.azurewebsites.net/

VIDEO DEMO

https://youtu.be/34y3LF-Zk80

azure_platform_dataengineer's People

Contributors

nguyen187 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.