Giter Club home page Giter Club logo

mybi-dbt-showcase's Introduction

myBI Market + dbt demo project

This repo guides you through building analyics for myBI Market with Modern Data Stack:

  • myBI Connect (Extract - Load tool)
  • Clickhouse (Database)
  • dbt (Transformations) + mybi-dbt-core module
  • Metabase (Business Intelligence)
  • Github Actions (Continuous Integration + Deployment)

Table of Contents

  1. Configure environment
  2. Install and configure mybi_dbt_core package
  3. Configure data sources
  4. Build staging layer models
  5. Model Data Marts
  6. Set up metrics layer
  7. Visualize on a dashboard
  8. Publish dbt project docs
  9. Introduce Continuous Integration

Configure environment

  1. Install prerequisites:

  2. Fork & Clone this repository and open in IDE

  3. Spin up Docker containers

    All the services are configured via Docker containers.

    • Clickhouse
    • Metabase
    • Cube
    # launch containers: clickhouse, metabase, cube
    docker-compose build --no-cache
    docker-compose up -d
  4. Open dev container with dbt installed

    devcontainer open .
    
    # test connections
    dbt --version
    dbt debug

    Spin up Docker containers

Install and configure mybi_dbt_core package

  1. Install module via packages.yml

    dbt clean # clean temp files
    dbt deps # install dependencies (modules)
  2. ✅ Enable only relevant data models (and disable the rest)

    We will use specific data sources:

    Confirm with command: dbt ls --resource-type model -s tag:staging

    Confirm specfic data sources enabled

  3. ✅ Assign variables

    Certain variable values have to be set:

    • source database connection details
    • database and schema name to find mybi source data
    • specific account_id values to be included
  4. ✅ Turn on custom schema management

    I use generate_schema_name.sql macro to set target schema names:

    Renders a schema name given a custom schema name. In production, this macro will render out the overriden schema name for a model. Otherwise, the default schema specified in the active target is used.

    Take a look at custom_schema_management.sql macro to find out more.

Configure data sources

  1. Create Clickhouse database of PostgreSQL Database Engine with source data:

    dbt run-operation init_source_data

    Initialized myBI source database

Build staging layer models

Staging models are alredy configured for you in mybi_dbt_core package:

All you need to do is just build these models in one command:

dbt build -s tag:staging

Build staging models

Model Data Marts

With staging models in place we now can proceed to data modeling.

  1. Intermediate models include:

    dbt build -s tag:intermediate
  2. Data Marts include:

    dbt build --full-refresh -s tag:marts

    Take a look at the project graph (DAG):

    Data marts graph

Set up metrics layer

dbt Metrics

Cube Metrics

Cube Playground

Visualize on a dashboard

Now we are ready to visualize key metrics on a dashboard.

I have configured Clickhouse connection and prepared Metabase dashboard which you can access at http://localhost:3000/dashboard/1-mybi-tutorial:

Explore data from Metabase dashboard

You may explore data from Metabase yourself or even build your own dashboard.

Publish dbt project docs

dbt Docs can be easily served locally on http://localhost:8080:

dbt docs generate
dbt docs serve

Or you may access pre-build version from Github Pages:

Access dbt project docs

Introduce Continuous Integration

Let's say you want to introduce some code changes. How do you ensure data quality?

You protect your main branch and require Pull Requests to have:

Continuous Integration Checks

This way you make sure to deploy high quality and functional code.

Contributing

If you have any questions or comments please create an issue.

mybi-dbt-showcase's People

Contributors

kzzzr avatar katrindenek avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.