Giter Club home page Giter Club logo

data-pipeline's Introduction

dbt-data-pipeline

Setup Steps

Google Sheets Setup

  1. Copy this Google Sheet
  • Set up Supermetrics queries to fill in each channel
  • Don't touch anything that is dark grey
  1. Copy the url of the Sheet as you'll need it in the bigQuery setup
  2. Make a copy of the the Data Pipeline Reporting Client List; This will serve as the file where you configure your clients.
  • Client - Add the Client Name
  • DMAM - is the account Manager. Could also be your SEO or any other role. You can change the heading but you'll have to chane the field names downstream in your SQL queries.
  • PPC - name of the PPc person on your account
  • MPA - name of the digital marketing assistant on the account
  • Content - Set the name of the content person on the account
  • Logo URL - please enter
  • Client Summary Sheet URL - add the URL of each client's Sheet
  • Client Dashboard URL - add url of their data studio report.
  1. Make a copy of this Google sheet
  • GDS Date - Enter date for each month on each row for each Client in yyyy-mm-dd format.
  • Client - Enter Client Name (must match the name in client list sheet. You should create data validation in this column to refer to a unique list of names from the client list file you created before to limit opportuinities of data entry error.
  • budget - enter how much the client pays you to manage each month. in this format: 1111.00 (don't use curreny symbols).

**** THis is a work in progress and more on the sheets setup to come

Google Cloud Account

  1. Sign Up for a Google Cloud Free Trial Account
  2. Enable the BigQuery API here

Create your BigQuery Project courtesy of dbt's documentation

  1. Go to the BigQuery Console โ€” if you don't have a Google Cloud Platform account you will be asked to create one.
  2. Create a new project for this tutorial โ€” if you've just created a BigQuery account, you'll be prompted to create a new project straight away. If you already have an existing you can select the project drop down in the header bar, and create a new project from there.

BigQuery Credentials for dbt

  1. Go to the BigQuery credential wizard. Ensure that your new project is selected in the header bar.
  2. Generate credentials with the following options:
  • Which API are you using? BigQuery API
  • Are you planning to use this API with App Engine or Compute Engine? No
  • Service account name: dbt-user
  • Role: BigQuery User
  • Key type: JSON
  • Download the JSON file and save it in an easy-to-remember spot, with a clear filename (e.g. dbt-user-creds.json)

BigQuery Setup

  1. Buid new Dataset in BigQuery to match the client name with multiple words separated by underscores.
  2. Build external tables for each of the marketing channels with following process:
  • Click on Client's Dataset
  • Choose Create a Table
  • Table Settings:
    • Create table from Drive
    • Put Sheet URL in Select Drive URI
    • Set File Format to Google Sheets
    • Confirm Project name and Dataset name are correct
    • Set table name
    • Check autodetect schema and input parameters
    • Set "Header rows to skip" to 1.
    • Click Create Table
  1. Tables include:
  • GA_in
  • GA_Goals_in
  • GMB_in
  • GoogleAds_in
  • Mailchimp_in
  • Management_in
  • Microsoft_in
  • Reporting_Client_List_in

BigQuery Gotchas

  • If the created table doesn't have the correct header names from your Google Sheet, no big deal. Just rebuild the table.
  • Remember to check autodetect schema and input parameters
  • Remember to Set "Header rows to skip" to 1.

Github Setup

  1. Clone this repo.
  2. Name the new Repo "data-pipeline- to match the Client name
  3. Edit the dbt_project.yml.
    • lines 6 (change name to client Name),
    • Line 33 Change from CAM_Solar to the client name with underscores inbetween words.
    • Line 37 Change CAM_Solar to to match the Dataset name found in BigQuery
    • Line 38 Change from CAM Solar to match the real Client Name
  4. Use this repo as the source for new dbt Project.

Set Up dbt Account + Project courtesy of dbt

  1. Create a dbt Cloud account here. If your organization already has a dbt Cloud account, ask an admin to add you as a Developer.
  2. Click the hamburger menu, then Account Settings, then Projects.
  3. Click "New Project".
  4. CLick "Begin" on next screen.
  5. Name the project "pipeline", and click "Continue".
  6. Click "BigQuery" on Next Screen.
  7. Click "Upload a Service Account JSON file" and paste in the credentials file from the BigQuery Credentials for dbt instructions above.
  8. scroll down to the "Development Credentials and rename the DATASET to match what was entered in line 37 of the Github Setup Section.
  9. Scroll Up and Click "Test".
  10. After you see "Connection Test Succeeded" Click "Continue".
  11. On Next Screen Click "Github" Button.
  12. You'll then be prompted to log into Github or you'll see your repos listed below in a table.
  13. Click on the Repo you built in the Github Setup.
  14. After the success message, Click the "Continue" buttonin top right of screen.
  15. On next screen Click "Start Developing".

Learn all about dbt Projects here

Running dbt

Follow dbt's documentation here

Scheduling dbt Runs

dbt's docuumentation Here

data-pipeline's People

Contributors

noahlearner avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.