This repository accompanies the presentation by Joy Payton entitled "Driving Data Insights with Google BigQuery: Trying it on for Size", given in February 2023. The goal of that presentation is to help learners understand how BigQuery can drive data insights in different ways for different stakeholders.
Caper Role | Data Talent |
---|---|
Mastermind | Your C-Suite / Senior Leader, who needs the big picture in a dashboard |
Insider | Your data subject matter expert and data provisioner, who wants to understand ETL |
Hacker | The data analyst who throws together quick and dirty notebooks for exploratory data analysis |
Safecracker | The data scientist who gets at the deep insights through machine learning |
Your insider is the data SME, the person who knows the data well.
(Department heads, DBAs, data lake contributors…)
Concern:
"How do I get the data into BigQuery?"
Info for the Insider
- BigQuery is for tabular data, and you can get data into BigQuery in a number of ways.
- Object Storage for flat files like .csvs via Google Cloud Storage
- Ingestion of various types of flat file data directly into Google BigQuery
- Integration with other public cloud providers like AWS
- Streaming data options
- Data organization:
- Projects hold
- Datasets which hold
- Tables
- Datasets which hold
- Projects hold
Tasks this crew member will be interested in:
- Creating a Google Cloud Platform Project
- Enabling BigQuery within a project
- Enabling Google Cloud Storage within a project
- Adding a new bucket
- Adding files and folders to a bucket
- Adding a dataset to BigQuery
- Adding data to a dataset in BigQuery
- From files in Google Cloud Storage
- From local data
- From third party connections (e.g. AWS, Azure)
- Other methods (two dozen and counting!)
Your mastermind is a senior decision maker who cares about the big picture.
(VPs, C-Suite…)
Concern:
"How can I see insights via a dashboard I can understand easily?"
Info for the Mastermind
- Dashboards using Looker Studio are easy to create and change, existing staff will pick this up right away
- There's some built in data privacy double-checking
- Sharing is simple and intuitive
Tasks this crew member will be interested in:
Your hacker is a data analyst who loves working in a Jupyter notebook for quick exploration.
(Data analysts, developers, data scientists…)
Concern:
"How can I access BigQuery data from a notebook?"
Info for the Hacker:
- Google Colab provides multi-language support for notebooks
- Colab is very well integrated into BigQuery
- A boilerplate Colab notebook is just one click away!
Tasks this crew member will be interested in:
- Automatic creation of Colab boilerplate
- Manual method: authenticating to GCP in a Colab notebook
- Manual method: adding BigQuery data in a Colab notebook
Your safecracker is a data scientist who wants to get valuable insights that lead to predictive value.
(Data scientists, CDO, data engineers…)
Concern:
"How can I do machine learning?"
Info for the Safecracker:
- BigQuery ML steps:
- Create a dataset to hold your model(s)
- Write a model creation query in SQL
- Inspect your model
- Use your model for prediction
Tasks this crew member will be interested in: