Giter Club home page Giter Club logo

databricks-workshops's Introduction

Databricks Workshops

Description of some databricks workshops and learning material we have developed at Knowit.

Workshops (Knowit toppturer)

These workshops are 2.5h hands-on workshops for learning various important aspects of databricks.

At Knowit we call these workshops Toppturer, giving quick but meaningful experience with a technology/tool/framework.

image

Workshop: Data engineering on Databricks

Link: https://github.com/knowit/AWS-Databricks-NYC-Taxi-Workshop

For: Developers, analysts, data scientists, data engineers.

Pre-requisites: Some python knowledge

Topics:

  • Basic understanding of components and tools in Databricks
  • Perform data transformation in Spark SQL and Pyspark
  • Use Databricks Reops for git-versioned Data Engineering
  • Deploy a Spark job with Databricks Workflows
  • Write ETL code and data quality checks in Delta Live Tables

Link:

Workshop: Using LangChain and open LLM-modeller on Databricks

Link: https://github.com/paalvibe/llm-langchain-course

For: Anybody

Topics:

  • Setup and use of LLMs in Databricks
  • Use of Langchain-rammeverket for:
    • LLM-wrapping
    • LLM-serving
    • Summarizing
  • Context embedding with chromadb
  • Reformating
  • Multi query retrieval
  • Prompt engineering

Workshop: LLM Adaptation on Databricks

Link: https://github.com/paalvibe/llm-tune-course

For: Anybody

Topics:

  • What is an LLM (Large Language Model)?
  • Tuning of LLM-modeller on Databricks
  • Different modes of adapting LLMs
  • When and when not to train your own LLM?

Workshop: DataOps on Databricks, using git and versioning of tables, jobs and code

Link: https://github.com/paalvibe/databricks-dataops-course

For: Data Engineers, Full stack data scientists, ML Engineers, Data Platform Engineers

Topics:

  • Opinionated git-based approach to DataOps
  • Structure your environments to allow for dev runs of data pipelines
  • Move data pipelines from dev to prod
  • Using git branches and commits to name and manage data and jobs responsibly
  • Will not do Github Actions here, but the processed needed are used
  • Does not cover data quality nor pipeline management

Pre-requisites: Some python knowledge

FUTURE Workshop: DataOps on Databricks part 2

For: Data Engineers, Full stack data scientists, ML Engineers, Data Platform Engineers

  • How to enable data contracts and data quality checks in pipelines
  • Difference between Delta Live Tables and regular databricks notebooks

Pre-requisites: Some python knowledge

databricks-workshops's People

Contributors

paalvibe avatar

Stargazers

novica avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.