Using Python v3.8.10
on Ubuntu Linux v20.04
, with Docker v20.10.8
,
docker-compose v1.29.2
and DBT v0.20.2
.
-
Pre-setup for Ubuntu/Debian distros
sudo apt-get install git libpq-dev python-dev python3-pip sudo apt-get remove python-cffi sudo pip install -U cffi pip install cryptography~=3.4
-
Create a Python virtual environment and activate it
python3 -m venv .venv; source .venv/bin/activate
-
Install libraries (upgrading
pip
if necessary)pip install -U wheel setuptools; pip install -U dbt
-
Initialize DBT project and go inside it
dbt init DBT-with-Postgres; cd DBT-with-Postgres
-
Initialize Git
git init
-
Build Docker containers specified in the
docker-compose.yml
filedocker-compose up --build -d
-
Copy over the Postgres JDBC driver to the
jars
directory inSPARK_HOME=/usr/local/spark
docker cp drivers/postgresql-42.2.23.jar dbt-with-postgres_pyspark-notebook_1:/usr/local/spark/jars/
-
Create a
profiles.yml
file in the~/.dbt/
directory and add the following.darmadia: # the profile_name, typically a company name, with one profile for each warehouse target: dev-postgres-dest outputs: dev-postgres-dest: type: postgres # type of connection host: localhost user: destdb1 password: destdb1 port: 5434 dbname: destdb schema: stage threads: 4 keepalives_idle: 0 # search_path: [optional, override the default postgres search_path] # role: [optional, set the role dbt assumes when executing queries] # sslmode: [optional, set the sslmode used to connect to the database]
Run the debug command to test connection to database
dbt debug
-
Run the example models to confirm that setup is successful
dbt run