Overview: In this mini project,
We are building an ETL pipeline using Python, Pandas, and Python dictionary methods to extract and transform the data. After transforming this data, we have created four CSV files and use the CSV file data to create an ERD and a table schema. After creating schema, we are loading these CSV file data into a Postgres database.
Installation : Clone the repository. Install required Python packages: pandas, openpyxl. Set up a PostgreSQL server and create a new database called crowdfunding_db.
How to execute : Run the Jupyter Notebook to extract and transform data, and create CSV files. Use the provided crowdfunding_db_schema.sql file to create tables in the PostgreSQL database. Import the CSV files into the corresponding PostgreSQL tables. Query the database to verify the data has been loaded correctly.
Techstack used: Python Pandas PostgreSQL Jupyter Notebook Excel