some change
Welcome to the GitHub Gems project! This project hosts a data analytics pipeline that enables smarter investment decisions by measuring the popularity of open-source repos on Github.
The goal of this project is to develop an efficient data pipeline that streamlines analytics, reduces manual effort, and enables deeper insights into the open-source ecosystem on GitHub. By leveraging modern data tools and best practices, such as dbt (data build tool) and Airflow, we aim to create a scalable and reliable solution for data-driven decision-making.
To get started with the GitHub Gems project, follow these steps (click on the links for guides):
ℹ️ Skip some steps if you're already set!
If you already have git, VSCode, and/or Python installed, just skip the corresponding step(s).
-
If you don't already use git, install it here.
-
If you don't have a coding editor installed, install VSCode. After that, install the Python and Python extension.
-
Make sure you have Python 3 installed (or install it here).
-
Create a new repo in your Github account and name it
github-stars-pipeline
. -
Clone this repo.
git clone https://github.com/edsioufi/github-stars-pipeline.git
- Point your local clone to your own remote (so that you can modify your copy of the repo, not the template). Make sure you repalce
{your_github_username}
with the corresponding value.
cd github-stars-pipeline
git remote set-url origin https://github.com/{your_github_username}/github-stars-pipeline.git
- Push to your new github repo.
git push origin master
- Create a python virtual environment for your repo:
python -m venv venv
source venv/bin/activate
- Install DuckDB (make sure you select the Python option), your first python dependency.
ℹ️ You might have to install additional dependencies if you're on Windows.
-
Install DBeaver to explore DuckDB.
-
Create a new git branch:
git checkout -b add_duck_db
- Add your newly installed packages to your requirements file:
pip freeze > requirements.txt
- Commit and push:
git add --all
git commit
git push origin -u add_duck_db
-
Create a Pull Request (PR) in Github.