Giter Club home page Giter Club logo

github-stars-pipeline-temp's Introduction

some change

GitHub Gems: Driving Open-Source Investments With Data

Welcome to the GitHub Gems project! This project hosts a data analytics pipeline that enables smarter investment decisions by measuring the popularity of open-source repos on Github.

Project Overview

The goal of this project is to develop an efficient data pipeline that streamlines analytics, reduces manual effort, and enables deeper insights into the open-source ecosystem on GitHub. By leveraging modern data tools and best practices, such as dbt (data build tool) and Airflow, we aim to create a scalable and reliable solution for data-driven decision-making.

Getting Started

To get started with the GitHub Gems project, follow these steps (click on the links for guides):

Set up your IDE

ℹ️ Skip some steps if you're already set!

If you already have git, VSCode, and/or Python installed, just skip the corresponding step(s).

  1. If you don't already use git, install it here.

  2. If you don't have a coding editor installed, install VSCode. After that, install the Python and Python extension.

  3. Make sure you have Python 3 installed (or install it here).

Create your personal repo

  1. Create a new repo in your Github account and name it github-stars-pipeline.

  2. Clone this repo.

git clone https://github.com/edsioufi/github-stars-pipeline.git
  1. Point your local clone to your own remote (so that you can modify your copy of the repo, not the template). Make sure you repalce {your_github_username} with the corresponding value.
cd github-stars-pipeline
git remote set-url origin https://github.com/{your_github_username}/github-stars-pipeline.git
  1. Push to your new github repo.
git push origin master

Set up your python environment and DuckDB

  1. Create a python virtual environment for your repo:
python -m venv venv
source venv/bin/activate
  1. Install DuckDB (make sure you select the Python option), your first python dependency.

ℹ️ You might have to install additional dependencies if you're on Windows.

  1. Install DBeaver to explore DuckDB.

  2. Create a new git branch:

git checkout -b add_duck_db
  1. Add your newly installed packages to your requirements file:
pip freeze > requirements.txt
  1. Commit and push:
git add --all
git commit
git push origin -u add_duck_db
  1. Create a Pull Request (PR) in Github.

  2. Merge your first PR.

github-stars-pipeline-temp's People

Contributors

edsioufi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.