Giter Club home page Giter Club logo

jlumbroso / basic-git-scraper-template Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 0.0 456 KB

🔬 Starter template for automating web scrapers using GitHub Actions workflows to incrementally commit data to Git 📈 Includes sample script, scheduling, dependency installation, output to CSV/JSON, and ethics guide 🤖 Customizable for diverse sites and use cases!

License: MIT License

Python 79.17% Jupyter Notebook 20.83%
git-scraping github-template template web-scraping

basic-git-scraper-template's Introduction

Hello there! 👋🏻

Jérémie's dynamically generated GitHub stats

Jérémie's Mastodon Jérémie's Twitter Jérémie's Github Jérémie's ORCID Jérémie's GoogleScholar Jérémie's LinkedIn

Faculty at University of Pennsylvania's Department of Computer & Information Sciences. I love to teach, to mentor and advise students, to think "at scale", to build stuff open source, and to expand the circle of people who identify as "programmers."

  • 🔭 I’m currently working on music digital humanities project + CS education, code grading/teaching projects
  • 🌱 I’m currently learning TypeScript/React/front-end + machine learning
  • 👯 I’m looking to collaborate on open-source projects, especially that reduce the friction to building
  • 💬 Ask me about scaling, academic peer review, gamification, centralization/decentralization, capitalism, good software engineering practices, veganism 🐮

🎹 Tools for Musical Digital Humanities

  • 🎶 imslp: A Python package to query and retrieve scores from the International Music Score Library Project (IMSLP).

  • 🎼 incipit: A Python package and command line tool to slice a musical score into bars, staves and systems. Was originally designed to extract the first line of each of Domenico Scarlatti's 555 sonatas to create a searchable catalog with incipit.

You can also visit the GitHub organization of the Domenico Scarlatti Foundation.

⚙️ GitHub Templates for your projects

🎲 Probabilistic Algorithms

  • 🌊 Many data streaming probabilistic algorithms, including those I design and study, use families of hash functions. Hard to find families with good properties (simple, efficient, not too correlated). A affine transform of CRC32 hash, with factors drawn from Mersenne Twister provides a good empirical family. Details are tricky to get right—so I get them right for you!

  • 🙆🏼 Affirmative Sampling (2022) with Conrado Martínez (PDF), is a novel probabilistic sampling algorithm of which the size of the sample grows as a function of the (unknown) number of distinct elements, making it uniquely adaptive to queries that depend on the relative proportion of elements. Reference implemented in Python at affirmative-sampling

basic-git-scraper-template's People

Contributors

github-actions[bot] avatar jlumbroso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.