Giter Club home page Giter Club logo

webscraping's Introduction

Web Scraping Use Cases Repository

Welcome to this unique GitHub repository, a treasure trove of real-world web scraping scenarios, complete with source code and Jupyter notebooks. This resource is designed to guide you through the various web scraping techniques applicable to a diverse range of use cases.

Currently, the repository features code and Jupyter notebooks for scraping job listings from two renowned job search websites, Indeed and LinkedIn. Each of these projects leverages popular Python libraries, namely BeautifulSoup and Selenium, and gives you an essential understanding of HTML structure - a key ingredient for any web scraping endeavor.

The beauty of these examples lies in their simplicity and modifiability. Each code is paired with a Jupyter notebook that simplifies the process into easily digestible portions, making it incredibly straightforward for you to follow and learn.

Here's a snapshot of what you can discover in this repository:

  • Indeed Job Scraper: Explore the Python script and corresponding Jupyter notebook explaining how to scrape job postings from Indeed.

  • LinkedIn Job Scraper: Discover the Python script and its Jupyter notebook, demonstrating the method to scrape job postings from LinkedIn.

  • Web Scraping Basics: Delve into Jupyter notebooks covering fundamental concepts and techniques such as interpreting HTML structure and using BeautifulSoup and Selenium for web scraping.

What's more, every piece of code in this repository comes with a complementary Medium article, offering a more detailed explanation and context. Simply follow the provided links in each of the previous sections.

This repository is a growing entity, with plans for continual expansion. New scripts and notebooks will be added to include more websites and use cases, ultimately creating a comprehensive resource for anyone keen on mastering web scraping.

Please be mindful and respect the terms of service of the websites you scrape. Enjoy your journey into the world of web scraping and happy coding!

webscraping's People

Contributors

rfeers avatar for-code-sake avatar brett-petrusek avatar aandvalenzuela avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.