Giter Club home page Giter Club logo

eneiromatos / the-home-depot-web-scraper Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 4.0 201 KB

This web scraper is intended to extract data from The Home Depot Website, it could be run locally or in the Apify platform, the latter is the preferred way. It was made using Apify SDK V3 (Crawlee) with Typescript.

Home Page: https://apify.com/eneiromatos/home-depot-web-scraper

License: GNU Affero General Public License v3.0

Dockerfile 6.64% TypeScript 93.36%
dataextraction scraper typescript webscrapping

the-home-depot-web-scraper's Introduction

Home Depot Web Scraper

Home Depot Web Scraper automates data extraction from Home Depot, the largest home improvement retailer in the United States. Scrape products based on search query, category urls or product urls and extract information such as:

  • Title, brand, pricing and availability.
  • Product description and variations.
  • Product images and specifications.

Export accumulated data into HTML, JSON, CSV, Excel, or XML formats. Don't waste your time looking for other tools, this is the ultimate Home Hepot Web Scraper.

Input

These are the inputs used by the actor, you can use any of them or all of them:

Category URLs

The category URLs where you want to search for products.

Keywords

The keywords you want to use to search for products.

Product URLs

The products URLs where you want to search for products.

Pagination Options

These are the options to control the actor's pagination:

Start Page

First result page from which to start extracting products, works with categories and keywords.

Last Page

Last result page from which to end extracting products, works with categories and keywords.

Scrape all pages

When set to ON the scraper will crawl all the result pages, works with categories and keywords.

Segmentation Options

Options intended for products segmentation.

Minimum Price

Set the minimum price of the scraped products.

Maximum Price

Set the maximum price of the scraped products.

Note

Please report any error or let me know your suggestions to improve this scraper.

the-home-depot-web-scraper's People

Contributors

eneiromatos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.