gh-scraper's Introduction

gh-scraper

This simple tool can be used to parse Grubhub restaurant pages and output the menu items in an easy to use JSON format.

This was created to be used in conjunction with McD4Me. Note that this is not an actual Grubhub scraper, since Grubhub loads the content for their pages via HTTP requests. Instead, it should be fed an HTML file that it will then parse and output into JSON format.

Usage

First, make sure BeautifulSoup is installed. See this page for details.
Head to the Grubhub page for your desired restaurant.
Save the HTML page after it loads. This can be done by viewing page source and saving or simply hitting CTRL-S/COMMAND-S on the page.
Move the HTML file to the same directory as scrape.py
Change the 'FILENAME.html' and 'FILENAME.json' parameters on lines 38 and 59 respectively, replacing FILENAME with the file name of your choice.
Run python3 scrape.py

The file will be output in the same folder as the input HTML file. To see an example of the input and output, refer to the example folder.

JSON output

The JSON output comes in the following format:

{
  "group":"Kung Fu Classic",
  "name":"Kung Fu Black Tea",
  "id":"KunFuBlaTea",
  "price":3.58
}

The tool identifies the menu item's category, name, price, and generates a string id. It creates an object for each menu item with this information, and puts everything in one resulting array.

Recommend Projects

mfarejowicz / gh-scraper Goto Github PK

gh-scraper's Introduction

gh-scraper

Usage

JSON output

gh-scraper's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent