This simple tool can be used to parse Grubhub restaurant pages and output the menu items in an easy to use JSON format.
This was created to be used in conjunction with McD4Me. Note that this is not an actual Grubhub scraper, since Grubhub loads the content for their pages via HTTP requests. Instead, it should be fed an HTML file that it will then parse and output into JSON format.
- First, make sure BeautifulSoup is installed. See this page for details.
- Head to the Grubhub page for your desired restaurant.
- Save the HTML page after it loads. This can be done by viewing page source
and saving or simply hitting
CTRL-S
/COMMAND-S
on the page. - Move the HTML file to the same directory as
scrape.py
- Change the
'FILENAME.html'
and'FILENAME.json'
parameters on lines 38 and 59 respectively, replacingFILENAME
with the file name of your choice. - Run
python3 scrape.py
The file will be output in the same folder as the input HTML file. To see an
example of the input and output, refer to the example
folder.
The JSON output comes in the following format:
{
"group":"Kung Fu Classic",
"name":"Kung Fu Black Tea",
"id":"KunFuBlaTea",
"price":3.58
}
The tool identifies the menu item's category, name, price, and generates a string id. It creates an object for each menu item with this information, and puts everything in one resulting array.