minimum_sugar's People
Forkers
deanmalmgrenminimum_sugar's Issues
Refactor `menu_histogram` to handle SQLite
Flatten structure of data
The menu item data is presently held in a data structure that has unnecessary nesting. The menu item data should be contained in a list of dicts; each dict representing an individual menu item. Presently, the organization according to restaurant needlessly duplicates data and adds complexity.
This new, flatter structure requires that I write new tools to filter the data.
Check categorization of menu items using reported values
Some of the text reports generated in report.ipynb
have results that indicate a problem in categorization. For example, in the "Maximum sugar" subsection the code
wendys_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", "Wendy's")
wendys_entree_items = minimum_sugar.filter_menu_items(wendys_menu_items, "menu_category", "entree")
wendys_high_sugar_menu_items = [menu_item for menu_item in wendys_entree_items if menu_item["nf_sugars"] > 18]
for menu_item in wendys_high_sugar_menu_items:
print menu_item["item_name"] + ":", menu_item["nf_sugars"]
returns a result "Double Chocolate Chip Cookie: 28". Clearly that menu item is mis-categorized.
Separate non-entree menu items
I only care about the sugar content of the entree menu items for any particular restaurant. Thus I should separate the following categories:
- beverage
- condiment
- side order
- dessert
Tag and release
I am nearing the point where I can release this report on my blog a la #11. I will tag the commit that gets posted to the blog using the YYYYMMDD rubric (I don't think semver applies here). Additionally, I need to tag a commit for which the notebook cells containing plotting directives have been executed and the plots generated.
Post report to blog on jrsmith3.github.io
Once #10 is closed, post the report to my blog.
Note the url of this repo in `report.ipynb`
Eventually I am going to post report.ipynb
on my blog (cf. #11). A link to this minimum_sugar
repo should appear in that file so that people can see the source.
Plot histogram of sugar data
I can plot histogram data of the various sugar content of the menu items. These would probably be a nice visualization.
Separate report ipynb from data wrangling ipynb
The information in report.ipynb
as of 36b5061 contains both data wrangling code and report copy/code. The data wrangling component should be separated into its own notebook.
Collect list of all restaurants and corresponding UIDs Nutritionix has
In closing issue #1, I got a small subset of the restaurant names and corresponding UIDs contained in the Nutritionix database. I should get the UIDs for all of the restaurants in the database.
Functionality to grab all of the nutrition data for all of a restaurant's menu items
I need to be able to fetch a list of menu items given an arbitrary restaurant UID; each item in that list should contain all of the nutritional data available.
Refactor `menu_histogram` to use pyplot.hist
See this SO question for an example of using pyplot.hist
as opposed to the pyplot.bar
I was using.
Download data I need
I want a local copy of this data so I don't have to keep hitting Nutritionix's server.
Handle duplicate entries as records are added to the database
As noted in #20, the Nutritionix API sometimes returns duplicate menu items. These duplicates need to be handled before attempting to add items to the SQLite database.
Generate dict mapping restaurant names to UIDs
Nutritionix identifies restaurants by a unique ID number. For example, according to the API documentation, McDonald's ID is 513fbc1283aa2dc80c000053. I frequent the following places and need to determine the Nutritionix ID number for each:
- McDonalds
- Wendy's
- Taco Bell
- Qdoba
- Chipotle
- Five Guys
- Costco
Add functionality to normalize histograms
I need functionality to make the histogram plots look uniform. Currently, the x scale for each restaurant is different because each restaurant has a different distribution of menu items. Additionally, the widths of the boxes in the histogram are different.
I am ultimately going to plot these histograms in a column and so the horizontal and vertical scales should match.
Organize code in fewer files
Presently (commit 1a7c11c), there are several files containing python source. The code in these files should be combined into a single library.
The following files should be concatenated:
data_manipulation.py
fetch_data.py
fetch_restaurant_ids.py
The file restaurant_menus.ipynb
should be updated to reflect the change in the library.
Refactor to use SQLite instead of the list of dicts
I created a file named menu_data.json
based on Nutritionix API calls, but I never committed that file to this repo. Nutritionix was nice enough to let me use their data, and I'm pretty sure they don't want me publishing it in my repo.
The menu_data.json
file contains a list of dicts. A lot of the analysis would be easier if the menu data were contained in a SQLite database file. Thus, I need to:
- Move data from
menu_data.json
tomenu_data.db
. - Refactor code to access
menu_data.db
instead.
`fetch_menu_item_data` returns duplicates
The following code will yield duplicate menu items.
# Assume `credentials` is a dictionary holding Nutritionix API credentials.
import minimum_sugar
import collections
# ID value 513fbc1283aa2dc80c000053 corresponds to McDonald's
menu_items = minimum_sugar.fetch_menu_item_data("513fbc1283aa2dc80c000053", credentials)
item_ids = [menu_item["item_id"] for menu_item in menu_items]
dups = [item for item, count in collections.Counter(item_ids).items() if count > 1]
print len(item_ids)
print len(item_ids) - len(dups)
print len(dups)
# Returns
#359
#347
#12
Write up report
Once #9 is closed, I need to write up a report of the results along with some development notes.
Refactor `print_max_sugar_menu_item` to handle SQLite
Refactor code to leverage `entree_items` list
Early in the report.ipynb
, the following line of code occurs:
entree_items = minimum_sugar.filter_menu_items(menu_data, "menu_category", "entree")
Many times following that line, I re-sort the entree items from menu_data
. I should refactor that code to rely on entree_items
instead so I don't look like an amateur.
Write function to determine max value for particular restaurant menu item
I've repeated code that looks like the following as of 3143fbf
restaurant_name = "Taco Bell"
restaurant_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", restaurant_name)
restaurant_entree_items = minimum_sugar.filter_menu_items(restaurant_menu_items, "menu_category", "entree")
max_sugar = max(minimum_sugar.extract_variable(restaurant_entree_items, "nf_sugars"))
print "Max sugar:", max_sugar
menu_items = minimum_sugar.filter_menu_items(restaurant_entree_items, "nf_sugars", max_sugar)
for menu_item in menu_items:
print "Item name:", menu_item["item_name"]
in order to determine the entree menu item(s) containing the most sugar for a particular restaurant. This code should be factored into its own function instead of copying and pasting it all over the place.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.