Giter Club home page Giter Club logo

subber's Introduction

Subber

BuildStatus Docker Repository on Quay

Subber is a web-based application that allows active reddit users to discover subreddits based on post history and subscription data.

Prerequisites

To deploy Subber, you must obtain an API key and application ID.

  1. Go to your Reddit authorized apps page.
  2. Click the "are you a developer? create an app..." button at the bottom of the page.
  3. Fill out the form with the appropriate details, ensuring that script is selected.

After submitting the form, add your Reddit API credentials to subber.cfg.

# Move and edit example config
cp subber.cfg.example subber.cfg
vi subber.cfg

Run Subber in a container (recommended)

This section assumes you have Docker and GNU Make installed.

NOTE: Your user must be authorized to run Docker commands.

# Build Subber
make

# Run Subber
make run

To stop the container, execute:

make stop

See the "Using Subber" section for usage details.

Run Subber as a Python package (for developers)

NOTE: Running Subber as a Python package on Windows is not supported, due to Gunicorn's dependence on the fnctl module. However, running Subber in a container on Windows is still supported.

# Install Subber and project dependencies
make install

Start REST API with timeout value:

gunicorn subber.subber:app -t 900

Using Subber

Request subreddit recommendations for a user by opening your browser and visiting 127.0.0.1:8000.

NOTE: This may take a few moments.

Troubleshooting

If a runtime error occurs while Subber is running, Subber will terminate and log detailed error messages in subber.log. If more details are not available, please file an issue on the Subber issues page along with a copy of your subber.log file.

subber's People

Contributors

annapankiewicz avatar danieledwardknudsen avatar drewwalters96 avatar ghugo avatar stannum-l avatar trevormccasland avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

subber's Issues

Add error/exception handling

Subber needs to handle all error/exceptions that could occur due to bad ingoing/outgoing Reddit API requests

  • Bad request API responses
  • Internal server errors (outgoing Reddit API requests)
  • Possible bad data received from Reddit API
  • Bad or no config values

Config: configparser.RawConfigParser().readfp is deprecated

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Expected behavior

Config file should be read according to this

Actual behavior

Config file is read using deprecated method

Steps to reproduce the behavior
Relevant logs (subber.log)

Load config in subber.py

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Feature request (please include as much detail as possible):

The config file should be loaded in subber.py and rather than doing so directly from the Reddit class in reddit.py.

This addresses the confusing import system for the unit tests in #41

Front end: Subreddit metadata

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Feature request (please include as much detail as possible):

Subber should include more metadata for each subreddit recommendation displayed to the user.

  • Community age

  • Over 18 community?

  • Subscriber count

Implement strategy to handle larger number of similar users

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Feature request (please include as much detail as possible):

Subber needs a strategy to handle a large number of similar users in a short amount of time, preferably less than 3 min.

Post processing for deleted user accounts crashes Subber

Expected behavior

Subber fetches recommended subs for a user without interruption.

Actual behavior

Subber crashes with the following exceptions:

2017-12-15 18:39:29,230 Retrieving user submissions for user agom94
2017-12-15 18:39:29,230 User submissions retrieved for user agom94
2017-12-15 18:39:29,230 User submissions retrieved as <praw.models.listing.generator.ListingGenerator object at 0x7f77c8c28d68>
2017-12-15 18:39:29,231 Fetching: GET https://oauth.reddit.com/user/agom94/submitted
2017-12-15 18:39:29,231 Data: None
2017-12-15 18:39:29,231 Params: {'t': 'all', 'sort': 'top', 'limit': 13, 'raw_json': 1}
2017-12-15 18:39:29,386 https://oauth.reddit.com:443 "GET /user/agom94/submitted?t=all&sort=top&limit=13&raw_json=1 HTTP/1.1" 200 970
2017-12-15 18:39:29,388 Response: 200 (970 bytes)
2017-12-15 18:39:29,389 Fetching: GET https://oauth.reddit.com/comments/60i9qv/
2017-12-15 18:39:29,389 Data: None
2017-12-15 18:39:29,389 Params: {'limit': 2048, 'sort': 'best', 'raw_json': 1}
2017-12-15 18:39:29,602 https://oauth.reddit.com:443 "GET /comments/60i9qv/?limit=2048&sort=best&raw_json=1 HTTP/1.1" 200 2946
2017-12-15 18:39:29,604 Response: 200 (2946 bytes)
2017-12-15 18:39:29,608 Returning commenters and users as ['techminder', 'Zapablast05', 'SreaminGinger', 'eodizzlez', 'lakk'] for user agom94
2017-12-15 18:39:29,608 Retrieving active subs for user techminder
2017-12-15 18:39:29,609 User's comments retrieved for user techminder as <praw.models.listing.generator.ListingGenerator object at 0x7f77c8c28d68>
2017-12-15 18:39:29,609 Retrieving user submissions for user techminder
2017-12-15 18:39:29,609 User submissions retrieved for user techminder
2017-12-15 18:39:29,609 User submissions retrieved as <praw.models.listing.generator.ListingGenerator object at 0x7f77c8c28b00>
2017-12-15 18:39:29,609 Fetching: GET https://oauth.reddit.com/user/techminder/comments
2017-12-15 18:39:29,609 Data: None
2017-12-15 18:39:29,610 Params: {'sort': 'new', 'limit': 30, 'raw_json': 1}
2017-12-15 18:39:29,766 https://oauth.reddit.com:443 "GET /user/techminder/comments?sort=new&limit=30&raw_json=1 HTTP/1.1" 404 38
2017-12-15 18:39:29,768 Response: 404 (38 bytes)
2017-12-15 18:39:29,769 Error combining processed posts and comments for user techminder
Traceback (most recent call last):
  File "/home/drew/Documents/subber/subber/reddit.py", line 170, in process_posts
    for p in posts:
  File "/usr/lib/python3.6/site-packages/praw/models/listing/generator.py", line 52, in __next__
    self._next_batch()
  File "/usr/lib/python3.6/site-packages/praw/models/listing/generator.py", line 62, in _next_batch
    self._listing = self._reddit.get(self.url, params=self.params)
  File "/usr/lib/python3.6/site-packages/praw/reddit.py", line 367, in get
    data = self.request('GET', path, params=params)
  File "/usr/lib/python3.6/site-packages/praw/reddit.py", line 472, in request
    params=params)
  File "/usr/lib/python3.6/site-packages/prawcore/sessions.py", line 179, in request
    params=params, url=url)
  File "/usr/lib/python3.6/site-packages/prawcore/sessions.py", line 124, in _request_with_retries
    raise self.STATUS_EXCEPTIONS[response.status_code](response)
prawcore.exceptions.NotFound: received 404 HTTP response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/drew/Documents/subber/subber/reddit.py", line 196, in _get_active_subs
    subs = process_posts(comments) + process_posts(submissions)
  File "/home/drew/Documents/subber/subber/reddit.py", line 177, in process_posts
    '{}'.format(user, p))
UnboundLocalError: local variable 'p' referenced before assignment
2017-12-15 18:39:29,770 Exception while getting user recommendations for user agom94
Traceback (most recent call last):
  File "/home/drew/Documents/subber/subber/app.py", line 49, in on_get
    recommendations = reddit.get_user_recommendations(session, user)
  File "/home/drew/Documents/subber/subber/reddit.py", line 68, in get_user_recommendations
    for sub in active_subs:
TypeError: 'NoneType' object is not iterable

Steps to reproduce the behavior

  1. Start the Subber API
  2. curl 127.0.0.1:8000/user/agom94

Project: subber container

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Feature request (please include as much detail as possible):

Subber should run in a container specified using the Dockerfile format

Ensure a single subreddit suggestion instead of multiple

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

When obtaining subreddit suggestions for a user, repeated suggestions are returned.

Expected behavior:

Only a single suggestion per subreddit should be returned.

Actual behavior:

Multiple suggestions are returned, but the number of repeated suggestions per subreddit varies.

Steps to reproduce the behavior:
  1. Run Subber in a container
  2. Find suggestions for user "agom94"
  3. Observe repeated suggestions:
  • 4 suggestions of r/army
  • 3 suggestions of r/military
  • 2 suggestions of r/pcmasterrace
  • 2 suggestions of r/AskReddit
Relevant logs ('subber.log')

subber.log

Not supported on Windows

Subber does not support Windows, and this should be reflected in its documentation.

When running Subber as a Python package on Windows, starting the REST API with Gunicorn fails and gives "ModuleNotFoundError: No module named 'fcntl'".

Add logging

Subber needs to log incoming API requests and outgoing Reddit API requests.

  • Info and Debug logging should be logged in stdout
  • Critical and Error logs should be persistent

Reduce number of similar users

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Sub recommendation takes to long for users with a high amount of similar users. This should be capped at a number (i.e. 10) until a permanent caching solution is crafted for a later releases.

Expected behavior

Subber processes users recommended subs for < 3 min

Actual behavior

Subber processes users recommended subs for > 15 min

Steps to reproduce the behavior
  1. Request recommendations for a user with a high amount of posts
Relevant logs (subber.log)
2017-12-28 06:31:46,700    subber.reddit     DEBUG    Returning similar users fo
r user as ['Savasshole', 'Savasshole', 'mg_ridgeview', '1jojo9', 'Gowtha
mpkp', 'AfouToPatisa', 'cmason37', 'k5josh', 'oxipital', 'zreeon', 'k5josh', 'k5
josh', 'k5josh', 'AwfulLandlord', '7yearlurkernowposter', 'larry_crime_donkey', 
'minorthreat21', 'NJ-JRS', 'somekindofhat', 'Lord_Dreadlow', '7yearlurkernowpost
er', 'pavanjadhaw', 'cosners', 'Best_coder_NA', 'nile_river7', 'xiomacaroni', 'h
attori_heiji', 'Kingoflionbears', 'WarKirby', 'themikemoze', 'aliniazi', 'nangto
i', 'Courtnall14', 'provelcheesebakery', 'Sobie17', 'aspoels', 'SloTek', 'RnBoos
', 'keepitwithmine', 'Khazahk', 'silkdurag', 'fast_edo', 'nangtoi', 'hethoma', '
MSimpsonPhotos', 'imTheSnuggler', 'lemontongues']

Report invalid username/password

If the Reddit API connection is failing and invalid username/password/secret/ID are suspected, report this at the server stdout and log level

Validate type and field length in config.py

Feature request (please include as much detail as possible):

Refactor config.py to include records of the type or required field length of a config field, and then validate that the fields parsed from the config conform to the constraints. Because not all config fields are used on startup, this would prevent runtime type errors later on.

frontend: Subreddit link bug

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Subreddit links on the results page are invalid because they are missing the http: prefix.

Expected behavior

Subreddit links direct the user to the page of the subreddit.

Actual behavior

Subreddit links redirect to a 404 page while trying to direct the user to 127.0.0.1:8000/{sub-link}

Steps to reproduce the behavior
  1. Run the Subber container
  2. Click on a subreddit link on the results page
Relevant logs (subber.log)

Data Collection for Future Analysis

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Feature Request (please include as much detail as possible):

Overview

Keeping in mind the mantra that all data are good data, I propose adding a data collection feature to Subber.
The view presented to a user when viewing recommendations could be modified to allow the user to indicate, for each recommendation, whether it was GOOD, BAD, or ALREADY SUBSCRIBED.
Upon collection of the user's opinions on the recommendations, the opinions could be posted back to a web-service that collects the data in tabular form for future analysis.

Examples of ways this could be useful might be:

  • scoring the performance of different recommendation systems to determine which makes it to production

  • building a database that allowed the system to see which subreddits are closely linked, and adjust recommendations accordingly

Breakout

The changes that would need to be made:

  1. Creation of an accessible web service that accepts POST requests (i.e. subber.com/api/report)
  2. Conversion of POST requests to persistent tabular data
  3. Addition of the web service URL to the configuration file
  4. Changing the recommendation view such that recommendations can be rated as GOOD, BAD, ALREADY SUBSCRIBED
  5. Sending user opinions to the web service as they are rated

The case for one database used by all throughout development

All data are good data, and until this application hits production it will be difficult to accumulate data at volume. To accelerate the data collection process, I propose all developers post to the same URL through development. Application keys could be given to developers to stop bogus requests, and we can make DEVELOPER and/or BUILD a required field in the configuration in order to sort/filter.

Suggested Fields for Database

The following data could be useful to collect:

  • Developer
  • Timestamp
  • A unique identifier for the recommendation session
  • Queried User
  • Recommended Subreddit
  • Rating
  • Recommendation System Used
  • User's active subs (this could also be collected server-side upon receipt of the request)

Project: Improve error handling, imports, logging

Requirements:

  • Check Reddit API connection during initialization and exit if connection fails

  • Remove package_dir declaration (package in package) in favor of the standard import format

  • Suppress outside loggers (e.g. urllib3)

  • Update logging types (errors are critical if it is unrecoverable)

  • Exit for unrecoverable errors

Eliminate subscribed subreddits from recommendations

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Subber recommends subreddits to users that the user is already subscribed to.

Expected behavior

Subber does not recommend subreddits to users that the user is already subscribed to.

Actual behavior

Subber recommends subreddits to users that the user is already subscribed to.

Steps to reproduce the behavior
Relevant logs (subber.log)

Replace `utc_epoch_sec_to_years` function

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Expected behavior
Actual behavior
Steps to reproduce the behavior
Relevant logs (subber.log)
  12 2018-10-06 17:04:15,078    root              DEBUG    Converting 1279862064.0 seconds from UTC epoch to years to date
  13 2018-10-06 17:04:15,078    root              ERROR    Unable to calculate years to date from UTC epoch timestamp
  14 2018-10-06 17:04:15,746    root              DEBUG    Converting 1228908510.0 seconds from UTC epoch to years to date
  15 2018-10-06 17:04:15,746    root              ERROR    Unable to calculate years to date from UTC epoch timestamp
  16 2018-10-06 17:04:16,452    root              DEBUG    Converting 1446861249.0 seconds from UTC epoch to years to date
  17 2018-10-06 17:04:16,453    root              ERROR    Unable to calculate years to date from UTC epoch timestamp
  18 2018-10-06 17:04:17,123    root              DEBUG    Converting 1201311072.0 seconds from UTC epoch to years to date
  19 2018-10-06 17:04:17,124    root              ERROR    Unable to calculate years to date from UTC epoch timestamp
  20 2018-10-06 17:04:17,792    root              DEBUG    Converting 1423117238.0 seconds from UTC epoch to years to date
  21 2018-10-06 17:04:17,792    root              ERROR    Unable to calculate years to date from UTC epoch timestamp
  22 2018-10-06 17:04:18,512    root              DEBUG    Converting 1527525447.0 seconds from UTC epoch to years to date
  23 2018-10-06 17:04:18,512    root              ERROR    Unable to calculate years to date from UTC epoch timestamp

Feature request (please include as much detail as possible):

Eliminate duplicate similar users

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Subber does not eliminate duplicate similar users and will process the same users original user multiple times.

Expected behavior

Subber processes each similar user once.

Actual behavior

Subber does not eliminate duplicate similar users and will process the same users original user multiple times.

Steps to reproduce the behavior
  1. Request sub recommendations for a user with a large amount of posts
Relevant logs (subber.log)
2017-12-28 06:31:46,700    subber.reddit     DEBUG    Returning similar users fo
r user as ['Savasshole', 'Savasshole', 'mg_ridgeview', '1jojo9', 'Gowtha
mpkp', 'AfouToPatisa', 'cmason37', 'k5josh', 'oxipital', 'zreeon', 'k5josh', 'k5
josh', 'k5josh', 'AwfulLandlord', '7yearlurkernowposter', 'larry_crime_donkey', 
'minorthreat21', 'NJ-JRS', 'somekindofhat', 'Lord_Dreadlow', '7yearlurkernowpost
er', 'pavanjadhaw', 'cosners', 'Best_coder_NA', 'nile_river7', 'xiomacaroni', 'h
attori_heiji', 'Kingoflionbears', 'WarKirby', 'themikemoze', 'aliniazi', 'nangto
i', 'Courtnall14', 'provelcheesebakery', 'Sobie17', 'aspoels', 'SloTek', 'RnBoos
', 'keepitwithmine', 'Khazahk', 'silkdurag', 'fast_edo', 'nangtoi', 'hethoma', '
MSimpsonPhotos', 'imTheSnuggler', 'lemontongues']

reddit.py: bad config import

The import statement from subber import config in reddit.py causes an error when launching the API using the instructions in README.md.

Proposed solution:

  • Create setup.py with the root package directory pointing to subber

This solution should not be permitted to break any other imports.

Config: config file not closed

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Issue:

Expected behavior

The subber config file is closed after all config data has been loaded.

Actual behavior

The subber config file is not closed after all config data has been loaded.

Steps to reproduce the behavior

Run subber according to the instructions in README.md

Relevant logs (subber.log)

Dependencies: Update to PRAW 5.3.0

Subber currently uses PRAW 5.2.0 and needs to be upgraded to the latest version.

Requirements:

  • PRAW 5.3.0 must not break any current functionality

  • Subber should install and utilize PRAW 5.3.0 rather than version 5.2.0

Travis container build

Upon submission of a new patch, Travis CI should execute make build the Subber container.

Project: Add front end

Thank you for your contribution to Subber! Please fill out one of the two templates below (issue or feature request).

Feature request (please include as much detail as possible):

Subber needs a front end to display user Subreddit recommendations. While this will be a primitive, proof-of-concept design, it should meet the following requirements:

  • Utilize the Flask framework (and retire usage of the Falcon framework)

  • Include a form to submit a username

  • Display a list of Subreddits (after the form is submitted), each accompanied by their respective name and description

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.