Giter Club home page Giter Club logo

pittapi's People

Contributors

adrienac avatar azharichenko avatar dependabot[bot] avatar doomki avatar dzheng256 avatar ethanwelsh avatar fishofpitt116 avatar jdsleppy avatar johnlinahan avatar nij-patel avatar omribarak avatar rahi374 avatar rchatrath7 avatar richie78321 avatar ritwikgupta avatar rohit-ganguly avatar steven-jarmell avatar timparenti avatar varughese avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pittapi's Issues

Integrate Travis CI

Once tests are more developed, I'll integrate Travis CI into this project.

Unit tests should be hermetic

I noticed your tests are making actual calls to Pitt's website each time they are run. While this does shield against API changes from Pitt's side, it does so at the expense of slowing down your test runs and increases the amount of traffic that Pitt's website may have.

Ideally, your unit tests should not make any network, disk, etc. calls. The use of a mocking framework such as Python's mock, will allow you to substitute mock objects which allow you to verify that they were called, what parameters they were called with, etc. Your unit tests should only test your code (within reason).

Create a Library API

We can get JSON results from our library catalog via the following URL: http://pitt.summon.serialssolutions.com/api/search?q=water

q is a URL-encoded parameter that is the search string.

This returns JSON already, just clean up the results somewhat and return it as is!

License - MIT or GPL?

The readme says this is released under the MIT license but the pittAPI.py file notes it is released under GPLv2.

Enforce Unicode for all strings and results

Write now the string handling is very hacky so you'll see Unicode characters being expanded to \u0blah in the responses. Please ensure that proper Unicode encoding is enforced throughout so we don't have this issue. UTF-8 is what we should stick with.

Create a DiningAPI

The Pitt legacy mobile website shows if the dining halls are currently open or not: https://m.pitt.edu/dining/. Find the place where they're getting this info from, hook into it directly, and create an API to return the statuses for all dining places.

Parallelize People calls

The PeopleAPI currently synchronously loops in chunks of 10 to get data. The Requests API lets you make a pool of connections that not only keep one connection alive for re-use, but can be made asynchronously! Switch the PeopleAPI to use this async feature.

ValueError when using BeautifulSoup in lab.py

Experiencing a ValueError when I attempt to use the PittAPI to retrieve a dictionary from lab.py for computer lab availability. I've narrowed it down to a potential issue with BeautifulSoup in _fetch_labs():
soup = BeautifulSoup(page.text, 'lxml')
I added a print statement before and after this line. Print statement before checks out, the one after never prints.

Steps to reproduce:
-Activate the virtual environment, and run server.py
-Make a request to the API for a lab's status.

It will reach the appropriate endpoint, find a match in locations, and attempt to fetch the lab info, but will encounter a ValueError when attempting to parse the span section of the page response's text.

Documentation

So that we know what APIs have what methods with what parameters and what return values.

Without having to pry through the code.

Improving documentation

This issue will deal with making consistent documentation within the project's code and with the docs. The plans below are subject to change and will include more issues/enhancements to deal with as they come up. Each issue will have the pull request or issue that they are resolved in. If there are any other documentation features you would like to be included please comment below.

Documentation Structure

  • Moving from DOCS.md to docs/ ( #35 )
  • Spliting DOCS.md into seperate API DOCS ( #35 )
  • Create template for adding functions to docs

Doc Improvements

Code Documentation

make PittAPI actually an API

Currently PittAPI is essentially a collection of functions in classes. Gold!

However, it would be more useful if it was hosted on, say, an AWS lambda.

Making PittAPI into a proper package

Proposal:

Essentially the change I proposing is moving all of PittAPI code into a folder and modularizing the project. This includes a big move away from OOP programming since there isn't much use of it in this library (this will get explained towards the end). So I'm going to show how the API will look after the change, how it will get structured, and why to get rid of the OOP usage.

API Usage Change:

So let's look at an example from of the current usage from the README.md of how to use the course API...

from pittAPI import CourseAPI

### Courses
course = CourseAPI()
# Will return a list of dictionaries containing courses in subject
big_dict = course.get_courses(term="2161", subject="CS")
# Will return a list of dictionaries containing courses that fulfill req
big_dict = course.get_courses_by_req(term="2161", req="Q")
# Will return a string, which is the description of the course
description = course.get_class_description(class_number="10163", term="2161")

and here is what it will look like after the change.

from PittAPI import course

# Will return a list of dictionaries containing courses in subject
big_dict = course.get_courses(term="2161", subject="CS")
# Will return a list of dictionaries containing courses that fulfill req
big_dict = course.get_courses_by_req(term="2161", req="Q")
# Will return a string, which is the description of the course
description = course.get_class_description(class_number="10163", term="2161")

Details to Notice

  • Notice how the naming is changed especially with CourseAPI turning into course. Thought behind this is that the PittAPI contains different modules that have to access various data from Pitt. Saying API repeatedly is a bit redundant since it's all coming from the PittAPI, instead it's shorten down to just course which is imported from PittAPI.
  • Next, notice how you no longer have to initialize the course API. Instead, you can use it's function directly, making the API even more friendly. This is a change to all the APIs where initialization is no longer required. (I will go into more in-depth why in a later explanation)

Package Structure:

The way the repository is structured is there is a pittAPI.py that has the whole library. First, change is to move this into a folder a.k.a. a package. This will come in handy later if this package gets put on the Python Package Index since, in the setup of submitting, you should already have the code all in a folder (many python packages contain their code into packages like this).

Before

__init__.py
pittAPI.py

After

PittAPI/
  - __init__.py
  - pittAPI.py

Very simple change, but in the current state shown above it makes the usage worse but there is one more change I would like to make.

Moving away from OOP (Making them into modules):

So as I was looking throughout this library I couldn't help but notice the overuses of classes when none are needed. This might seem weird but here me out.

Let's look at that CourseAPI class. One issue is why is __init__ declared since it's not used? Ok that's a bit nitpicky.

class CourseAPI:
    def __init__(self):
        pass

But there is a much bigger observation to make with this class. Besides the functions annotated with @staticmethod. Such as...

@staticmethod
def _retrieve_from_url(url):
    page = s.get(url)
    soup = BeautifulSoup(page.text, 'html.parser')
    courses = soup.findAll("tr", {"class": "odd"})
    courses_even = soup.findAll("tr", {"class": "even"})
    courses.extend(courses_even)
    return courses

Aren't all the functions static? Take a look at this function below notice how there isn't really any usage of self (besides using it to call a static method) and that this function is very self-contained. What's the point of forcing all these static functions into a class, in python, there is a better way of doing this.

def get_courses(self, term, subject):
      subject = subject.upper()

      url = 'http://www.courses.as.pitt.edu/results-subja.asp?TERM={}&SUBJ={}'.format(term, subject)
      courses = self._retrieve_from_url(url)

      course_details = []

      for course in courses:
          details = [course_detail.string.replace(' ', '').strip()
                     for course_detail in course
                     if course_detail.string is not None]

          # Only append details if the list is not empty
          # If the subject code is incorrect, details will be NoneType
          if details:
              course_details.append(
                  {
                      'subject': details[1] if details[1] else "Not Decided",
                      'catalog_number': details[3] if details[3] else "Not Decided",
                      'term': details[5].replace('\r\n\t', '') if details[5] else "Not Decided",
                      'class_number': course.find('a').contents[0] if course.find('a').contents[0] else "Not Decided",
                      'title': details[8] if details[8] else "Not Decided",
                      'instructor': details[10] if details[10] else "Not Decided",
                      'credits': details[12] if details[12] else "Not Decided"
                  }
              )

      if not course_details:
          raise InvalidParameterException("The TERM or SUBJECT is invalid")

      return course_details

This better way is to turn these classes into modules. Essentially all that is being done is dropping the class and moving the all the functions into their own python file. So the structure will go from this ...

Before

PittAPI/
  - __init__.py
  - pittAPI.py

to this with each API have its own python file.

After

PittAPI/
  - __init__.py
  - course.py
  - laundry.py
  - people.py
  - lab.py
  - dining.py

In an instant, this removes all the OOP and makes the project more manageable, with each API having its own module to contain all its functions and any other specifics that apply to that API. This makes calling the API even simpler since there is no longer the need to initialize a class. It also makes importing the API very simple as all that needs to be done is to import the module.


I have already done this and tested it a bit (take a look at my fork of this project), but I would love to hear what your opinion of this idea is. If it all sounds good I would be more than glad to do a pull request of all these changes.


Who am I?

This is all who wonder why someone outside of Pitt is contributing to this project. Hello, I'm Alex Zharichenko and I'm a senior in high school who is currently in his 4th year of computer science (in my school this is basically independent study). I just recently after more than 6 months of waiting (applied on 7/21/16) finally got accepted into Pitt. I'm already fully committed to going to Pitt with my major being Computer Science. Hence the sudden interest into this repository.

Write unit tests for all functions

There is currently 0% coverage and we don't know what breaks. All functions have to be manually tested. Write a test gradle for the repo.

README.md issues

Hey, I was messing around with your example use cases in README.md and noticed a few things. First, there's a call for a deprecated function in there. The line
big_dict = course.get_courses(term="2177", code="CS")
should be
big_dict = course.get_classes(term="2177", code="CS") since get_courses() is deprecated.
You also make a call to course.get_course() when I believe you mean to call course.get_courses(), since course.get_course() doesn't exist (sorry, but it's a little confusing when there's a get_class()).

I'm just working with the course module right now, but you should probably take a look through the examples in README.md. In particular, all the examples related to dining don't seem to be working right now (for me, the first 3 return empty lists, and get_location_by_name isn't a valid function name)--not sure if that's still a work in progress. I just installed and upgraded the package with Pip, so maybe the issues with dining.py have something to do with that version not being updated.

Anyway, sorry for the lengthy thing. Awesome job on this API overall.

Issues with Shuttle API's get_vehicle_route_stop_estimates

Recently the builds are showing an issue with Shuttle API's get_vehicle_route_stop_estimates and this should be investigated.

The issue comming off is....

======================================================================
ERROR: test_vehicle_route_stop_estimates (tests.shuttle_test.ShuttleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 3, in wrapper
  File "/home/travis/build/Pitt-CSC/PittAPI/tests/shuttle_test.py", line 55, in test_vehicle_route_stop_estimates
    stop_estimates = shuttle.get_vehicle_route_stop_estimates(25, 4)
  File "/home/travis/build/Pitt-CSC/PittAPI/PittAPI/shuttle.py", line 43, in get_vehicle_route_stop_estimates
    response = sess.get("http://www.pittshuttle.com/Services/JSONPRelay.svc/GetVehicleRouteStopEstimates", params=payload)
  File "/home/travis/virtualenv/python3.4.6/lib/python3.4/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/home/travis/virtualenv/python3.4.6/lib/python3.4/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/travis/virtualenv/python3.4.6/lib/python3.4/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/travis/virtualenv/python3.4.6/lib/python3.4/site-packages/responses.py", line 625, in unbound_on_send
    return self._on_request(adapter, request, *a, **kwargs)
  File "/home/travis/virtualenv/python3.4.6/lib/python3.4/site-packages/responses.py", line 600, in _on_request
    raise response
requests.exceptions.ConnectionError: Connection refused: GET http://www.pittshuttle.com/Services/JSONPRelay.svc/GetVehicleRouteStopEstimates?vehicleIdStrings=25&quantity=4

Introduce static type checking into the API

MyPy is a terrific project that lets us introduce static typing into parts of our code. Since there are certain parts of the code which follow a specific type, use MyPy to type check those portions.

MyPy: http://mypy-lang.org/

  • Identify parts of the code that can be type checked (all private functions)
  • Add static typing to them

Making the package name lowercase

I find that the current usage is not very intuitive since the package name is case sensitive and makes typing it a hassle, as shown below.

from PittAPI import course

Instead, let's just make it all lowercase so that typing and adding the package is a breeze, and follows the convention most other packages follow.

from pittapi import course

Note: This is just effecting the users use of the package and don't reflect a change of the name you use to install through pip. Essentially this will still hold pip install PittAPI

Rewrite all urllib2 content in requests

Seems like requests has matured a lot since the last time I looked at it. Most importantly, it allows us to use a connection pool and parallel async requests. Re-write every urllib2 call with an equivalent requests call.

Drop Python 2 support

It's getting weird and complicated trying to support both 2 and 3 in one library.

To embrace the future (technically the present), I move to drop Python 2 support.

As of this term, the Course API is broken

The Dietrich site that was traditionally used is now gone forever and we need to switch it over to using peoplesoft mobile which might have the advantage of being more up to date information than Dietrich.

CourseAPI: Valid subjects/terms are labeled invalid if no courses exist

I caught this while unit testing.
The specific example is subject="ARABIC" and term="2171" or term="2177". For these two terms, no Arabic courses are offered, so the array course_details is empty, and the code then assumes that because the array is empty, then there must be an invalid parameter and it throws an InvalidParameteException

Course Information is incorrect for detail lists of length 6

For some subjects,

    details = [course_detail.string.replace('&nbsp;', '').strip()
                       for course_detail in course
                       if course_detail.string is not None
                       and len(course_detail.string.strip()) > 2]

grabs the subject at index 0. This causes the returned list of course_details to be incorrect, since with the current conditionals, it's setup to have the catalog_number be set to what would then be the subject. This was found for every subject except 4. For now, an easy fix is to check lists of length 6, and accommodate, but the underlying reason for why certain subjects (CS, JS, SA, FP) return lists of length 5 instead of 6 (ie not including the subject in the parsed result).

screen shot 2017-01-13 at 7 25 10 pm

IndexError when getting extra details for section

The function at line 230 of course.py could throw an IndexError if the parameter text does not contain the string ' :'

The function should look like:

def __extract_data_past_colon(text: str) -> str:
        if ': ' in text:
            return text.split(': ')[1]
        else:
            return text

I was getting this issue for some reason and this fixed it

set up pre-commit

  • just adding some hooks for basic formatting
  • tests are already handled with gh action

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.