cobalt-uoft / cobalt Goto Github PK

View Code? Open in Web Editor NEW

98.0 10.0 20.0 1.85 MB

Open data APIs for interfacing with public information from the University of Toronto.

Home Page: https://cobalt.qas.im

License: MIT License

JavaScript 100.00%

uoft open-data toronto

cobalt's Introduction

Notice

This project is no longer maintained, and certain pieces have been archived.

All pieces of the project remain open source, in the event that a new open data initiative at the University of Toronto may benefit from the work we did during the project's lifetime.

Open data APIs for interfacing with public information from the University of Toronto.

Documentation

Data

Current and historical datasets that power Cobalt's APIs are publicly available at cobalt-uoft/datasets.

Contributing

Cobalt is a student-driven project at the University of Toronto. Help contribute towards making this service better. Learn more by reading the contributing guide.

Cobalt is kindly sponsored by ThinkData.

cobalt's People

Contributors

Stargazers

Watchers

Forkers

elliottsj g3chench qasim 1vn gitter-badger kashav g3wanghc philipbao hobindar andrewmcewen whizzzkid athasach darwintr mikeyin97 brianli009 mzawadi jsfix-ci

cobalt's Issues

Requires database to be called cobalt

Currently, the documentation seems to imply, when self-hosting, you can use whatever URI you wish for mongodb, which makes sense. However, at the moment, cobalt will fail if the URI specifies a database name other than 'cobalt'. This seems to be because 'cobalt' is hard-coded in when using mongoimport.

Transportation API

Use cobalt-uoft/uoft-scrapers#32 and cobalt-uoft/uoft-scrapers#33 to create the Transportation API. It will provide data for UTM shuttles as its own endpoint and parking lots / bicycle racks as another endpoint.

Once we have a sense of what the data will look like, we can expand on what endpoints we want exactly.

The documentation for this will also mention the TTC and Go Transit open data APIs if users would like to build upon public transit in the city of Toronto and the GTA.

Filter unit tests

Both course and building APIs need filter endpoint unit tests.

Come up with a method to testing these endpoints
- (there are a lot of variables in this endpoint, specifically within the q parameter)
Write the tests

Resurrect the Food API and implement the endpoints

https://github.com/cobalt-uoft/cobalt/tree/master/api/food is where the food API is located. It was abandoned.

It doesn't have a scraper, but map.utoronto.ca is your best friend to implement one at cobalt-uoft/uoft-scrapers#15.

GET courses doesn't return any locations.

for the location field, empty string is returned always, rather than the expected building code.

Campus filter not working?

https://cobalt.qas.im/api/1.0/courses/filter?q=prerequisite:%27MAT135%27&campus:%27UTSG%27

When I make the query above the second and third courses that show up have the campus as UTM.
Any reason why they are not being filtered out?

Move endpoints `courses/list` and `buildings/list` to `/courses/` and `/buildings/` respectively

Fix unit testing not working for text searches (MongoDB related)

Implement a method to validate format of the `q` parameter for filter endpoints

https://github.com/cobalt-uoft/cobalt/blob/master/api/validation/index.js#L60

This should be a loose validation, as in we are just looking that it matches the format and not really if the query is actually housing valid parameters/values (breadth must be numeric, etc. is not being checked).

Implement a smart way of keeping local MongoDB instance in sync with `cobalt-uoft/datasets`

This is purely a self-hosted scenario, where a user clones Cobalt and then runs it on their own server:

A smart way to keep each local instance in sync with the main DB. Cron job to check if the HEAD of cobalt-uoft/datasets has been modified, and then update if so?

Time and Date updates

From now on, uoft-scrapers provides time and date like follows:

time: Seconds since midnight (i.e. 21600)
date: ISO 8601 date formatted string (i.e. 2016-04-01)

The question now is how to move forward with filtering for both of these. On cobalt's side, along with any date key, we compute an extra key (which is not shown to the user but kept internally), called date_num. It holds an integer value of the current date (i.e. 20160401). This will help us filter it, as we can perform normal number operations on it.

However, we still need to decide on all the formats a user can query both of these formats. As it stands, formatting time as time:>21600 to mean "time greater than 6AM" is a little difficult to understand. Maybe they can also perform time:>"6:00"?

As for date, users don't know about the date_num, so they shouldn't query date_num directly like date:<20160415. Instead, they'll be doing date:<"2016-04-15" perhaps.

Let me know what you think, and then we can move forward with implementing this the same way across all our filter endpoints.

People looking for specific date and time can do date:"2016-04-20" AND time:"16:20"?

TLDR; we shouldn't ever have to touch Date objects anymore.

Consider changing formatting of query parameters

The Web API parameters seem awfully overcomplicated when you could probably just make use of JSON to pass in parameters that are more easily readable like "breadth": "1, 2", "level": "<= 200", etc.

Order by / sorting

We don't really have a way to order results by something. It may be worth looking into how other RESTful public APIs do this and implementing it here.

API 502 responses

Looks like there is some issue with the API, appears identical to #89 which was resolved couple months ago.
The API worked fine couple days ago. Every call I try results in a 502 error.

Example:

curl https://cobalt.qas.im/api/1.0/textbooks/filter?q=course_code:csc&key=MY_KEY&limit=100

returns

<head><title>502 Bad Gateway</title></head>
<body bgcolor="white">
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.9.12</center>
</body>
</html>

Thanks a bunch again!!

Item endpoints should not need the `/show`

Item endpoints like courses/show/:id and buildings/show/:id shouldn't need the /show.

They should just be accessible as courses/:id and buildings/:id respectfully.

API server down?

The website and the API server seem to be down at the moment.
Is it going to come back soon?

Problem in self-hosting

info: Cobalt is listening at http://localhost:4242
info: Connected to MongoDB
info: Synced athletics.
info: Synced parking.
info: Synced food.
info: Synced shuttles.
info: Synced buildings.
info: Synced exams.
info: Synced textbooks.
info: Synced courses.
Mongoose: mpromise (mongoose's default promise library) is deprecated, plug in your own promise library instead: http://mongoosejs.com/docs/promises.html

There is no labs and printers synced. Any suggestions?

Unit testing?

We should have some form of testing as we build this out. How would one unit test an API?

404 for cdf/labs and printers

I queried data with this api url https://cobalt.qas.im/api/1.0/cdf/labs?key=MY_KEY, but I got the message {"error":{"code":404,"message":"Not Found"}}. Any ideas are appreciated! Thank you!

Use `food` instead of `foods` as MongoDB collection name for Food vendors

@kshvmdn so turns out having this different is making the database generator really mad. I know I originally said to leave it as is, but let's patch that. My bad!

Exams API

Got started on this.

Add unit tests for buildings

The unit tests for courses are there, we just need test data and cases for buildings next. See https://github.com/cobalt-uoft/cobalt/tree/master/test/buildings.

Date parameters in filter endpoints should support multiple formats?

There are 4 ways to create dates in JS:

new Date() // This moment
new Date(value) // Milliseconds since January 1, 1970
new Date(dateString) // ISO-8601 formatted date or datetime
new Date(year, month[, day[, hour[, minutes[, seconds[, milliseconds]]]]])

Right now, endpoints that take date as part of the filter query support the 4th method. We should add support for the other 3 as well.

I think we'd need a method to differentiate each call in our query parser.

For milliseconds since January 1, 1970, we could take an integer like date:1461461852.
For dateString, we could see if the string has commas or not? Although the special case is if for the 4th method, they only enter a year like date:"2016", then that has no commas in it. Maybe we check if the length of the string is 4 or something.

Opinions?

Add code coverage

This guide seems helpful: https://github.com/sindresorhus/ava/blob/master/docs/recipes/code-coverage.md

Tracking / analytics

We're looking for low-cost (resource wise) analytics for the APIs, so we can potentially see how developers are using Cobalt.

Is this something as simple as using a sophisticated logger which writes to a file we can then post-process or is there something better?

Revamp how we filter

The filter code is probably some of the oldest living code in the entire project; it was one of the first things Ivan and I worked on back in 2014. It worked well with one or two endpoints, but we're almost at 10 now and it's about time to revisit this.

I'm going to start working on a query tokenizer module along with a token parser, laying out the groundwork for all future APIs to follow (and we will slowly move old filter code to the new one).

Here's what I'm thinking so far:

Query tokenizer
- Takes the raw user query and splits it into pieces (i.e. date:>"2016-04-28", code:-"CSC108")
- The return value will be a multi-dimensional array, it splits first on AND and then splits second on each of those with OR (since AND takes precedence)
Token parser
- Takes a piece from what the tokenizer outputs and converts it into an object that has data on what the token is trying to accomplish
- The return value of the token parser will give us insights on whether errors occured during parsing and whether a further mapreduce step is required for sub-documents or not.
- It will also give the operator from the original token, and the raw MongoDB query that is needed to fulfill the filter itself
- Should support the following
  - Numbers, arrays of numbers
  - Strings, arrays of strings
  - Dates in the format of YYYY-MM-DD
  - Times in the format of HH:MM or simply just seconds until midnight

Events API

Attempts to add Events to the Cobalt API. ¯_(ツ)_/¯

UTM courses not returning full course codes for its prerequisites

Ex. ANT327H5S
prerequisites: "ANT(200H5, 201H5)/200Y5"

Write API documentation

Need some more complete documentation on how to use the API, both for developers wishing to consume the API, and to serve as a spec for implementation.

I recommend Apiary.

A way to switch between dataset releases

Cobalt has a few historical datasets, there should be a way to run the APIs on more than just the latest one.

Move continuous integration to TravisCI

It's just the right thing to do

DeprecationWarning: Mongoose: mpromise (mongoose's default promise library) is deprecated, plug in your own promise library instead

(node:2307) DeprecationWarning: Mongoose: mpromise (mongoose's default promise library)
is deprecated, plug in your own promise library instead:
http://mongoosejs.com/docs/promises.html

Any suggestions?

Update tests to use promises / async instead of callbacks

Design a comprehensive analytics system behind the API

Bring code coverage up to 90%

List of things not tested:

A 404 page
A negative or NaN value for skip param
A sort param with length 1
Tests for each filter type (<, >, <=, >=, -) for each filter endpoint
Exclude src/db/index.js from coverage detection

Pass in API key via header instead of URL param

Presently, API keys are passed in via the URL, like so:

https://cobalt.qas.im/api/1.0/buildings/080?key=API_KEY

It's better to pass it in as the Authorization header of the HTTP request, so it isn't as easily exposed.

Athletics API

Started work on the API for athletics.

Calendar API

It'd be nice to have an API that provided data from places like the Arts & Science important dates pages and other similar things.

Add filter endpoint to buildings

Food API: Move filter to newer ParseQuery model + abstract the `open` filter

502 bad gateway

Hi,
It looks like there is some issue with the API. I noticed today that I get 502 responses. Everything was working a couple days ago.

Would anyone mind looking into this?

Thanks!

Building API: keys out of order

Building API shows keys in different order:

polygon should be below the address key.

Should we discontinue /search in favour of /filter?

Search functionality can be reproduced in filter by doing string comparisons. Search's only advantage is that it can do whole word text searches on pre-computed indexes (some speed increase). But it does add a little bit of confusion for a new user to figure out if that's even worth it. Should we just discontinue search and improve our algorithm for filter string comparisons?