Personalisable dashboard for Sharing Cities IoT data

License: Apache License 2.0

HTML 1.35% JavaScript 30.33% CSS 0.18% Python 64.77% Dockerfile 1.53% Shell 1.84%

sharingcitiesdashboard's Issues

Run Analytics Tasks on Thread

Once the request is verified launch a new thread to run the analytics request, the new thread should have access to task_id, user_id and api address to which it needs to send the data. The thread should also have access to database to save the computed data

Define build procedure for API client

So that it fits into the setup @arya-hemanshu is creating.

Implement tests for incoming data APIs from cities

Ensure that tests are in place to ensure that we are getting the data we expect from each of the incoming APIs.

Data Analytics Api - allow users to pass date range

allow users to pass date range and train the model only using the data specified in that range

Adopt Greenwich APIs

Based on existing test code and new API client infrastructure.

Processing Analytics - Check for null/empty - remove

Based on user input update the null or empty cells

How to determine update interval when API returns data 'on demand'

Example: http://datos.santander.es/api/rest/datasets/sensores_smart_parking.json

This API returns data whenever you call it. However, the timestamp gets updated only when the status is changed. How do we determine how often to ping it?

Processing Analytics Request - Data Retrieval

With the passed col name and table names, retrieve the data from database and convert them into dataframe so that it can be passed for type checking

User should be able to combine fields from different data sources if they overlap in time

Users may want to plot temperature vs bikes available at a particular site. For this to happen, time has to be normalised between the two. Approaches to this have been discussed and investigated (e.g. forward filling and back filling).

Recommendation for first implementation: only allow for the combination of two fields from two different data sources - no more.

create mysql to postgresql migration tool

create a tool to migrate mysql to postgresql database

Add packaging report

Pipeline to build automated tests on branch creation and merging branch into master - Backend

Build Docker image
Upload it to docker hub
Create a wercker file

Create API to expose data

For the UI and the analytics engine.

Pipeline to build automated tests on branch creation and merging branch into master - Frontend

Build Docker image
Upload it to docker hub
Create a wercker file

Add README to highest hierarchy

Provide an overview of the project.

Processing Analytics request - Data Validation - Type Checking

Validate the data type of retrieved data

Adopt Lisbon APIs

Based on existing test code and new API client infrastructure.

Building Deployment pipeline

Build deployment pipeline for production environment

Analytics - Return response from Api - If successful - Data not in range

i) Message about the status of the request
ii) ID of the request
iii) ID of the requestor
iv) Timestamp of the response

Add License

Adopt Milan APIs

Based on existing test code and new API client infrastructure.

Integrate backend code into repository

Enabling continuous integration and testing.

ignore_object_tags only considers tags in first level of json heirarchy

Update Documentation

Documentation has incorrect date range in the request section, update the documentation to reflect the current method

Create API to handle users' analytics requests

Pipeline to build automated tests on branch creation and merging branch into master - Analytics

Build Docker image
Upload it to docker hub
Create a wercker file

API credentials / token expiry

We want the users / cities to have their own API credentials that are used to access the data from each API. We're currently using our development credentials and shouldn't expose these. Ideally we need to:

Enable the user to provide their own (and tell them where to get / sign up)
Deal with the expiry of tokens through time
Make rate limiting / fair use transparent

Analytics - Read me

Create readme for analytics folder

Forecasting / Modeling should have defaults where possible.

Depending on the selected operation, defaults should be supplied:

E.g. Default
missingValues = forward fill.

Referring to below:

{
"columnsX": {
	"table1": ["col1", "col2", "col3"],
	"table2": ["col4", "col5", "col6"]
},
"columnY": "ycola",
"tableY": "table3",
"operation": "classification",
"requestor_id": 456,
"timeseries": "True",
"missingValues": "mean",
"dataRange": "12-03-2018 00:00:00.12-03-2018 00:00:00"
}

Assign difficult task to Andris

Just testing this out

Processing Analytics - Check for saved model

For every request first check if there is already an existing model, if there is

return the prediction with request id

if no, check if data is under default limit
if yes, create model and send the predictions, if no start processing and send the request id

Data Source - Greenwich - Add TFL Disruption

TFL's TIMS feed will be a valuable addition.

http://content.tfl.gov.uk/tims-feed-technical-specification.pdf

Filter Data

Give user an option in the api to send values on which data can be filtered, e.g if user just want to predict NO2, then user should be able to send these in the post request

Create visualisations having the same temporal span

It would be good if the users can view data with different temporal frequency in a common temporal extent. This would require resampling of one versus the other

eg. variable x temporal frequency is in minutes
variable y temporal frequency is in days

Processing Analytics - Feature scale

Feature scale the data so that it can be passed to ML libraries for training purposes

Check if fromdate is smaller than todate

In date range check if users have passed the right values, if from date is smaller than to date then reject the request

Make a schema for metadata used by the front end

To be used by the front end globally and including:

data type
data update frequency
Unit type
human-readable column name ("label")
whether data is mappable
default ordering (column and direction)
total number of records

Setup Repo

Setup Repo with disabling users to directly merge into master or without code review

Testing different ML libraries

Testing different machine learning libraries for identify which one is best suited for the project

Identify likely space requirements for 2-year deployment

The application / DB will grow through time. Need to decide the following:

How much data should we keep for each metric (e.g. max 1 year for each metric)
How big will the DB get given this and given a quadrupling of the number of metrics?
How much storage will the machine require to account for this?

Processing Analytics - Check for null/empty - take mean/median

if a col or row in a dataframe contains NaN or is empty, take mean or median

Add testing to API client

Set up tests for basic functionality and health of the API client.

Create an Api endpoint - Analytics

Create an api endpoint that accepts:

i) Column name for dependent variables
ii) Table names for dependent variables
iii) Column name for Independent Variable
iv) Table name for Independent Variable
v) Operation to be performed on dataset
vi) Requestor Id
vii) A flag to indicate whether the data is timeseries data or not.
viii) If null flag to indicate what to do with the null rows
ix) Range of data

Analytics - Return response - in error

i) Error message defining what is actually missing and what caused the error while accepting the request.
ii) Request id (need to give more thought on it, is it even required)

Analytics - Return Response - if successful - Data in range

i) Prediction array
ii) Timestamp for the predictions
iii) Request id
iv) Timestamp of the response

Need to clean the datasets before visualising/analytics

There;s no guarantee that the incoming data are errorless. Need to find a way to filter out erroneous (but numerical) records

Add data summary

Include a summary of the data that are included in the app.

Break down by city
Document date of update
Link to documentation where possible

connectedplacescatapult / sharingcitiesdashboard Goto Github PK

sharingcitiesdashboard's Issues

Recommend Projects

Recommend Topics

Recommend Org