Giter Club home page Giter Club logo

sensors's People

Contributors

prashtx avatar

Stargazers

 avatar

Watchers

 avatar  avatar

sensors's Issues

Add New Relic monitoring

Percentile stats would be great (95th and/or 99th percentile for service times), as would visibility into individual request types. Right now we only have coarse POST vs. GET via log-deranged + librato, and only mean/max.

We should test first on a dev service with simulated high-rate POSTs. The agent warnings we've seen elsewhere make me nervous. We should look for warnings that the agent's connection has been cut off, and we should monitor the memory usage.

Create a notion of sets of sources

A Set has one or more Sources as well as some metadata.

Creating/managing Sets requires some notion of users/permissions, though, which we don't currently have. Alternatively, this can be admin-only functionality until we create a user system.

Requesting an aggregation for a field that doesn't exist gives a 500

Support time-bounded queries

We can query pages of data, but we should also support querying based on time boundaries. We should probably enforce a max number of entries, unless we can cleanly stream the data end-to-end and avoid any memory issues.

Stale cache issues when querying the API for large data sets

Hi,

I'm seeing stale cache issues across the API this morning, and I'm apparently not the only one with this problem.

A cursory look at the source code doesn't provide any good explanation to this issue as I'm not seeing any HTTP cache headers set at the application level.

cURL'ing the API however indicates the presence of an Etag header, possibly added my Heroku middleware(?).

More importantly, this header doesn't seem to change with the body content, for example, querying the same URL at a two minutes interval gives the following headers:

curl -I "https://localdata-sensors.herokuapp.com/api/v1/sources/ci4lr75sf000602ypyfkxnua3/entries?startIndex=0&count=100000000000"
HTTP/1.1 200 OK
Server: Cowboy
Connection: keep-alive
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Content-Type
Content-Type: application/json; charset=utf-8
Content-Length: 245773
Etag: W/"EcvTQgeOsE1g83G22g9JuQ=="
Vary: Accept-Encoding
Date: Tue, 20 Jan 2015 13:52:59 GMT
Via: 1.1 vegur
$ curl -I "https://localdata-sensors.herokuapp.com/api/v1/sources/ci4lr75sf000602ypyfkxnua3/entries?startIndex=0&count=100000000000"
HTTP/1.1 200 OK
Server: Cowboy
Connection: keep-alive
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Content-Type
Content-Type: application/json; charset=utf-8
Content-Length: 245773
Etag: W/"EcvTQgeOsE1g83G22g9JuQ=="
Vary: Accept-Encoding
Date: Tue, 20 Jan 2015 13:54:20 GMT
Via: 1.1 vegur

You'll notice that while the Date varies, neither the Etag header nor the Content-Length header has changed, yet, multiple sensor data points have been added to the database in the meantime:

$ curl "https://localdata-sensors.herokuapp.com/api/v1/sources/ci4lr75sf000602ypyfkxnua3/entries?startIndex=7576&count=8" | python -mjson.tool
[
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 1033.2,
            "humidity": 55.6,
            "light": 272,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1348,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:53:07.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 903.93,
            "humidity": 55.4,
            "light": 301,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1276,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:53:17.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 903.93,
            "humidity": 55.4,
            "light": 283,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1324,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:53:27.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 903.93,
            "humidity": 55.4,
            "light": 283,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1336,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:53:37.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 1349.25,
            "humidity": 55.3,
            "light": 283,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1312,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:53:47.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 1349.25,
            "humidity": 55.3,
            "light": 265,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1324,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:53:57.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 1349.25,
            "humidity": 55.2,
            "light": 265,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1348,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:54:07.000Z"
    },
    {
        "data": {
            "airquality": "Fresh",
            "airquality_raw": 17,
            "dust": 996.08,
            "humidity": 55.2,
            "light": 265,
            "location": [
                6.035054,
                46.1558904
            ],
            "sound": 1320,
            "temperature": 8,
            "uv": 267.74
        },
        "source": "ci4lr75sf000602ypyfkxnua3",
        "timestamp": "2015-01-20T13:54:17.000Z"
    }
]

Note that this only seems to affect large data sets which seem to also have incorrect Content-Length headers.

Add an aggregation endpoint

We're going to want some server-side aggregation of stats (daily, weekly, monthly to start?). We'll want it by source and probably additionally grouped by location. If responses are coming in every 10 seconds, that's way more than we want to send over the wire for clients to process.

Out of range query params should trigger a 4XX response

Currently, the count query param is capped at 1000. This is opaque to the user of the API which gets the same data set whether the count query param is set to 1000 or 100000000.

Instead of silently capping the value, I suggest returning a 4XX response with an appropriate error message when the count query param is bigger than 1000.

Switch to an ORM

Raw queries were great for quick prototyping, but now we should introduce a layer of abstraction. Sequelize seems like a good option, and I think it uses the same low-level drive we currently use.

We should have tests in place (#1) before we do this.

Get details for a source

A GET to http://localdata-sensors.herokuapp.com/api/v1/sources/ci4omwt7k0003nm0u92mt03al should return details (like name, latlng)

Enhance logging

We should stop logging the no-api-version warning, since we have a large deployed population of devices that fail to reference the version.

We aren't adding much info, if any, to the Heroku router's logs for entry POSTs. We can consider logging some info for source POSTs or GETs, in case we want to see user agent strings, for example.

The express logs reference the router's internal IP address rather than the IP address of the original client.

Provide timezone info for each data point

Displaying the data with meaningful time indications implies figuring out the timezone and eventual DST of each device.

This is doable on the fly, but requires relying on a third party to get the information for every timestamp.

Could this be provided by the API itself?

As the time offset might change with DST, this info would need to be provided for each data point.

Endpoint for finding sources

We will want to find all sources by location, or simply all sources. A GET to /api/v1/sources should return something interesting that helps!

Switch to knex.js

Knex.js has a streaming interface that uses a cursor (via pg-query-stream). Since we're accessing chunks of fairly raw time-series data, rather than objects with more structured relationships, we don't take advantage of Sequelize's higher-level features. Initial investigations suggest that knex might just be nicer to work with, too.

The streaming interface indirectly necessitates the use of the javascript pg client (vs. the native client), so we should look at the performance before and after.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.