Giter Club home page Giter Club logo

localdata-tiles's People

Contributors

bensheldon avatar hampelm avatar mr0grog avatar prashtx avatar yuletide avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

14mmm isabella232

localdata-tiles's Issues

Rendering is broken

We're getting 0 results from the datasource for known good bounds. Not sure if something happened to projections or to queries yet.

Custom-location entries have insufficient UTFGrid data

If a survey has points that are unassociated with a base feature, then we need to pass along the response ID in the UTFGrid. Otherwise, we have no way of retrieving more information about the item.

Right now, this requires a small change both in tiles.js and in the MongoDB nodetiles datasource.

UTFGrid files have the wrong content type on S3

The browser interprets it as a script anyway, but it's poor form. We use the same S3 caching middleware for tile PNGs and UTFGrid JSON, so I suspect that's the culprit. We send to S3 after we've sent the data to the client, so we should be able to just check the Content-Type there.

Custom locations crash the renderer

2014-02-27T19:09:01.436692+00:00 app[web.2]:     at /app/node_modules/nodetiles-
mongodb/MongoDB.js:166:34
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at /app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/cursor.js:173:9
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at /app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/cursor.js:205:48
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at /app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/cursor.js:855:28
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at /app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/db.js:1806:9
2014-02-27T19:09:01.436864+00:00 app[web.2]:     at Server.Base._callHandler (/app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/connection/base.js:442:41)
2014-02-27T19:09:01.338438+00:00 app[web.2]: 
2014-02-27T19:09:01.389989+00:00 app[web.2]: /app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/connection/base.js:242
2014-02-27T19:09:01.436535+00:00 app[web.2]:         throw message;      
2014-02-27T19:09:01.436635+00:00 app[web.2]:               ^
2014-02-27T19:09:01.436692+00:00 app[web.2]: TypeError: Cannot read property '0' of undefined
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at Object.project.Point (/app/node_modules/nodetiles-core/lib/projector.js:163:21)
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at Object.util.latLonToMeters (/app/node_modules/nodetiles-core/lib/projector.js:70:14)
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at Object.project.FeatureCollection (/app/node_modules/nodetiles-core/lib/projector.js:106:33)
2014-02-27T19:09:01.436692+00:00 app[web.2]:     at Object.project.Feature (/app/node_modules/nodetiles-core/lib/projector.js:115:55)
2014-02-27T19:09:03.759128+00:00 heroku[web.2]: Process exited with status 8

Is it faster to redirect to S3 or proxy the data?

If we have a cached tile, we currently send a 302 response and redirect the client to the cached tile's S3 URL. There's some overhead in waiting for that response and then initiate a new request. The time to retrieve the S3 object for the Heroku-hosted app should be small, though, so it might be faster to just send the data instead of the 302 response.

Thoughts:

  • We already get the S3 header. If we will get the whole file later, it could make sense to get the whole file at the beginning, instead of the header. That would basically increase miss latency but decrease hit latency.
  • What will this do to memory, CPU, and network usage?

Use forms to generate the list of questions and answers instead of the survey stats

This requires coordination with the dashboard. We should be able to generate the list of valid questions, and the list of possible answers for each, from the current and past forms. Then we're dealing with a handful (or dozens at the high end) of objects rather than thousands or tens of thousands.

This should speed up the initial tiles that we render for each survey, and it should wreak less havoc on the mongodb cache, once we have a full-fledged instance where the cache is really being helpful.

Empty PNGs should short-circuit the S3 code path

If there is no data for a tile, we still check the S3 cache and potentially redirect the client to S3, even though the result will be identical to every other empty tile.

We should bypass S3 if there is no data. Ideally, we should skip the tile rendering and return a static, "blank" PNG. If we have a background color set for the map style (unlikely for this tile server), then we won't send a fully transparent PNG.

This ought to save the client from making several requests (since we won't redirect to S3), and it should reduce the latency in responding to the remaining requests.

Add a "request load" metric

Similar to a CPU load average, which indicates the number of tasks waiting to run, the request load would measure the number of active requests.

The tile server tends to get requests in groups: multiple tile images and their UTFGrids. Understanding the "overlap" in our requests could be informative. For example, the default of 5 MongoDB connections might be inadequate.

Exception in tiles.js if no question object for key -- occurred in collector map

Probably occurred in a request for /filter/collectors/tile...

2014-02-06T19:39:47.691856+00:00 app[web.2]: 
2014-02-06T19:39:47.691960+00:00 app[web.2]: /app/lib/controllers/tiles.js:205
2014-02-06T19:39:47.699711+00:00 app[web.2]:       for (i = 0; i < answers.length; i += 1) {
2014-02-06T19:39:47.699711+00:00 app[web.2]:                              ^
2014-02-06T19:39:47.699711+00:00 app[web.2]: TypeError: Cannot read property 'length' of undefined
2014-02-06T19:39:47.699711+00:00 app[web.2]:     at /app/lib/controllers/tiles.js:205:30
2014-02-06T19:39:47.699711+00:00 app[web.2]:     at Promise.<anonymous> (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:177:8)
2014-02-06T19:39:47.699711+00:00 app[web.2]:     at Promise.EventEmitter.emit (events.js:95:17)
2014-02-06T19:39:47.699711+00:00 app[web.2]:     at Promise.emit (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:84:38)
2014-02-06T19:39:47.699711+00:00 app[web.2]:     at Promise.fulfill (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:97:20)
2014-02-06T19:39:47.699711+00:00 app[web.2]:     at Promise.<anonymous> (/app/lib/models/Form.js:64:5)
2014-02-06T19:39:47.699895+00:00 app[web.2]:     at Promise.<anonymous> (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:177:8)

and shortly thereafter

2014-02-06T19:39:47.699711+00:00 app[web.2]:     at Promise.resolve (/app/node_modules/mongoose/lib/promise.js:108:15)
2014-02-06T19:39:47.697196+00:00 app[web.1]:                              ^
2014-02-06T19:39:47.697196+00:00 app[web.1]: TypeError: Cannot read property 'length' of undefined
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at /app/lib/controllers/tiles.js:205:30
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at Promise.<anonymous> (/app/lib/models/Form.js:64:5)
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at Promise.<anonymous> (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:177:8)
2014-02-06T19:39:47.691679+00:00 app[web.1]: 
2014-02-06T19:39:47.691918+00:00 app[web.1]: /app/lib/controllers/tiles.js:205
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at Promise.EventEmitter.emit (events.js:95:17)
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at Promise.emit (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:84:38)
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at Promise.fulfill (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:97:20)
2014-02-06T19:39:47.697196+00:00 app[web.1]:     at Promise.resolve (/app/node_modules/mongoose/lib/promise.js:108:15)
2014-02-06T19:39:47.697441+00:00 app[web.1]:     at Promise.<anonymous> (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:177:8)
2014-02-06T19:39:47.697441+00:00 app[web.1]:     at Promise.EventEmitter.emit (events.js:95:17)
2014-02-06T19:39:47.697196+00:00 app[web.1]:       for (i = 0; i < answers.length; i += 1) {
2014-02-06T19:39:47.697441+00:00 app[web.1]:     at Promise.emit (/app/node_modules/mongoose/node_modules/mpromise/lib/promise.js:84:38)

Collector filtering is broken

We allow filtering by collector, but we treat it just like any other question. The response isn't structured that way, though, so our generated styles are inconsistent with the data that nodetiles-core will actually see.

Should we pull the MongoDB datasource into this repo?

The MongoDB nodetiles datasource is just one file. We're not using it generically, and there may be advantages to adding some application-specific code to it. If changes and bugfixes might require modifications both to the tile server and the datasource, then I think it makes sense to pull it into the tile server repo.

We can control logging and instrumentation much more easily if we don't have to make changes to a referenced repo.

It seems reasonable to create application-specific nodetiles datasources, considering the lightweight interface.

Improve tilejson generation

// Todo:
// - load from a template
// - use a sensible center (from the survey data)
// - better attribution etc. (use survey data)

missing properties cause a crash

2013-12-10T22:40:38.012120+00:00 app[web.1]: SELECTING { 'geo_info.geometry': 1,
2013-12-10T22:40:38.012120+00:00 app[web.1]:   'geo_info.humanReadableName': 1,
2013-12-10T22:40:38.012120+00:00 app[web.1]:   object_id: 1 }
2013-12-10T22:40:38.048176+00:00 heroku[router]: at=error code=H13 desc="Connection closed without response" method=GET path=/742863d0-ae10-11e2-b9ea-49e82a1613e0/tiles/18/70575/96949.png host=localdata-tiles.herokuapp.com fwd="50.17.56.19" dyno=web.1 connect=3ms service=151ms status=503 bytes=0
2013-12-10T22:40:38.021066+00:00 app[web.1]: SELECTING { 'geo_info.geometry': 1,
2013-12-10T22:40:38.021066+00:00 app[web.1]:   object_id: 1 }
2013-12-10T22:40:38.021066+00:00 app[web.1]:   'geo_info.humanReadableName': 1,
2013-12-10T22:40:38.023247+00:00 app[web.1]: Fetched 2 responses in 13ms
2013-12-10T22:40:38.025752+00:00 app[web.1]: Error connecting to database [TypeError: Property 'undefined' of object #<Object> is not a function]

Bad precedent

npm install fails for reasons that are not apparent to me. I frequently get errors like this around canvas or bson (it's not entirely clear from the log who is the culprit)

npm ERR! cb() never called!
npm ERR! not ok code 0

Blowing away node_modules and re-installing generally fixes the problem -- but sometimes it takes multiple cycles to get everything working. Maybe things need to be installed or compiled in a specific order to play well together?

@prashtx, didn't you mention you had a random issue with dependencies on heroku, too, or was that on server-api?

It's bad enough that I have to attach this:

prashant

Cache validation will miss nearby objects

When we render a tile, we translate the tile name into a bounding box and then add a buffer. That lets us render objects that are technically outside our tile but will be in view of the tile due to rendered style (a thick border, for example).

When we confirm validity of the image cached in S3, we only look at the tile's bounding box, not the buffered bbox. If a new object pops up right next to our tile, we'll incorrectly serve the cached image.

Our styles are not so fat right now, so this isn't a major issue.

To fix, I think we should modify nodetiles-core to allow a configurable buffer. Then we can use the same buffer for our cache validation that we pass to nodetiles-core.

Cache mongodb responses

The map PNG and UTFGrid both use the same data for a given tile, but right now we make separate requests. We should be able to make one request and reuse the data.

We can cache the data with a short time-to-live, so that we don't explode our memory footprint. We likely need to use promises, since the database requests will go out at about the same time. The second one should see a promise in the cache, which in some cases may have been resolved already.

The effectiveness of an in-memory cache will decrease as we scale up the number of dynos. Ideally the approach should be general enough that we could drop in redis or similar and share the cache across processes.

Filter/color according to answer

@hampelm : Using this issue for the discussion around filter architecture.

We need to "color" and "filter".

Color: Apply styles that color entries according to the answer for a specified question.
Filter: Only plot the entries where the answer matches the one specified. The color for those entries should be the same as if we hadn't filtered.

I don't think we need to alter nodetiles-core. We're just giving it data and styles in all of the cases. We should be able to generate a version of the datasource that selects the appropriate fields. We can still cache the resulting Map object, using the survey ID and a string description of the color/filter information.

I've started a filter branch that we can use for the server portion of this. Right now it just has some housekeeping changes.
https://github.com/LocalData/localdata-tiles/tree/filter

Add fine-grained instrumentation

We should get a handle on tile server performance. There are several pieces of work (DB, tile rendering, cache checking, etc.) and multiple paths (S3 cache vs generated tile, UTF grids), and I'd like to know how they each contribute to performance.

I think we can make the tile server faster, but we should know which pieces to work on, and we should track our progress.

StrongOps doesn't have the granularity needed. We could log the metrics, but then we have to do our own log processing, graphing, etc.

NodeTime supports custom metrics, and I think those will work for us:
http://docs.nodetime.com/#custom-metrics

Now what to record?

  • Cache hits/misses for every cache we use (etag, S3, survey stats)
  • Time to get data from database (will this require instrumenting nodetiles-core or nodetiles-mongodb?)
  • Total time to serve tiles
  • Total time to serve UTF grids
  • Time to check validity for ETag cache and S3 cache

Cache promises for the survey stats requests

For the first few tiles for a given survey, each tile results in a request for the survey stats, which is a very expensive operation. Once we get the stats, we cache them for future tiles.

Instead, we should cache a promise for the stats. Then the initial tiles can all share the same request.

Cannot overwrite `Form` model once compiled.

On master and filter, I get this error when running node tileserver.js

/Users/matth/projects/ld/tiles/localdata-tiles/node_modules/mongoose/lib/index.js:278
      throw new mongoose.Error.OverwriteModelError(name);
            ^
OverwriteModelError: Cannot overwrite `Form` model once compiled.
    at Mongoose.model (/Users/matth/projects/ld/tiles/localdata-tiles/node_modules/mongoose/lib/index.js:278:13)
    at Object.<anonymous> (/Users/matth/projects/ld/tiles/localdata-tiles/lib/models/Form.js:143:38)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)
    at require (module.js:380:17)
    at Object.<anonymous> (/Users/matth/projects/ld/tiles/localdata-tiles/tileserver.js:37:13)
    at Module._compile (module.js:456:26)

Consider compositing old tile images with new data

We don't have any layers that are drawn on top of our main data layer. So if we find new entries for a tile, we could draw them on top of the existing image.

We check the count and time of last modification for cached image validation. We could instead check the count of documents within the bounding box and before the last modification time. If that count doesn't match, we had a deletion and should refetch and rerender. If that count matches, we fetch entries after the last modification time. If there are none, the cached image is good to go. If there are new entries, we can render those to a PNG and composite that with the cached image.

If the composite time is not large, this could save a lot of time when updating tiles at low zoom levels. Our system primarily has additions, so we would seldom have to fetch all of the data for a tile.

Allow gzip'd delivery of cached UTFGrid data

The UTFGrid data is cached as JSONP text. Cached data is served with a redirect to S3. S3 doesn't gzip responses, though.

One option is to store the raw and gzip'd responses on S3. Based on the Accept-Encoding header, we can redirect to the appropriate item.

Another option is to proxy the S3 data through the server and use the built-in conditional compression. The underlying zlib calls are async, so we should not end tying up the event thread. This option is appropriate if we find that the client-side overhead is high for redirects.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.