Giter Club home page Giter Club logo

millstone's People

Contributors

abenrob avatar ansis avatar csytsma avatar danzel avatar dboze avatar dmitrig01 avatar kkaefer avatar miccolis avatar mojodna avatar tmcw avatar tomhughes avatar tpotter7 avatar wrynearson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

millstone's Issues

support remote datasource urls without extensions via `content-disposition`

Frequently urls to geojson don't end with an extension of .json or .geojson. Also csv files remotely hosted at google docs also present this issue:

https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AqV4OJpywingdFNYLXpKMmxqMG1lWTJzNE45ZUVnNlE&single=true&gid=0&output=csv

Without the ability to detect the type of the file then tilemill is unable to tell mapnik which datasource plugin to use to read it, eg ogr for .geojson files and csv for .csv files.

So, we need to find more robust ways to support this. One approach in TileMill may be to detect urls without an obvious extension and then require the user to supply the type of file, but good api's will set the content-disposition from which a filename and extension can be pulled.

It should be easy to get the content-disposition during download()

            if (!path.extname(filepath)
                && response.headers 
                && response.headers['content-disposition']) {
                var disp = response.headers['content-disposition'];
                var match = disp.match(/^.*filename="(.*)"/i);
                if (match[1]) {
                    filepath = path.join(filepath,match[1]);
                }
            }

But our caching framework will still likely break because we call path.extname(uri.pathname) in several places.

tests to validate relative or absolute path detection

In several places in millstone different behavior is triggered depending on whether paths are relative or absolute on the filesystem. Like at https://github.com/mapbox/millstone/blob/master/lib/millstone.js#L400.

We need to make sure that this logic works cross platform. It obviously will not work if paths start with drive letters, e.g. C:\foo\bar.txt will look like a relative path because it does not start with a path separator.

So, the task here it to either write an is_relative function that detects drive letters, or see if someone in the node world has already done this, or look for whether it is cleaner to normalize paths somehow across millstone into uri components whereby drive is separate and the absolute path (starting with the path separator).

Then we need good, simple tests for this that can easily be run as mocha tests.

small typo

I think there is a small typo in util.js (line 4)

Step = require('Step');

that should be:

Step = require('step');

millstone needs to throw if a projection is not known for layer types where it will not be autodetected

You can easily get blank maps for layer types in tilemill other than geojson and shapefiles, because projections that are unknown are simply passed through as blanks. So, then mapnik aborts rendering and dumps: proj_init_error:failed to initialize projection with: '' in the logs.

Millstone should throw and let the api user (in this case tilemill) known about the need to manually specify the projection.

regression in handling shapefiles zipped in nested directory

Pretty sure we properly recursed into directories inside archives from the beginning, but because some shapefiles are failing, my hunch is that this is broken now in tilemill 0.5.0.

error is:

 "Error: No projection found for layer "al" at /Users/dane/Documents/MapBox/project/aaa".

This appears to be spurious as the shapefile does have a valid .prj, but perhaps is not finding it because of the directory?

need to reattempt srs parse if srs.proj4 is undefined

Recently we moved the parsing fallback required for ESRI variant shapefile WKT into node-srs. But, node-srs only fallsback if there is an error and should not fallback only if the .proj4 value is undefined (node-srs will require other srs representations that did work). But TileMill requires the proj4 be known, so millstone still needs this fallback. Ideally the node-srs api would give a single .proj4() call, but until then we need to handle this correctly in millstone for what tilemill expects.

carto believes files with no extension 'could not be downloaded'

From mapbox/carto#61

when trying to run 'tilebatch' to render an mml file that references a remote geojson file with no extension, carto will download the file, but then render nothing.

if you set a breakpoint in lib/carto/renderer.js:95, you discover that it throws the 'could not be downloaded' error. Also, err is set to the local filename that it was downloaded to.

r elatedly, it would be nice if this error was somehow displayed to the user, instead of silently producing an empty mbtiles file.

Add indexing step and add spatial index for sqlite files

The flow of millstone is currently something like this:

  1. Localize all files
  2. Correct any information about the datasources (SRS) or do autodetection if possible
  3. Return resolved MML

The task is to add a step between 2 & 3 that gives us a chance to add indexes to datasources. At the moment we're interested in implementing sqlite index addition but in the future there are other indexing operations (e.g. shapefiles) that we may want to do.

We'll want to add a test for this step as well once it's implemented.

replace node-get with request to enable https support

In order to get csv loading out of google docs we need https support (along with 302 support). I found that switching out node-get with request immediatly allowed this to work, and avoid the error of 'socket hung up' that I get with node-get.

Potential patch is:

diff --git a/lib/millstone.js b/lib/millstone.js
index dcf1807..18e0bbe 100644
--- a/lib/millstone.js
+++ b/lib/millstone.js
@@ -8,6 +8,7 @@ var EventEmitter = require('events').EventEmitter;
 var _ = require('underscore');
 var srs = require('srs');
 var get = require('get');
+var request = require('request');
 var zipfile = require('zipfile');
 var Step = require('step');
 var sqlite3 = require('sqlite3');
@@ -31,7 +32,7 @@ function download(url, filepath, callback) {

     downloads[url] = new EventEmitter();
     pool.acquire(function(obj) {
-        (new get(url)).toDisk(dl, function(err, file) {
+        var req_dl = request(url, function (err, response, body) {
             pool.release(obj);
             if (err) {
                 downloads[url].emit('done', err);
@@ -44,6 +45,7 @@ function download(url, filepath, callback) {
                 return callback(err, filepath);
             });
         });
+        req_dl.pipe(fs.createWriteStream(dl));
     });
 }

diff --git a/package.json b/package.json
index 8764133..ae22f2f 100644
--- a/package.json
+++ b/package.json
@@ -18,6 +18,7 @@
         "underscore"  : "1.1.x",
         "step": "0.0.x",
         "generic-pool": "1.0.x",
+        "request": "1.9.x",
         "get": "~0.4.2",
         "srs": "~0.2.7",
         "zipfile": "~0.2.2",

Localizing CartoCSS image paths

I'm trying to localize uris passed to marker-file, point-file and the like but I can't seem to see them localized.
Before going any deeper, I went looking in the existing test cases but I found no such case tested there.
Do you have an example ?

What I'm trying is:

{ point-file: url('http://upload.wikimedia.org/wikipedia/commons/7/72/Cup_of_coffee.svg'); }

Millstone says:

[millstone] processing style 'style.mss'
path.exists is now called `fs.exists`.
[millstone] finished processing '/tmp/millstone/base'

And the millstone.resolve callback gets passed a resolved mml which is the same
as the input one.

Indeed I don't see the localizeCartoURIs function in lib/millstone.js use the "file" parameter passed to the callback, but just the error, if any. What's the rationale for that ?
This is millstone master branch:
https://github.com/mapbox/millstone/blob/master/lib/millstone.js#L401

error handling broken for Carto URI localization

URI localization was added in #41 but it appears that no errors are thrown if the url is invalid. This is not good because the lack of a throw ends up leading to an invalid path being constructed in the mapnik xml and then for every feature a console line appears like:

[tilemill] "Mapnik LOG> 2012-08-23 13:12:27:" could not intialize reader for: '/Users/dane/Documents/MapBox/cache/a9b9f043-/a9b9f043-'

This will be mitigated by mapnik/mapnik#1439 slightly but really should be handled earlier in millstone, no?

test failure on master re: sqlite

@willwhite - any sense if this is supposed to pass? Last I checked we don't support sqlite srs detection.

> [email protected] test /Users/dane/projects/millstone
> which expresso | sh


   uncaught: AssertionError: "Unable to determine SRS for layer \"sqlite-attach\" at /Users/dane/projects/millstone/test/cache/layers/countries.sqlite" == "Server returned HTTP 404"
    at /Users/dane/projects/millstone/test/test.js:43:16
    at Function.end (/Users/dane/projects/millstone/lib/millstone.js:485:9)
    at next (/Users/dane/projects/millstone/node_modules/step/lib/step.js:51:23)
    at next (/Users/dane/projects/millstone/node_modules/step/lib/step.js:54:7)
    at next (/Users/dane/projects/millstone/node_modules/step/lib/step.js:54:7)
    at Function.<anonymous> (/Users/dane/projects/millstone/lib/millstone.js:221:29)
    at native

Alternative type detection of remote resources

In addition to parsing content-type we should also at least:

  • detect content-type
  • and as a fallback look for likely extensions in the url like 'csv' in: ?foo=bar&type=fun&format=csv&id=4

handle escaped urls

If a user pushes an escaped url into TileMill's Layer UI like:

http://examples.cartodb.com/api/v1/sql?format=geojson&q=SELECT%20*%20FROM%20costa_rica_pa%20limit%201

Then it appears by the time it gets to the point of being sent to node-get the url is double escaped:

http://examples.cartodb.com/api/v1/sqlformat=geojson&q=SELECT%2520*%2520FROM%2520costa_rica_pa%2520limit%25201

This breaks of course since that url is no longer valid.

It appears harmless to call unescape on both a raw url (basically unescaping twice) and an escaped url, so this fix can be as easy calling unescape right here: https://github.com/mapbox/millstone/blob/master/lib/millstone.js#L38

Missing licensing info

Hello,
I'm packaging millstone for Debian.

The only licensing information is a "BSD" in package.json: please expand this, either by placing a comment in the sourcecode (millstone.js), or by putting a LICENSE/COPYING file in the root directory of the source.

It would be great if you could provide some licensing info in this bugreport, so that I can continue my work, and also so that I don't have to wait for a new release to happen :)

Many thanks,
David

ESRI:: fallback is needed both in millstone and node-srs

a979c06 is invalid.

node-srs will fallback to trying to parse the prj as if it is an ESRI variant if there is an exception thrown internally by ogr. But the design of node-srs is that it will return as much about the projection as it can if ogr internally does not throw. The original idea behind this design is that you might want to know various things about the projection even if you can't know all (like proj4 string or epsg #).

What this means for millstone/tilemill is that millstone needs to fallback to parsing files as ESRI variants if no proj4 value is known for a given prj (which is possible to be the case for ESRI variant files even if they never caused an exception in ogr originally).

So, the case where the projection is valid enough to be parsed by ogr but not valid enough to have its proj4 representation detected is the most critical case and will fail for all tilemill users until millstone falls back to trying a parse with ESRI:: pre-prepended. This is the same regression I fixed in tilemill 0.6.x but had regressed in 0.5.x after working in 0.4.x.

Overall I think the design and api of node-srs is terrible. Ideally soon we can properly wrap libgdal so we can do projection detection on all datasource types and ditch node-srs - calling it a temporary shim that got us this far.

framework for different caching strategies

Driven by https://github.com/mapbox/tilemill/issues/922 we should think about how to make the caching (and cache flushing) functionality more modular.

So far for the experimentation in the 'live-cache' branch of tilemill this is all that is needed to trigger rec-caching upon save, but we can do better:

diff --git a/lib/millstone.js b/lib/millstone.js
index ef4beb0..fdd70b5 100644
--- a/lib/millstone.js
+++ b/lib/millstone.js
@@ -312,7 +312,7 @@ function resolve(options, callback) {
             if (uri.protocol) {
                 var filepath = path.join(cache, cachepath(l.Datasource.file));
                 path.exists(filepath, function(exists) {
-                    if (exists) {
+                    if (exists && !(l.cache_method && l.cache_method === 'live')) {
                         symlink(filepath);
                     } else {
                         utils.mkdirP(path.dirname(filepath), 0755, function(err) {

Flush _all_ localized resources from a given MML

I was suprised to see that .flush() does not undo the effects of .resolve(), as the README file seems to suggest just that.

It would be useful to be able to clean up the cache by passing the same MML passed to .resolve.

localizeCartoURIs fails to resolve duplicate urls

It appears localizeCartoURIs calls uniq to get all unique matches of urls from a style. This makes sense to avoid hitting the download queue harder than needed. But the problem with this is that then some urls are not switch out with their localized path, which breaks things.

npm test: Error: Cannot find module 'Spec'

After a successful npm install, npm test fails with :

] npm test

> [email protected] test /home/src/cartodb/millstone
> mocha -R Spec --timeout 10000


module.js:340
    throw err;
          ^
Error: Cannot find module 'Spec'
    at Function.Module._resolveFilename (module.js:338:15)
    at Function.Module._load (module.js:280:25)
    at Module.require (module.js:362:17)
    at require (module.js:378:17)
    at Mocha.reporter (/home/src/cartodb/millstone/node_modules/mocha/lib/mocha.js:100:24)
    at Object.<anonymous> (/home/src/cartodb/millstone/node_modules/mocha/bin/_mocha:178:7)
    at Module._compile (module.js:449:26)
    at Object.Module._extensions..js (module.js:467:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.runMain (module.js:492:10)
    at process.startup.processNextTick.process._tickCallback (node.js:244:9)
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

avoid creating an empty database when sqlite introspection fails

Millstone does some intense introspection of sqlite databases to be able to autodetect primary keys and spatialite geometry type.

When a user in tilemill miss spells a database, the logic fails but leaves behind a blank database (because the default behavior of node-sqlite3 is to create one if the filename passed does not exist).

This then creates very hard to understand errors when this db name is passed to node-mapnik/mapnik in TileMill which indicate the table is missing when really the database filename is wrong.

Because Mapnik latest upstream now can handle auto-detection of primary keys and spatialite geometry types I propose simply removing all this logic (rather than fixing the create if not found behavior). So, assigning to myself to get this cleaned up.

localizing datasources/resources representing more than one layer

Flagging this issue as requiring more thought.

Consider these various issues:

  • Zipfiles - we currently assume its a shapefile and look for the first .shp, but there could be multiple. And eventually we'll want to support other datasources being zipped - like geojson: https://github.com/mapbox/tilemill/issues/253
  • KML files - can embed more than one layer - currently millstone passes layer_by_index:0 to mapnik to just take the first
  • SQLite databases - would be nice to be able to introspect and find all spatial tables

With the exception of zipped resources, the generic answer to these is using gdal/ogr to introspect the data (and therefore wrapping gdal/ogr as a node c++ addon). But, currently the approach of millstone is to handle each case in a custom way - with the benefit of avoiding the extra dependency and offering advanced functionality.

millstone crashes tilemill when adding postgis layer

On XP in copy mode:


[tilemill] [millstone] processing style 'layer'
[tilemill] 
[tilemill] C:\Program Files\TileMill-v0.10.0-pre\tilemill\node_modules\millstone\lib\millstone.js:184
[tilemill]         return loc[0] !== '\\' && loc.match(/^[a-zA-Z]:\\/) === null;
[tilemill]                   ^
[tilemill] TypeError: Cannot read property '0' of undefined
[tilemill]     at isRelative (C:\Program Files\TileMill-v0.10.0-pre\tilemill\node_modules\millstone\lib\millstone.js:184:19)
[tilemill]     at resolved.Layer.forEach.name (C:\Program Files\TileMill-v0.10.0-pre\tilemill\node_modules\millstone\lib\millstone.js:598:29)
[tilemill]     at Object.oncomplete (fs.js:297:15)
[tilemill] Error: child process: "tile" failed with code "1"

Switch to request?

It might be a good idea to switch to request (more used and developed than node-get).

attachdb path that does not exist crashes tilemill

putting in:

/this/does/not/exist/business.sqlite

for an sqlite layer in tilemill crashes the app with:

/Users/dane/projects/tilemill_master/node_modules/millstone/lib/millstone.js:360
                            if (err) throw err;
                                     ^
Error: SQLITE_CANTOPEN: unable to open database file

naive localization of sqlite attached dbs

I'm going to move forward with a first stab at making it possible to use remote attached dbs. As advertised, I'm not going to be looking to far into the generalization of this functionality.

new tag

TileMill will need a new tag to pull in the srs fixes (#16) and https fixes (#17) once finished.

More informative HTTP errors (moved)

moved from mapbox/carto#67

Errors for failed layer requests like this

{ message: 'Server returned HTTP 403'
, statusCode: 403
}

would be more helpful if they included the url that was being requested. I had to manually check each layer source to see which one was failing (it was a typo on my part). Example 403 error: http://gis-data.s3.amazonaws.com/foobar.zip

@springmeyer

curious one @ajashton, I've not seen that before. Just want to add that, as you likely know, usually missing data usually looks like:

{ message: 'File not found: /Users/dane/projects/arc.js/square.json'
, stack: [Getter/Setter]
}
{ message: 'File not found: /Users/dane/Desktop/route.shp'
, stack: [Getter/Setter]
}
{ message: [Getter/Setter]
, stack: [Getter/Setter]
, type: 'non_object_property_load'
, arguments: [ 'length', undefined ]
}

guessExtension heuristic fails on filenames with quotes

This was the cause of the failing to load KML from https://github.com/mapbox/tilemill/issues/1242.

Given a filename like: New York City\'s Solidarity Economy.kml then guessExtension in millstone tries to deduce the path extension from just 'New York City' leading to no ext being found and no knowledge of the file type.

> var s = 'attachment; filename="New York City\'s Solidarity Economy.kml"'
> s.match(/filename=['"]?([^'";]+)['"]?/)
[ 'filename="New York City\'',
  'New York City',
  index: 12,
  input: 'attachment; filename="New York City\'s Solidarity Economy.kml"' ]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.