kevinschaul / binify Goto Github PK

View Code? Open in Web Editor NEW

158.0 158.0 13.0 564 KB

A command-line tool to better visualize crowded dot density maps.

License: MIT License

Python 100.00%

binify's People

Contributors

Stargazers

Watchers

Forkers

jeremyjbowers albertoconti datadesk jessejajit bradbaker gabrielflorit barrycug smoucka sureddy scdavis50 tommylees112 motiteux iq-scm

binify's Issues

`count_intersections()` is slow

Algorithm probably has to be O(n^2). Look into optimizations.

Show progress during calculations

Process can take quite some, especially count_intersections().

Does PyPI install gdal (and other dependencies) correctly?

All my machines have gdal preinstalled. I can test using a virtual box.

What happens when input shapefile does not contain a point layer?

Add custom extent to README.md

Windows support

.. probably does not exist yet. Is there enough interest?

binify other things than density

Hi,
I would like to display the exact thing your tool is displaying.
But the data is not a cloud of points. It is a function wich associates a quantity to a location.

There is a difference, since zooming on a bunch of points will disperse them, thus the density will decrease. Zooming on a location doesn't change that location, thus the function will not change its output.
But by zooming, more hexagons will appear, so there is more hexagon color to compute.

I have the feeling that I am not very clear, ask me anything that doesn't make sense.
I will try to use your code to implement the tool I am speaking of in 2 or 3 weeks if I can find the time, I will make a pull request at this occasion.

Better documentation

Something using GitHub Pages.

Postgres support

Seems simple enough to do, but right now my patch just segfaults.

https://github.com/datadesk/binify

Output doesn't include .prj file

URL change for Waterloo, IA map

The URL has changed for our crime map that's listed in the "In the wild" section. The new, permanent home for the map is: http://wcfcourier.com/app/crime_map2013/index_wloo.php

Thanks again for listing it!

Readme language is unclear

The description of the project is cryptic

more analysis options beyond count

It would be good if you could average a value in an attribute, for instance.

Theoretical time for execution

I'm trying to bin 30,000 lat/lng pairs with an -n of 120, and it looks like it will take about an hour. (2 Ghz Macbook Pro, 8GB memory). Is that to be expected? From what I understand of the theory, it should run pretty quickly. May be doing something wrong.

Create QGIS plugin

Move main code out of `cli.py`

Only cli logic belongs in this file.

Pass a parameter to COUNT?

In many cases, I have a series of points that are already aggregate values of some data point. For example, I might have a point for every county with an attached field in the .shp file for jobs created in that county in the past month.

My goal in this case would be to create hexagonal bins where COUNT = jobs in that hexagon. Would it be possible for COUNT to represent not just an actual count but a sum of a given field?

Ensure code is standardized

Possibly by using pylint

Implement number of hexagons option

Python 3 support

Option to ignore zero-count hexagons?

I've been playing with binify for some data I gathered by geocoding about 30,000 addresses. I needed a --num-across value of about 200 to get the granularity I wanted, which created a huge .shp file. To get it in shape for graphing with d3, I converted to GeoJSON and wrote a short python script to remove all polygons where COUNT=0. After then converting to TopoJSON it got down to a very manageable 169KB.

I thought this might make for a useful flag to the command line tool.

Would also be happy to publish this example as a gist or elsewhere. The final product looks great.

progressbar 2.3 dependency issue

It seems progressbar v2.3 is unavailable using pip and chokes on the dependencies when binify is installed in this manner (using virtualenv per your recommendation). On PyPI the last stable version is 2.2 released nearly 8 years ago making me think I'm missing something here...

Any help would be greatly appreciated.

`create_grid()` only works with test shapefile

The math behind create_grid() is nonexistent. The basic logic is there, but the constant multiplier values are off.

Performance issues with 700K points

I've more than 700K points and I want to binify them. If I gave n=20 to n=120, it can finish with 1 hour to 1 day, but if I gave n=1000 then it almost finish %1 in 18 hours.

Is there anything you suggest for binning these points quickly?