Giter Club home page Giter Club logo

spatial_access's Introduction

spatial_access: Compute travel times and spatial access metrics at scale

Compute travel times and spatial access measures at scale (millions of origin-destination pairs in minutes). Travel times for three modes: walking, biking, driving. Spatial access measures: provider-to-people ratio, avg. time to nearest provider, count/attribute sum of nearby providers, weighted access scores and floating catchment areas.

Latest Release latest release
Build Status travis build status
Documentation read the docs
Tested Operating Systems Ubuntu, macOS

Components of spatial_access :

spatial_access has two submodules:

  • p2p: Generate many to many matrices with travel times for sets of coordinates. Use walk ,bike or drive network types (import transit from other sources), or get the distance in meters.
  • Models: Contains a suite of models for calculating spatial accessibility to amenities.

To use this service as a ReST API, see: https://github.com/GeoDaCenter/spatial_access_api

If you are a Windows user, instructions for installing Ubuntu on a virtual machine are at the bottom of the Readme.

Installation

  1. A modern compiler like gcc or clang.

  2. Dependencies

    • MacOS:

      brew install spatialindex

    • Ubuntu:

      sudo apt-get install libspatialindex-dev

      sudo apt-get install python-tk

  3. Package

    pip3 install spatial_access

More detailed instructions for installing in 0_Reqs_Install.ipynb

Usage

See the iPython notebooks in docs/ for example usage, The first two notebooks contain installation instructions and run through a simple demo to make sure you have the setup successfully installed:

The remaining notebooks walk through how to run the travel time matrix and spatial access metrics, including main functions and parameters:

The data folder contains the input_data needed to estimate the metrics under sources (for origins) and destinations (for destinations).
In output_data, the matrices folder stores the estimated symmetric and asymmetric matrices.
The models folder contains the results of the models' analyses.
Finally, figures stores the results of maps and plots calculated during the process.

You can also download all of the notebooks in one PDF file here.

Overwriting default configuration values

p2p provides default configuration values for edge weights and node impedance (see spatial_access/configs.py). You can overwrite these as follows:

from spatial_access.p2p import TransitMatrix
from spatial_access.Configs import Configs
custom_config = Configs()
# set fields of custom_cofig
tm = TransitMatrix(..., configs=custom_config)
# continue with computation

Maintainance

Instructions for building locally (only for developers):

  • Additional requirements: cython and jinja2
  • To regenerate .pyx files, run: bash cythonize_extension.sh (TravisCI will do this automatically on deployment)
  • To install locally, run: sudo python3 setup.py install from spatial_access root directory
  • Unit tests require the pytest package. From package root directory, run python3 -m pytest tests/ to run all unit tests.

PyPi Maintenance

The package lives at: https://pypi.org/project/spatial-access/

When a branch is pulled into Master and builds/passes all unit tests, Travis CI will automatically deploy the build to PyPi.

To update PyPi access credentials, see .travis.yml and follow the instructions at https://docs.travis-ci.com/user/deployment/pypi/ to generate a new encrypted password.

Installing Ubuntu 18 LTS with dependencies from scratch (recommended for Windows users)

  1. Follow the instructions at this link: https://linus.nci.nih.gov/bdge/installUbuntu.html to set up a virtual machine
  2. sudo apt-get update
  3. sudo add-apt-repository universe
  4. sudo apt-get -y install python3-pip
  5. Continue with Installation Instructions (above)

Questions/Feedback?

[email protected]

Acknowledgments

Developed by Logan Noel at the University of Chicago's Center for Spatial Data Science (CSDS) with support from the Public Health National Center for Innovations (PHNCI), the University of Chicago, and CSDS.

spatial_access's People

Contributors

emochoa avatar gyoliver avatar ifarah avatar lixun910 avatar puttingscienceintodatascience avatar vidal-anguiano avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spatial_access's Issues

run check on spatial extent

can we add a test on the spatial extent to make sure it's not wider or longer than, e.g. an MSA? Let us know if you want distance inputs for this.

from George:
I was running the code on Jerry's data and it was taking an unexpectedly long time, then I ran out of hard drive space. There are a couple coordinates in the data which are far outside of Chicago (LA and DC actually), and a number that are in the Chicago suburbs. I think this just underscores the importance of adding some of the data checks. Ultimately, I think we do want a input data verification in the web app, but until we have that, it may be good to add a quick check in the back-end code?

Deprecated Spatial Join

Bring spatial join (ScoreModel) back to get the community areas for each destination without clipping the destinations so we don't lose points outside the city boundaries.

Before:
Deprecated spatial join (ScoreModel) because points outside of the Chicago community boundaries are dropped (losing dest observations). In the health example, 17 points lay outside the city of Chicago boundaries going from 199 to 182 observations.

Protobuf causes linker error

Applies to commit: https://github.com/GeoDaCenter/spatial_access/tree/5fe8717bd6bed34b023a53b47e8f3ec49f48f201

Package builds but is unable to be imported because the .so is not found:

---> 1 from transitMatrixAdapter import *

ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/transitMatrixAdapter.cpython-35m-darwin.so, 2): Symbol not found: __ZN6google8protobuf7Message20DiscardUnknownFieldsEv
  Referenced from: /Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/transitMatrixAdapter.cpython-35m-darwin.so
  Expected in: flat namespace
 in /Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/transitMatrixAdapter.cpython-35m-darwin.so

ERROR: Running driving times

When running these files I got this error:

Screen Shot 2019-04-24 at 5 41 55 PM

Screen Shot 2019-04-24 at 5 41 48 PM

I created a new ID column instead of geoid10, but not sure what the problem is.

Coverage doesn't use specified category field (edge case)

Low priority, but probably worth addressing eventually.

Scenario: an input destination file has a field named "category", but the user points to a different a field (e.g., "category_1") as containing category values.
The user also provides a list of values to the "categories" parameter of the Coverage class constructor.

It looks like the code never maps the internal data model's "category" field to the one provided by the user. It looks for the category values in the list provided by the user in the destination file's "category" field instead of the field specified and can't find them.

The main scenario I could see this coming up in is where a user has a legacy field "category", and then a current field called "category_facility" or "category_1" etc containing the 'real' category values. I've seen this kind of thing pretty frequently given the poor data management practices out there, the difficulty of updating schemas...

specify exception category weight lists for AccessModel

#49

Add the ability for the user to input a category weight dict that overrides the default weight list for the categories specified when calculating Access scores (AccessModel).

This way, users won't have to derive a list of categories each time to set a few exceptions. I think there would be enough demand for such a feature that it makes sense to build it into the module.

@ifarah Any thoughts about this, Irene?

Print categories used in dictionary

Print out the categories used in access model.
Whenever the user does not specify a category in the weight's dictionary, the default is DIMINISH_WEIGHTS. It would be great if the categories used in the model would be printed out so the user has more control of the categories computed.

Add 6th “model type” for dest attribute aggregation

Add 6th “model type” for dest attribute aggregation for two new results:

DestSum: Sum of attribute of point destination within boundaries of larger area (like community area)
Example: Total funding going to providers in a cmty area

PCDestSum: Same as DestSum but per capita: Sum of attribute of point destination within boundaries/ Sum of pop within boundaries
Example: Total funding going to providers in a cmty area/ per population in that area

Package for pysal?

Talking to @ifarah, I'll take a look at how to package/include this in pysal.

For me to get this through, we'd have to make sure this follows the "submodule contract"we wrote to ensure consistency across parts of pysal.

Not quite sure where this is after our website reorganization, but I'll look into it.

add default category weight list to AccessModel

Add the ability for users to specify a default category weight list to AccessModel. Julia agreed that this is useful functionality.

@ifarah Would you prefer that the user has to specify the default list whenever calculating access scores, or on top of the ability to set a default list when running the code, would it also be useful to have it hard-coded into the module (i.e., in Configs.py) so the users doesn't have to set a default list each time? I believe category weights aren't relevant to AccessCount, AccessSum, AccessTime, right?

increase impedance

I compared Logan's, Dan's, and GoogleMaps' results for a tract in Lincoln Park, Hyde Park, and Midway:
The results increased but are still below. Could you increase the impedance a bit more so I can rerun and revalidate?

  1. LP:
    Origin: 17031071100 (41.921800, -87.651100)
    Destination: 17031460400 (41.744500, -87.563700)
    Logan:1351 (22.52) NOW: 1357 (22.61)
    Dan:1440 (24)
    Googlemaps: 26-35 min

  2. HP
    Origin: 17031836200 (41.790469, -87.601284)
    Destination: 17031170100 (41.955900, -87.794800)
    Logan:1468 (24.46) NOW: 1762 (29.36)
    Dan:1980 (33)
    Googlemaps: 28-60 min

  3. Midway
    Origin: 17031561100 (41.788869, -87.771909)
    Destination: 17031030400 (41.986952, -87.672079)
    Logan:1674 (27.9) NOW: 1886 (31.43)
    Dan:2040 (34)
    Googlemaps: 30-50 min

Column of blank values in transit matrix output

An extra column of blank string values (rightmost column) is getting written into the transit matrix. TransitMatrix.write_to_file() fails when it tries to cast the blank values as floats.

Below is the error that gets thrown:

Traceback (most recent call last):

File "command_line_test.py", line 87, in

upper=int(30))

File "/Users/georgeyoliver/GitHub/CSDS/GeoDaCenter/spatial_access/spatial_access/CommunityAnalytics.py", line 301, in init

self.load_sp_matrix(sp_matrix_filename)

File "/Users/georgeyoliver/miniconda3/envs/web_app/lib/python3.6/site-packages/spatial_access/ScoreModel.py", line 307, in load_sp_matrix

self.dicto[row[0]][int(float(self.dest_2[i]))] = int(float(row[i]))     

ValueError: could not convert string to float:

Add AccessSum

Could you add:

AccessSum

Same to our AccessCount but this would sum an attribute of the destination (e.g. funding amount) within the catchment area of an origin point instead of counting the number of destination points.

Prefetch networks

Add a feature to natively support prefetching networks from OSM, without continuing with matrix computation.

rename

if you could update:

FloatingCatchmentAreaDest --> 2SFCA (if that's the 2nd column = sum of ratios at origin)
FloatingCatchmentArea --> coverage (if that's the ratio only at destination)

output AccessModel

The output of AccessModel gives results for "good_access".
I still don't understand the reasoning for including those results in the output instead of just having the results of the access score.

Screen Shot 2019-04-19 at 10 05 17 AM

ValueError: Latitude must be in the [-90; 90] range.

Hello Logan,
I believe the code is throwing this error for latitudes, reading from the longitude column.
(see error below)
I specified latitudes as coord_y and longitude as coord_x. coord_x is indeed ranging [-91,90], but coord_y is ranging [36,40]
Screen Shot 2019-04-23 at 11 29 48 AM
Screen Shot 2019-04-23 at 11 30 03 AM

I'm also copying and pasting the error just in case it's easier for you.

INFO:spatial_access.p2p:Total number of rows in the dataset: 119
/usr/local/lib/python3.6/site-packages/geopy/point.py:81: UserWarning: Latitude normalization has been prohibited in the newer versions of geopy, because the normalized value happened to be on a different pole, which is probably not what was meant. If you pass coordinates as positional args, please make sure that the order is (latitude, longitude) or (y, x) in Cartesian terms.
UserWarning)

ValueError Traceback (most recent call last)
in ()
----> 1 matrix.process()

/usr/local/lib/python3.6/site-packages/spatial_access-0.1.6.13-py3.6-macosx-10.13-x86_64.egg/spatial_access/p2p.py in process(self)
461 self.network_type, self.epsilon)
462
--> 463 self.prefetch_network()
464
465 is_symmetric = self.secondary_input is None and self.network_type is 'drive'

/usr/local/lib/python3.6/site-packages/spatial_access-0.1.6.13-py3.6-macosx-10.13-x86_64.egg/spatial_access/p2p.py in prefetch_network(self)
440 self.secondary_data,
441 self.secondary_input is not None,
--> 442 self.epsilon)
443
444 def clear_cache(self):

/usr/local/lib/python3.6/site-packages/spatial_access-0.1.6.13-py3.6-macosx-10.13-x86_64.egg/spatial_access/NetworkInterface.py in load_network(self, primary_data, secondary_data, secondary_input, epsilon)
157 self._try_create_cache()
158 self._get_bbox(primary_data, secondary_data,
--> 159 secondary_input, epsilon)
160 if self._network_exists():
161 filename = self.get_filename()

/usr/local/lib/python3.6/site-packages/spatial_access-0.1.6.13-py3.6-macosx-10.13-x86_64.egg/spatial_access/NetworkInterface.py in _get_bbox(self, primary_data, secondary_data, secondary_input, epsilon)
100 self.bbox = [lon_min, lat_min, lon_max, lat_max]
101 if self.area_threshold:
--> 102 approx_area = self._approximate_bbox_area()
103 if approx_area > self.area_threshold:
104 if self.logger:

/usr/local/lib/python3.6/site-packages/spatial_access-0.1.6.13-py3.6-macosx-10.13-x86_64.egg/spatial_access/NetworkInterface.py in _approximate_bbox_area(self)
56 lower_right_point = (self.bbox[3], self.bbox[0])
57 upper_left_point = (self.bbox[1], self.bbox[2])
---> 58 lower_edge = distance.distance(lower_left_point, lower_right_point).km
59 left_edge = distance.distance(lower_left_point, upper_left_point).km
60 area = lower_edge * left_edge

/usr/local/lib/python3.6/site-packages/geopy/distance.py in init(self, *args, **kwargs)
382 kwargs.pop('iterations', 0)
383 major, minor, f = self.ELLIPSOID # pylint: disable=W0612
--> 384 super(geodesic, self).init(*args, **kwargs)
385
386 def set_ellipsoid(self, ellipsoid):

/usr/local/lib/python3.6/site-packages/geopy/distance.py in init(self, *args, **kwargs)
160 elif len(args) > 1:
161 for a, b in util.pairwise(args):
--> 162 kilometers += self.measure(a, b)
163
164 kilometers += units.kilometers(**kwargs)

/usr/local/lib/python3.6/site-packages/geopy/distance.py in measure(self, a, b)
403 # Call geographiclib routines for measure and destination
404 def measure(self, a, b):
--> 405 a, b = Point(a), Point(b)
406 lat1, lon1 = a.latitude, a.longitude
407 lat2, lon2 = b.latitude, b.longitude

/usr/local/lib/python3.6/site-packages/geopy/point.py in new(cls, latitude, longitude, altitude)
169 )
170 else:
--> 171 return cls.from_sequence(seq)
172
173 if single_arg:

/usr/local/lib/python3.6/site-packages/geopy/point.py in from_sequence(cls, seq)
408 raise ValueError('When creating a Point from sequence, it '
409 'must not have more than 3 items.')
--> 410 return cls(*args)
411
412 @classmethod

/usr/local/lib/python3.6/site-packages/geopy/point.py in new(cls, latitude, longitude, altitude)
181
182 latitude, longitude, altitude =
--> 183 _normalize_coordinates(latitude, longitude, altitude)
184
185 self = super(Point, cls).new(cls)

/usr/local/lib/python3.6/site-packages/geopy/point.py in _normalize_coordinates(latitude, longitude, altitude)
80 '(latitude, longitude) or (y, x) in Cartesian terms.',
81 UserWarning)
---> 82 raise ValueError('Latitude must be in the [-90; 90] range.')
83
84 if abs(longitude) > 180:

ValueError: Latitude must be in the [-90; 90] range.

Define limit for bounding box

Throw error if the extremes of the bounding box are too big. Maybe define a feasible area for estimating the bounding box and if it exceeds this area, throw error?

Change naming of xcol and ycol in p2p.py

For backend users:
Lines 249-276 in p2p.py denominate xcol as latitude, when the x coordinate are the equivalent of longitude. This might be confusing for users, so change xcol to longitude and ycol to latitude.

disconnected network

Previously, Shiv implemented Kosaraju's algorithm to find the strongly connected components of the network. Some of the observations of the access score were zero because the nodes were not connected to the network.

Now the values are set to 65,535 and should be set as missing meanwhile.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.