Giter Club home page Giter Club logo

tigernet's Introduction

GitHub release PyPI version Conda Version Conda Recipe

TigerNet

Network Topology via TIGER/Line Edges

unittests codecov made-with-python Code style: black pre-commit

What is TigerNet and how does it work?

TigerNet is an open-source Python library that addresses concerns in topology and builds accurate spatial network representations from TIGER/Line data, specifically TIGER/Line edges. This is achieved through a 7-step process that roughly is as follows:

  1. creation of initial TIGER/Line edges subset (features with a road-type MTFCC)
  2. creation of initial segments subset (retain only specified road-type MTFCCs)
  3. welding of limited-access segments (limited-access segments — freeways, etc. — that share a non-articulation point are isolated and welded together)
  4. welding of general segments (surface street segments that share a non-articulation point are isolated and welded together)
  5. splitting of general segments (surface street segments that cross at known intersections are split)
  6. cleansing of the segment data (steps 4 and 5 are repeated until the data is deemed "clean" enough for network instantiation)
  7. building of the network (creation of network topology with the option of further simplification to eliminate all remaining non-articulation points — a pseudo graph-theoretic object — while maintaining spatial accuracy)

Important

After some consideration, this repo will serve as a stub for the tigernet implementation developed for Gaboardi (2019), which can be cited in future publications through its DOI. Currently, some of the concepts are already being incorporated into spaghetti, with more of the functionality in the original tigernet potential (such as network measures pysal/spaghetti#126).

Examples

Installation

Pypi python versions Currently tigernet officially supports 3.8, 3.9, and 3.10.

(Recommended) Install the current release via conda-forge by running:

$ conda install tigernet

Install the current release from PyPI by running:

$ pip install tigernet

Install the most current development version of tigernet by running:

$ pip install git+https://github.com/jGaboardi/tigernet

Support

If you are having issues, please create an issue.

License

The project is licensed under the BSD 3-Clause license.

Citations

@misc{tigernet_gaboardi_2019,
  author  = {James David Gaboardi},
  title   = {jGaboardi/tigernet},
  month   = {aug},
  year    = {2019},
  doi     = {10.5281/zenodo.204572461},
  url     = {https://github.com/jGaboardi/tigernet}
}

Related projects

References

  • The original method for tigernet is described in Chapter 1 of Gaboardi (2019).
    • James D. Gaboardi (2019). Populated Polygons to Networks: A Population-Centric Approach to Spatial Network Allocation. ProQuest Dissertations Publishing.
  • The results of secondary analysis (spatial representions of population) were presented in Gaboardi (2020) and can also be found in Chapter 3 of Gaboardi (2019).
    • James D. Gaboardi (2020, November). Validation of Abstract Population Representations. Presented at the 2019 Atlanta Research Data Center Annual Research Conference at Vanderbilt University (ARDC), Nashville, Tennessee: Zenodo. DOI
  • The WeightedParcels_Leon_FL_2010 dataset is based on that used in Gaboardi (2019), which was produced in Strode et al. (2018).
    • Georgianna Strode, Victor Mesev, and Juliana Maantay (2018). Improving Dasymetric Population Estimates for Land Parcels: Data Pre-processing Steps. Southeastern Geographer 58 (3), 300–316. doi: 10.1353/sgo.2018.0030.

tigernet's People

Contributors

dependabot[bot] avatar jgaboardi avatar martinfleis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

martinfleis

tigernet's Issues

Add edge test cases

Add edge test cases:

  • utils._drop_geoms() 640d661
  • utils._weld_MultiLineString()
    • 3+ parts with one set of equal sp/ep removed code; unnecessary in workflow
    • sp1 almost equals ep2 016b775
  • utils.ring_correction()
  • utils.record_filter()
  • utils.restriction_welder()
  • utils.line_splitter()
    • no __iter__ attribute 4136694 refactored code
  • utils._create_split_lines()
    • unaltered line 7d67f19
    • create endpoint for ring roads
  • utils._find_break_locs()
  • utils._make_break_locs()
    • breaks and standard 56a9600
    • breaks and not standard line/mline 779ef85

customize auto notes

Update labels for notes customization and try PR with more than one label (e.g. enhancement and tiger_edges).

See here.

Find and resolve bottlenecks

BY far the two largest time/memory hogs are [n2n_matrix] and [paths] within Network.cost_matrix():

# calculate shortest path length and records paths if desired
n2n_matrix, paths = utils.shortest_path(self, gp=wpaths)

An attempt to speed up should be made...

  • try numba?
  • simple multiprocessing?

flesh out README.md

To Do's for README.md:

  • Title/Documentation Link
  • add functionality blurb — #4
  • badges
    • GitHub tag badge — 6a863bc
    • travis build badge
    • python versions badge
    • DOI badge
    • coverage
    • code style (black)
    • anaconda stuff
  • Installation
  • Requirements
  • License
  • DOI badge
  • BibTeX Citation

Add known `discard_segs` in info.py

# known issues with Leon County FL 2010 Tiger Edges
OLD_BLAIRSTONE = [
    618799725, 618799786, 618799785, 634069404, 618799771, 618799763, 610197711
]
SILVER_LAKE = [
    82844664, 82844666, 82844213, 82844669, 82844657, 82844670, 82844652, 82844673
]
discard_segs = OLD_BLAIRSTONE + SILVER_LAKE

upload+update+streamline code base

  • from jGaboardi_dissertation structure (at1866_Master):

    • tigernet/__init__.py
    • tigernet/spaghetti.py
    • tigernet/sauce.py
    • tigernet/utils.py
    • tigernet/dissertation_workflows.py
  • to jGaboardi/tigernet structure:

    • tigernet/__init__.py
    • tigernet/spaghetti.py --> tigernet/tigernet.py
    • tigernet/sauce.py
      • tigernet/dataprep.py, tigernet/netbuild.py, tigernet/point_funcs.py
    • tigernet/utils.py
      • tigernet/utils.py (utils+sauce+dissertation_workflows)?
      • integrate with others?
    • tigernet/dissertation_workflows.py

document tigernet-feedstock, etc.

Following #61

  • add auto merge bot to tigernet-feedstock
  • add conda-forge badge to README
  • add recipe badge to README
  • sync description here with that of tigernet-feedstock

disabled Windows tests

Currently, a sizable portion of the CI tests are disabled when running Windows due to an undiagnosed issue resulting in incorrect results. Periodically go back and check wether these tests are passing or not.

IndexError in Network.calc_net_stats() --> dict for node2coords?

Probably should convert all lookups to dictionaries from lists. Not sure why I made all these lists in dissertation implementation...

See also #29

To reproduce:

bbox = (-84.279,30.480,-84.245,30.505)
f = "zip://test_data/Edges_Leon_FL_2010.zip!Edges_Leon_FL_2010.shp"
gdf = geopandas.read_file(f, bbox=bbox)
gdf = gdf.to_crs("epsg:2779")
yes_roads = gdf["MTFCC"].str.startswith("S")
roads = gdf[yes_roads].copy()

kwargs = {"s_data": roads.copy(), "from_raw": True}
attr_kws = {"attr1": ATTR1, "attr2": ATTR2}
kwargs.update(attr_kws)
comp_kws = {"record_components": True, "largest_component":True}
kwargs.update(comp_kws)
graph_elems_kws = {"def_graph_elems": True}
kwargs.update(graph_elems_kws)
geom_kws = {"record_geom": True, "calc_len": True}
kwargs.update(geom_kws)
mtfcc_kws = {"discard_segs": discard_segs, "skip_restr": SKIP_RESTR}
mtfcc_kws.update({"mtfcc_split": INTRST, "mtfcc_intrst": INTRST})
mtfcc_kws.update({"mtfcc_split_grp": SPLIT_GRP, "mtfcc_ramp": RAMP})
mtfcc_kws.update({"mtfcc_split_by": SPLIT_BY, "mtfcc_serv": SERV_DR})
kwargs.update(mtfcc_kws)
net = tigernet.Network(**kwargs)
net.calc_net_stats()
IndexError                                Traceback (most recent call last)
<ipython-input-84-bab5579caf99> in <module>
     14 kwargs.update(mtfcc_kws)
     15 net = tigernet.Network(**kwargs)
---> 16 net.calc_net_stats()

~/tigernet/tigernet/tigernet.py in calc_net_stats(self, conn_stat)
    622 
    623         # Calculate the sinuosity of network segments and provide descriptive stats
--> 624         stats.calc_sinuosity(self)
    625 
    626         # Set node degree attributes

~/tigernet/tigernet/stats.py in calc_sinuosity(net)
     18     # Calculate absolute shortest path along segments
     19     # Euclidean distance from vertex1 to vertex2
---> 20     net.s_data = utils.euc_calc(net, col="euclid")
     21 
     22     # Calculate sinuosity for segments

~/tigernet/tigernet/utils.py in euc_calc(net, col)
   1274     net.s_data[col] = numpy.nan
   1275     for (seg_k, (n1, n2)) in net.segm2node:
-> 1276         p1, p2 = net.node2coords[n1][1][0], net.node2coords[n2][1][0]
   1277         ed = _euc_dist(p1, p2)
   1278         net.s_data.loc[(net.s_data[net.sid_name] == seg_k), col] = ed

IndexError: list index out of range

CI testing in Python 39 [2nd try]

Current failures Python 3.9 support:

======================================================================= short test summary info =======================================================================
FAILED tigernet/tests/test_tigernet_empirical_gdf.py::TestNeworkTopologyEmpiricalGDF::test_network_node2node - AssertionError: Lists differ: [5, 8, 9] != [8, 9, 5]
FAILED tigernet/tests/test_tigernet_empirical_gdf.py::TestNeworkComponentsEmpiricalGDF::test_network_ndata_components - AssertionError: Lists differ: [360, 99, 99, ...
FAILED tigernet/tests/test_tigernet_empirical_gdf.py::TestNeworkComponentsEmpiricalGDF::test_network_node_components_largest - KeyError: 281
FAILED tigernet/tests/test_tigernet_empirical_gdf.py::TestNetworkSimplifyEmpiricalGDF::test_simplify_copy_node_cc - KeyError: 89
FAILED tigernet/tests/test_tigernet_empirical_gdf.py::TestNetworkSimplifyEmpiricalGDF::test_simplify_inplace_node_cc - KeyError: 89
FAILED tigernet/tests/test_tigernet_synthetic.py::TestNetworkSimplifyBarb::test_simplify_copy_segm_cc - AssertionError: {2: [0, 1, 2]} != {1: [0, 1, 2]}
FAILED tigernet/tests/test_tigernet_synthetic.py::TestNetworkSimplifyBarb::test_simplify_inplace_segm_cc - AssertionError: {2: [0, 1, 2]} != {1: [0, 1, 2]}

See first attempt in #37.

.shp read needed?

Remove functionality to read in .shp files. Only relying on geopandas.GeoDataFrame as input is acceptable and more robust.

Bug in network topology

There appears to be a major bug that has cropped up since the dissertation version on this code.

  • It seems to be within utils.update_adj(), but can't pinpoint where exactly it happens.
  • It occurs when def_graph_elems=True

To reproduce:

bbox = (-84.279,30.480,-84.245,30.505)
f = "zip://test_data/Edges_Leon_FL_2010.zip!Edges_Leon_FL_2010.shp"
gdf = geopandas.read_file(f, bbox=bbox)
gdf = gdf.to_crs("epsg:2779")
yes_roads = gdf["MTFCC"].str.startswith("S")
roads = gdf[yes_roads].copy()

kwargs = {"s_data": roads.copy(), "from_raw": True}
attr_kws = {"attr1": ATTR1, "attr2": ATTR2}
kwargs.update(attr_kws)
comp_kws = {"record_components": True, "largest_component":True}
kwargs.update(comp_kws)
graph_elems_kws = {"def_graph_elems": True}
kwargs.update(graph_elems_kws)
geom_kws = {"record_geom": True, "calc_len": True}
kwargs.update(geom_kws)
mtfcc_kws = {"discard_segs": discard_segs, "skip_restr": SKIP_RESTR}
mtfcc_kws.update({"mtfcc_split": INTRST, "mtfcc_intrst": INTRST})
mtfcc_kws.update({"mtfcc_split_grp": SPLIT_GRP, "mtfcc_ramp": RAMP})
mtfcc_kws.update({"mtfcc_split_by": SPLIT_BY, "mtfcc_serv": SERV_DR})
kwargs.update(mtfcc_kws)
net = tigernet.Network(**kwargs)
IndexError: list index out of range

CI failing

Debug unittest failures that started here. At first glance this appears it could be due to the release of numpy v1.20.1.

To Do:

  • start by pinning numpy v1.20.0
  • actually pyproj issue (#67)

Roadmap for tigernet

After some consideration, this repo will serve as a stub for the tigernet implementation developed for Gaboardi (2019), which can be cited in future publications through its DOI. Currently, some of the concepts are already being incorporated into spaghetti, with more of the functionality in the original tigernet potential (such as network measures pysal/spaghetti#126).

  • Gaboardi, James D. 2019. Populated Polygons to Networks: A Population-Centric Approach to Spatial Network Allocation. ProQuest Dissertations Publishing.

new pyproj causes variations in projection results

Something changed between pyproj 2.6.1.post1 and pyproj 3.0.0.post1 that is leading to small variations in projected coordinates, which is then propagated on to point snapping distance, cost matrices, etc. For now, pinning to 2.6.1.post1 is reasonable (#66). A longer term solution is either updating the tests to reflect the altered projection results or figuring out if this is an actual bug. The code chunk below demonstrates the output difference between pyproj 2.6.1.post1 (top) and pyproj 3.0.0.post1 (bottom) with an example through geopandas and a MWE with only pyproj.

@knaaptime @slumnitz @martinfleis Have any of you noticed this in your work?

import warnings
warnings.filterwarnings("ignore")
import geopandas
import numpy
import pyproj
import shapely

packages = [geopandas, numpy, pyproj, shapely]
for p in packages:
    print(f"{p.__name__}: {p.__version__}")

layer = "WeightedParcels_Leon_FL_2010"
fin = f"zip://{layer}.zip!{layer}.shp"
bbox = (-84.279, 30.480, -84.245, 30.505)
gdf = geopandas.read_file(fin, bbox=bbox)
TEST_PARCEL = "1117160000020"
gdf = gdf[gdf["PARCEL_ID"] == TEST_PARCEL].reset_index(drop=True)
point = gdf.geometry[0]
x, y = point.x, point.y

from_crs, to_crs = gdf.crs, "EPSG:2779"

print(f" -- {geopandas.__name__}/{shapely.__name__} --")
print(gdf.crs, point)
gdf = gdf.to_crs(to_crs)
point = gdf.geometry[0]
print(gdf.crs, point)

print(f" -- {pyproj.__name__} --")
transformer = pyproj.Transformer.from_crs(from_crs, to_crs)
print(from_crs, (x, y))
print(to_crs, transformer.transform(y, x))
geopandas: 0.8.1
numpy: 1.19.4
pyproj: 2.6.1.post1
shapely: 1.7.1
 -- geopandas/shapely --
epsg:4326 POINT (-84.25873245332716 30.48436516338524)
EPSG:2779 POINT (623164.270749338 164564.2000569425)
 -- pyproj --
epsg:4326 (-84.25873245332716, 30.48436516338524)
EPSG:2779 (623164.270749338, 164564.20005694253)
geopandas: 0.8.2
numpy: 1.20.1
pyproj: 3.0.0.post1
shapely: 1.7.1
 -- geopandas/shapely --
epsg:4326 POINT (-84.25873245332716 30.48436516338524)
EPSG:2779 POINT (623164.6468275142 164563.5686168194)
 -- pyproj --
epsg:4326 (-84.25873245332716, 30.48436516338524)
EPSG:2779 (623164.6468275142, 164563.56861681942)

Only TIGER/Line Edges?

Remove all reference and code dealing with TIGER/Line Roads. Only using TIGER/Line Edges now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.