Giter Club home page Giter Club logo

Comments (19)

stephhazlitt avatar stephhazlitt commented on September 7, 2024 1

I feel like it might be some of the processing or caching by bcmaps? The raw data file works fine with bcdata (which might be a good workaround for you in the interim @jeromerobles).

The resulting values in the geometry column in the bcmaps tibble has POLYGON or MULTIPOLYGON which is not in the bcdata tibble:

airzones_bcmaps <- bcmaps::airzones()
#> airzones was updated on NULL
coastal_airzones_bcmaps <- airzones_bcmaps[airzones_bcmaps$Airzone %in% c("Coastal", "Georgia Strait"), ]
coastal_airzones_bcmaps
#> Simple feature collection with 2 features and 1 field
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: 962121.4 ymin: 578792.6 xmax: 1264425 ymax: 835719.9
#> Projected CRS: NAD83 / BC Albers
#> # A tibble: 2 × 2
#>   Airzone                                                               geometry
#>   <chr>                                                           <GEOMETRY [m]>
#> 1 Coastal        MULTIPOLYGON (((1162356 654773.5, 1162224 654672.7, 1162194 65…
#> 2 Georgia Strait POLYGON ((1264304 579248.2, 1264425 579131.4, 1264360 578846.4…
plot(coastal_airzones_bcmaps)

airzones_bcdata <- bcdata::bcdc_get_data(record = 'e8eeefc4-2826-47bc-8430-85703d328516', 
                                         resource = '4cea4b22-36a3-4ea7-9346-7e872e747076')
#> Reading the data using the read_sf function from the sf package.
coastal_airzones_bcdata <- airzones_bcdata[airzones_bcdata$Airzone %in% c("Coastal", "Georgia Strait"), ]
coastal_airzones_bcdata
#> Simple feature collection with 2 features and 1 field
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 485170.8 ymin: 360742.5 xmax: 1264674 ymax: 1260894
#> Projected CRS: Albers
#> # A tibble: 2 × 2
#>   Airzone                                                               geometry
#>   <chr>                                                       <MULTIPOLYGON [m]>
#> 1 Coastal        (((1162224 654672.7, 1162194 654802.2, 1162271 654852.9, 11623…
#> 2 Georgia Strait (((1263164 579153.2, 1264304 579248.2, 1264425 579131.4, 12643…
plot(coastal_airzones_bcdata)

Created on 2023-07-29 with reprex v2.0.2

from bcmaps.

boshek avatar boshek commented on September 7, 2024 1

Ah yes you said as much before. Sorry for not noticing. So the behaviour you see is introduced here:

ret <- sf::st_make_valid(ret)

I do still wonder though if the raw data is still the issue and the raw geometries aren't valid.

from bcmaps.

ateucher avatar ateucher commented on September 7, 2024 1

It looks like it might be all the things!

The resource you were testing against using bcdata is the shapefile resource, whereas bcmaps uses the geojson resource... and apparently they are slightly different - The projections are different (The shapefile is in a global (not BC) Albers projection (EPSG: 9001), while the geojson is in WGS84 (lat long) - this does make sense as that is the standard for geojson, though the shapefile should probably be EPSG:3005 (BC Albers). @boshek is right that it is the st_make_valid() line that is causing the issue, but it seems we can resolve it by switching the order of st_make_valid() and st_transform() here:

https://github.com/bcgov/bcmaps/blob/d09b4a049903dbf487539a6f0893596cea64aaf9/R/get_data.R#L145C1-L146.

library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(bcdata)
#> 
#> Attaching package: 'bcdata'
#> The following object is masked from 'package:stats':
#> 
#>     filter

## The shapefile resource
az_shp <- bcdc_get_data(record = 'e8eeefc4-2826-47bc-8430-85703d328516',
                    resource = '4cea4b22-36a3-4ea7-9346-7e872e747076')
#> Reading the data using the read_sf function from the sf package.

st_crs(az_shp)
#> Coordinate Reference System:
#>   User input: Albers 
#>   wkt:
#> PROJCRS["Albers",
#>     BASEGEOGCRS["GCS_GRS 1980(IUGG, 1980)",
#>         DATUM["D_unknown",
#>             ELLIPSOID["GRS80",6378137,298.257222101,
#>                 LENGTHUNIT["metre",1,
#>                     ID["EPSG",9001]]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["Degree",0.0174532925199433]]],
#>     CONVERSION["unnamed",
#>         METHOD["Albers Equal Area",
#>             ID["EPSG",9822]],
#>         PARAMETER["Latitude of false origin",45,
#>             ANGLEUNIT["Degree",0.0174532925199433],
#>             ID["EPSG",8821]],
#>         PARAMETER["Longitude of false origin",-126,
#>             ANGLEUNIT["Degree",0.0174532925199433],
#>             ID["EPSG",8822]],
#>         PARAMETER["Latitude of 1st standard parallel",50,
#>             ANGLEUNIT["Degree",0.0174532925199433],
#>             ID["EPSG",8823]],
#>         PARAMETER["Latitude of 2nd standard parallel",58.5,
#>             ANGLEUNIT["Degree",0.0174532925199433],
#>             ID["EPSG",8824]],
#>         PARAMETER["Easting at false origin",1000000,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8826]],
#>         PARAMETER["Northing at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8827]]],
#>     CS[Cartesian,2],
#>         AXIS["(E)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]],
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]]
st_is_valid(az_shp)
#> [1]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE

## The resource (geojson) that bcmaps uses in airzones()
az_geojson <- bcdc_get_data(record = 'e8eeefc4-2826-47bc-8430-85703d328516',
                    resource = 'c495d082-b586-4df0-9e06-bd6b66a8acd9')
#> Reading the data using the read_sf function from the sf package.

st_crs(az_geojson)
#> Coordinate Reference System:
#>   User input: WGS 84 
#>   wkt:
#> GEOGCRS["WGS 84",
#>     DATUM["World Geodetic System 1984",
#>         ELLIPSOID["WGS 84",6378137,298.257223563,
#>             LENGTHUNIT["metre",1]]],
#>     PRIMEM["Greenwich",0,
#>         ANGLEUNIT["degree",0.0174532925199433]],
#>     CS[ellipsoidal,2],
#>         AXIS["geodetic latitude (Lat)",north,
#>             ORDER[1],
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         AXIS["geodetic longitude (Lon)",east,
#>             ORDER[2],
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>     ID["EPSG",4326]]
st_is_valid(az_geojson)
#> [1]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE

## Transform first, then make valid:
plot(
  st_make_valid(st_transform(az_shp, 3005)),
  main = "shp: transform, then make valid"
)

plot(st_make_valid(
  st_transform(az_geojson, 3005)
), main = "geojson: transform, then make valid")

## Make valid first, then transform
plot(st_transform(
  st_make_valid(az_shp),
  3005
), main = "shp: make valid, then transform")

plot(st_transform(
  st_make_valid(az_geojson),
  3005
), main = "geojson: make valid, then transform")

st_transform(st_make_valid(az_shp), 3005)
#> Simple feature collection with 7 features and 1 field
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: 272458 ymin: 360742.5 xmax: 1872244 ymax: 1735332
#> Projected CRS: NAD83 / BC Albers
#> # A tibble: 7 × 2
#>   Airzone                                                               geometry
#> * <chr>                                                           <GEOMETRY [m]>
#> 1 Northeast           POLYGON ((1332343 1684516, 1388377 1033256, 1377903 10285…
#> 2 Northwest           POLYGON ((272458 1735332, 365988.2 1719589, 515080.9 1697…
#> 3 Southern Interior   POLYGON ((1565810 831252.7, 1573864 823888.6, 1576626 828…
#> 4 Lower Fraser Valley POLYGON ((1326899 580398.9, 1334263 578097.6, 1334493 569…
#> 5 Central Interior    POLYGON ((962140 835637.9, 952596.4 831566, 950347.4 8339…
#> 6 Coastal             MULTIPOLYGON (((1162194 654802.2, 1162271 654852.9, 11623…
#> 7 Georgia Strait      MULTIPOLYGON (((1252569 578270.3, 1250496 578097.6, 12495…
st_transform(st_make_valid(az_geojson), 3005)
#> Simple feature collection with 7 features and 1 field
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 272458 ymin: 360742.5 xmax: 1872244 ymax: 1735332
#> Projected CRS: NAD83 / BC Albers
#> # A tibble: 7 × 2
#>   Airzone                                                               geometry
#> * <chr>                                                       <MULTIPOLYGON [m]>
#> 1 Northeast           (((1196093 1673266, 1053593 1669516, 962342.7 1669516, 89…
#> 2 Northwest           (((289126.8 1722367, 292830.9 1723294, 291904.9 1709403, …
#> 3 Southern Interior   (((1554878 833784.1, 1552347 839077.1, 1559826 850928.6, …
#> 4 Lower Fraser Valley (((1326761 579850.8, 1325240 580298.3, 1320637 571553.5, …
#> 5 Central Interior    (((968603.7 825480.6, 976012.1 809208.7, 981700.6 808811.…
#> 6 Coastal             (((1162356 654773.5, 1162271 654852.9, 1162194 654802.2, …
#> 7 Georgia Strait      (((1263105 578917.6, 1263259 578792.6, 1264360 578846.4, …

Created on 2023-07-31 with reprex v2.0.2

So I think on the catalogue backend there could be some work done to post topologically valid geometries in the appropriate projections in the various files in the catalogue. And in bcmaps we could test switching the order of those two lines in get_catalogue_data() to make sure it would work across all data sets?

My hypothesis (that I'm likely not going to test any time soon) is that this change is due to newer versions of {sf} using the s2 geometry engine instead of GEOS for geographic operations, which made st_make_valid() and (more likely) st_transform() behave differently than it did in the past.

from bcmaps.

ateucher avatar ateucher commented on September 7, 2024 1

Yep, basically. There's probably some more efficient way to do it, maybe in a qmd and a loop similar to the tests?

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024 1

LGTM, thanks for doing this @ateucher. I will work with the catalogue to get the source airzone map sorted as well.

from bcmaps.

jeromerobles avatar jeromerobles commented on September 7, 2024 1

Thank you all for looking into this. Let me know if you need to get the map creation history, i.e., whether it was generated using QGIS, or ArcGIS, and any types of processing that were applied in its creation. I know in certain air-related maps, boundaries are redefined and new map iterations are made following consultations.

from bcmaps.

ateucher avatar ateucher commented on September 7, 2024 1

It would be nice to know exactly why. Having the order (which seemed arbitrary) change exactly which geometries are returns isn't ideal...

It doesn't change which geometries are returned - they are all still there, it's just that the Coastal airzone, since it contained non-valid geometries, did something weird ("weird" because I can't pretend to know exactly what the geometry engine is doing) when it transformed the CRS via st_transform(). Then making it topologically valid (st_make_valid()) turned it into a strange MULTIPOLYGON of two tiny polygons, which are likely drawing artifacts from when the original airzones were drawn.

library(sf)
library(bcdata)

az_geojson <- bcdc_get_data(record = 'e8eeefc4-2826-47bc-8430-85703d328516',
                            resource = 'c495d082-b586-4df0-9e06-bd6b66a8acd9')

st_is_valid(st_geometry(az_geojson)[[6]], reason = TRUE)
#> [1] "Hole lies outside shell[-126.559102183576 52.5289837004392]"
plot(st_geometry(az_geojson)[[6]])

a <- st_transform(az_geojson, 3005, check = TRUE)

valid_a <- st_make_valid(a)

plot(st_geometry(valid_a), reset = FALSE)

plot(st_cast(st_geometry(valid_a)[[6]], to = "MULTIPOINT"), col = "red", add = TRUE)

Created on 2023-08-14 with reprex v2.0.2

I'm satisfied with this so am going to tidy up the testing code and do a PR

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024 1

FYI, looks like @jeromerobles and @ateucher are listed as the data providers for this layer in the BCDC. So @jeromerobles if you want to fix this issue at the source (the raw data available in the BCDC) then I guess it is up to you or your team to evaluate and re-submit a clean layer.

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024

I can reproduce this issue @jeromerobles. Let me source the data directly from the catalogue to confirm it is not a data source issue as a first step.

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024

Works fine with bcdata (which is the package bcmaps uses under the hood to source the data from the B.C. Data Catalogue) so this is indeed a bug in bcmaps.

#use bcmaps -- missing two zones!
plot(bcmaps::airzones())
#> airzones was updated on NULL

#use bcdata to get shp file -- all zones
plot(bcdata::bcdc_get_data(record = 'e8eeefc4-2826-47bc-8430-85703d328516', 
                            resource = '4cea4b22-36a3-4ea7-9346-7e872e747076'))
#> Reading the data using the read_sf function from the sf package.

#use bcdata to get geojson file -- all zones
plot(bcdata::bcdc_get_data(record =
                'e8eeefc4-2826-47bc-8430-85703d328516', resource =
                'c495d082-b586-4df0-9e06-bd6b66a8acd9'))
#> Reading the data using the read_sf function from the sf package.

Created on 2023-07-29 with reprex v2.0.2

> session_info()
─ Session info ───────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Ventura 13.4.1
 system   aarch64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Vancouver
 date     2023-07-29
 rstudio  2023.06.1+524 Mountain Hydrangea (desktop)
 pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────
 package     * version date (UTC) lib source
 bcdata        0.4.1   2023-03-18 [1] CRAN (R 4.3.0)
 bcmaps        1.2.0   2023-04-11 [1] CRAN (R 4.3.0)

from bcmaps.

boshek avatar boshek commented on September 7, 2024

I think this might be a caching issue. If you tried running:

 bcmaps::delete_cache("airzones.rds")

and then re-running it again, what happens? For me that fixes this.

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024

I did start with a fresh pull using bcmaps (I should have included that above, sorry). No luck for me:

bcmaps::delete_cache("airzones.rds")
plot(bcmaps::airzones())
#> Saving to bcmaps data directory at /Users/stephhazlitt/Library/Caches/org.R-project.R/R/bcmaps
#> Reading the data using the read_sf function from the sf package.

Created on 2023-07-29 with reprex v2.0.2

from bcmaps.

boshek avatar boshek commented on September 7, 2024

I did start with a fresh pull using bcmaps

I should've done the same. Can reproduce now. It was my cache that was invalidated 🤦

from bcmaps.

boshek avatar boshek commented on September 7, 2024

Something weird in the raw data so probably an upstream catalogue issue. I mean it is still there, just not plotting.

library(bcmaps)
#> Loading required package: sf
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
#> Support for Spatial objects (`sp`) was deprecated in {bcmaps} v1.2.0, and will be removed altogether in the Summer 2023. Please use `sf` objects with {bcmaps}.
raw_airzones <- airzones()
#> airzones was updated on NULL

plot(raw_airzones)

(coastal_airzones <- raw_airzones[raw_airzones$Airzone %in% c("Coastal", "Northwest"), ])
#> Simple feature collection with 2 features and 1 field
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: 272458 ymin: 654672.7 xmax: 1196788 ymax: 1735332
#> Projected CRS: NAD83 / BC Albers
#> # A tibble: 2 × 2
#>   Airzone                                                               geometry
#>   <chr>                                                           <GEOMETRY [m]>
#> 1 Northwest POLYGON ((272458 1735332, 365988.2 1719589, 515080.9 1697364, 62991…
#> 2 Coastal   MULTIPOLYGON (((1162356 654773.5, 1162224 654672.7, 1162194 654802.…

plot(coastal_airzones)

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024

So I am clear on DOD re:

test switching the order of those two lines in get_catalogue_data() to make sure it would work across all data sets

This means we switch the code, clear our cache and then pull every shortcut function and see if the resulting map looks correct?

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024

I'll open a ticket with the BC Data Catalogue re: topologically valid geometries in the appropriate projections for this file (and presumably others).

from bcmaps.

ateucher avatar ateucher commented on September 7, 2024

Here's a comparison of all of the maps using shortcuts with the order of the two operations switched. I added this commit to allow this check, we can remove that switch if we agree changing the order is ok. It looks good to me.

library(bcmaps)
#> Loading required package: sf
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
#> Support for Spatial objects (`sp`) was deprecated in {bcmaps} v1.2.0, and will be removed altogether in the Summer 2023. Please use `sf` objects with {bcmaps}.
library(sf)
avail <- available_layers()
fn_names <- avail$layer_name[
  !avail$local &
    !grepl("cded_", avail$layer_name) &
    avail$using_shortcuts
]

fn_names <- setdiff(fn_names, c("bec", "tsa"))
par(mfrow = c(1,2))

for (i in seq_along(fn_names)) {
  delete_cache()
  cat("\n", fn_names[i])
  
  layer <- do.call(fn_names[i], list(ask = FALSE, force = TRUE))
  plot(st_geometry(layer), main = "st_transform then st_make_valid")
  
  delete_cache()
  
  withr::with_envvar(c("MAKEVALID_BEFORE_TRANSFORM" = "true"), {
    layer2 <- do.call(fn_names[i], list(ask = FALSE, force = TRUE))
    plot(st_geometry(layer2), main = "st_make_valid then st_transform")
  })
}
#> 
#>  airzones
#> Reading the data using the read_sf function from the sf package.
#> original order
#> Reading the data using the read_sf function from the sf package.
#> trying the reverse thing
#> 
#>  bc_cities
#> original order

#> trying the reverse thing
#> 
#>  census_dissemination_area
#> original order

#> trying the reverse thing
#> 
#>  census_division
#> original order

#> trying the reverse thing
#> 
#>  census_economic
#> original order

#> trying the reverse thing
#> 
#>  census_metropolitan_area
#> original order

#> trying the reverse thing
#> 
#>  census_subdivision
#> original order

#> trying the reverse thing
#> 
#>  census_tract
#> original order

#> trying the reverse thing
#> 
#>  ecoprovinces
#> original order

#> trying the reverse thing
#> 
#>  ecoregions
#> original order

#> trying the reverse thing
#> 
#>  ecosections
#> original order

#> trying the reverse thing
#> 
#>  gw_aquifers
#> original order

#> trying the reverse thing
#> 
#>  health_chsa
#> original order

#> trying the reverse thing
#> 
#>  health_ha
#> original order

#> trying the reverse thing
#> 
#>  health_hsda
#> original order

#> trying the reverse thing
#> 
#>  health_lha
#> original order

#> trying the reverse thing
#> 
#>  hydrozones
#> original order

#> trying the reverse thing
#> 
#>  municipalities
#> original order

#> trying the reverse thing
#> 
#>  nr_areas
#> You are accessing a simplified view of the data - see the catalogue record for details.
#> original order

#> You are accessing a simplified view of the data - see the catalogue record for details.
#> trying the reverse thing
#> 
#>  nr_districts
#> You are accessing a simplified view of the data - see the catalogue record for details.
#> original order

#> You are accessing a simplified view of the data - see the catalogue record for details.
#> trying the reverse thing
#> 
#>  nr_regions
#> You are accessing a simplified view of the data - see the catalogue record for details.
#> original order

#> You are accessing a simplified view of the data - see the catalogue record for details.
#> trying the reverse thing
#> 
#>  regional_districts
#> original order

#> trying the reverse thing
#> 
#>  water_districts
#> original order

#> trying the reverse thing
#> 
#>  water_precincts
#> original order

#> trying the reverse thing
#> 
#>  wsc_drainages
#> original order

#> trying the reverse thing

Created on 2023-08-11 with reprex v2.0.2

from bcmaps.

boshek avatar boshek commented on September 7, 2024

It would be nice to know exactly why. Having the order (which seemed arbitrary) change exactly which geometries are returns isn't ideal...

Am I reading this right that the only one this affected was the airzones?

from bcmaps.

stephhazlitt avatar stephhazlitt commented on September 7, 2024

From a manual/visual inspection, that is all I can see @boshek.

from bcmaps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.