Giter Club home page Giter Club logo

geofi's People

Contributors

aa-m-sa avatar antagomir avatar dieghernan avatar janikmiet avatar jlehtoma avatar muuankarski avatar olivroy avatar ouzor avatar pitkant avatar sampoves avatar statguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geofi's Issues

Improve wfs_api() error reporting

Currently wfs_api() passes the provided WFS URL as-is and expects a correct response. However, it is not robust against faulty URLs (404), timeouts (408) or any other unexpected situations. It would be good if it did. In principle this would be up to ows4R, but nut sure at this point how it deals with errors.

geofi::get_municipalities returns municipality codes and region codes without leading zeroes

According to Statistics Finland, municipality codes always have three digits and leading zeroes are used where needed. The same is true for regions, except there only two digits are used.

geofi::get_municipalities outputs these codes without the leading zeroes. These are the column names: kunta, maakunta_code, and municipality_code.

As per Statistics Finland:
https://www2.tilastokeskus.fi/fi/luokitukset/kunta/
https://www2.tilastokeskus.fi/fi/luokitukset/maakunta/

WFS connection

Hello!
I am trying to build some maps, but I am getting the following error:

image

I have no clue on how to proceed here, or what is going on.
any help will be very useful!

Thank you

Implement tests

One option is to use httptest, but does it work with ows2r? Let's find out! ([EDIT] Why yes, yes it does).

Rename wfs()?

I don't recall our last discussion on the function naming conventions exactly, but I don't think wfs() is a very good name for the function after all. To me, wfs implies an abstraction of a WFS, which is what the different classes and methods in ows4R do. wfs() is rather a generic getter function for data available through a WFS, so maybe get_wfs_layer() etc. would be better. Any comments?

Mappings of municipalities to other region types for history data

It would be useful in many cases to have mappings of municipality -> region and municipality -> subregion(seutukunta) from years much earlier than now. E.g. register-based data from HILMO care register contains municipality data from early 1970s, while currently geofi contains mappings for the 2013-2020.

I wonder if some resource/database already holds the borderline data of municipalities for 1970/1972-2012. Personally I had to use data from here in my past study: Kuntaliitokset wikipedia.

For weighting the study data with population data the borderline data are not necessarily needed so it would be useful to have more rough data (=no borderlines) available as well. That kind of mapping tables could be build based on the link I provided.

`get_municipalities` argument `codes_as_character` seems to not work

Hello,

It would seem to me that the argument codes_as_character for function get_municipalities does not work in geofi_1.0.9:

codes_as_character is FALSE

> muns1 <- geofi::get_municipalities(codes_as_character = FALSE) %>% 
+   dplyr::select(kunta)
> 
Requesting response from: http://geo.stat.fi/geoserver/wfs?service=WFS&version=1.0.0&request=getFeature&typename=tilastointialueet%3Akunta4500k_2023
Data is licensed under: Attribution 4.0 International (CC BY 4.0)
Warning message:
Coercing CRS to epsg:3067 (ETRS89 / TM35FIN) 
> 
> muns1
Simple feature collection with 309 features and 1 field
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 83747.59 ymin: 6637032 xmax: 732907.7 ymax: 7776431
Projected CRS: ETRS89 / TM35FIN(E,N)
First 10 features:
   kunta                           geom
1      5 MULTIPOLYGON (((366787.9 70...
2      9 MULTIPOLYGON (((382543.4 71...
3     10 MULTIPOLYGON (((343298.2 69...
4     16 MULTIPOLYGON (((436139.7 67...
5     18 MULTIPOLYGON (((426631 6720...
6     19 MULTIPOLYGON (((263938.3 67...
7     20 MULTIPOLYGON (((328844.1 67...
8     35 MULTIPOLYGON (((176190.4 67...
9     43 MULTIPOLYGON (((92735.28 67...
10    46 MULTIPOLYGON (((600317.4 69...
> 
> sapply(muns1, class)
$kunta
[1] "integer"

$geom
[1] "sfc_MULTIPOLYGON" "sfc"

codes_as_character is TRUE

> muns2 <- geofi::get_municipalities(year = 2022, codes_as_character = TRUE) %>% 
+   dplyr::select(kunta)
Requesting response from: http://geo.stat.fi/geoserver/wfs?service=WFS&version=1.0.0&request=getFeature&typename=tilastointialueet%3Akunta4500k_2022
Data is licensed under: Attribution 4.0 International (CC BY 4.0)
Warning message:
Coercing CRS to epsg:3067 (ETRS89 / TM35FIN) 
> 
> muns2
Simple feature collection with 309 features and 1 field
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 83747.59 ymin: 6637032 xmax: 732907.7 ymax: 7776431
Projected CRS: ETRS89 / TM35FIN(E,N)
First 10 features:
   kunta                           geom
1      5 MULTIPOLYGON (((366787.9 70...
2      9 MULTIPOLYGON (((382543.4 71...
3     10 MULTIPOLYGON (((343298.2 69...
4     16 MULTIPOLYGON (((436139.7 67...
5     18 MULTIPOLYGON (((426631 6720...
6     19 MULTIPOLYGON (((263938.3 67...
7     20 MULTIPOLYGON (((328844.1 67...
8     35 MULTIPOLYGON (((176190.4 67...
9     43 MULTIPOLYGON (((92735.28 67...
10    46 MULTIPOLYGON (((600317.4 69...
> 
> sapply(muns2, class)
$kunta
[1] "integer"

$geom
[1] "sfc_MULTIPOLYGON" "sfc"

Changing the argument value does not introduce leading zeroes to the column kunta and it does not change the column type to character.

My session:

> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=Finnish_Finland.utf8  LC_CTYPE=Finnish_Finland.utf8    LC_MONETARY=Finnish_Finland.utf8 LC_NUMERIC=C                    
[5] LC_TIME=Finnish_Finland.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] geofi_1.0.9    stringr_1.5.0  stringi_1.7.12 readxl_1.4.2   dplyr_1.1.2   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10        cellranger_1.1.0   pillar_1.9.0       compiler_4.2.2     class_7.3-20       tools_4.2.2        odbc_1.3.4        
 [8] digest_0.6.31      bit_4.0.5          lifecycle_1.0.3    tibble_3.2.1       pkgconfig_2.0.3    rlang_1.1.0        DBI_1.1.3         
[15] cli_3.6.0          writexl_1.4.2      curl_5.0.0         yaml_2.3.7         e1071_1.7-12       withr_2.5.0        httr_1.4.5        
[22] xml2_1.3.3         generics_0.1.3     vctrs_0.6.2        hms_1.1.2          classInt_0.4-9     bit64_4.0.5        grid_4.2.2        
[29] tidyselect_1.2.0   glue_1.6.2         sf_1.0-12          R6_2.5.1           fansi_1.0.4        purrr_1.0.1        blob_1.2.3        
[36] magrittr_2.0.3     ellipsis_0.3.2     units_0.8-1        httpcache_1.2.0    utf8_1.2.2         KernSmooth_2.23-20 proxy_0.4-27

Additionally, what's peculiar is that the command geofi::get_municipalities(codes_as_character = FALSE) works without any specific year, but codes_as_character = TRUE requires an explicit year argument: geofi::get_municipalities(year = 2022, codes_as_character = TRUE). This is obviously a separate matter, will be opening an issue for it too if I find the time.

CRAN checks

Test the pkg with --as-cran on command line to ensure CRAN compatibility.

Error: Topics missing from index: municipality_key_2022

There seems to be a persistent error with pkgdown related to topics missing from index, which prevents new pkgdown site from generating. A recent example: https://github.com/rOpenGov/geofi/runs/4936970331?check_suite_focus=true

>-- Building function reference -------------------------------------------------
> 69Error: Error: Topics missing from index: municipality_key_2022
> 70 Backtrace:
> 71 █
> 72 1. └─pkgdown::deploy_to_branch(new_process = FALSE)
> 73 2.   └─pkgdown::build_site_github_pages(pkg, ..., clean = clean)
> 74 3.     └─pkgdown::build_site(...)
> 75 4.       └─pkgdown:::build_site_local(...)
> 76 5.         └─pkgdown::build_reference(...)
> 77 6.           └─pkgdown::build_reference_index(pkg)
> 78 7.             ├─pkgdown::render_page(...)
> 79 8.             │ └─pkgdown:::render_page_html(pkg, name = name, data = data, depth = depth)
> 80 9.             │   └─utils::modifyList(data_template(pkg, depth = depth), data)
> 81 10.             │     └─base::stopifnot(is.list(x), is.list(val))
> 82 11.             └─pkgdown:::data_reference_index(pkg)
> 83 12.               └─pkgdown:::check_missing_topics(rows, pkg)
> 84 11 Removing worktree 11111111111111111111111111111111111111111111111111111111111
> 85 Running git worktree remove /tmp/Rtmpa4n0lp/file33b93695d10f
> 86 Execution halted
> 87 Error: Process completed with exit code 1.

Any ideas on how to fix this? There are some related issues in pkgdown repo from 2019 but also from recent days:
r-lib/pkgdown#1132
r-lib/pkgdown#1716
r-lib/pkgdown#1951
r-lib/pkgdown#1958

Changing dataset help files to @keywords internal might not be the preferred solution, no?

Add postinumerotiedostot by Posti

Postinumeropalvelut ovat maksuttomia tiedostoja, jotka tarjoavat ajantasaista tietoa Suomen kuntien ja postinumeroalueiden postinumeroista sekä osoitteistosta. Voit yhdistää ne yrityksesi osoitetietokantoihin ja helpottaa postituslistojen ja osoiterekisterien ylläpitoa. Tiedostoissa ei ole kadunnimiä, kartta- tai muita paikkatietoja.

https://www.posti.fi/webpcode/

Example

write here

Implement cache for wfs_api()

Rudimentary implementation for getting data over WFS already exists in function wfs(). This function also needs a cache, I suggest using the R.cache package. Cached entities are the GML objects returned from the WFS. Decision need to be done regarding cache policies such as expiration time, flushing mechanism etc.

Travis-CI to GitHub Actions

Old Travis-CI.org service will be shut down after 31th of December, requiring either action to migrate to the commercial Travis-CI.com service or migrating to a new CI altogether. An rOpenSci blogpost describes the pros of moving to GitHub Actions. The process is as follows:

  1. run: usethis::use_github_action_check_standard()
  2. Check if package tests contain skip_on_travis -functions and replace them with skip_on_ci
  3. Add a badge to the package README: [![R build status](https://github.com/rOpenGov/PACKAGENAME/workflows/R-CMD-check/badge.svg)](https://github.com/rOpenGov/PACKAGENAME/actions)
  4. Remove travis.yml (and appveyor.yml)
  5. Remove Travis CI badge (and Appveyor badge, since GHA runs tests simultaneously for Linux, macOS and Windows)

Consider VRK:n rakennusten osoitetiedot ja äänestysalueet -data

Väestörekisterikeskus publishes annually data containing all buildings in Finland. Data is zipped delimited file with .OPT-extension and has 3,6 million rows. It can be read and processed in R (slowly) with following code:

# 2019
library(dplyr)
library(sp)
library(sf)
tmpfile <- tempfile()
tmpdir <- tempdir()
download.file("https://www.avoindata.fi/data/dataset/cf9208dc-63a9-44a2-9312-bbd2c3952596/resource/ae13f168-e835-4412-8661-355ea6c4c468/download/suomi_osoitteet_2019-05-15.zip",
              destfile = tmpfile)
unzip(zipfile = tmpfile,
      exdir = tmpdir)

opt <- read.csv(glue::glue("{tmpdir}/Suomi_osoitteet_2019-05-15.OPT"), 
                sep = ";", 
                stringsAsFactors = FALSE, 
                header = FALSE)

names(opt) <- c("rakennustu","sijaintiku",
                "sijaintima","rakennusty",
                "CoordY","CoordX",
                "osoitenume", "katunimi_f",
                "katunimi_s", "katunumero",
                "postinumer", "vaalipiirikoodi",
                "vaalipiirinimi","tyhja",
                "idx", "date")
if (F){ # subsetting just to make conversions faster
opt_orig <- as_tibble(opt)
opt <- sample_n(opt_orig, size = 2000)
}

opt$katunimi_f <- iconv(opt$katunimi_f, from = "windows-1252", to = "UTF-8")
opt$katunimi_s <- iconv(opt$katunimi_s, from = "windows-1252", to = "UTF-8")
opt$katunumero <- iconv(opt$katunumero, from = "windows-1252", to = "UTF-8")
opt$vaalipiirinimi <- iconv(opt$vaalipiirinimi, from = "windows-1252", to = "UTF-8")

sp.data <- SpatialPointsDataFrame(opt[, c("CoordX", "CoordY")], 
                                  opt, 
                                  proj4string = CRS("+init=epsg:3067"))

# Project the spatial data to lat/lon
# sp.data <- spTransform(sp.data, CRS("+proj=longlat +datum=WGS84"))

shape <- st_as_sf(sp.data)

st_coordinates(shape)

# shape %>% select(rakennustu) %>% plot()

saveRDS(shape, file=paste0("./sf19_buildings.RDS"))

Any ideas how to incorporate this with geofi. It is useful for instance when geocoding sensitive addresses.

However, this would require a storage as the data should be preprocessed. Do you think this as a suitable data for geofi and should we create a data repo such as geofi_data?

Data catalog

Data catalogue: should this be maintained as part of the package; which data sets to include? Broad coverage, or selective?

municipality_key data seems to hold duplicates for year 2016

#install.packages("geofi")
#trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/geofi_1.0.0.tgz'

data("municipality_key")
select(municipality_key, kunta, year, municipality_name_fi) %>%
arrange(kunta, year)

A tibble: 2,828 x 3
kunta year municipality_name_fi

1 5 2013 Alajärvi
2 5 2014 Alajärvi
3 5 2015 Alajärvi
4 5 2016 Alajärvi
5 5 2016 Alajärvi

6 5 2017 Alajärvi

#install.packages("geofi")
#trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/geofi_1.0.0.tgz'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.