hunzikp / velox Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://hunzikp.github.io/velox/
Home Page: https://hunzikp.github.io/velox/
Hello,
While recently re-running a script utilizing the incredible power of velox
, I noticed a dramatic slow down in the execution of this script. It appears that in-between the package velox
has been update from version 0.1.0
to 0.2.0
.
Basically, the script takes a raster of Digital Elevation Model (DEM) in GeoTIFF format and extract the mean elevation in 9 squared windows around stations. As a result, a matrix of Nx9 mean elevations.
Please find below an simple example summarizing the problem. I hope it might help you to solve this issue.
libs <- c("raster", "rgeos", "rgdal", "sp", "velox", "profvis")
lapply(libs, library, character.only = TRUE)
set.seed(42)
# Large projected raster, mimicking a
dem <- raster(nrows = 12094, ncols = 12057, xmn = -331805.4, ymn = -47.6398, xmx = 778385.9, ymx = 1113551, crs = "+proj=omerc +lat_0=4 +lonc=102.25 +alpha=323.0257964666666 +k=0.99984 +x_0=804671 +y_0=0 +no_uoff +gamma=323.1301023611111 +ellps=GRS80 +units=m +no_defs", resolution = c(92.07857, 92.07857))
dem <- setValues(dem, 1:ncell(dem))
# One random spatial point, can be more
nb <- 1
p <- as(sampleRandom(dem, nb, xy = TRUE, sp = TRUE), "SpatialPoints")
dem.velox <- velox(dem)
# Side of the main window in meters
R <- 11000
# Half side of sub-window, i.e. R/(3*2)
r <- R/6
# Repeating the coordinates of the point
P <- matrix(rep(p@coords, each = 9), 9, 2)
# Creating the matrix to shift the original coordinates
# This will give the coordinates of the centroid of each polygon
S <- matrix(c(-2, 2, 0, 2, 2, 2, -2, 0, 0, 0, 2, 0, -2, -2, 0, -2, 2, -2), 9, 2, byrow = TRUE)
# Matrix of centroids (original point + 8 new centroids)
centroids <- P + S*r
centroids <- SpatialPoints(centroids)
projection(centroids) <- projection(p)
# Finally the 9 sub-windows from which the means will be extracted
buffers <- gBuffer(centroids, width = r, byid = TRUE, quadsegs = 1, capStyle = "SQUARE")
# Illustration
plot(buffers)
points(p, pch = "+", cex = 2)
points(centroids)
velox
version 0.1.0
system.time(dem.velox$extract(buffers, mean))
user system elapsed
0.072 0.000 0.073
# Profiling the extraction
profvis(dem.velox$extract(buffers, mean))
The extraction of 9 polygons takes ~40ms to run.
velox
version 0.2.0
system.time(dem.velox$extract(buffers, mean))
user system elapsed
3.140 15.828 18.972
# Profiling the extraction
profvis(dem.velox$extract(buffers, mean))
The extraction of 9 polygons takes ~16000ms to run, due to boost
, boost.VeloxRaster
, boostFactory$makePointGrid
and .External
taking long time to run.
Hello,
I'm trying to write my velox Raster to a .tif file and recieve the following message:
vLC2$write("test.tif", overwrite = TRUE)
Error in if (is_float32) { : missing value where TRUE/FALSE needed
Hello,
I'm trying to extract raster data from a grid of points (right now, n=8), but I keep getting am ambiguous error message, and I don't have any idea why. This seems like a fairly straightforward operation with Velox. I guess it could be something wrong with my code? I have followed the vignette pretty closely, however.
Any help you could offer would be greatly appreciated!
#----Loading NLCD raster----
nlcd=raster(file_name)
#----Generating overall outline of selected states----
states.full = c("Maine", "New Hampshire", "Vermont", "Massachusetts",
"Rhode Island", "Connecticut", "New York", "Pennsylvania",
"New Jersey")
us = raster::getData('GADM', country = 'US', level = 1)
#----Generating regular grid within that outline----
st.contour <- us[us$NAME_1 %in% states.full,]
st.contour = spTransform(st.contour, CRS("+proj=laea +x_0=0 +y_0=0 +lon_0=-74 +lat_0=40 +units=m"))
grid <- makegrid(st.contour, cellsize = 250000) # cell-size in +units=m
grid <- SpatialPoints(grid, proj4string = CRS(proj4string(st.contour)))
date() ; grid <- grid[st.contour, ] ; date()
#----Cropping the nlcd raster----
grid.r = spTransform(grid, crs=crs(nlcd))
grid.r@bbox = matrix(c(1550000.0, 2550000.0, -434099.8, 722809.3), 2,2, byrow=T) # Extending bbox so all buffers fall on raster
nlcd.crop = crop(nlcd, grid.r)
nlcd.vx <- velox(nlcd.crop)
# I have also tried: nlcd.vx = velox(stack(nlcd.crop))
#----Creating polygon buffers----
spol <- gBuffer(grid.r, width=500, byid=TRUE)
spdf <- SpatialPolygonsDataFrame(spol, data.frame(id=1:length(spol)), FALSE)
#----Velox Extract----
date()
ex.mat <- nlcd.vx$extract(spdf)
ex.mat
date()
Error in boostFactory$makePointGrid(origin = origin, dim = dim, res = res) :
std::bad_alloc
Does anybody know how to contact the author of this package? He is being unresponsive to issue requests on GitHub and not responding to CRAN regarding check errors. CRAN maintainers are getting ready to archive the velox package from the repositories.
From CRAN
"We have repeatedly asked for an update fixing the check problems with no reply from the maintainer thus far. Thus, package velox is now scheduled for archival on 2020-03-17, and archiving this will necessitate also archiving its strong reverse dependencies."
The package will not install on my device. I am running Windows 10 and trying to install the package on R 4.0.0. I receive the following error message:
Error: Failed to install 'velox' from GitHub:
(converted from warning) installation of package ‘C:/Users/KJXMO/AppData/Local/Temp/Rtmp888zFR/file217c43367f7b/velox_0.2.0.9002.tar.gz’ had non-zero exit status
Any advice on how to get the package to install will be highly appreciated.
Hello,
I wonder if you have though to create the function raster::mask in velox package, I would be very grateful for having this function in velox, because it will be faster than mask from raster.
Thank you so much
Best regards,
Fabio Castro
Hi. First up, I think velox is a great and much-needed initiative, kudos there! It has the potential to speed up a bunch of my tasks.
Now I have buttered you up, is there any plan to include raster arithmetic like simple +,-,*,/ ? Maybe even a function like raster::calc? This would be really amazing.
Or do you have a work around which will still be faster than using the raster package?
Your rasterize function is amazing and has sped up my workflow immensely. Can the same principle be applied to going the other way, which often takes even longer?
Hi,
in pointextract.cpp
there is an input NumericVector dim
which is unused. Wondered if it should be removed for tidiness?
It looks like aggregate is not allowing for NAs? It would be great if it were possible to have a na.rm=TRUE
argument.
mat <- matrix(c(1,2,3,NA), 2, 2)
vx <- velox(mat, extent=c(0,1,0,1), res=c(0.5,0.5), crs="+proj=longlat +datum=WGS84 +no_defs")
plot(vx$as.RasterLayer())
vx$aggregate(factor=c(2,2), aggtype="sum")
Thanks!
After having updated R to version 4.0, it seems that velox is not available. Any chance to upgrade the package? Thanks
I noticed this issue when I was extracting raster data for 6 m radius plots with raster resolution from 1 to 30 meters. At the coarser resolution some of the polygons do not coincide with any pixel centers. Thus, the function (e.g. mean or median) returns NA:
# Reproducible example:
library(velox)
library(raster)
## Make VeloxRaster
mat <- matrix(1:100, 10, 10)
extent <- c(0,1,0,1)
vx <- velox(mat, extent=extent, res=c(0.1,0.1), crs="+proj=longlat +datum=WGS84 +no_defs")
## Make SpatialPolygonsDataFrame
library(sp)
library(rgeos)
set.seed(0)
coords <- cbind(runif(10, extent[1], extent[2]), runif(10, extent[3], extent[4]))
sp <- SpatialPoints(coords)
# Default example
# from https://cran.r-project.org/web/packages/velox/README.html
spol_norm <- gBuffer(sp, width=0.2, byid=TRUE)
spdf_norm <- SpatialPolygonsDataFrame(spol_norm, data.frame(id=1:length(spol_norm)), FALSE)
# Smaller buffer
spol_small<- gBuffer(sp, width=0.05, byid=TRUE)
spdf_small <- SpatialPolygonsDataFrame(spol_small, data.frame(id=1:length(spol_small)), FALSE)
plot(spdf_norm); par(new=F)
plot(spdf_small)
## Extract values and calculate mean, see results
(ex.mat.norm <- vx$extract(spdf_norm, fun=median))
(ex.mat.small <- vx$extract(spdf_small, fun=median)) # -> 3 NA's
# Check problematic polygons
r <- raster(mat)
plot(r, main="Problematic polygons")
lines(spdf_small)
lines(spdf_small[5:7,], col="red") # Plots NAs in red
I can't install velox_0.2.0.tar.gz in rstudio, the error is: fatal error: 'math.h' file not found.
I use macOS Catalina, I tried to fix this issue by run "xcode-select --install" or "xcode-select --reset" in terminal but didn't work.
Please help.
When trying to install the package on my MacBook Pro, I get the following error:
> install_github("hunzikp/velox")
Downloading GitHub repo hunzikp/velox@master
from URL https://api.github.com/repos/hunzikp/velox/zipball/master
Installing velox
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL \
'/private/var/folders/_z/01gg71zs19g816v6m2dddt8w0000gn/T/RtmpuyMPbb/devtools55ad1309c763/hunzikp-velox-ff7bc3c' \
--library='/Users/thiago/Documents/R-packages' --install-tests
* installing *source* package ‘velox’ ...
** libs
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c aggregate.cpp -o aggregate.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c boostgeom.cpp -o boostgeom.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c coordinates.cpp -o coordinates.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c focal.cpp -o focal.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c hittest.cpp -o hittest.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c im2col.cpp -o im2col.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c median.cpp -o median.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c pointextract.cpp -o pointextract.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c rasterize.cpp -o rasterize.o
clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/thiago/Documents/R-packages/Rcpp/include" -I"/Users/thiago/Documents/R-packages/BH/include" -I/usr/local/include -fPIC -Wall -g -O2 -c write_helper.cpp -o write_helper.o
clang++ -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o velox.so RcppExports.o aggregate.o boostgeom.o coordinates.o focal.o hittest.o im2col.o median.o pointextract.o rasterize.o write_helper.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /Users/thiago/Documents/R-packages/velox/libs
** R
** tests
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
Error: package or namespace load failed for ‘velox’ in .doLoadActions(where, attach):
error in load action .__A__.1 for package velox: Rcpp::loadModule(module = "BOOSTGEOM", what = TRUE, env = ns, : Unable to load module "BOOSTGEOM": vector is too large
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Users/thiago/Documents/R-packages/velox’
Installation failed: Command failed (1)
If I install it from CRAN, I get the same error when loading it.
Any ideas on what can be wrong? My session info below:
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] devtools_1.13.6 gridExtra_2.3 rasterVis_0.45 latticeExtra_0.6-28 RColorBrewer_1.1-2 lattice_0.20-35 raster_2.6-7
[8] sp_1.3-1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 git2r_0.23.0 pillar_1.3.0 compiler_3.5.1 class_7.3-14 tools_3.5.1 digest_0.6.15
[8] memoise_1.1.0 gtable_0.2.0 tibble_1.4.2 lubridate_1.7.4 jsonlite_1.5 viridisLite_0.3.0 pkgconfig_2.0.1
[15] rlang_0.2.1 rstudioapi_0.7 DBI_1.0.0 curl_3.2 yaml_2.2.0 rgdal_1.3-4 parallel_3.5.1
[22] hexbin_1.27.2 geojson_0.2.0 geojsonio_0.6.0 spData_0.2.9.3 e1071_1.7-0 withr_2.1.2 jqr_1.0.0
[29] stringr_1.3.1 httr_1.3.1 hms_0.4.2 rgeos_0.3-28 geojsonlint_0.2.0 classInt_0.2-3 grid_3.5.1
[36] jsonvalidate_1.0.0 sf_0.6-3 R6_2.2.2 foreign_0.8-71 readr_1.1.1 magrittr_1.5 rmapshaper_0.4.0
[43] units_0.6-0 maptools_0.9-3 V8_1.5 stringi_1.2.4 lazyeval_0.2.1 crayon_1.3.4 zoo_1.8-3
Several focal function are available in Velox, such as $meanFocal
, $medianFocal
and $sumFocal
. Usually in statistical analysis other basic function such as standard deviation, min and max are need. Any plans to include these in the future?
"Error in out[p, k] <- fun(valmat[, k]) :
number of items to replace is not a multiple of replacement length"
I'm trying to do table statistics for a velox raster object using "VeloxRaster_extract"
It would be nice to be able to retrieve either the values from the extraction directly or be able to pass into "VeloxRaster_extract" a function that returns more than one value.
Sample code something like..
s.x <- velox(some.stack)
s.p <- spdf(some.spdf)
s.x$extract(s.p, fun=table)
throws the error.
The documentation says fun must be function accepting a numeric vector as its sole input, so "table" should work. Issue is that the function expects to only return one value as its output.
It would be nice to include a na.rm
argument for the velox extraction method (see ?VeloxRaster_extract
). Although the latter performs considerably faster as compared to raster::extract
, I consider the lack of a proper handling of missing values a major drawback.
For example, the below code taken from gis-stackexchange works fine using a raster-based approach (i.e., extract
) along with na.rm = TRUE
:
library(raster)
## get country geometry
if (!dir.exists("data")) dir.create("data")
aut <- getData("GADM", country = "AUT", level = 2, path = "data")
## download worldclim data
path <- "data/worldclim"
if (!dir.exists(path)) dir.create(path)
bio <- getData("worldclim", var = "bio", res = 10, path = path)
## extract values
val <- extract(bio, aut, fun = mean, na.rm = TRUE)
val[1:5, 1:5]
[1,] 99.0 97.00 31.00 7392.00 262.00
[2,] 94.5 96.50 32.00 7305.25 256.00
[3,] 93.0 97.50 31.50 7362.50 253.00
[4,] 90.0 98.00 31.00 7390.00 251.50
[5,] 91.0 96.00 32.00 7278.00 252.00
Unfortunately, NAs are not omitted using velox
(regardless of the fact that it performs amazingly fast):
library(velox)
vlx <- velox(bio)
xtr <- vlx$extract(aut, fun = function(x) mean(x, na.rm = TRUE))
xtr[1:5, 1:5]
[1,] 99 97.0 31.0 7392.0 262.0
[2,] NA NA NA NA NA
[3,] 93 97.5 31.5 7362.5 253.0
[4,] 90 98.0 31.0 7390.0 251.5
[5,] 91 96.0 32.0 7278.0 252.0
Here is my sessionInfo
:
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] velox_0.1.0.9000 rgeos_0.3-22 raster_2.5-8 sp_1.2-4 devtools_1.12.0
loaded via a namespace (and not attached):
[1] httr_1.2.1 R6_2.2.0 rgdal_1.2-5 tools_3.3.2 withr_1.0.2 curl_2.3 Rcpp_0.12.9
[8] memoise_1.0.0 grid_3.3.2 git2r_0.18.0 digest_0.6.11 lattice_0.20-34
I tried installing velox in Mac and get these errors during compilation
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/gethostuuid.h:39:17: error: C++ requires a type specifier for all declarations
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:662:27: error: unknown type name 'uuid_t'; did you mean 'uid_t'?
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:664:27: error: unknown type name 'uuid_t'; did you mean 'uid_t'?
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:727:31: error: unknown type name 'uuid_t'; did you mean 'uid_t'?
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:729:31: error: unknown type name 'uuid_t'; did you mean 'uid_t'?
5 errors generated.
make: *** [boostgeom.o] Error 1
ERROR: compilation failed for package ‘velox’
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] udunits2_0.13 BH_1.72.0-3 raster_3.0-12 sp_1.4-1 sf_0.8-1 devtools_2.2.2 usethis_1.5.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4 compiler_3.6.1 prettyunits_1.1.1 remotes_2.1.1 class_7.3-15 tools_3.6.1
[7] testthat_2.3.2 digest_0.6.25 pkgbuild_1.0.6 pkgload_1.0.2 lattice_0.20-38 memoise_1.1.0
[13] rlang_0.4.5 cli_2.0.2 DBI_1.1.0 rstudioapi_0.11 curl_4.3 e1071_1.7-3
[19] withr_2.1.2 desc_1.2.0 fs_1.3.1 classInt_0.4-2 rprojroot_1.3-2 grid_3.6.1
[25] glue_1.3.1 R6_2.4.1 processx_3.4.2 fansi_0.4.1 sessioninfo_1.1.1 callr_3.4.2
[31] magrittr_1.5 codetools_0.2-16 backports_1.1.5 ps_1.3.2 ellipsis_0.3.0 units_0.6-7
[37] assertthat_0.2.1 KernSmooth_2.23-15 crayon_1.3.4
I checked the dependencies and I installed udunits ... I am not really sure how to proceed now as I really do not get the errors' meaning. Any hint?
Hey there,
Very nice package.
Currently only SpatialPolygons are supported by the extract function. Are there any plans to allow the input of SpatialPoints as well?
I am working with RasterStacks and want to process individual coordinates.
Cheers,
Martin
Hi,
I may be well off on this as I'm new to R but I want to be able to crop and keep the original. In the raster package I do this
buffer_osg <- gBuffer( postcode_point, width=500, byid=TRUE)
buffer_wgs84 <- spTransform(buffer_osg, LATLON_CRS)
cropped <- crop(ndvi_raster, buffer_wgs84),
vals <- extract(cropped, buffer_wgs84)
And it takes about a second on a raster of dimension : 13561, 26809, 363556849 (nrow, ncol, ncell)
But I need to do it 90,000+ times on a sequence of 10 rasters so faster would be nicer
With Velox I do
vx1$extract(buffer_wgs84)
Which take 30+ seconds so I try cropping first (like with raster)
vx1$crop(extent(buffer_wgs84))
Which is really quick but I lose the original raster as it treats vx1 as mutable.
Is there a way to crop and get the results of the crop and keep the original to use next time around in the loop?
Sorry if this is a dumb question and thanks for sharing your work.
Roger
The sf
package is the R implementation of Simple Features and starts to be a new standard for working with spatial data in R. More information at https://github.com/edzer/sfr and http://robinlovelace.net/geocompr/spatial-class.html.
@hunzikp do you plan to add a support for the sf
objects?
when trying to run your benchmark, I get:
> ## Import raster into PostGIS and index
> dbSendQuery(con, "DROP TABLE IF EXISTS raster.large;")
NOTICE: schema "raster" does not exist, skipping
I'm trying to speed up a summarize by polygon function using your package, but I work with extremely large rasters, generally 2-3gb geoTiffs.
When I try to convert a raster stack object to a velox object, I get ye' olde warning "Can not allocate vector of size N when working with large rasters.", I assume because velox is trying to place the entire uncompressed tif into memory (huge).
Its way beyond my skillset, but it would be amazing to be able to somehow use the velox functions on very large rasters.
I'm cross posting this from the GIS Stack Exchange: https://gis.stackexchange.com/questions/314950/all-geotiff-raster-values-na-after-converting-to-velox
I'm able to extract average tree cover values for each of the three small counties in the US State of Delaware, but I don't seem to have enough CPU or RAM to do the same thing for all counties in the US.
I'm getting my tree cover data from https://prd-tnm.s3.amazonaws.com/StagedProducts/Small-scale/data/LandCover/tree48i0100a.tif_nt00840.tar.gz. See also https://www.sciencebase.gov/catalog/item/581d0548e4b08da350d52653
library(raster)
library(tigris)
tigris.state.counties <- counties(state = 10)
tree.raster <-
raster("tree48i0100a.tif")
polygon.trees <-
extract(tree.raster,
tigris.state.counties,
fun = mean,
na.rm = TRUE)
report <-
cbind.data.frame(tigris.state.counties$NAME, polygon.trees)
print(report)
tigris.state.counties$NAME polygon.trees
1 New Castle 70.75920
2 Sussex 68.17089
3 Kent 67.00225
So I have set up an R environment on a 64 GB AWS instance, and I'm trying to take advantage of the velox
fast extraction. But all of the values I extract are NA
.
What am I doing wrong?
UPDATE: I still get NAs even if I use the small polygon option
FWIW, I was able to replicate the [velox extract tutorial][2] and even modify it to get the night light intensity in the Delaware counties.
My attempt to extract tree cover in Delaware counties with velox:
library(velox)
library(tigris)
tree.velox <- velox("tree48i0100a.tif")
tigris.state.counties <- counties(state = 10)
tree.mean.mat <-
tree.velox$extract(
sp = tigris.state.counties,
fun = function(x)
mean(x, na.rm = TRUE)
)
report <-
cbind.data.frame(tigris.state.counties$NAME, tree.mean.mat)
print(report)
tigris.state.counties$NAME tree.mean.mat
591 New Castle NA
1150 Sussex NA
2861 Kent NA
Just found Velox earlier today and it looks promising for something I'm working on. This is more an idea than an issue for you. I need area weighted means for what I'm working on, wondering if you can drop one particular clause in this if and it would get you there to find all points that intersect or overlap, then I could get apply the area weights myself, but I need all the points that overlap or intersect my polygons. In theory, if you dropped the length(missing.idx)>0 from this if, would it then extract all the raster points where the box overlaps and where the centroid intersects? Just a thought.
if (small & length(missing.idx) > 0 & !isLine) {
# Create box grid, boost geometries, intersect
boostBoxGrid <- boost(.self, box = TRUE)
missing.boost <- geomc.boost[missing.idx]
intrs.ls <- bg_intersects(missing.boost, boostBoxGrid)
....
In order to create an elevation cross section using the extract function I believe there needs to be an along
argument, as seen in raster::extract
. here's the description of this argument from the raster package:
along
boolean. Should returned values be ordered to go along the lines?
Does anyone have a work around for this?
hi,
Thanks for this package. It's very powerful for extract data.
However, loading raster is very slow compare to raster package.
So, the full process with velox package is not faster than with raster package.
This is an exemple:
### Install velox with devtools (v0.1.0.9004 with @extract_points())
install.packages("devtools")
library(devtools)
install_github("hunzikp/velox")
### Load library
library(raster)
library(velox)
library(sp)
library(rbenchmark)
### Download a Raster
tf <- tempfile()
td <- tempdir()
download.file(url = 'https://raw.githubusercontent.com/GeoScripting-WUR/IntroToRaster/gh-pages/data/gewata.zip', destfile = tf, method = 'auto')
unzip(tf,exdir = td)
lf<-list.files(path = td ,pattern = '^.*\\.tif$', full.names = T)
### Load Raster
benchmark(
vx<-velox(lf),
r<-raster(lf),
replications = 10
)
#RESULT load raster:
# test replications elapsed relative user.self sys.self user.child sys.child
# 2 r <- raster(lf) 10 0.11 1.000 0.09 0.00 NA NA
# 1 vx <- velox(lf) 10 0.94 8.545 0.71 0.15 NA NA
### Create Point
x_coord<-runif(10,r@extent@xmin,r@extent@xmax)
y_coord<-runif(10,r@extent@ymin,r@extent@ymax)
p<-SpatialPoints(data.frame(x_coord,y_coord))
### Extract value
benchmark(
v_vx<-vx$extract_points(p),
v_r<-raster::extract(r,p),
replications = 10
)
#RESULT extract data:
# test replications elapsed relative user.self sys.self user.child sys.child
# 2 v_r <- raster::extract(r, p) 10 6.1 6.1 6.06 0.01 NA NA
# 1 v_vx <- vx$extract_points(p) 10 1.0 1.0 0.50 0.50 NA NA
velox 9x slower to load raster than raster
velox 6x faster to extract data
I have 4000 big ASC rasters and i want to extract data in each one.
Can you load raster faster or should i change my code?
(sorry for my bad english)
Hello,
thank you for this package; I am really impressed by the speed.
Currently I am working with some big raster stacks and am obviously trying to get by with your fast methods.
Maybe I am doing something wrong, but often when I call your methods on velox-stacks, I get this error
Error: attempt to apply non-function
I am trying to create a reproducible example, but the error is not easily isolated. What I have found for now is that this works
# create a velox object by explicitly stating the individual raster stack layers
velox.stack <- velox(raster.stack[[1]], raster.stack[[2]])
# apply some fast functions
velox.stack$aggregate
# velox-->raster stack again
new.raster.stack <- velox.stack$as.RasterStack()
while the following code does not work (for some of my files)
# create a velox object by just handing over the whole stack
velox.stack <- velox(raster.stack)
# apply some fast functions
velox.stack$aggregate
Error: attempt to apply non-function
For now I have only checked two functionsaggregate
and rasterize
, where both produce the error.
When I can create a nice reproducible example for you, I will post it here.
Regarding my OS specs: I am working on Fedora23 with R-Version 3.3.1
After reviewing the issues (like I should have done first) I am definitly having an issue of velox 0.1.0 vs. 0.2.0 as here: #25
I am working with a pretty big raster (nearly 1GB) and something is strange. Velox is a lot slower at extracting a huge raster than Raster. What is happening here????? (reproducible example below):
Reproduce DATA:
#To download a huge raster for a reproducible example, you can use my package spaceheater:
devtools::install_github("nbarsch/spaceheater")
library(spaceheater)
getWPdownload("Ethiopia", "Population", "adj", 2015) ###1GB file download, their server is also slow (sorry)
##Generate buffer
library(sf)
buff <- data.frame(lon=38.763, lat=9.005)
buff <- st_as_sf(buff,coords=c("lon","lat"),crs=4326)
buff <- st_transform(buff, crs=32637)
buff <- st_buffer(buff, 20000)
buff <- st_transform(buff, crs=4326)
buff$point <- "1"
Test velox2 v velox1:
library(raster)
library(velox)
vras <- velox("ETHIOPIA_Population_adj_2015.tif")
ras <- raster("ETHIOPIA_Population_adj_2015.tif")
##VELOX 0.2.0 version
velox_extracttime <- system.time(veloxval <- vras$extract(buff, fun=function(x){mean(x,na.rm=T)}))
> velox_extracttime
user system elapsed
12.019 48.024 88.293
#Checking value
> veloxval
[,1]
1 28.12575
###VELOX 0.1.0
> velox_extracttime <- system.time(veloxval <- vras$extract(as(buff,"Spatial"), fun=function(x){mean(x,na.rm=T)}))
> velox_extracttime
user system elapsed
0.052 0.004 0.056
> veloxval
[,1]
[1,] 28.12575
I have been unable to make the point extraction function work using velox 0.2.0 package. For example, taking the code directly from page 19 of the current velox manual, the following error results..
set.seed(0)
mat1 <- matrix(rnorm(100), 10, 10)
mat2 <- matrix(rnorm(100), 10, 10)
vx <- velox(list(mat1, mat2), extent=c(0,1,0,1), res=c(0.1,0.1),crs="+proj=longlat +datum=WGS84 +no_defs")
library(sp)
library(rgeos)
coord <- cbind(runif(10), runif(10))
spoint <- SpatialPoints(coords=coord)
vx$extract_points(sp=spoint)
Error in envRefInferField(x, what, getClass(class(x)), selfEnv) :
‘extract_points’ is not a valid field or method name for reference class “VeloxRaster”
Does this mean the vx$extract_points doesn't work, or is there another reason I am not seeing? Any help appreciated. I think velox is an amazing extension of R's handling of rasters.
Regards
Terry
Hi
I am interested in obtaining the weights of each cell with the function extract, when it is used with fun=NULL (although it could also be useful for fun=fo, weighting results).
I can see two ways:
raster::extract
does it in pseudo-code (see true code in rasterizePolygons.R):
r2 <- disaggregate(raster(x), 10)
r3 <- "rasterize"(r2)
aggregate(r3, 10, sum)
This could be used, although so far the aggregate does not seem to handle NAs (see Issue 9)?
Another approach is to do:
r2 <- raster::disaggregate(raster(x), 10)
r3 <- velox::extract(r2, df=TRUE)
aggregate the data-frame on cell
I use this approach, and I get identical results to the extract ones. Before going into details, is this necessarily the most efficient? Can you think of a faster way?
Code (based on pull request 14 allowing df=TRUE). This for now uses raster
package (for disaggregate) and tidyverse
.
library(testthat)
library(tidyverse)
library(raster)
library(velox)
#####################
### TEST
#####################
extr_weights <- function(x, cellnumbers=FALSE, sp, normalizeWeights=TRUE){
x$cell <- 1:(nrow(x)*ncol(x))
brk_100 <- disaggregate(x, fact=10)
brk_100_vx <- velox(brk_100)
vx.raw.df <- brk_100_vx$extract(sp, fun = NULL, df=TRUE) %>% as.tbl
colnames(vx.raw.df)[-1] <- names(x)
res <- vx.raw.df %>%
count_(colnames(.)) %>%
arrange(ID_sp, cell) %>%
mutate(n=n/100) %>%
rename(weight=n) %>%
dplyr::select(ID_sp, cell, everything())
if(normalizeWeights) {
res <- res %>%
group_by(ID_sp) %>%
mutate(weight=weight/sum(weight)) %>%
ungroup()
}
if(!cellnumbers) res <- dplyr::select(res, -cell)
res
}
#####################
### TEST
#####################
set.seed(0)
mat1 <- matrix(rnorm(100), 10, 10)
mat2 <- matrix(rnorm(100), 10, 10)
brk <- brick(raster(mat1), raster(mat2))
vx <- velox(list(mat1, mat2), extent=c(0,1,0,1), res=c(0.1,0.1),
crs="+proj=longlat +datum=WGS84 +no_defs")
## Make SpatialPolygons
coord <- matrix(c(0,0,1,1, 0.5, 0.55), nrow=3, 2, byrow = TRUE)
spoint <- SpatialPoints(coords=coord)
spols <- gBuffer(spgeom=spoint, width=c(0.5, 0.5, 0.04), byid = TRUE)
## Extract raw values as data-frame
rs.raw.df_w_norm <- raster::extract(x = brk, y = spols, fun = NULL, df=TRUE, small=FALSE,
cellnumbers=TRUE, weights=TRUE, normalizeWeights=TRUE) %>% as.tbl
rs.raw.df_w_unnorm <- raster::extract(x = brk, y = spols, fun = NULL, df=TRUE, small=FALSE,
cellnumbers=TRUE, weights=TRUE, normalizeWeights=FALSE) %>% as.tbl
vx.raw.df_w_unnorm <- extr_weights(x=brk, cellnumbers=TRUE, sp=spols, normalizeWeights=FALSE) %>%
rename(ID=ID_sp) %>%
mutate(ID=as.double(ID))
vx.raw.df_w_norm <- extr_weights(x=brk, cellnumbers=TRUE, sp=spols, normalizeWeights=TRUE) %>%
rename(ID=ID_sp) %>%
mutate(ID=as.double(ID))
all.equal(vx.raw.df_w_norm, rs.raw.df_w_norm)
all.equal(vx.raw.df_w_unnorm, rs.raw.df_w_unnorm)
expect_true(all.equal(vx.raw.df_w, rs.raw.df))
Hi Philipp
The package sounds really promising after working with a rather large dataset and the raster package being terribly slow. Unfortunately, I was not able to use it though. I have a large SpatialLinesDataFrame (~84'000 Elements) and I'd like to extract values from a digital elevation model of Switzerland (7972 x 18545 cells). I tried to do it with a subset first, but I always got the same error:
Error in boostFactory$makeBoxGrid(origin = origin, dim = dim, res = res) :
std::bad_alloc
The same happenes when I try to rasterize the line (for a subsequent raster overlay). Did you encounter this before or do you know how to resolve it? If you wish some test data you can contact me by email: [email protected].
Thanks for the great work.
Best regards,
Chris
When transferring data from Velox to Raster package the data type is not maintained. E.g. integer values become floats. This increases the file size of the resultant raster.
Simple example:
mat <- matrix(1:9, 3, 3)
class(mat[1,1])
"integer"
vx <- velox(mat, extent=c(0,3,0,3), res=c(1,1), crs="+proj=longlat +datum=WGS84 +no_defs")
rs <- vx$as.RasterStack()
dataType(rs)
"FLT4S"
Hi,
Thanks for such a great package. I am looking for a way to access the cell number(or lat long) of the values extracted using velox_extract
.
The goal is that the values can be modified and then replaced.
I've looked through the examples on your website and the documentation but haven't seen anything.
Ideally, it would return a similar result as ...
# pl is a polyline
# r is a raster
elevations <- raster::extract(r, pl, cellnumber = T)[[1]]
head(elevations)
cell layer
1 1959129 22.73608
2 1960496 28.27165
3 1961863 35.43027
4 1963230 35.97610
5 1963231 35.62851
6 1964598 36.00765
but with all the benefits of your package!
Either the cell index or lat/lon would be awesome if you know a solution.
Thank you!
Mike
Hi,
the following example always leads to an unexpected crash of my R session.
library(raster)
library(velox)
r <- raster(xmn = 0, xmx = 10, ymn = 0, ymx = 10, res = 0.1)
r[] <- runif(ncell(r))
v <- velox(r)
l <- spLines(cbind(c(2, 8), c(2, 8)))
v$extract(sp = l, fun = mean)
sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 3.5.1 (2018-07-02)
os Ubuntu 18.04.1 LTS
system x86_64, linux-gnu
ui RStudio
language en_US:en
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Berlin
date 2019-02-11
─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date lib source
assertthat 0.2.0 2017-04-11 [1] CRAN (R 3.5.1)
class 7.3-14 2015-08-30 [4] CRAN (R 3.5.0)
classInt 0.2-3 2018-04-16 [1] CRAN (R 3.5.1)
cli 1.0.1 2018-09-25 [1] CRAN (R 3.5.1)
codetools 0.2-15 2016-10-05 [4] CRAN (R 3.5.0)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.1)
DBI 1.0.0 2018-05-02 [1] CRAN (R 3.5.1)
e1071 1.7-0 2018-07-28 [1] CRAN (R 3.5.1)
lattice 0.20-35 2017-03-25 [4] CRAN (R 3.5.0)
magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.1)
raster * 2.8-4 2018-11-03 [1] CRAN (R 3.5.1)
Rcpp 0.12.19 2018-10-01 [1] CRAN (R 3.5.1)
rgdal 1.3-6 2018-10-16 [1] CRAN (R 3.5.1)
rgeos 0.4-2 2018-11-08 [1] CRAN (R 3.5.1)
rstudioapi 0.8 2018-10-02 [1] CRAN (R 3.5.1)
sessioninfo 1.1.0 2018-09-25 [1] CRAN (R 3.5.1)
sf * 0.7-2 2018-12-20 [1] CRAN (R 3.5.1)
sp * 1.3-1 2018-06-05 [1] CRAN (R 3.5.1)
spData 0.2.9.4 2018-09-15 [1] CRAN (R 3.5.1)
units 0.6-1 2018-09-21 [1] CRAN (R 3.5.1)
velox * 0.2.0 2017-12-01 [1] CRAN (R 3.5.1)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.1)
yaml 2.2.0 2018-07-25 [1] CRAN (R 3.5.1)
[1] /home/jsigner/R/x86_64-pc-linux-gnu-library/3.5
[2] /usr/local/lib/R/site-library
[3] /usr/lib/R/site-library
[4] /usr/lib/R/library
Will you be adding mode as a function option to extract?
How one can use the copy method to create a deep copy of a VeloxRaster object?
Thanks
In my experience, raster
can do anything in acceptable time with in-memory operations on machines with up to 200GB of memory.
What really is sluggish compared to optimal is extracting small subsets of data from very large raster stacks or very high resolution rasters.
When I last dug into the raster code, the main reason seems to be that the loading code was 1) written in an R for
loop rather than C++, and 2) the code was loading entire rows into memory then subsetting rather than loading chunks of each row. velox would be a very nice place to optimize these sorts of extraction operations.
Hello,
I was trying to use velox to extract points from a wildfire raster and noticed that velox was much slower than raster. I've reproduced this issue on both Linux Mint and OS X. Apologies for the long MWE:
library(raster)
library(velox)
library(sf)
# https://www.fs.usda.gov/rds/archive/Product/RDS-2015-0046
rast <- raster("w001001.adf")
vx <- velox("w001001.adf")
n <- 100
points.df <- data.frame(x=runif(n, extent(rast)@xmin, extent(rast)@xmax),
y=runif(n, extent(rast)@ymin, extent(rast)@ymax))
points.sf <- st_as_sf(points.df, coords = c("x", "y"), crs = proj4string(rast), agr = "constant")
points.sf$test <- 1
points.sp <- as(points.sf, "Spatial")
system.time(extract(rast, points.sp)) # 0.3s
system.time(vx$extract_points(points.sf)) # 145s
system.time(vx$extract_points(points.sp)) # 154s
# Velox is much faster if raster is resaved as .tif
writeRaster(rast, "w001001_fromRaster.tif")
vx3 <- velox("w001001_fromRaster.tif")
rast3 <- raster("w001001_fromRaster.tif")
system.time(vx3$extract_points(points.sf)) # ~0s
system.time(vx3$extract_points(points.sp)) # ~0s
system.time(extract(rast3, points.sp)) # 0.3s
As you can see at the bottom, my resolution at the moment is to resave as GeoTiff and reload, at which point velox is once again much faster on OSX and at least comparable to raster on Linux Mint. But I admit I'm a bit confused. Both systems have plenty of memory.
Hi,
Trying to use velox for data extraction from large Sentinel composite raster images. However, when loading the raster data the following error message appears: Error: long vectors not supported yet: memory.c:3451 Do we have an unsupported data type stored in the raster or is the raster that is too large?
/hans
It would be great if the extract
function could return also the raw values indicating on which cells a given polygon falls. This would emulate the functioning of raster::extract()
, with fun=NULL
.
This does not seem too difficult right now:
my understanding is that valmat
already returns these. So what is missing is a possibility to bypass the fun
step if fun=NULL
?
New thing required is to add the polygon ID (as multiple rows correspond to a single polygon). This would be good anyway to return it, even if fun is not null?
In pseudo-code (working only for nbands=1)
Before the p loop:
if(is.null(fun)) out <- vector(mode = "list", length=length(sp))
IDs <- sapply(sp@polygons, function(x) x@ID)
Within the p loop, line 87:
valmat <- hitmat[,3:ncol(hitmat),drop=FALSE]
if (!is.null(fun) & nrow(valmat) > 0) {
for (k in 1:ncol(valmat)) {
out[p,k] <- fun(valmat[,k])
}
} else {
out[[p]] <- data.frame(ID=IDs[p], valmat)
}
After the p loop:
if(is.null(fun)) out <- do.call( "rbind", out)
Hi all,
I have tried switching to both exactextractr and terra for zonal statistics over very large amounts of polygons, but they both end up being much slower than velox for my applications. I continue to get by using velox, but it becomes an issue when sharing pipelines with collaborators who don't have as much flexibility.
Any solutions for compatibility issues, other than switching over to other libraries?
Cheers,
Arthur
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.