wch / downloader Goto Github PK
View Code? Open in Web Editor NEWR package for downloading files with https
R package for downloading files with https
source_url
and sha_url
create temporary files and don't remove them like expected. It seems that on.exit(rm(temp_file))
should be replaced by on.exit(unlink(temp_file))
.
To reproduce:
library("downloader")
packageVersion("downloader")
# [1] '0.2.2.99'
list.files(tempdir())
# character(0)
downloader::source_url("https://gist.github.com/wch/dae7c106ee99fe1fdfe7/raw/db0c9bfe0de85d15c60b0b9bf22403c0f5e1fb15/test.r", sha="9b8ff5213e32a871d6cb95cce0bed35c53307f61")
list.files(tempdir())
# [1] "file406779a6d925"
The download was successful but pdf could not be opened / read.
library(downloader)
download("https://ssl.www8.hp.com/ww/en/secure/pdf/4aa5-3281enw.pdf"
,"4aa5-3281enw.pdf"
,mode = "wb"
,cacheOK = FALSE
);
trying URL 'https://ssl.www8.hp.com/ww/en/secure/pdf/4aa5-3281enw.pdf'
Content type 'text/html; charset=UTF-8' length 200 bytes
opened URL
downloaded 102 Kb
shell.exec("C:/Users/XXXX/4aa5-3281enw.pdf");
However, I could manually download the pdf file that could be read without any issue! Is there anything (like encryption) to do with 'ssl' inside url link which is a secured site?
If you want to try out with the above url link, please paste the link and fill out the form at HP site. Then you could run R code. Thank you.
$platform
[1] "x86_64-w64-mingw32"
$version.string
[1] "R version 3.1.0 (2014-04-10)"
downloader_0.4
downloader
is part of my local dependency tree. For this dependency tree I am attempting to run all package tests for every package and the usage of testthat::context()
causes a hickup when running the tests for downloader
.
https://github.com/wch/downloader/blob/master/tests/testthat/test-download.R#L1
https://github.com/wch/downloader/blob/master/tests/testthat/test-sha.R#L1
Any chance for a minor update that omits the use of testthat::context()
?
See rstudio/packrat#239.
Downloads frequently fail when getting a large number of files. A nice retry feature (somewhere) would be great. I currently use something like:
downloaded <- FALSE
try_number <- 1
while(!downloaded & try_number <= 10){
try_number <- try_number + 1
downloaded <- TRUE
tryCatch(out <- download.file(url = url, ...),
error = function(e){
downloaded <<- FALSE
})
}
but this doesn't handle failures that don't result in an error (I am unsure if these exist and I haven't yet run into them).
Dear Winston,
Downloader often fails to fetch a file using an encrypted connection (https://), but using appropriate options like method = "wget" works using download.file().
My system is Ubuntu 16.04 running the latest stable CRAN version of downloader() and current R.
Minimal reproducible example (for me):
lad <- paste0("https://census.edina.ac.uk/ukborders/easy_download/prebuilt/",
"shape/England_lad_2011_gen.zip")
downloader::download(lad, destfile = "extdata/lad.zip")
## downloaded 0 bytes
##
## Error in download.file(url, method = method, ...) :
## cannot download all files
## In addition: Warning message:
## In download.file(url, method = method, ...) :
## URL 'https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip': status was '404 Not Found'
But download.file(..., method = "wget") works:
lad <- paste0("https://census.edina.ac.uk/ukborders/easy_download/prebuilt/",
"shape/England_lad_2011_gen.zip")
download.file(lad, destfile = "extdata/lad.zip", method = "wget")
## --2016-06-15 14:43:24-- https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip
## Resolving census.edina.ac.uk (census.edina.ac.uk)... 129.215.41.78
## Connecting to census.edina.ac.uk (census.edina.ac.uk)|129.215.41.78|:443... connected.
## HTTP request sent, awaiting response... 200 OK
## Length: 2364331 (2.3M) [application/zip]
## Saving to: ‘extdata/lad.zip’
##
## 0K .......... .......... .......... .......... .......... 2% 2.35M 1s
## 50K .......... .......... .......... .......... .......... 4% 4.63M 1s
## 100K .......... .......... .......... .......... .......... 6% 21.0M 0s
## 150K .......... .......... .......... .......... .......... 8% 4.37M 0s
## ...truncated...
##
## 2016-06-15 14:43:24 (5.47 MB/s) - ‘extdata/lad.zip’ saved [2364331/2364331]
System info:
Sys.info()
## sysname
## "Linux"
## release
## "4.4.0-24-generic"
## version
## "#43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016"
## nodename
## "ggp13pmj"
## machine
## "x86_64"
## login
## "unknown"
## user
## "ggp13pmj"
## effective_user
## "ggp13pmj"
Session info:
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.3.0 downloader_0.4 digest_0.6.9 packrat_0.4.7-1
I'll be honest I'm not sure if this is a bug or if my computer isn't quite configured correctly, but I decided to post this because I am able to download the file by manually specifying arguments that I thought downloader should take care of. I'm very happy to hear if I'm just doing something wrong though! :-)
Thanks,
Phil
Hi,
I posted this earlier in the bioc-devel
mailing list, and while it's not a downloader
-only issue, maybe you have some insight into what is the problem. You can find my original post at https://stat.ethz.ch/pipermail/bioc-devel/2016-June/009403.html but the basic issue is that downloading a Rdata file from GitHub and trying to load it using R on Windows fails (but it doesn't for Unix/Mac). Using utils::download.file()
directly also fails but using the browser doesn't. I see that something like this has happened to others before http://stackoverflow.com/questions/28155563/when-loading-data-in-rstudio-getting-error-readitem-unknown-type-161-perhaps#comment61979452_28155854 but there is no clear solution.
Here is the R code:
> library('downloader')
> download('https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true', destfile = 'test.Rdata')
trying URL 'https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true'
Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
downloaded 2.4 MB
> load('test.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
> traceback()
1: load("test.Rdata")
> options(width = 120)
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
setting value
version R version 3.3.0 (2016-05-03)
system x86_64, mingw32
ui Rgui
language (EN)
collate English_United States.1252
tz America/New_York
date 2016-06-17
Packages ---------------------------------------------------------------------------------------------------------------
package * version date source
devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
downloader * 0.4 2015-07-09 CRAN (R 3.3.0)
memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>
The same issue happens when using RawGit's url http://rawgit.com/
> download('https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata', destfile = 'test2.Rdata')
trying URL 'https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata'
Content type 'application/octet-stream' length unknown
downloaded 2.4 MB
> load('test2.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
It's not a problem on Mac:
> library('downloader')
> download('https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true', destfile = 'test.Rdata')
trying URL 'https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true'
Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
==================================================
downloaded 2.4 MB
> load('test.Rdata')
> options(width = 120)
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
setting value
version R version 3.3.0 RC (2016-05-01 r70572)
system x86_64, darwin13.4.0
ui AQUA
language (EN)
collate en_US.UTF-8
tz America/New_York
date 2016-06-18
Packages ---------------------------------------------------------------------------------------------------------------
package * version date source
devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
downloader * 0.4 2015-07-09 CRAN (R 3.3.0)
memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
Best,
Leonardo
Hi,
I am trying to query a URL and getting the below error
Peer certificate cannot be authenticated with given CA certificates
Kindly suggest a solution
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.