wch / downloader Goto Github PK

View Code? Open in Web Editor NEW

32.0 32.0 7.0 40 KB

R package for downloading files with https

R 100.00%

downloader's People

Contributors

Stargazers

Watchers

Forkers

irkinosor arturochian jmpaine jjallaire jmgwouona jmpasmoi kendonb

downloader's Issues

downloaded pdf file could not be read by Acrobat Reader!

The download was successful but pdf could not be opened / read.
library(downloader)
download("https://ssl.www8.hp.com/ww/en/secure/pdf/4aa5-3281enw.pdf"
,"4aa5-3281enw.pdf"
,mode = "wb"
,cacheOK = FALSE
);

trying URL 'https://ssl.www8.hp.com/ww/en/secure/pdf/4aa5-3281enw.pdf'
Content type 'text/html; charset=UTF-8' length 200 bytes
opened URL
downloaded 102 Kb

Try to open pdf file

shell.exec("C:/Users/XXXX/4aa5-3281enw.pdf");

Got an error: "Acrobat could not open '4aa5-3281enw.pdf' because it is not a supported file type or because the file has been damaged ....."

However, I could manually download the pdf file that could be read without any issue! Is there anything (like encryption) to do with 'ssl' inside url link which is a secured site?

If you want to try out with the above url link, please paste the link and fill out the form at HP site. Then you could run R code. Thank you.

System Information::

$platform
[1] "x86_64-w64-mingw32"
$version.string
[1] "R version 3.1.0 (2014-04-10)"
downloader_0.4

Package tests use the now deprecated `testthat::context()` function

downloader is part of my local dependency tree. For this dependency tree I am attempting to run all package tests for every package and the usage of testthat::context() causes a hickup when running the tests for downloader.

https://github.com/wch/downloader/blob/master/tests/testthat/test-download.R#L1
https://github.com/wch/downloader/blob/master/tests/testthat/test-sha.R#L1

Any chance for a minor update that omits the use of testthat::context()?

Peer certificate cannot be authenticated with given CA certificates

Hi,

I am trying to query a URL and getting the below error

Peer certificate cannot be authenticated with given CA certificates

Kindly suggest a solution

Add URL and BugReports field to DESCRIPTION

downloader() failes where download.file(..., method = "wget") works

Dear Winston,

Downloader often fails to fetch a file using an encrypted connection (https://), but using appropriate options like method = "wget" works using download.file().

My system is Ubuntu 16.04 running the latest stable CRAN version of downloader() and current R.

Minimal reproducible example (for me):

lad <- paste0("https://census.edina.ac.uk/ukborders/easy_download/prebuilt/",
              "shape/England_lad_2011_gen.zip")
downloader::download(lad, destfile = "extdata/lad.zip")

## downloaded 0 bytes
## 
## Error in download.file(url, method = method, ...) : 
##  cannot download all files
## In addition: Warning message:
## In download.file(url, method = method, ...) :
##   URL 'https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip': status was '404 Not Found'

But download.file(..., method = "wget") works:

lad <- paste0("https://census.edina.ac.uk/ukborders/easy_download/prebuilt/",
              "shape/England_lad_2011_gen.zip")
download.file(lad, destfile = "extdata/lad.zip", method = "wget")

## --2016-06-15 14:43:24--  https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip
## Resolving census.edina.ac.uk (census.edina.ac.uk)... 129.215.41.78
## Connecting to census.edina.ac.uk (census.edina.ac.uk)|129.215.41.78|:443... connected.
## HTTP request sent, awaiting response... 200 OK
## Length: 2364331 (2.3M) [application/zip]
## Saving to: ‘extdata/lad.zip’
##
##      0K .......... .......... .......... .......... ..........  2% 2.35M 1s
##     50K .......... .......... .......... .......... ..........  4% 4.63M 1s
##    100K .......... .......... .......... .......... ..........  6% 21.0M 0s
##    150K .......... .......... .......... .......... ..........  8% 4.37M 0s
##    ...truncated...
## 
## 2016-06-15 14:43:24 (5.47 MB/s) - ‘extdata/lad.zip’ saved [2364331/2364331]

System info:

Sys.info()
##                                      sysname 
##                                      "Linux" 
##                                      release 
##                           "4.4.0-24-generic" 
##                                      version 
## "#43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016" 
##                                     nodename 
##                                   "ggp13pmj" 
##                                      machine 
##                                     "x86_64" 
##                                        login 
##                                    "unknown" 
##                                         user 
##                                   "ggp13pmj" 
##                               effective_user 
##                                   "ggp13pmj"

Session info:

R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.3.0     downloader_0.4  digest_0.6.9    packrat_0.4.7-1

I'll be honest I'm not sure if this is a bug or if my computer isn't quite configured correctly, but I decided to post this because I am able to download the file by manually specifying arguments that I thought downloader should take care of. I'm very happy to hear if I'm just doing something wrong though! :-)

Thanks,
Phil

source_url/sha_url don't remove temporary files

source_url and sha_url create temporary files and don't remove them like expected. It seems that on.exit(rm(temp_file)) should be replaced by on.exit(unlink(temp_file)).

To reproduce:

library("downloader")

packageVersion("downloader")
# [1] '0.2.2.99'

list.files(tempdir())
# character(0)

downloader::source_url("https://gist.github.com/wch/dae7c106ee99fe1fdfe7/raw/db0c9bfe0de85d15c60b0b9bf22403c0f5e1fb15/test.r", sha="9b8ff5213e32a871d6cb95cce0bed35c53307f61")

list.files(tempdir())
# [1] "file406779a6d925"

Windows-only issue with downloading a Rdata file and loading it with R

Hi,

I posted this earlier in the bioc-devel mailing list, and while it's not a downloader-only issue, maybe you have some insight into what is the problem. You can find my original post at https://stat.ethz.ch/pipermail/bioc-devel/2016-June/009403.html but the basic issue is that downloading a Rdata file from GitHub and trying to load it using R on Windows fails (but it doesn't for Unix/Mac). Using utils::download.file() directly also fails but using the browser doesn't. I see that something like this has happened to others before http://stackoverflow.com/questions/28155563/when-loading-data-in-rstudio-getting-error-readitem-unknown-type-161-perhaps#comment61979452_28155854 but there is no clear solution.

Here is the R code:

> library('downloader')
> download('https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true', destfile = 'test.Rdata')
trying URL 'https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true'
Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
downloaded 2.4 MB

> load('test.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
> traceback()
1: load("test.Rdata")
> options(width = 120)
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
 setting  value
 version  R version 3.3.0 (2016-05-03)
 system   x86_64, mingw32
 ui       Rgui
 language (EN)
 collate  English_United States.1252
 tz       America/New_York
 date     2016-06-17

Packages ---------------------------------------------------------------------------------------------------------------
 package    * version date       source
 devtools     1.11.1  2016-04-21 CRAN (R 3.3.0)
 digest       0.6.9   2016-01-08 CRAN (R 3.3.0)
 downloader * 0.4     2015-07-09 CRAN (R 3.3.0)
 memoise      1.0.0   2016-01-29 CRAN (R 3.3.0)
 withr        1.0.1   2016-02-04 CRAN (R 3.3.0)
>

The same issue happens when using RawGit's url http://rawgit.com/

> download('https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata', destfile = 'test2.Rdata')
trying URL 'https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata'
Content type 'application/octet-stream' length unknown
downloaded 2.4 MB

> load('test2.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R

It's not a problem on Mac:

> library('downloader')
>  download('https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true', destfile = 'test.Rdata')
trying URL 'https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true'
Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
==================================================
downloaded 2.4 MB

> load('test.Rdata')
> options(width = 120)
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
 setting  value                                 
 version  R version 3.3.0 RC (2016-05-01 r70572)
 system   x86_64, darwin13.4.0                  
 ui       AQUA                                  
 language (EN)                                  
 collate  en_US.UTF-8                           
 tz       America/New_York                      
 date     2016-06-18                            

Packages ---------------------------------------------------------------------------------------------------------------
 package    * version date       source        
 devtools     1.11.1  2016-04-21 CRAN (R 3.3.0)
 digest       0.6.9   2016-01-08 CRAN (R 3.3.0)
 downloader * 0.4     2015-07-09 CRAN (R 3.3.0)
 memoise      1.0.0   2016-01-29 CRAN (R 3.3.0)
 withr        1.0.1   2016-02-04 CRAN (R 3.3.0)

Best,
Leonardo

Respect error codes for various download methods

See rstudio/packrat#239.

Request: add a retry feature

Downloads frequently fail when getting a large number of files. A nice retry feature (somewhere) would be great. I currently use something like:

downloaded <- FALSE
try_number <- 1
while(!downloaded & try_number <= 10){
  try_number <- try_number  + 1
  downloaded <- TRUE
  tryCatch(out <- download.file(url = url, ...),
           error = function(e){
             downloaded <<- FALSE
           })
}

but this doesn't handle failures that don't result in an error (I am unsure if these exist and I haven't yet run into them).