Giter Club home page Giter Club logo

downloader's Issues

source_url/sha_url don't remove temporary files

source_url and sha_url create temporary files and don't remove them like expected. It seems that on.exit(rm(temp_file)) should be replaced by on.exit(unlink(temp_file)).

To reproduce:

library("downloader")

packageVersion("downloader")
# [1] '0.2.2.99'

list.files(tempdir())
# character(0)

downloader::source_url("https://gist.github.com/wch/dae7c106ee99fe1fdfe7/raw/db0c9bfe0de85d15c60b0b9bf22403c0f5e1fb15/test.r", sha="9b8ff5213e32a871d6cb95cce0bed35c53307f61")

list.files(tempdir())
# [1] "file406779a6d925"

downloaded pdf file could not be read by Acrobat Reader!

The download was successful but pdf could not be opened / read.
library(downloader)
download("https://ssl.www8.hp.com/ww/en/secure/pdf/4aa5-3281enw.pdf"
,"4aa5-3281enw.pdf"
,mode = "wb"
,cacheOK = FALSE
);

trying URL 'https://ssl.www8.hp.com/ww/en/secure/pdf/4aa5-3281enw.pdf'
Content type 'text/html; charset=UTF-8' length 200 bytes
opened URL
downloaded 102 Kb

Try to open pdf file

shell.exec("C:/Users/XXXX/4aa5-3281enw.pdf");

Got an error: "Acrobat could not open '4aa5-3281enw.pdf' because it is not a supported file type or because the file has been damaged ....."

However, I could manually download the pdf file that could be read without any issue! Is there anything (like encryption) to do with 'ssl' inside url link which is a secured site?

If you want to try out with the above url link, please paste the link and fill out the form at HP site. Then you could run R code. Thank you.

System Information::

$platform
[1] "x86_64-w64-mingw32"
$version.string
[1] "R version 3.1.0 (2014-04-10)"
downloader_0.4

Package tests use the now deprecated `testthat::context()` function

downloader is part of my local dependency tree. For this dependency tree I am attempting to run all package tests for every package and the usage of testthat::context() causes a hickup when running the tests for downloader.

https://github.com/wch/downloader/blob/master/tests/testthat/test-download.R#L1
https://github.com/wch/downloader/blob/master/tests/testthat/test-sha.R#L1

Any chance for a minor update that omits the use of testthat::context()?

Request: add a retry feature

Downloads frequently fail when getting a large number of files. A nice retry feature (somewhere) would be great. I currently use something like:

downloaded <- FALSE
try_number <- 1
while(!downloaded & try_number <= 10){
  try_number <- try_number  + 1
  downloaded <- TRUE
  tryCatch(out <- download.file(url = url, ...),
           error = function(e){
             downloaded <<- FALSE
           })
}

but this doesn't handle failures that don't result in an error (I am unsure if these exist and I haven't yet run into them).

downloader() failes where download.file(..., method = "wget") works

Dear Winston,

Downloader often fails to fetch a file using an encrypted connection (https://), but using appropriate options like method = "wget" works using download.file().

My system is Ubuntu 16.04 running the latest stable CRAN version of downloader() and current R.

Minimal reproducible example (for me):

lad <- paste0("https://census.edina.ac.uk/ukborders/easy_download/prebuilt/",
              "shape/England_lad_2011_gen.zip")
downloader::download(lad, destfile = "extdata/lad.zip")

## downloaded 0 bytes
## 
## Error in download.file(url, method = method, ...) : 
##  cannot download all files
## In addition: Warning message:
## In download.file(url, method = method, ...) :
##   URL 'https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip': status was '404 Not Found'

But download.file(..., method = "wget") works:

lad <- paste0("https://census.edina.ac.uk/ukborders/easy_download/prebuilt/",
              "shape/England_lad_2011_gen.zip")
download.file(lad, destfile = "extdata/lad.zip", method = "wget")

## --2016-06-15 14:43:24--  https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip
## Resolving census.edina.ac.uk (census.edina.ac.uk)... 129.215.41.78
## Connecting to census.edina.ac.uk (census.edina.ac.uk)|129.215.41.78|:443... connected.
## HTTP request sent, awaiting response... 200 OK
## Length: 2364331 (2.3M) [application/zip]
## Saving to: ‘extdata/lad.zip’
##
##      0K .......... .......... .......... .......... ..........  2% 2.35M 1s
##     50K .......... .......... .......... .......... ..........  4% 4.63M 1s
##    100K .......... .......... .......... .......... ..........  6% 21.0M 0s
##    150K .......... .......... .......... .......... ..........  8% 4.37M 0s
##    ...truncated...
## 
## 2016-06-15 14:43:24 (5.47 MB/s) - ‘extdata/lad.zip’ saved [2364331/2364331]

System info:

Sys.info()
##                                      sysname 
##                                      "Linux" 
##                                      release 
##                           "4.4.0-24-generic" 
##                                      version 
## "#43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016" 
##                                     nodename 
##                                   "ggp13pmj" 
##                                      machine 
##                                     "x86_64" 
##                                        login 
##                                    "unknown" 
##                                         user 
##                                   "ggp13pmj" 
##                               effective_user 
##                                   "ggp13pmj" 

Session info:

R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.3.0     downloader_0.4  digest_0.6.9    packrat_0.4.7-1

I'll be honest I'm not sure if this is a bug or if my computer isn't quite configured correctly, but I decided to post this because I am able to download the file by manually specifying arguments that I thought downloader should take care of. I'm very happy to hear if I'm just doing something wrong though! :-)

Thanks,
Phil

Windows-only issue with downloading a Rdata file and loading it with R

Hi,

I posted this earlier in the bioc-devel mailing list, and while it's not a downloader-only issue, maybe you have some insight into what is the problem. You can find my original post at https://stat.ethz.ch/pipermail/bioc-devel/2016-June/009403.html but the basic issue is that downloading a Rdata file from GitHub and trying to load it using R on Windows fails (but it doesn't for Unix/Mac). Using utils::download.file() directly also fails but using the browser doesn't. I see that something like this has happened to others before http://stackoverflow.com/questions/28155563/when-loading-data-in-rstudio-getting-error-readitem-unknown-type-161-perhaps#comment61979452_28155854 but there is no clear solution.

Here is the R code:

> library('downloader')
> download('https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true', destfile = 'test.Rdata')
trying URL 'https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true'
Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
downloaded 2.4 MB

> load('test.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
> traceback()
1: load("test.Rdata")
> options(width = 120)
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
 setting  value
 version  R version 3.3.0 (2016-05-03)
 system   x86_64, mingw32
 ui       Rgui
 language (EN)
 collate  English_United States.1252
 tz       America/New_York
 date     2016-06-17

Packages ---------------------------------------------------------------------------------------------------------------
 package    * version date       source
 devtools     1.11.1  2016-04-21 CRAN (R 3.3.0)
 digest       0.6.9   2016-01-08 CRAN (R 3.3.0)
 downloader * 0.4     2015-07-09 CRAN (R 3.3.0)
 memoise      1.0.0   2016-01-29 CRAN (R 3.3.0)
 withr        1.0.1   2016-02-04 CRAN (R 3.3.0)
>

The same issue happens when using RawGit's url http://rawgit.com/

> download('https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata', destfile = 'test2.Rdata')
trying URL 'https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata'
Content type 'application/octet-stream' length unknown
downloaded 2.4 MB

> load('test2.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R

It's not a problem on Mac:

> library('downloader')
>  download('https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true', destfile = 'test.Rdata')
trying URL 'https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true'
Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
==================================================
downloaded 2.4 MB

> load('test.Rdata')
> options(width = 120)
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
 setting  value                                 
 version  R version 3.3.0 RC (2016-05-01 r70572)
 system   x86_64, darwin13.4.0                  
 ui       AQUA                                  
 language (EN)                                  
 collate  en_US.UTF-8                           
 tz       America/New_York                      
 date     2016-06-18                            

Packages ---------------------------------------------------------------------------------------------------------------
 package    * version date       source        
 devtools     1.11.1  2016-04-21 CRAN (R 3.3.0)
 digest       0.6.9   2016-01-08 CRAN (R 3.3.0)
 downloader * 0.4     2015-07-09 CRAN (R 3.3.0)
 memoise      1.0.0   2016-01-29 CRAN (R 3.3.0)
 withr        1.0.1   2016-02-04 CRAN (R 3.3.0)

Best,
Leonardo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.