r-hub / crandb Goto Github PK
View Code? Open in Web Editor NEWDatabase of CRAN R packages
License: Other
Database of CRAN R packages
License: Other
Will be only updated weekly, or so.
Just add new documents for them, that contain identical info to the version they refer to, and also contain the link explicitly. E.g.
{
"version": "3.1.1",
"date": "2014-07-10T00:00:00+00:00",
"type": "release"
}
Message issued on:
packages <- list_packages(format = "full", archived = TRUE)
Is releases()
(and the corresponding endpoint of the API) still useful given that rversions
exists?
It is a question how it would fit to the DB schema, though. Probably there will be a new document for each package, that contains the download numbers, or just the latest numbers.
db should be rpkgdb
In particular, if the inferred data/time is in the future, then try to correct it based on the difference of the current time and the inferred time.
probably other R-hub/METACRAN stuff?
r3 <- crandb::cran_releases("2.3.0", format = "full")
#> Error in ping_port(host, port = port, count = 1, timeout = server$timeout * : Cannot resolve host name
Created on 2019-04-17 by the reprex package (v0.2.1)
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#> setting value
#> version R version 3.5.3 (2019-03-11)
#> os Ubuntu 18.04.2 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language en_US
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Paris
#> date 2019-04-17
#>
#> ─ Packages ──────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.5.3)
#> backports 1.1.4 2019-04-10 [1] CRAN (R 3.5.3)
#> callr 3.2.0 2019-03-15 [1] CRAN (R 3.5.3)
#> cli 1.1.0 2019-03-19 [1] CRAN (R 3.5.3)
#> crandb 1.0.0 2019-04-17 [1] local
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.3)
#> curl 3.3 2019-01-10 [1] CRAN (R 3.5.3)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.5.3)
#> devtools 2.0.2 2019-04-08 [1] CRAN (R 3.5.3)
#> digest 0.6.18 2018-10-10 [1] CRAN (R 3.5.3)
#> evaluate 0.13 2019-02-12 [1] CRAN (R 3.5.3)
#> fs 1.2.7 2019-03-19 [1] CRAN (R 3.5.3)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.5.3)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.5.3)
#> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.5.3)
#> httr 1.4.0 2018-12-11 [1] CRAN (R 3.5.3)
#> jsonlite 1.6 2018-12-07 [1] CRAN (R 3.5.3)
#> knitr 1.22 2019-03-08 [1] CRAN (R 3.5.3)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.3)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.3)
#> parsedate 1.1.3 2017-03-02 [1] CRAN (R 3.5.3)
#> pingr 1.1.2 2017-03-02 [1] CRAN (R 3.5.3)
#> pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.5.3)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.5.3)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.5.3)
#> processx 3.3.0 2019-03-10 [1] CRAN (R 3.5.3)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.5.3)
#> R6 2.4.0 2019-02-14 [1] CRAN (R 3.5.3)
#> Rcpp 1.0.1 2019-03-17 [1] CRAN (R 3.5.3)
#> remotes 2.0.3 2019-04-09 [1] CRAN (R 3.5.3)
#> rlang 0.3.4 2019-04-07 [1] CRAN (R 3.5.3)
#> rmarkdown 1.12 2019-03-14 [1] CRAN (R 3.5.3)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.5.3)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.3)
#> spareserver 1.0.1 2015-07-13 [1] CRAN (R 3.5.3)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.5.3)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.5.3)
#> testthat 2.0.1 2018-10-13 [1] CRAN (R 3.5.3)
#> usethis 1.5.0 2019-04-07 [1] CRAN (R 3.5.3)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.3)
#> xfun 0.6 2019-04-02 [1] CRAN (R 3.5.3)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.5.3)
#>
#> [1] /home/maelle/R/x86_64-pc-linux-gnu-library/3.5
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
to query maintainers by email, immediate use case is for the cran checks API. I need to move away from Ruby to command line scripts for scraping check pages, and I haven't found a fast CLI way to parse https://cloud.r-project.org/web/checks/check_summary_by_maintainer.html to pull out maintainer email addresses
data <- list_packages(limit = 1000)
Error in robust_q(service, url, fun, args, servers, odds, timeout_multiplier = timeout_multiplier * :
Cannot do query '/-/desc?start_key=""&limit=1000'.
A minor nit but it takes close to 2 minutes for list_packages() to fail if the servers can't be pinged. Perhaps I'm just impatient, but it would be nice to be able to short that.
Add info about pkgsearch to the README
Should issues related to crandb API stay in this issue tracker or be transferred to another r-hub repo?
Something like http://www.pushmon.com/
Or https://deadmanssnitch.com/, but this is expensive.
library("crandb")
package("ropenaq", version = "all")
#> Error in ping_port(host, port = port, count = 1, timeout = server$timeout * : Cannot resolve host name
devtools::session_info()
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.3.1 (2016-06-21)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate Spanish_Spain.1252
#> tz Europe/Paris
#> date 2017-07-14
#> Packages -----------------------------------------------------------------
#> package * version date source
#> assertthat 0.2.0 2017-04-11 CRAN (R 3.3.3)
#> backports 1.0.5 2017-01-18 CRAN (R 3.3.2)
#> base * 3.3.1 2016-06-21 local
#> crandb * 1.0.0 2017-07-14 Github (metacran/crandb@c0c7c21)
#> datasets * 3.3.1 2016-06-21 local
#> devtools 1.13.1 2017-05-13 CRAN (R 3.3.3)
#> digest 0.6.12 2017-01-27 CRAN (R 3.3.2)
#> evaluate 0.10 2016-10-11 CRAN (R 3.3.1)
#> falsy 1.0.1 2017-07-14 Github (gaborcsardi/falsy@ee26873)
#> graphics * 3.3.1 2016-06-21 local
#> grDevices * 3.3.1 2016-06-21 local
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.3.3)
#> httr 1.2.1 2016-07-03 CRAN (R 3.3.1)
#> jsonlite 1.5 2017-06-01 CRAN (R 3.3.3)
#> knitr 1.16 2017-05-18 CRAN (R 3.3.3)
#> magrittr 1.5 2014-11-22 CRAN (R 3.2.2)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.3.3)
#> methods * 3.3.1 2016-06-21 local
#> parsedate 1.1.3 2017-03-02 CRAN (R 3.3.3)
#> pingr 1.1.2 2017-03-02 CRAN (R 3.3.3)
#> prettyunits 1.0.2 2015-07-13 CRAN (R 3.3.2)
#> R6 2.2.1 2017-05-10 CRAN (R 3.3.3)
#> Rcpp 0.12.11 2017-05-22 CRAN (R 3.3.3)
#> rmarkdown 1.5 2017-04-26 CRAN (R 3.3.3)
#> rprojroot 1.2 2017-01-16 CRAN (R 3.3.2)
#> spareserver 1.0.1 2015-07-13 CRAN (R 3.3.3)
#> stats * 3.3.1 2016-06-21 local
#> stringi 1.1.5 2017-04-07 CRAN (R 3.3.3)
#> stringr 1.2.0 2017-02-18 CRAN (R 3.3.3)
#> tools 3.3.1 2016-06-21 local
#> utils * 3.3.1 2016-06-21 local
#> withr 1.0.2 2016-06-20 CRAN (R 3.2.5)
#> yaml 2.1.14 2016-11-12 CRAN (R 3.3.2)
Works:
> package("devtools", "0.6")
CRAN package devtools 0.6, 4 years ago
Title: Tools to make developing R code easier
...
Doesn't:
> package("devtools", "0.6.0")
Error: crandb query:
Would be great if the system could match that properly.
Because everything goes through _show/package
, just need to change the rewrites
I guess, to let through versions verbatim.
It brings in Rcpp, and there is not enough memory on Openshift to compile that.
"Database of CRAN R packages" -> "Database of CRAN R packages metadata"?
Easy, just add them.
@ pave probably
If it was archived before an R release, then do not include it in that release. E.g. http://db.r-pkg.org/AMA
Not completely sure where.....
So that we don't need to make 300 queries to compare locally installed packages.
In progress. It seems that it will work.
e..g,.
http://crandb.r-pkg.org/geonames
{
Package: "geonames",
Type: "Package",
Title: "Interface to www.geonames.org web service",
Version: "0.998",
Date: "2011-24-11", # <-------------- here
Author: "Barry Rowlingson",
Maintainer: "Barry Rowlingson <[email protected]>",
Depends: {
R: ">= 2.2.0"
},
Imports: {
rjson: "*"
},
Description: "Code for querying the web service at www.geonames.org",
License: "GPL-3",
LazyLoad: "yes",
BugReports: "https://github.com/ropensci/geonames/issues",
Packaged: "2014-12-19 17:40:07 UTC; rowlings",
NeedsCompilation: "no",
Repository: "CRAN",
Date/Publication: "2014-12-19 19:02:02",
crandb_file_date: "2014-12-19 13:03:21",
date: "2014-12-19T19:02:02+00:00",
releases: [ ]
}
Similarly to /-/events and /-/pkgreleases
I'm looking at the number of releases in the ~6500 packages on CRAN around the end of April. There's a huge number with 0 releases (see attached figure). I don't think this is real or right, yes? For example dplyr
is one of those packages. Can you help me figure out what I'm looking at? BTW I computed number of releases from the length of the releases
vector. We are getting this info through the API but below I just use crandb
.
> library(crandb)
> package("dplyr")$releases
Argument unicode has been deprecated. YAJL always parses unicode.
list()
> package("lattice")$releases
Argument unicode has been deprecated. YAJL always parses unicode.
list()
> package("grid")$releases
Argument unicode has been deprecated. YAJL always parses unicode.
[1] "2.0.0" "2.0.1" "2.1.0" "2.1.1" "2.2.0" "2.2.1" "2.3.0" "2.3.1"
[9] "2.4.0" "2.4.1" "2.5.0" "2.5.1" "2.6.0" "2.6.1" "2.6.2" "2.7.0"
[17] "2.7.1" "2.7.2" "2.8.0" "2.8.1" "2.9.0" "2.9.1" "2.9.2" "2.10.0"
[25] "2.10.1" "2.11.0" "2.11.1" "2.12.0" "2.12.1" "2.12.2" "2.13.0" "2.13.1"
[33] "2.13.2" "2.14.0" "2.14.1" "2.14.2" "2.15.0" "2.15.1" "2.15.2" "2.15.3"
[41] "3.0.0" "3.0.1"
For now.
spareserver
, falsy
So we don't mess with production.....
Looks quite silly.....
Do not compare the full db, just check the last package in both, and if there is something new, do a proper update.
It could be done using the mtime of the RDS file(s), maybe through rsync, or some other way, e.g. by checking the file sizes only.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.