Giter Club home page Giter Club logo

citecorp's Issues

Maintenance status / help needed?

๐Ÿ‘‹ @Selbosh!

Do you still intend to become this package's maintainer?

If so do you need any help? For instance an aspect where you'd appreciate some tips, contributions, a PR review? Do you need an invitation to our friendly Slack workspace?

Package not working anymore?

It seems that citecorp is not working anymore?

See transcript below.


> library("citecorp")
> oc_doi2ids("10.1097/igc.0000000000000609")
data frame with 0 columns and 0 rows
> devtools::session_info()
โ”€ Session info โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Ventura 13.4.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Zurich
 date     2023-06-26
 pandoc   3.1.3 @ /opt/homebrew/bin/pandoc

โ”€ Packages โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 package     * version date (UTC) lib source
 cachem        1.0.7   2023-02-24 [1] CRAN (R 4.3.0)
 callr         3.7.3   2022-11-02 [1] CRAN (R 4.3.0)
 citecorp    * 0.3.0   2020-04-16 [1] CRAN (R 4.3.0)
 cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.0)
 crayon        1.5.2   2022-09-29 [1] CRAN (R 4.3.0)
 crul          1.4.0   2023-05-17 [1] CRAN (R 4.3.0)
 curl          5.0.1   2023-06-07 [1] CRAN (R 4.3.0)
 data.table    1.14.8  2023-02-17 [1] CRAN (R 4.3.0)
 devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.0)
 digest        0.6.31  2022-12-11 [1] CRAN (R 4.3.0)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.0)
 fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
 fauxpas       0.5.2   2023-05-03 [1] CRAN (R 4.3.0)
 fs            1.6.1   2023-02-06 [1] CRAN (R 4.3.0)
 glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.0)
 htmltools     0.5.5   2023-03-23 [1] CRAN (R 4.3.0)
 htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.3.0)
 httpcode      0.3.0   2020-04-10 [1] CRAN (R 4.3.0)
 httpuv        1.6.9   2023-02-14 [1] CRAN (R 4.3.0)
 jsonlite      1.8.5   2023-06-05 [1] CRAN (R 4.3.0)
 later         1.3.0   2021-08-18 [1] CRAN (R 4.3.0)
 lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.3.0)
 magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
 memoise       2.0.1   2021-11-26 [1] CRAN (R 4.3.0)
 mime          0.12    2021-09-28 [1] CRAN (R 4.3.0)
 miniUI 2018-05-18 [1] CRAN (R 4.3.0)
 pkgbuild      1.4.0   2022-11-27 [1] CRAN (R 4.3.0)
 pkgload       1.3.2   2022-11-16 [1] CRAN (R 4.3.0)
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.3.0)
 processx      3.8.1   2023-04-18 [1] CRAN (R 4.3.0)
 profvis       0.3.7   2020-11-02 [1] CRAN (R 4.3.0)
 promises 2021-02-11 [1] CRAN (R 4.3.0)
 ps            1.7.5   2023-04-18 [1] CRAN (R 4.3.0)
 purrr         1.0.1   2023-01-10 [1] CRAN (R 4.3.0)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
 Rcpp          1.0.10  2023-01-22 [1] CRAN (R 4.3.0)
 remotes       2.4.2   2021-11-30 [1] CRAN (R 4.3.0)
 rlang         1.1.0   2023-03-14 [1] CRAN (R 4.3.0)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
 shiny         1.7.4   2022-12-15 [1] CRAN (R 4.3.0)
 stringi       1.7.12  2023-01-11 [1] CRAN (R 4.3.0)
 stringr       1.5.0   2022-12-02 [1] CRAN (R 4.3.0)
 triebeard     0.4.1   2023-03-04 [1] CRAN (R 4.3.0)
 urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.0)
 urltools      1.7.3   2019-04-14 [1] CRAN (R 4.3.0)
 usethis       2.1.6   2022-05-25 [1] CRAN (R 4.3.0)
 vctrs         0.6.2   2023-04-19 [1] CRAN (R 4.3.0)
 whisker       0.4.1   2022-12-05 [1] CRAN (R 4.3.0)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.0)

 [1] /Users/rainerkrug/R/library/aarch64-apple-darwin20/4.3
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library


oc_coci_cites() fails with multiple DOIs

Hi, just starting to play with this package that looks very cool:


pavo1_doi <- "10.1111/2041-210X.12069"
pavo2_doi <- "10.1111/2041-210X.13174"

#> # A tibble: 67 x 7
#>    cited     timespan citing     journal_sc creation oci               author_sc
#>  * <chr>     <chr>    <chr>      <chr>      <chr>    <chr>             <chr>    
#>  1 10.1111/โ€ฆ P2Y2M    10.1111/bโ€ฆ no         2015-09โ€ฆ 0200101010136111โ€ฆ no       
#>  2 10.1111/โ€ฆ P0Y4M    10.1636/bโ€ฆ no         2013-11  0200106030636110โ€ฆ no       
#>  3 10.1111/โ€ฆ P1Y10M   10.1650/cโ€ฆ no         2015-05  0200106050036122โ€ฆ no       
#>  4 10.1111/โ€ฆ P3Y7M    10.1186/sโ€ฆ no         2017-02โ€ฆ 0200101080636280โ€ฆ no       
#>  5 10.1111/โ€ฆ P5Y5M    10.1155/2โ€ฆ no         2018-12โ€ฆ 0200101050536020โ€ฆ no       
#>  6 10.1111/โ€ฆ P5Y4M    10.1111/eโ€ฆ no         2018-11โ€ฆ 0200101010136142โ€ฆ no       
#>  7 10.1111/โ€ฆ P5Y7M    10.1111/eโ€ฆ no         2019-02โ€ฆ 0200101010136142โ€ฆ no       
#>  8 10.1111/โ€ฆ P2Y6M    10.1002/eโ€ฆ no         2016-01โ€ฆ 0200100000236141โ€ฆ no       
#>  9 10.1111/โ€ฆ P3Y10M   10.1101/1โ€ฆ no         2017-05โ€ฆ 0200101000136010โ€ฆ no       
#> 10 10.1111/โ€ฆ P4Y1M    10.1101/1โ€ฆ no         2017-08โ€ฆ 0200101000136010โ€ฆ no       
#> # โ€ฆ with 57 more rows

#> # A tibble: 4 x 7
#>   cited     timespan  citing    journal_sc creation oci                author_sc
#> * <chr>     <chr>     <chr>     <chr>      <chr>    <chr>              <chr>    
#> 1 10.1111/โ€ฆ P0Y4M6D   10.7717/โ€ฆ no         2019-08โ€ฆ 02007070107362514โ€ฆ no       
#> 2 10.1111/โ€ฆ P0Y2M3D   10.1007/โ€ฆ no         2019-06โ€ฆ 02001000007362801โ€ฆ no       
#> 3 10.1111/โ€ฆ -P0Y0M13D 10.1101/โ€ฆ no         2019-03โ€ฆ 02001010001360508โ€ฆ yes      
#> 4 10.1111/โ€ฆ P0Y4M7D   10.1101/โ€ฆ no         2019-08โ€ฆ 02001010001360703โ€ฆ no

oc_coci_cites(c(pavo1_doi, pavo2_doi))
#> # A tibble: 0 x 0

Created on 2020-04-08 by the reprex package (v0.3.0)

According to the documentation, it should work:

doi (character) one or more Digital Object Identifiers

but maybe this just applies to oc_coci_meta() (which does work with multiple DOIs) and not oc_coci_cites()?

Paginate oc_coci_meta() or warn when there are too many DOIs?

Thanks for this great package!

I am trying to scrape and augment the citations of a bunch of articles - so I am obtaining them with oc_coci_cites and then passing the result into oc_coci_meta. However, with more than about 120 citations, then fails after quite a long time with Request Header Fields Too Large (HTTP 431)

Maybe oc_coci_meta could split the request automatically when too many DOI are requested? Or alternatively, just issue an explicit warning when there are more than say 100? As it stands, it took me rather long too figure out what the problem was (even though the error is already rather suggestive in hindsight.)

make sure egs run only if okay

via cran checks

  > if (crul::ok('')) {
    + oc_doi2ids("10.1097/igc.0000000000000609")
... removed
    Error: No description found for code: 520


 > if (crul::ok("")) {
    + # references
    + oc_coci_refs(doi1)
... removed
    Error in fauxpas::find_error_class(x$status_code) :
     no method found for 520
    Calls: oc_coci_refs -> oc_coci_stub -> oc_GET -> errs -> <Anonymous>

Error in doi2ids - arguments imply differing number of rows: 0, 1

I try to convert DOIs to IDs, without success. And I know for a fact that these DOIs are on the Open Citations Corpus, because that's where I found them!

Minimal working example


Error returned in each case:

Error in data.frame(type = gsub("\\.type", "", names(tmp[, grep("\\.type",  : 
  arguments imply differing number of rows: 0, 1

The problem

This is a common bug due to an unexpected behaviour in how subsetting data frames works. That is, if you subset a data frame and the result is one column, it is automatically collapsed to a vector (not a one-column data frame) unless you specify drop = FALSE.

Here is the culprit

And the bug is triggered whenever the preceding tmp variable contains exactly one column with the suffix .type, for example

  paper.type                           paper.value
1        uri

because if you run the above code on this, you get character(0) as a result, which is not what you want.


Whilst I could add , drop = FALSE I would take the opportunity to simplify the code instead. The following works on the examples above.

    tmp <- data.frame(
      type  = gsub('\\.type', '', grep('\\.type', names(tmp), value = TRUE)),
      value = unname(unlist(tmp[, grep('\\.value', names(tmp))])),
      stringsAsFactors = FALSE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.