marton-balazs-kovacs / tenzing Goto Github PK

View Code? Open in Web Editor NEW

53.0 6.0 4.0 3.36 MB

tenzing: documening contributorship with CRediT

Home Page: https://marton-balazs-kovacs.github.io/tenzing/

License: Other

R 15.68% CSS 0.88% JavaScript 0.19% HTML 83.25%

shinyapp rpackage contributor-roles

tenzing's Introduction

tenzing

Tenzing, an easy-to-use web-based app, allows researchers to generate reports about the contribution of each team member on a project using CRediT, for insertion into their manuscripts and for publishers to potentially incorporate into article metadata.

CRediT (Contributor Roles Taxonomy) is a taxonomy of 14 categories of contributions to scientific scholarly output. Each researcher can indicate which category they contributed to in a scholarly project.

The app is named after the Nepali-Indian Sherpa Tenzing Norgay, who was one of the two individuals who reached the summit of Mount Everest for the first time. Despite his essential contribution, he received less credit than his partner, the New Zealand mountaineer Edmund Hillary.

Features

Tenzing can:

read all the necessary contributorship information from one file (.csv, .tsv or .xlsx)
create a report of the contributions
create the contributors’ affiliation information, designed for inclusion in the first page of a manuscript
create JATS XML with the contributions, suitable for publishers to include in metadata
create a YAML output that will automatically add the contributorship information to the papajapackage used by some researchers to write APA-formatted manuscripts

Usage

Tenzing can be used either via the web app or via R.

Using the web app

You can use the app at https://tenzing.club/.

You can alternatively run the app locally on your own computer by following these instructions:

Install the development version (tenzing is not available from CRAN) from GitHub with:

# install.packages("devtools")
devtools::install_github("marton-balazs-kovacs/tenzing")

Running the app.

tenzing::run_app()

You can read more on how to use the tenzing app in vignette("app_use").

Using the package

You can read more on how to use the tenzing package to create reports from R in vignette("local_use").

Contribution

We are open to new ideas and feature requests. We think Tenzing has the potential to make additional contributorship-related tasks easy for researchers.

Please note that the ‘tenzing’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

tenzing's People

Contributors

Stargazers

Watchers

Forkers

alexholcombe open-science-promoters jcolomb mannazsci

tenzing's Issues

Add support for role qualifications

I'll summarize our discussion below.

We may want to qualify our roles. Given that this seems important to us, maybe we should implement support for this feature before releasing the app? In terms of how we could implement this in the sheet, I suggest we could turn the tick boxes into a drop-down list to select from (e.g., Lead, Equal, Supporting, None; I just tried this out, it works nicely). This would obviate some of the input validation.

We can also say that if someone does not want to use the more nuanced version they can just set every contributor "equal", as the default choice should be "none" in all cells.

But we should provide help to users about how to decide who is the lead and who is supporting etc.

There are some open questions for the human readable report:

Does this mean that for the reporting of the credit taxonomy we would have the role, then the lead contributor, then the equal contributors, then the supporting? If there are ore contributors saying equal or supporting then they will be organized by the order in publication? What do you think? Should I write the lead and supporting roles in parenthesis after the name?

LATEX formatted output

Based on a suggestion on twitter we might consider adding LATEX formatted output to the contributors' affiliation and the contributor roles outputs.

I wasn't sure that including a separate option worth the work as the text output could be copy-pasted and formatted easily in overleaf.

However, @crsh says that there are several ways to include this information in a LATEX formatted file, and with the inclusion of ORCID ids this option gets more interesting.

JATS entities should be NISO links, CASRAI won’t work any more

Caroline Webber of Aries pointed this out - that for the currently live-website at least, it's still using CASRAI links, which will stop resolving if they haven't already.

Should we show the infosheet in the app even if it fails the main checks?

The validation code is still running and upon loading an infosheet the modal pop up with the result of the validation that explains why it was not validated.

The reason why we do not show the sheet until it throws an error is that there is a custom code that highlights the row with missing credit info which needs the credit columns to be present in the infosheet currently.

We can change that behavior but I think that should be in a different pull request not in this one.

Originally posted by @marton-balazs-kovacs in #44 (comment)

Submodules for output preview and output download

We talked about the idea of creating submodules for the output previews and downloads with @crsh.

Preview:
For all four output types, the preview processes are basically the same, with the exception of the input RMD names.

Download:
The processes are also similar expect the content function part of the downloadHandler that is different for the YAML and XML outputs.
We should discuss whether it would make sense to create a write type function that decides which function to use to write the output, or we should create separate download submodules for code type and formatted text typed outputs.

more than 2 affiliations

I am looking into the code... I see that only 2 affiliations are accepted.

Is there a plan to go beyond 2 affiliations?

googlesheet integration

It might be interesting to keep the spreadsheet on a google sheet, and read that spreadsheet with the googlesheets4 package ?

people would have to enter the googlesheet url

jatsxml format problems

the jats4r recommendation is out of date (URL indicated leads to a 404 page).

I would use the ontology ID for now, and try to connect with the jats4r people to have an update. (I am also looking into having the force11 working group do that)

update spreadsheet template

We may want to update the spreadsheet template, the goal:

all information should fit in a unique sheet
work on googlesheet but also as a standalone spreadsheet (tsv format)

get orcid numbers
allow more than 2 affiliation
allow for funding information ?

correct roles (capitals wrong), use NISO terms #39

Visually distinguish links from the rest of the text

Hi @marton-balazs-kovacs, currently links in the app look just like regular text until you mouse-over on them. How do you feel about adding some visual distinction (e.g., changing the color) to make it easier for users to discover them?

Waiter loading bars act weird on download

There are multiple loading bars, some stay on the screen after the download is completed.

add info in first row of tenzing-template-sheet

this would allow working both on googlesheet and tsv.

show spreadsheet should be available on error

I've noticed that not only the output buttons but the show spreadsheet button is disabled on error raised by the infosheet validation. As the table output code is tailored to the infosheet template I am not sure that it is possible to change that without raising an error the crushes the app. However, it is worth noting. It would be a better user experience if the error could be checked in-app.

Handling of middle names

Hi Marton,

a brief summary of our discussion thus far. The infosheet includes a field for author middle names. It should handle both full names and initials, which is currently not the case. I just had a quick look at the current behavior for

Full middle name
Initial with dot
Initial without dot
Multiple initials with dot

There are currently no errors. I saw the following behavior:

In the author contribution list everything is printed as is.
In the author affiliation list everything is correctly abbreviated to an initial with dot. However, if there are multiple initials, only the first one is kept.
XML and JATS include everything as is.

I assume we want consistent behavior for all outputs (e.g., abbreviations)?

The line that currently does the abbreviation in the author affiliation list is this one:

tenzing/R/print_contrib_affil.R

Line 33 in 296f99b

paste0(stringr::str_sub(`Middle name`, 1, 1), ".")),

Something like this should work and be reasonably robust:

abbreviate <- function(x) {
    if(x != "") {
        gsub(x,  pattern = "^(.).*", replace = "\\1.")
    } else {
        NULL
    }
}

abbreviate_middle_names <- function(x) {
    split_names <- unlist(strsplit(x, split = " |\\."))
    initials <- unlist(lapply(split_names, abbreviate))
    paste(initials, collapse = " ")
}

name <- " p k j.p."
abbreviate_middle_names(name)

[1] "p. k. j. p."

This works regardless even when some names are abbreviate with or without a colon and if a colon-abbreviated entry is missing a space.

If this looks good to you I could implement this for all outputs in a new PR.

Feature request: conflict of interest statements

Super helpful app - thanks! Not sure if this is in your scope, but as you already collect information for funding statements, I wonder if you see an opportunity to capture information from authors for conflict of interest statements? Even if authors have no conflicts of interest, it would be useful to confirm that explicitly with each author.

support several first/last authors

input:

same number in order
another column with co-first, co-last

output:
subscript star in author name (similar to affiliation) + text below

These authors contributed equally to this work

Uploading messages contradict each other

After a user clicks Browse to choose a local contributors table file using the local operating system file dialog, the result is a message that says "Upload complete". Unfortunately, this is very confusing because after that you still have to click "Upload from local file". Because I can't find the "Upload complete" text in our code, I'm thinking it's a message provided by some File manipulation widget in a library for which we can't change the message? If so, we'll need to change "Upload from local file" instead, maybe to "Process file"?

Feature request: ORCiD

This is a great tool!

Could you add a column for Orcid numbers for the authors' list generator?

Thanks.

YAML output error: Conceptualization listed for all authors even when no authors indicated for conceptualization

As you can see after uploading this infosheet -
infosheet_template (1).xlsx

in the YAML output, every author is listed for Conceputalization even though none were indicated.

validation text: state errors making validation fails

errors (in red in the current validation text) make the spreadsheet not usable (the show infosheet button fails).

It should be clearer for the user that it is the case and that the infosheet cannot be used.

Alternatively, we could have no "error validation" but only warnings.

Feature request: show corresponding author in authors list

In the author list, put an asterix next to the corresponding author(s) name. E.g.:

John Doe¹, Jane Doe²*,
¹ University A
² University B
*Corresponding author

Alternative output type for the contributor roles output

Suggestion made by Simon Kerridge:

Also provide a text output where the contributions are listed by author (as opposed to the current option which shows authors against contribution. [I think both have their uses, but I favour the former for a small number of authors. As a possible addition to this output you could also add in affiliation email (and perhaps ORCID iD if that was added).

reorder the outputs

I think it would make more sense to have it:

Text to copy
- author list as APA text
- contribution text
- funding text
computer readable exports
- yaml (papaya)
- Jats-XML

or simpler:

author list as APA text
contribution text
funding text

yaml (papaya)
Jats-XML

Backup of slack entries

this is a issue to back up interesting slack messages (via the github app) to be sure they are not lost.
When the discussion move to a different issue or is finished, please delete the message.

Decide backend (R, javascript, both?) depending on usability on the long term.

Use initials in the contributor roles output

Users should be able to choose to use initials in the contributor roles output. I think a simple toggle button should be enough.

Collect usage stats

We need usage statistics to report when applying for grants.

The shinyapps.io deployment provides a logfile of the last 3 months, by going to Applications->tenzing->Logs. Everytime someone or something visits the website, a bunch of stuff is spat to the logfile, but that isn't very useful because that could be just a bot or a person checking out the site without using it.

I have tried to work out whether any particular log message is spat out when I load a contributor table into the app. If someone pastes a URL of a Google Sheet, then a message like this is generated:

✔ Reading from "Copy of contributors_table_template".
✔ Range ''Sheet1''.

However I haven't found any diagnostic message that pops up when I load a local file. I initially thought that the following was a telltale, but it seems to happen sometimes even when you're not doing any loading:

2022-06-21T02:41:25.282779+00:00 shinyapps[4527698]: Warning in file(con, "r") :
2022-06-21T02:41:25.282839+00:00 shinyapps[4527698]:   file("") only supports open = "w+" and open = "w+b": using the former

I'm guessing that we can modify our code to get a message recorded to the log when a user loads a contributor table, possibly by issuing a warning ourselves, e.g.

warning("A warning")

According to the shinyapps.io manual, print, cat, and message will also work. But this shinyapps.io log will only give us records of the previous 3 months.

To have records for longer than 3 months, this page says you can use the shinylogs package's function store_googledrive.

Spreadsheet name

"infosheet seems not to be a good name, write proposition in comments.

accept entry without affiliation

sometimes only looking at contribution statement, so not needing the affiliation.
Change from error to warnings when there is no affiliation

end-to-end authoring tools, export to journals

ready:

Overleaf (npg related)
authorea (whiley)

planned:

https://substance.io/ (previously texture project by coko, jatsxml writing)

open source, markdown based, mostly planned submission:

manubot (github-python)
JOOS (github-python)
modern publishing project: https://oa-pub.hos.tuhh.de/en/#hero (gitlab-based pandoc scholar)
pubpub.org.?
Coko Foundation’s solution (Wax/Editoria https://coko.foundation/category/wax-editor/)

json -> yaml first, and use pandoc scholar ?

I see that the different outputs are created from the spreadsheet information.

I would take some time to think if on the long term, it would not be better to produce the yaml version (i,e, a list in R) and then use pandoc/pandoc scholar to create the other outputs. This would probably solve #31 and many other thing that might come once we try to incorporate more infos (orcid number, funding, non-author contributors,...)

JATS-XML implementation

As pointed out by a user on Twitter we currently output author names like so

<contrib>
  <name surname="Aczel" given-names="Balazs"/>
</contrib>

when according to NISO JATS 1.2 (as also recommended for CRediT) it should be

<contrib contrib-type="author" corresp="no">
  <name>
    <surname>Balazs</surname>
    <given-names>Aczel</given-names>
  </name>
</contrib>

Also note, that the contrib-tag has an corresp-attribute, which we currently do not use although the information is included in the spreadsheet.

Formatted XML preview

Hi Frederik,

I mentioned to you earlier that I could not find a good way to show the preview of the XML document output. Right now, we are showing the document as a text output (see).

I was trying to use the htmltidy package as it has a shiny feature but I could not get it working. (see).

Can you take a look at this, please?

Thanks,
Marton

Validation rejects authors without middle names

Hi! I am just trying out tenzing for the first time (first manuscript since you released it). Some of my authors do not have middle names, so I cannot complete the form without the "missing middle name" error message appearing.

This is stopping me generating a CREDIT report using tenzing (although the spreadsheet template was still very helpful).

"?" help links don't work

I see now there are large red question marks on the homepage of tenzing which I suppose should be hyperlinks for help. But both in Safari and in Chrome, both for the live site and locally when I pull the repo, nothing happens when I click on them.

error message on googlesheet with restricted permission

We should print a more user-friendly error message if the google sheet is not viewable by all.

Feature request: tickbox to list authors in alphabetical order

@anne-urai requested this because her Int'l Brain Laboratory consortium lists authors that way

Add Oxford comma to outputs

To match APA style (but not Nature Publishing Group style), use Oxford comma for output lists such as the list of names:
Marton Kovacs, Alex Holcombe, and Balazs Aczel

This should be very low priority.

City and country as a must in affiliations

For most journals, city and country should be included in the affiliation. Based on the current layout of tenzing it would be quite hard to make this a requirement in the contributors' table. Maybe we can add checks or add tips and tricks info to the app so that users know what to look out for.

special characters

Hi! Thanks for this useful tool :)
I'm having issues with special characters, e.g. đ, š, ć, ç, ô, etc. they don't show up the the converted text.
Is there something I can do to prevent this?

Add link to Tenzing app to Google sheet template

I store all my Tenzing sheets in a Google drive folder.
When I open a Google sheet, it would be convenient to get to the Tenzing app with a link.
(Even better: The link automagically loads the Google sheet from where it came. I have no idea if that is possible (Google sheet would have to access its own URL an build that into the link. The app would need to parse the URL parameter in the link)

merge with contributor role creator ?

see https://colomb.shinyapps.io/contributorlist_creator/

and the corresponding https://github.com/open-science-promoters/contibutor_manager/

got to work with latest credit and CRO ontologies (download the OBO files)
import author information from ORCID (name, institution, funding information)

I will definitively look at your outputs and copy it, but would love to actually merge the two application (at least their background).

correct credit column names of spreadsheet template

The column names do not fit the credit role perfectly (Capitals wrong)

Next meeting agenda

Possibilities and directions have multiplied enough that this project has become a real “project management” challenge. Probably we need to formulate a tentative roadmap. It seems we need to schedule another Zoom, with part of the agenda to

decide on a process for formulating the roadmap
have a process for keeping ourselves on track in the future.

set meeting time: https://www.when2meet.com/?10424191-vXBtB

add row for author type

This would allow to fit different author list in one spreadsheet:
manuscript author: author type = authors
data collector (for example): author type = contributors

While the app does not have to deal with it yet, it would be good to have it in the template.
(?)

orcid number column

the template should get an orcid id column.

We can work on a way to use the information in the output later.

Support CRO roles

I'll summarize our discussion below.

CRO, Contributorship Research Ontology, an extension of CRediT, so perhaps it will gain traction, but probably at this point we should wait and see.

they wrote this manuscript about it using ManuBot.
Manubot also uses a YAML metadata format for authors and contributor roles. The YAML format is very similar) to that used in papaja. So we could, with not too much extra work, create a Manubot-YAML as well.

Data pull: from orcid

We could think of a way to populate the spreadsheet with information collected on orcid.

We could use the default data input from the contributor list creator (return first name, surname, present affiliation, first funding entry).

At first, we could have simply a "add one or several author" function, so we do not have to deal with updating information that could have changed on orcid.

Gt bioarchiv output

Bioarchiv as a tsv template we can have as an output

Feature request: output image with contribution table

Beyond outputting text-based contribution statements, export a figure that shows all contributions in a table.
Idea by @nsteinme, described here: https://twitter.com/SteinmetzNeuro/status/1147241128858570752

See e.g. https://elifesciences.org/articles/63711, figure 6: