Giter Club home page Giter Club logo

ro-crate-paper's Introduction

Manuscript: Packaging research artefacts with RO-Crate

doi:10.3233/DS-210053 RO-Crate HTML from Manubot PDF from LaTeX GitHub Actions Status Manubot GitHub Actions Status LaTeX

Manuscript description

This is the manuscript source for the paper Packaging research artifacts with RO-Crate, published in the journal Data Science as https://doi.org/10.3233/DS-210053

The manuscript was edited as a hybrid of Manubot and Overleaf using Markdown in the content/ folder.

The manuscript was created in a kind of hybrid mode:

  1. Overleaf uses Luatex to render PDF according to the Data Science template, see https://www.overleaf.com/read/gbmzkwyhjnzc
  2. Manubot runs automatically from Git commit and publishes HTML to https://stain.github.io/ro-crate-paper/
  3. Overleaf is triggered manually to sync with GitHub

The text is all in https://github.com/stain/ro-crate-paper/tree/master/content as Markdown files. Note that some of them used some inline LaTeX that may not render well in the HTML. Figures are also only shown in PDF.

Why still the HTML? Well, https://www.iospress.nl/journal/data-science/ encourages submission in HTML, and I have been a proponent for that as well, even the cited paper on RASH I was a reviewer on. But the challenge still is how to do collaborative editing and references. So in this hybrid approach I can choose to submit either the PDF or the HTML โ€“ or both. I would focus on the Overleaf approach for now.

LaTeX requirements

If you have Docker, then make will build the manuscript PDF.

Repository directories & files

The directories are as follows:

  • content contains the manuscript source, which includes markdown files as well as inputs for citations and references. See USAGE.md for more information.
  • latex LaTeX files used by Overleaf, including fonts and bibliography
  • output contains the outputs (generated files) from Manubot including the resulting manuscripts. You should not edit these files manually, because they will get overwritten.
  • webpage is a directory meant to be rendered as a static webpage for viewing the HTML manuscript.
  • build contains commands and tools for building the manuscript.
  • ci contains files necessary for deployment via continuous integration.

Local execution

The easiest way to run Manubot is to use continuous integration to rebuild the manuscript when the content changes. If you want to build a Manubot manuscript locally, install the conda environment as described in build. Then, you can build the manuscript on POSIX systems by running the following commands from this root directory.

# Activate the manubot conda environment (assumes conda version >= 4.4)
conda activate manubot

# Build the manuscript, saving outputs to the output directory
bash build/build.sh

# At this point, the HTML & PDF outputs will have been created. The remaining
# commands are for serving the webpage to view the HTML manuscript locally.
# This is required to view local images in the HTML output.

# Configure the webpage directory
manubot webpage

# You can now open the manuscript webpage/index.html in a web browser.
# Alternatively, open a local webserver at http://localhost:8000/ with the
# following commands.
cd webpage
python -m http.server

Sometimes it's helpful to monitor the content directory and automatically rebuild the manuscript when a change is detected. The following command, while running, will trigger both the build.sh script and manubot webpage command upon content changes:

bash build/autobuild.sh

Continuous Integration

Whenever a pull request is opened, CI (continuous integration) will test whether the changes break the build process to generate a formatted manuscript. The build process aims to detect common errors, such as invalid citations. If your pull request build fails, see the CI logs for the cause of failure and revise your pull request accordingly.

When a commit to the master branch occurs (for example, when a pull request is merged), CI builds the manuscript and writes the results to the gh-pages and output branches. The gh-pages branch uses GitHub Pages to host the following URLs:

For continuous integration configuration details, see .github/workflows/manubot.yaml if using GitHub Actions or .travis.yml if using Travis CI.

License

License: CC BY 4.0 License: CC0 1.0

Except when noted otherwise, the entirety of this repository is licensed under a CC BY 4.0 License (LICENSE.md), which allows reuse with attribution. Please attribute by linking to https://github.com/stain/ro-crate-paper or the DOI of the final paper.

Since CC BY is not ideal for code and data, certain repository components are also released under the CC0 1.0 public domain dedication (LICENSE-CC0.md). All files matched by the following glob patterns are dual licensed under CC BY 4.0 and CC0 1.0:

  • *.sh
  • *.py
  • *.yml / *.yaml
  • *.json
  • *.bib
  • *.tsv
  • .gitignore

These files are licensed by the SIL Open Font License, see LICENSE-SIL.md:

  • *.ttf
  • *.otf

All other files are only available under CC BY 4.0, including:

  • *.md
  • *.html
  • *.pdf
  • *.docx

Please open an issue for any question related to licensing.

ro-crate-paper's People

Contributors

adam3smith avatar adebali avatar agapow avatar agitter avatar alaninmcr avatar cgreene avatar ctb avatar dhimmel avatar evancofer avatar gwaybio avatar michaelmhoffman avatar nfry321 avatar olgabot avatar petebachant avatar rgieseke avatar rhagenson avatar simleo avatar slochower avatar stain avatar vincerubinetti avatar vsmalladi avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

ro-crate-paper's Issues

CWL / NextFlow difference in BCO section is confusing

The last paragraph in the section "Regulatory Sciences" about BCO mentions CWL twice, while the figure and its caption talk only about Nextflow, which I found confusing. I think the figure caption has the balance right, but the paragraph could be made a bit more generic, e.g.:

Specifically, a BCO alone is insufficient for reliable re-execution of a workflow, which would need a compatible workflow engine depending on the workflow definition language. IEEE 2791 recommends using Common Workflow Language [55] for interoperable pipeline execution, but supports any type of engine (from workflow systems like Galaxy and Nextflow to a simple script). Workflows may in turn rely on tool packaging in software containers using e.g. Docker or Conda. Thus, we can consider BCO RO-Crate as a stack: transport-level manifests of files (BagIt), provenance, typing and context of those files (RO-Crate), workflow overview and purpose (BCO), workflow definition (e.g. CWL) and tool distribution (Docker).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.