Giter Club home page Giter Club logo

checklist-recipe's People

Contributors

damianooldoni avatar lienreyserhove avatar peterdesmet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

checklist-recipe's Issues

Remove website

Having the docs directory as part of the recipe is rather annoying from a template perspective. You end up with a whole bunch of files in your repository that are not relevant to your project.

Since we don't really showcase website for the checklist-recipe, but rather link to some examplar ones on https://github.com/trias-project/checklist-recipe/wiki/Examples, I would just remove the docs directory, but still mention it in the repository overview + leave the site.yml and _index.Rmd file to generate it.

Create dwc_mapping.Rmd

Should be functional (i.e. it standardizes the example source data to Darwin Core) and act as an example that can be build upon.

Functionality to include:

  • digest to generate taxonID
  • Remove duplicates to create unique records in taxon core
  • name_parse to get genus and taxonRank
  • static mapping, e.g. datasetName
  • as is mapping, e.g. locality
  • recode mapping: treatStatus from full text to codes
  • case_when mapping: if locality is empty and countryCode is e.g. BE => use Belgium for locality

Start a recipe document

The getting started guide is a step by step approach of how to use the checklist-recipe to standardize your own checklist. recipe.md in root.

Getting started

  1. Copy checklist-recipe (with invitation link)
  2. Clone repo locally
  3. Open R studio
  4. Run all cells
  5. Check what happened (nothing, but everything works)

Exercise vs working mapping template

  1. Either follow exercise (see steps there) OR use the functioning dwc_mapping.Rmd

Mapping steps

Note: see https://github.com/inbo/dwc-in-R/blob/master/src/dwc-mapping.Rmd#map-to-darwin-core-archive

  • Read data
  • Pre-processing
  • Creating taxon core (use unique)
  • Creating distribution extension

3 different types of mapping

Note: see https://github.com/inbo/dwc-in-R/blob/master/src/dwc-mapping.Rmd#mapping

  • Mapping static fields
  • Mapping fields as is
  • Recoding fields

Pushing changes back to Github

Some basic git skills, using rstudio (alternatively GitHub desktop)

Using IPT to publish

Links to other material

Examples

Link to existing TrIAS datasets

Create template README

Create README that can both be informative to user & can serve as final README of published checklist.

Comments to user could be written as:

This is regular text

<!-- this is a remark -->

This is regular text

Things to keep in mind when writing the recipe

Use of:

  • Input vs raw data: source data
  • dataset vs dataframe: data frame when talking about R code
  • you vs we (first is correct)
  • field vs variable: dropped the use of variable in favour of term (or field)
  • recipe vs workflow: clear, recipe is the whole thing, workflow is data script data

Use of recode() and case_when()

I would also use case_when to map nomenclaturalCode. Also, eventDate still needs to be mapped, for which we can also use case_when.

Add section to dwc_mapping with some stats about the dataset

Such a section could be useful for populating the metadata. Examples:

  1. Number of taxa, number of species

  2. % of taxa over kingdoms and countries, with totals:

input_data %>%
  tabyl(input_country_code, input_kingdom) %>%
  adorn_totals("row") %>%
  adorn_percentages("row") %>%
  adorn_pct_formatting(rounding = "half up", digits = 0)

Dates are missing from the checklist

There are clear spatial boundaries to the checklist, because you specify a country_code and locality, but I think you need temporal bounds for the information. I suggest adding columns for first and last record. These would then be converted to an eventDate by the script.

To add to recipe

  • Note about encoding
  • Loaded packages should be installed first

Create data template file

Create a csv, named data/raw/checklist.xlsx with the following fields:

  • scientific_name
  • kingdom
  • taxon_rank
  • country_code
  • locality
  • occurrence_status
  • establishment_means
  • threat_status
  • source
  • remarks

Not to include:

  • taxonomic_status: scope of recipe is thematic checklists, rather than taxonomic checklist. Including this would mean we also have to support synonymy link
  • vernacular_names: out of scope for now, would mean repetition of names for multiple rows of same taxon
  • event_date: very specific to TrIAS
  • habitat: useful for speciesProfile, but not used at GBIF yet, so out of scope for now

Review pages

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.