ioos / bio_mobilization_workshop Goto Github PK

Marine data mobilization workshop for Biology and Ecosystem Essential Ocean Variables (Bio-Eco EOV) as a Contribution to the UN Decade on Ocean Science for Sustainable Development

Home Page: https://ioos.github.io/bio_mobilization_workshop/

License: Other

Ruby 0.28% Makefile 2.22% HTML 48.38% SCSS 4.20% CSS 1.83% JavaScript 0.71% Python 38.98% R 2.91% Shell 0.48%

bio_mobilization_workshop's Introduction

Marine Biological Data Mobilization Workshop 2023

Workshop Digital Object Identifiers (DOI):

DOI	Year
	ALL
	2024
	2023
	2022

This repository is for participants to get general information and ask questions related to the Biological Data Mobilization Workshop (via the issues section).

Marine Data Mobilization Workshop for Biology and Ecosystem Essential Ocean Variables (Bio-Eco EOV) is a Contribution to the UN Decade on Ocean Science for Sustainable Development and the Marine Life 2030 Decade Action. The workshop is jointly hosted by CIOOS, IOOS, Hakai, MBON, OBIS-USA, and OTN.

This workshop is a small hands-on, interactive virtual workshop focused on mobilizing marine biological observation datasets to the Ocean Biodiversity Information System (OBIS) by helping data providers standardize their data using Darwin Core. This includes species observations from any type of sampling methodologies (e.g. visual surveys, net tows, microscopy, fish trawls, imaging, 'omics, acoustics, telemetry).

Workshop website: https://ioos.github.io/bio_mobilization_workshop/

Contributing

We welcome all contributions to improve the lesson! Maintainers will do their best to help you if you have any questions, concerns, or experience any difficulties along the way.

We'd like to ask you to familiarize yourself with our Contribution Guide and have a look at the [more detailed guidelines][lesson-example] on proper formatting, ways to render the lesson locally, and even how to write new episodes.

Please see the current list of [issues][FIXME] for ideas for contributing to this repository. For making your contribution, we use the GitHub flow, which is nicely explained in the chapter Contributing to a Project in Pro Git by Scott Chacon. Look for the tag . This indicates that the maintainers will welcome a pull request fixing this issue.

Maintainer(s)

Current maintainers of this lesson are

@MathewBiddle
@7yl4r
@albenson-usgs

Authors

A list of contributors to the lesson can be found in AUTHORS

Citation

To cite this lesson, please consult with CITATION

Deploying site locally

See this documentation.

Navigate to the folder that contains the lesson, and use bundle exec jekyll serve to preview the lessons.

If changing headers and menus bundle exec jekyll clean before serving.

How this repo is organized

At the completion of each event, this repository will be tagged and a release will be created with the year of the event (following Calendar Versioning scheme YYYY]). A DOI will be minted through Zenodo (see DOI table above). Since the workshop is intended to provide the most up to date information on aligning data to Darwin Core, the maintainers decided that we will continually build and update these materials instead of providing access to the previous years materials in subsequent yearly websites. If you would like to rebuild a specific year's website, checkout a specific release (eg. $ git checkout 2023) and build the website from that content.

bio_mobilization_workshop's People

Contributors

Stargazers

Watchers

Forkers

timvdstap cperaltab mathewbiddle jdpye rskelly ymgan albenson-usgs dimevil laurabrenskelle

bio_mobilization_workshop's Issues

Make `gh-pages` the default branch

So, it looks like we didn't use the main branch as we were thinking (documented in CONTRIBUTING.md).

I'm thinking we should make the gh-pages branch the default for this repository. It will make contributing easier and allow us to see more details on how the repo might be used (in the insights section). Other than the binder environment file (which didn't work all that great to begin with - and it could be copied to gh-pages) there is no other purpose for main.

I can make gh-pages the default, but should we delete main?

update CITATION

Before TDWG we need to update the citation file. Maybe push to zenodo and get a DOI too?

address build website GHA warnings

I've seen some warning pop up that we should probably look at. I don't think it's a requirement to fix asap, per se.

https://github.com/ioos/bio_mobilization_workshop/actions/runs/4449218838

build-website

Node.js 12 actions are deprecated. Please update the following actions to use Node.js 16: actions/setup-python@v2. For more information see: https://github.blog/changelog/2022-09-22-github-actions-all-actions-will-begin-running-on-node16-instead-of-node12/.

build-website

The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/

Create environment files for R and Python

Create environment files which can be picked up by Binder to spin up a Python or R environment, if folks have trouble locally.

Python
R
Do we want a combined environment like the IOOS conda env?

Add link(s) to bio_data_guide datasets

The datasets in https://github.com/ioos/bio_data_guide/tree/main/datasets can be linked to as examples where needed. Specifically the example_* ones could be useful. I am not sure where it makes sense to put those links.

Add more details about the workshop to index or about

We also have information in https://ioos.github.io/bio_mobilization_workshop/about/

It would be helpful if we added this information to one of those pages.

What this workshop will cover:

Darwin Core and the required terms for OBIS and GBIF.
Typical data cleaning tasks needed to standardize the data.
Getting your data into a final Darwin Core format.
Common QA/QC steps, data enhancement, and validation tools.
Required metadata information.
How to get your data into the Integrated Publishing Toolkit.
Tools that will help in all of the above processes.

The goal is that by the end of the workshop you will have a dataset in a final standardized state and shared to OBIS. We are hoping to address some of the blockers that you identified, including: lack of time, training, and specific formatting questions.

We have a short time together therefore our focus will be hands-on work in breakout rooms using the dataset you bring to the workshop. We will not have many presentations and they will be relatively short. Instead we will have large portions of time for you to work on your data and ask questions when you hit a stumbling block. Therefore, if you do not have a dataset to work on you may not find this workshop a good use of your time.

What this workshop will not cover:

What is OBIS?
Rationale for sharing data with OBIS.
Using data that’s already in OBIS.

If you would like to learn more about OBIS and a short rationale for sharing data to it, please watch this two minute video and this two minute video and share them with those you want to work with to share data.

Fix section 4 title on main page

Section 4 should reflect the title that we have updated it to: "Metadata, QA/QC, and publishing". I tried to edit this myself on the main page but I think it's actually pulling from the URLs? Or at least I couldn't figure out where the text was to update it.

Do we need `main` branch anymore?

This repo is primarily for hosting the website deployed from the gh-pages branch. Do we even need main anymore?

I think we can get the mybinder stuff to work from gh-pages.

Create eDNA 'extras' page

Need to create the extras page then add content

make breakouts more engaging

moved from #24

If we continue to get folks to provide some boilerplate info on the dataset they plan to work on. Maybe we can make breakout rooms topical based on data types, instead of lesson pages. Eg. passive acoustic, edna, survey, trawl, etc.

Just thinking of other ways the breakout rooms could be more engaging...

Add something about how to continue the conversation after the workshop

We should include a page(?) on where to continue having the discussions after the workshop.

Need to think about where to put it?

Clean up readme

The readme has a bunch of information we don't necessarily need anymore.

Let's clean this up to only contain the pertinent information:

Title
Description
- What do we want to use main for? According to CONTRIBUTING we were hoping to have folks upload their data/code. Is that still the case?
- I think How this repo is organized should be in the description/readme.
mybinder environment links
Link to the workshop website: https://ioos.github.io/bio_mobilization_workshop/

Let's keep it clear and consise.

update schedule

Link for Post-workshop help & survey (https://ioos.github.io/bio_mobilization_workshop/07-post-workshop/index.html) should be to https://ioos.github.io/bio_mobilization_workshop/08-post-workshop/index.html

Add extras page

GBIF DwC Vadlidator
Search NERC

Update Code of Conduct

CoC points to carpentries to report. As we're not officially carpentries, we should have another mechanism to report.

https://ioos.github.io/bio_mobilization_workshop/CODE_OF_CONDUCT.html

Tag 2023 materials, mint doi, attach html pages as assets to release?

2023 is now complete. We should tag the website materials. I think zenodo will automatically mint a new doi for a new version, but also provide a collection DOI for all versions (link).

Zenodo archives your repository and issues a new DOI each time you create a new GitHub release. Follow the steps at "Managing releases in a repository" to create a new one.

For example,

https://doi.org/10.5281/zenodo.7401979 - cite all versions
https://doi.org/10.5281/zenodo.7401980 - cite just 2022

Also, I wonder if there is a way to add the .html files to the release. Just thinking in case jekyll disappears we wouldn't be able to rebuild the website again. If we drop the .html files in as assets to the release, folks would be able to pull those up in a browser. Just not sure if those will be pulled over to the zenodo DOI package.

Review Intro to Darwin Core page

https://ioos.github.io/bio_mobilization_workshop/01-introduction/index.html

Standardize I/you language in the POST survey

from Carolina Peralta:

Some questions are placed in first personal pronoun (I) and others are in second person (you). I suggest to standardize (Did this workshop help me move past the blockers I identified?; How comfortable are you with aligning data to Darwin Core?) Maybe use all questions with "I" personal pronoun?

Survey form is editable in gforms. Let me know if you need edit access granted.

set up checklist / overview .md files for topics & exercises

I want to create .md files that give an overview & relevant links for the topics and exercises listed in the agenda.

pre and post-workshop survey

OBIS has some?

better breakout room descriptions

copied from #24

Create better descriptions of the breakout rooms so people know what room to go to ask their question.

Move from lessons template to workshop template for website?

Right now we are using The Carpentries Lesson template. @albenson-usgs shared that there is The Carpentries Workshop template.

To do this we need to:

Compare workshop template with lesson template to identify differences
Copy out the markdown files we've made/edited
replace all files with workshop template files
replace appropriate markdown files (from above)
update README, CONTRIBUTING, etc.

add example check for `individualCount` to QA/QC

          add example check for `individualCount` as it should be an integer.

Originally posted by @MathewBiddle in #57 (comment)

outdated information on Darwin Core and Extension Schemas page

I am just scheming through the pages for the workshop and noticed some information that may have outdated. I don't know what the original information should be, so I could not submit a pull request for the Darwin Core and Extension Schemas page

Occurrence Core + extensions

Using the occurrence core plus relevant extensions means that you can capture more of the data that’s been recorded.

The link https://tools.gbif.org/dwca-validator/extensions.do now points to the new data validator login page, but not the extensions

In the occurrence extension table, row basisOfRecord:

Pick from these controlled vocabulary terms: HumanObservation, MachineObservation, PreservedSpecimen, LivingSpecimen, FossilSpecimen

There are more vocabularies for basisOfRecord (e.g. MaterialSample)

add country to pre-workshop survey

from #24

Make sure to ask for country in the pre-workshop survey

Update landing page with the next workshop dates and information

Now that we know we will be doing it again, we should update the landing page to include dates and other relevant information.

Things to think about for next time

pre-workshop survey: https://docs.google.com/forms/d/e/1FAIpQLSeLsNrcwfU2EGtObf_E97juI55LZGR2Fc8VsPpTSpRmXG7ufA/viewform?usp=sf_link

How to Capture participants' stories?

Participants likely have management applications/successes/failures to share.

How can we capture this information in a way that could be used in the motivation section of future proposals?

Do we want statistics like "75% of participants reported that this workshop will help them achieve goal X"?

Review DwC and Extension Schemas

https://ioos.github.io/bio_mobilization_workshop/04-create-schema/index.html

Content for setup.md

https://ioos.github.io/bio_mobilization_workshop/setup.html

Do we want to put anything in the setup page for the workshop?

Typically this contains information on how to install software packages required for the lesson. See Software Carpentry R setup page. I think adding a section for RStudio and a section for Python would be helpful for those who stumble upon the site.

add youtube links on each topical page

from Naomi:

rather than just in the playlist so that video and writeup are together on one page

Create a release, tag, and mint DOI

I'd like to have a DOI for this repo.

TODO:

Create a release
tag said release
mint doi through Zenodo
update citation.cff

Review continuing the conversation page

review the Continuing the Conversation page and update with new post-survey once we have it.

Create website using github pages

We have a couple examples to borrow from:

Add citations for xkcd comics

@albenson-usgs mentioned that we don't have citations for the comics presented in the workshop. Fortunately, we are only providing the links in the markdown files and letting jekyll do the rendering for the website. But, we should be adding a citation underneath each of the images to appropriately cite xkcd.

From https://xkcd.com/about/:

Note: You are welcome to reprint occasional comics pretty much anywhere (presentations, papers, blogs with ads, etc). If you're not outright merchandizing, you're probably fine. Just be sure to attribute the comic to xkcd.com.

get ratings for different sections

moved from #24

Keeping in mind we don't want to ask too many questions- next time do we want to ask for ratings on the different sections so we can figure out which ones need improvement.

Change repository name

I propose we change this repository name to bio_mobilization_workshop.

Oceanhackweek had a nice approach of using the active gh-pages branch for the current year. Once the year is complete, move the content to a new branch that indicates the year (eg. 2022_site). Then work on/update the gh-pages branch for the next iteration of the workshop.

We can do something similar for main as well to preserve the datasets/code that were worked on that year. Give the branch a name like 2022_main.

I would like this to be somewhat futureproof in that we can use it for other workshops we might want to host in the future.

Post workshop certification of attendance?

What happened to our discussions about a certification of sorts?

Can we create some PDF that we send folks after the workshop to say they attended? A certification of attendance...

add GTKY+networking time

moved from #24

Should we make the workshop longer (a full week?) to build in GTKY/ networking time?

Speed up build time for webpage

Build time takes a bit to publish site updates (~3 mins). We probably don't need all the RMarkdown stuff...

bio_mobilization_workshop/.github/workflows/website.yml

Lines 44 to 89 in 6bb3039

 - name: Look for R-markdown files 

 id: check-rmd 

 run: | 

  echo "name=count::$(shopt -s nullglob; files=($(find . -iname '*.Rmd'));echo ${#files[@]})" >> $GITHUB_OUTPUT 

  - name: Set up R 

 if: steps.check-rmd.outputs.count != 0 

 uses: r-lib/actions/setup-r@v2 

 with: 

 r-version: 'release' 

 - name: Restore R Cache 

 if: steps.check-rmd.outputs.count != 0 

 uses: actions/cache@v2 

 with: 

 path: ${{ env.R_LIBS_USER }} 

 key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }} 

 restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1- 

 - name: Install needed packages 

 if: steps.check-rmd.outputs.count != 0 

 run: | 

  source('bin/dependencies.R') 

  install_required_packages() 

  shell: Rscript {0} 

 - name: Query dependencies 

 if: steps.check-rmd.outputs.count != 0 

 run: | 

  source('bin/dependencies.R') 

  deps <- identify_dependencies() 

  create_description(deps) 

  use_bioc_repos() 

  saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2) 

  writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version") 

  shell: Rscript {0} 

 - name: Install system dependencies for R packages 

 if: steps.check-rmd.outputs.count != 0 

 run: | 

  while read -r cmd 

  do 

  eval sudo $cmd || echo "Nothing to update" 

  done < <(Rscript -e 'cat(remotes::system_requirements("ubuntu", "20.04"), sep = "\n")')

Create separate pages for different events?

Say we were to hold this workshop again, how can we host the different workshops materials so they are available online?

I think Oceanhackweek did some neat tricks to make this happen?

cc: @ocefpaf

Review Metadata and Publishing page

https://ioos.github.io/bio_mobilization_workshop/07-validation-and-publishing/index.html

Update/Create favicons

Anyone have a favicon they want to use? For the browser tab and header of the webpage.

IOOS has a couple here

The existing favicons (for the carpentries) can be found here https://github.com/ioos/bio_mobilization_workshop/tree/gh-pages/assets/favicons

To do

create/decide on a favicon
make a new directory (bmw?) and add our favicons to it.
update _config.yml to reference new directory

mybinder environments not building

Error message: Error during build: .0.0 is not valid SemVer string

I think it has to do with not specifying the r-base version number in environment.yml. However, if we specify r-base version than the rstudio option doesn't work 😵‍💫

I'll do some tinkering and see what we can do.

Update worms taxa match with species list from Enrique

File: coverconcepcionspecies_matched.txt

Section on website:

bio_mobilization_workshop/_episodes/03-data-cleaning.md

Lines 305 to 332 in 965dfd9

 > ## Using the WoRMS Taxon Match Tool 

 > 1. Create a CSV (comma separated value) file with the scientific name of the species of interest. Here we are showing  

 >  the contents of the file `animal.csv`. 

 >  ```bash 

 > > head animal.csv 

 > Carcharodon carcharias, 

 > ``` 

 > 2. Upload that file to the [WoRMS Taxon match service](https://www.marinespecies.org/aphia.php?p=match) 

 >  * **make sure the option LSID is checked**  

 >  ![screenshot]({{ page.root }}/fig/WoRMS_upload.png){: .image-with-shadow } 

 >  

 > 3. Identify which columns to match to which WoRMS term. 

 >  ![screenshot]({{ page.root }}/fig/WoRMS_TaxonMatch_Preview.PNG){: .image-with-shadow } 

 >  

 > 4. Click `Match`  

 > 

 > 5. Hopefully, a WoRMS exact match will return 

 > 

 >  1. In some cases you will have ambiguous matches. Resolve the these rows by using the pull down menu to select the appropriate match. 

 >  2. Non-matched taxa will appear in red. You will have to go back to your source file and determine what the appropriate text should be.  

 > ![screenshot]({{ page.root }}/fig/WoRMS_TaxonMatch_MatchOutput.PNG){: .image-with-shadow } 

 >   

 > 6. Download the response as and XLS, XLSX, or text file and use the information when building the Darwin Core file(s). 

 >  ```bash 

 > > head animal_matched.txt 

 > ScientificName,,AphiaID,Match type,LSID,ScientificName,Taxon status,AphiaID_accepted,ScientificName_accepted 

 > Carcharodon carcharias,,105838,exact,urn:lsid:marinespecies.org:taxname:105838,Carcharodon carcharias,accepted,105838,Carcharodon carcharias 

 > ```

QA/QC page should link to required terms table in intro to DwC page

link to the required terms table on the Intro to Darwin Core page. Link should be something like

[link]({{ page.root }}/01-introduction/index.html#what-are-the-required-darwin-core-terms-for-publishing-to-obis)

GH-action build-website broken?

Error: Unable to resolve action `r-lib/actions@master`, unable to find version `master`

https://github.com/ioos/bio_mobilization_workshop/actions/runs/3321475196/jobs/5489169194

	- name: Look for R-markdown files
	id: check-rmd
	run: \|
	echo "name=count::$(shopt -s nullglob; files=($(find . -iname '*.Rmd'));echo ${#files[@]})" >> $GITHUB_OUTPUT


	- name: Set up R
	if: steps.check-rmd.outputs.count != 0
	uses: r-lib/actions/setup-r@v2
	with:
	r-version: 'release'

	- name: Restore R Cache
	if: steps.check-rmd.outputs.count != 0
	uses: actions/cache@v2
	with:
	path: ${{ env.R_LIBS_USER }}
	key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }}
	restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-

	- name: Install needed packages
	if: steps.check-rmd.outputs.count != 0
	run: \|
	source('bin/dependencies.R')
	install_required_packages()
	shell: Rscript {0}

	- name: Query dependencies
	if: steps.check-rmd.outputs.count != 0
	run: \|
	source('bin/dependencies.R')
	deps <- identify_dependencies()
	create_description(deps)
	use_bioc_repos()
	saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2)
	writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version")
	shell: Rscript {0}


	- name: Install system dependencies for R packages
	if: steps.check-rmd.outputs.count != 0
	run: \|
	while read -r cmd
	do
	eval sudo $cmd \|\| echo "Nothing to update"
	done < <(Rscript -e 'cat(remotes::system_requirements("ubuntu", "20.04"), sep = "\n")')

	> ## Using the WoRMS Taxon Match Tool
	> 1. Create a CSV (comma separated value) file with the scientific name of the species of interest. Here we are showing
	> the contents of the file `animal.csv`.
	> ```bash
	> > head animal.csv
	> Carcharodon carcharias,
	> ```
	> 2. Upload that file to the [WoRMS Taxon match service](https://www.marinespecies.org/aphia.php?p=match)
	> * make sure the option LSID is checked
	> ![screenshot]({{ page.root }}/fig/WoRMS_upload.png){: .image-with-shadow }
	>
	> 3. Identify which columns to match to which WoRMS term.
	> ![screenshot]({{ page.root }}/fig/WoRMS_TaxonMatch_Preview.PNG){: .image-with-shadow }
	>
	> 4. Click `Match`
	>
	> 5. Hopefully, a WoRMS exact match will return
	>
	> 1. In some cases you will have ambiguous matches. Resolve the these rows by using the pull down menu to select the appropriate match.
	> 2. Non-matched taxa will appear in red. You will have to go back to your source file and determine what the appropriate text should be.
	> ![screenshot]({{ page.root }}/fig/WoRMS_TaxonMatch_MatchOutput.PNG){: .image-with-shadow }
	>
	> 6. Download the response as and XLS, XLSX, or text file and use the information when building the Darwin Core file(s).
	> ```bash
	> > head animal_matched.txt
	> ScientificName,,AphiaID,Match type,LSID,ScientificName,Taxon status,AphiaID_accepted,ScientificName_accepted
	> Carcharodon carcharias,,105838,exact,urn:lsid:marinespecies.org:taxname:105838,Carcharodon carcharias,accepted,105838,Carcharodon carcharias
	> ```

ioos / bio_mobilization_workshop Goto Github PK

bio_mobilization_workshop's Introduction

Marine Biological Data Mobilization Workshop 2023

Contributing

Maintainer(s)

Authors

Citation

Deploying site locally

How this repo is organized

bio_mobilization_workshop's People

Contributors

Stargazers

Watchers

Forkers

bio_mobilization_workshop's Issues

Recommend Projects

Recommend Topics

Recommend Org