Giter Club home page Giter Club logo

ioos / bio_mobilization_workshop Goto Github PK

View Code? Open in Web Editor NEW
4.0 13.0 9.0 9.14 MB

Marine data mobilization workshop for Biology and Ecosystem Essential Ocean Variables (Bio-Eco EOV) as a Contribution to the UN Decade on Ocean Science for Sustainable Development

Home Page: https://ioos.github.io/bio_mobilization_workshop/

License: Other

Ruby 0.28% Makefile 2.22% HTML 48.38% SCSS 4.20% CSS 1.83% JavaScript 0.71% Python 38.98% R 2.91% Shell 0.48%

bio_mobilization_workshop's Introduction

Marine Biological Data Mobilization Workshop 2023

Workshop Digital Object Identifiers (DOI):

DOI Year
DOI ALL
DOI 2024
DOI 2023
DOI 2022

This repository is for participants to get general information and ask questions related to the Biological Data Mobilization Workshop (via the issues section).

Marine Data Mobilization Workshop for Biology and Ecosystem Essential Ocean Variables (Bio-Eco EOV) is a Contribution to the UN Decade on Ocean Science for Sustainable Development and the Marine Life 2030 Decade Action. The workshop is jointly hosted by CIOOS, IOOS, Hakai, MBON, OBIS-USA, and OTN.

This workshop is a small hands-on, interactive virtual workshop focused on mobilizing marine biological observation datasets to the Ocean Biodiversity Information System (OBIS) by helping data providers standardize their data using Darwin Core. This includes species observations from any type of sampling methodologies (e.g. visual surveys, net tows, microscopy, fish trawls, imaging, 'omics, acoustics, telemetry).

Workshop website: https://ioos.github.io/bio_mobilization_workshop/

Contributing

We welcome all contributions to improve the lesson! Maintainers will do their best to help you if you have any questions, concerns, or experience any difficulties along the way.

We'd like to ask you to familiarize yourself with our Contribution Guide and have a look at the [more detailed guidelines][lesson-example] on proper formatting, ways to render the lesson locally, and even how to write new episodes.

Please see the current list of [issues][FIXME] for ideas for contributing to this repository. For making your contribution, we use the GitHub flow, which is nicely explained in the chapter Contributing to a Project in Pro Git by Scott Chacon. Look for the tag good_first_issue. This indicates that the maintainers will welcome a pull request fixing this issue.

Maintainer(s)

Current maintainers of this lesson are

  • @MathewBiddle
  • @7yl4r
  • @albenson-usgs

Authors

A list of contributors to the lesson can be found in AUTHORS

Citation

To cite this lesson, please consult with CITATION

Deploying site locally

See this documentation.

Navigate to the folder that contains the lesson, and use bundle exec jekyll serve to preview the lessons.

If changing headers and menus bundle exec jekyll clean before serving.

How this repo is organized

At the completion of each event, this repository will be tagged and a release will be created with the year of the event (following Calendar Versioning scheme YYYY]). A DOI will be minted through Zenodo (see DOI table above). Since the workshop is intended to provide the most up to date information on aligning data to Darwin Core, the maintainers decided that we will continually build and update these materials instead of providing access to the previous years materials in subsequent yearly websites. If you would like to rebuild a specific year's website, checkout a specific release (eg. $ git checkout 2023) and build the website from that content.

bio_mobilization_workshop's People

Contributors

7yl4r avatar albenson-usgs avatar bbest avatar dependabot[bot] avatar elilawrence avatar eqmh avatar jdpye avatar laurabrenskelle avatar mathewbiddle avatar sarahrdbingo avatar sformel-usgs avatar timvdstap avatar ymgan avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bio_mobilization_workshop's Issues

Make `gh-pages` the default branch

So, it looks like we didn't use the main branch as we were thinking (documented in CONTRIBUTING.md).

I'm thinking we should make the gh-pages branch the default for this repository. It will make contributing easier and allow us to see more details on how the repo might be used (in the insights section). Other than the binder environment file (which didn't work all that great to begin with - and it could be copied to gh-pages) there is no other purpose for main.

I can make gh-pages the default, but should we delete main?

update CITATION

Before TDWG we need to update the citation file. Maybe push to zenodo and get a DOI too?

address build website GHA warnings

I've seen some warning pop up that we should probably look at. I don't think it's a requirement to fix asap, per se.

https://github.com/ioos/bio_mobilization_workshop/actions/runs/4449218838

build-website

Node.js 12 actions are deprecated. Please update the following actions to use Node.js 16: actions/setup-python@v2. For more information see: https://github.blog/changelog/2022-09-22-github-actions-all-actions-will-begin-running-on-node16-instead-of-node12/.

build-website

The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/

Add more details about the workshop to index or about

We also have information in https://ioos.github.io/bio_mobilization_workshop/about/

It would be helpful if we added this information to one of those pages.

What this workshop will cover:

  • Darwin Core and the required terms for OBIS and GBIF.
  • Typical data cleaning tasks needed to standardize the data.
  • Getting your data into a final Darwin Core format.
  • Common QA/QC steps, data enhancement, and validation tools.
  • Required metadata information.
  • How to get your data into the Integrated Publishing Toolkit.
  • Tools that will help in all of the above processes.

The goal is that by the end of the workshop you will have a dataset in a final standardized state and shared to OBIS. We are hoping to address some of the blockers that you identified, including: lack of time, training, and specific formatting questions.

We have a short time together therefore our focus will be hands-on work in breakout rooms using the dataset you bring to the workshop. We will not have many presentations and they will be relatively short. Instead we will have large portions of time for you to work on your data and ask questions when you hit a stumbling block. Therefore, if you do not have a dataset to work on you may not find this workshop a good use of your time.

What this workshop will not cover:

  • What is OBIS?
  • Rationale for sharing data with OBIS.
  • Using data that’s already in OBIS.

If you would like to learn more about OBIS and a short rationale for sharing data to it, please watch this two minute video and this two minute video and share them with those you want to work with to share data.

Fix section 4 title on main page

Capture

Section 4 should reflect the title that we have updated it to: "Metadata, QA/QC, and publishing". I tried to edit this myself on the main page but I think it's actually pulling from the URLs? Or at least I couldn't figure out where the text was to update it.

Do we need `main` branch anymore?

This repo is primarily for hosting the website deployed from the gh-pages branch. Do we even need main anymore?

I think we can get the mybinder stuff to work from gh-pages.

make breakouts more engaging

moved from #24

If we continue to get folks to provide some boilerplate info on the dataset they plan to work on. Maybe we can make breakout rooms topical based on data types, instead of lesson pages. Eg. passive acoustic, edna, survey, trawl, etc.

Just thinking of other ways the breakout rooms could be more engaging...

Clean up readme

The readme has a bunch of information we don't necessarily need anymore.

Let's clean this up to only contain the pertinent information:

Let's keep it clear and consise.

Tag 2023 materials, mint doi, attach html pages as assets to release?

2023 is now complete. We should tag the website materials. I think zenodo will automatically mint a new doi for a new version, but also provide a collection DOI for all versions (link).

Zenodo archives your repository and issues a new DOI each time you create a new GitHub release. Follow the steps at "Managing releases in a repository" to create a new one.

For example,

Also, I wonder if there is a way to add the .html files to the release. Just thinking in case jekyll disappears we wouldn't be able to rebuild the website again. If we drop the .html files in as assets to the release, folks would be able to pull those up in a browser. Just not sure if those will be pulled over to the zenodo DOI package.

Standardize I/you language in the POST survey

from Carolina Peralta:

Some questions are placed in first personal pronoun (I) and others are in second person (you). I suggest to standardize (Did this workshop help me move past the blockers I identified?; How comfortable are you with aligning data to Darwin Core?) Maybe use all questions with "I" personal pronoun?

Survey form is editable in gforms. Let me know if you need edit access granted.

outdated information on Darwin Core and Extension Schemas page

Hi

I am just scheming through the pages for the workshop and noticed some information that may have outdated. I don't know what the original information should be, so I could not submit a pull request for the Darwin Core and Extension Schemas page

Occurrence Core + extensions

Using the occurrence core plus relevant extensions means that you can capture more of the data that’s been recorded.

The link https://tools.gbif.org/dwca-validator/extensions.do now points to the new data validator login page, but not the extensions


In the occurrence extension table, row basisOfRecord:

Pick from these controlled vocabulary terms: HumanObservation, MachineObservation, PreservedSpecimen, LivingSpecimen, FossilSpecimen

There are more vocabularies for basisOfRecord (e.g. MaterialSample)

How to Capture participants' stories?

Participants likely have management applications/successes/failures to share.

How can we capture this information in a way that could be used in the motivation section of future proposals?

Do we want statistics like "75% of participants reported that this workshop will help them achieve goal X"?

Add citations for xkcd comics

@albenson-usgs mentioned that we don't have citations for the comics presented in the workshop. Fortunately, we are only providing the links in the markdown files and letting jekyll do the rendering for the website. But, we should be adding a citation underneath each of the images to appropriately cite xkcd.

From https://xkcd.com/about/:

Note: You are welcome to reprint occasional comics pretty much anywhere (presentations, papers, blogs with ads, etc). If you're not outright merchandizing, you're probably fine. Just be sure to attribute the comic to xkcd.com.

get ratings for different sections

moved from #24

Keeping in mind we don't want to ask too many questions- next time do we want to ask for ratings on the different sections so we can figure out which ones need improvement.

Change repository name

I propose we change this repository name to bio_mobilization_workshop.

Oceanhackweek had a nice approach of using the active gh-pages branch for the current year. Once the year is complete, move the content to a new branch that indicates the year (eg. 2022_site). Then work on/update the gh-pages branch for the next iteration of the workshop.

We can do something similar for main as well to preserve the datasets/code that were worked on that year. Give the branch a name like 2022_main.

I would like this to be somewhat futureproof in that we can use it for other workshops we might want to host in the future.

Post workshop certification of attendance?

What happened to our discussions about a certification of sorts?

Can we create some PDF that we send folks after the workshop to say they attended? A certification of attendance...

Speed up build time for webpage

Build time takes a bit to publish site updates (~3 mins). We probably don't need all the RMarkdown stuff...

- name: Look for R-markdown files
id: check-rmd
run: |
echo "name=count::$(shopt -s nullglob; files=($(find . -iname '*.Rmd'));echo ${#files[@]})" >> $GITHUB_OUTPUT
- name: Set up R
if: steps.check-rmd.outputs.count != 0
uses: r-lib/actions/setup-r@v2
with:
r-version: 'release'
- name: Restore R Cache
if: steps.check-rmd.outputs.count != 0
uses: actions/cache@v2
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-
- name: Install needed packages
if: steps.check-rmd.outputs.count != 0
run: |
source('bin/dependencies.R')
install_required_packages()
shell: Rscript {0}
- name: Query dependencies
if: steps.check-rmd.outputs.count != 0
run: |
source('bin/dependencies.R')
deps <- identify_dependencies()
create_description(deps)
use_bioc_repos()
saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2)
writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version")
shell: Rscript {0}
- name: Install system dependencies for R packages
if: steps.check-rmd.outputs.count != 0
run: |
while read -r cmd
do
eval sudo $cmd || echo "Nothing to update"
done < <(Rscript -e 'cat(remotes::system_requirements("ubuntu", "20.04"), sep = "\n")')

mybinder environments not building

image

Error message: Error during build: .0.0 is not valid SemVer string

I think it has to do with not specifying the r-base version number in environment.yml. However, if we specify r-base version than the rstudio option doesn't work 😵‍💫

I'll do some tinkering and see what we can do.

Update worms taxa match with species list from Enrique

File: coverconcepcionspecies_matched.txt

Section on website:

> ## Using the WoRMS Taxon Match Tool
> 1. Create a CSV (comma separated value) file with the scientific name of the species of interest. Here we are showing
> the contents of the file `animal.csv`.
> ```bash
> > head animal.csv
> Carcharodon carcharias,
> ```
> 2. Upload that file to the [WoRMS Taxon match service](https://www.marinespecies.org/aphia.php?p=match)
> * **make sure the option LSID is checked**
> ![screenshot]({{ page.root }}/fig/WoRMS_upload.png){: .image-with-shadow }
>
> 3. Identify which columns to match to which WoRMS term.
> ![screenshot]({{ page.root }}/fig/WoRMS_TaxonMatch_Preview.PNG){: .image-with-shadow }
>
> 4. Click `Match`
>
> 5. Hopefully, a WoRMS exact match will return
>
> 1. In some cases you will have ambiguous matches. Resolve the these rows by using the pull down menu to select the appropriate match.
> 2. Non-matched taxa will appear in red. You will have to go back to your source file and determine what the appropriate text should be.
> ![screenshot]({{ page.root }}/fig/WoRMS_TaxonMatch_MatchOutput.PNG){: .image-with-shadow }
>
> 6. Download the response as and XLS, XLSX, or text file and use the information when building the Darwin Core file(s).
> ```bash
> > head animal_matched.txt
> ScientificName,,AphiaID,Match type,LSID,ScientificName,Taxon status,AphiaID_accepted,ScientificName_accepted
> Carcharodon carcharias,,105838,exact,urn:lsid:marinespecies.org:taxname:105838,Carcharodon carcharias,accepted,105838,Carcharodon carcharias
> ```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.