Giter Club home page Giter Club logo

Comments (16)

servilla avatar servilla commented on August 17, 2024 1

Hi @mbjones - I would be glad to introduce the topic from EDI's perspective. Do you have any tentative dates for this call?

from community-calls.

mbjones avatar mbjones commented on August 17, 2024

Hi @servilla the community calls are typically held monthly on the first Thursday of each month (see details: https://www.dataone.org/community-calls/), but we've been on hiatus for the summer, and hadn't resumed yet. @karlbenedict helps with scheduling, but I think we could do it soonish, possibly even Oct 7 if the folks contributing wanted to do so.

from community-calls.

karlbenedict avatar karlbenedict commented on August 17, 2024

from community-calls.

mbjones avatar mbjones commented on August 17, 2024

Yeah, it is very soon. But that might also be ok if we only had a smallish group for the discussion, while still being open to anyone attending. Or we could wait until November if there isn't a rush on this -- it was triggered by specific needs at EDI, so hopefully folks there can weigh in on the timing.

from community-calls.

servilla avatar servilla commented on August 17, 2024

from community-calls.

vchendrix avatar vchendrix commented on August 17, 2024

We have been thinking about this on ESS-DIVE and have a few initial uses cases we are trying to support in the near term. I have forwarded this community call to my colleague who has been thinking about this for a while.

from community-calls.

jeanetteclark avatar jeanetteclark commented on August 17, 2024

Happy to present for the Arctic Data Center - we often need to do cross repo linking/replication.

from community-calls.

vchendrix avatar vchendrix commented on August 17, 2024

ESS-DIVE external linking (under review)

The following is the current work under review for ESS-DIVE.
(ESS-DIVE: @JEDamerow @shreddd)

The ability to provide a link to data file(s) outside of ESS-DIVE.
Instead of uploading data files to a dataset the user
could provide a link to the data along with metadata about
the data package being linked to. Our initial
use cases will support the ability to link out to (meta)data at other
repositories or ESS-DIVE Tier 2 storage.

Use Cases

  1. External link to data file(s) distributed as part of the dataset
  2. External link to a complete copy of the data in the dataset
  3. External link to original publication of dataset where metadata
    and data can be found.

ESS-DIVE uses EML as the underlying metadata format and also has a REST API
which translates ESS-DIVE generated EML to JSON-LD. Thus, there is a
requirement to be able to translate external linking in both
JSON-LD and EML. We have had conversations with several folks internal
and external to ESS-DIVE and gone through several iterations (6+) of thought
exercises on ways to capture external linking in both metadata formats that
allows for a smooth translation between the two formats. I will not
go over these iterations. The following is a description of our current thinking.

Our current iteration (and close to final pending team review), is to use
schema.org metadata to express the three use cases mentioned above in both
metadata formats (EML, JSON-LD).

We will use EML annotations to create a semantic triple
([subject] [predicate] [object]).

  • subject: The dataset
  • predicate: 'has part', 'same as' or 'archived at'
  • object: The externally linked resource

Use Case 1: External link to data file(s) distributed as part of the dataset

One or more files that are part of a data packages resides outside of the main
archive. This could be a link to an individual file or a directory.

EML (Annotation on the dataset)

Uses schema.org vocabulary to describe the external links. In this case, "Dataset has part orthomosaiced estimated reflectance data". The inverse would be "orthomosaiced estimated reflectance data is part of the Dataset.

<dataset id="<identifier>">
...
<annotation>
       <propertyURI label="has part">
            https://schema.org/hasPart
      </propertyURI>
      <valueURI label="orthomosaiced estimated reflectance data">
            https://portal.nersc.gov/wfsfa/doi-10-15485-16181314/
      </valueURI>
</annotation>
...
</dataset>
...

JSON

This use case translates to schema.org hasPart

{
  "@type": "Dataset",
  "hasPart": {
         "@type": "WebPage",
         "name": "orthomosaiced estimated reflectance data",
         "url": "https://portal.nersc.gov/wfsfa/doi-10-15485-16181314/"
   }
}

ESS-DIVE UI Example

Screen Shot 2021-10-05 at 8 50 21 AM

Use Case 2: External link to a complete copy of the data in the dataset

Another complete copy of the data in the data package resides outside.

EML annotations

This use case translates to schema.org archivedAt vocabulary to describe the relationship. In this case, "Dataset archived at Globus Copy at NERSC".

<dataset>
...
<annotation>
    <propertyURI label="archived at">
        https://schema.org/archivedAt
    </propertyURI>
    <valueURI label="Globus Copy at NERSC">
        https://app.globus.org/file-manager?origin_id=211394dc-e1a0-11ea-9ef9-0aba3c43875b&origin_path=%2Fdoi-10-15486-ngt-1770776%2F
    </valueURI>
</annotation>
...
</dataset>

JSON

This use case translates to schema.org archivedAt which is pending implementation feedback and adoption from applications and websites.

{
 "@type": "Dataset",
  "archivedAt" :  {
      "@type": "WebPage",
      "name": "Globus Copy at NERSC",
      "url":  "https://app.globus.org/file-manager?origin_id=211394dc-e1a0-11ea-9ef9-0aba3c43875b&origin_path=%2Fdoi-10-15486-ngt-1770776%2F"}
}

Use Case 3: External link to original publication of dataset where metadataand data can be found

The orignal landing page where the data can be found.

EML Annotations

This use case translates to schema.org sameAs and identifier
In this case, "Dataset same as https:doi.org/10.25581/spruce.048/1425889".

<dataset>
...
<annotation>
    <propertyURI label="sameAs">
        https://schema.org/sameAs
    </propertyURI>
    <valueURI label="doi:10.25581/spruce.048/1425889">
        https:doi.org/10.25581/spruce.048/1425889
    </valueURI>
</annotation>
...
</dataset>

JSON

This use case translates to schema.org sameAs
and identifier

{
  "@type": "Dataset",
  "identifier": {
       "@type": "PropertyValue",
       "propertyID": "DOI",
       "value":  "10.25581/spruce.048/1425889"
   },
   "sameAs":"https:dx.doi.org/10.25581/spruce.048/1425889",
}

ESS-DIVE UI Example

Screen Shot 2021-10-05 at 8 52 14 AM

Future work

Ability to link precisely to related resources, which are
important for interpretation, search, access, integration,
and reuse - particularly for interdisciplinary data.
This could include related datasets, sample metadata, sample data,
methods/protocols, and the paper associated with a dataset.

For this we will explore the use of DataCite metadata
scheme relationType from relatedIdentifiers with EML annotions.
In JSON-LD, we will experiment with mapping the datacite vocabulary
in @context.

from community-calls.

aebudden avatar aebudden commented on August 17, 2024

It seems that November is preferable and more reasonable given the October date is tomorrow. The first Thursday would be November 7th. We hold these at either 1000 Pacific or 1700 Pacific - alternating between the two. Unfortunately, Matt, Jeanette and I are running a training activity all that week. Given conflicts, and the summer break, do we want to find a different date vs waiting until December?

from community-calls.

JEDamerow avatar JEDamerow commented on August 17, 2024

Any update on when this may take place?

from community-calls.

aebudden avatar aebudden commented on August 17, 2024

Scheduling for Wednesday Nov 10th 1700 UTC
@JEDamerow @jeanetteclark @servilla - does that work for you?

from community-calls.

jeanetteclark avatar jeanetteclark commented on August 17, 2024

@aebudden I'll be on vacation that day

from community-calls.

mbjones avatar mbjones commented on August 17, 2024

We discussed this in the Arctic Data Center call today, and proposed that Natasha Haycock-Chavez (@nhchavez) present instead of Jeanette for the Arctic Data Center. She agreed to work with @jeanetteclark and me on it, and she can give a nice intro to the topic generally, as well of how we have handled replication and linking at the ADC. On the TT call today, @karlbenedict agreed to update the website with the new bio info. We also need to confirm if @JEDamerow or someone from ESS-DIVE would be able to help with the framing of the space.

With all of this, we need to keep in mind that the speaking part of the session should take up a total of no more than 20 minutes, so that the majority of the session is available for structured discussion. So that probably means max 5 minutes each to frame the discussion.

from community-calls.

mbjones avatar mbjones commented on August 17, 2024

@nhchavez this is the issue discussing the community call... thanks.

from community-calls.

JEDamerow avatar JEDamerow commented on August 17, 2024

Scheduling for Wednesday Nov 10th 1700 UTC @JEDamerow @jeanetteclark @servilla - does that work for you?

That works for me.

from community-calls.

aebudden avatar aebudden commented on August 17, 2024

Zoom line for the community call tomorrow: https://ucsb.zoom.us/j/94309556242
Hack pad for notes: https://hackmd.io/EKi9azkVTzW2FzmsZPjIgw
See you at 1700 UTC, 0900 Pacific

from community-calls.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.