Giter Club home page Giter Club logo

Comments (6)

atn38 avatar atn38 commented on August 20, 2024

Might also take this opportunity to reconsider some of the mapping itself since this directly feeds back into how one uses metabase: e.g. does it make sense for pubdate to come from pkg_mgmt.pkg_state.update_data_catalog?

from lter-core-metabase.

gastil avatar gastil commented on August 20, 2024

In the VIEW for perl named mb2eml.vw_eml_pubdate the date comes from the column "PublicRelease" in the table "DataSet". Ideas for defining pubDate:
(1) the last file modification date for any of the entities (the data files' mod date on the filesystem)
(2) the date the last significant change happened to data or metadata (ignoring trivial file mods or trivial metadata mods, and thus entered by a human.)
Of course the user or a script can put that date into "DataSet"."PublicRelease" and the view read it from there. I will try not to conflate the data model with how it is populated and read.
I do not like pubDate coming from update_date_catalog because the latter is supposed to hold the actual date the EML doc was uploaded to pasta. ie that is populated after the successful upload. Note its name is not update_data_catalog; it is update_date_catalog. And the comment on that column is
-- Date package last updated in catalog (same as pathquery update date)
pathquery was a metacat thing, pre pasta.
I have seen some EML writing scripts use the current date for the pubDate.

from lter-core-metabase.

twhiteaker avatar twhiteaker commented on August 20, 2024

What is meant by publication date?

A. The date the dataset and metadata were successfully uploaded? - update_date_catalog
B. The date the dataset and metadata were finalized in metabase - dbupdatetime?
C. The date when the PI said, "Here ya go. Publish this dataset." - data_receipt_date
D. The date when the dataset was submitted to PASTA? - SubmitDate

from lter-core-metabase.

gastil avatar gastil commented on August 20, 2024

E. The date the data files and metadata were last significantly revised.

The emphasis being on "significantly". At my site, the pubDate tells data users whether their copy is up to date. If I add an ORCiD, a cksum, a keyword, or correct the spelling of a genus I'm not going to update the pubDate. If I correct erroneous data or clarify a method to avoid data miss-use, then I will update pubDate. Thus, it is a human, not a script, setting that date, at my site.

I had not noticed the column "DataSet"."SubmitDate" before. I do not use it. I would have said that was a vestigial column. But now Im wondering if maybe that's the place to store pubDate. Looks empty for SBC too. A bit of a misnomer when one receives data a month before submitting it tho.

from lter-core-metabase.

atn38 avatar atn38 commented on August 20, 2024

@gastil, if going with

E. The date the data files and metadata were last significantly revised.

Then can the latest changeDate in the maintenance tracker be the pubDate?

from lter-core-metabase.

gastil avatar gastil commented on August 20, 2024

#27 This is a documentation issue. We want to make clear what content goes in what elements in what stage of the eml creation and generation workflow (process).

Example: Until recently, it was not clear what column in what table to enter the content for the EML element pubDate.

Example: Conventions in how to use column content were not clear, such as when a role is not creator in DataSetPersonell, that means it goes into an associatedParty tree not a creator tree in the EML.

These sort of workflow guidance information need to get into the documentation.

from lter-core-metabase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.