Giter Club home page Giter Club logo

Comments (8)

tomgee avatar tomgee commented on August 16, 2024

If all of the datasets referred to by the portal are in the conp-dataset repo, then an automatic poll would make the most sense. However, if there are (or will be) datasets who do not reside directly in the repo, then presumably any changes to those externally-hosted datasets would involve pushing notification of the changes into the CONP portal/repo through a versioned registration process.

I think I'm not completely clear on the relationship between the portal and the repo, so perhaps my thought isn't quite worth the $0.02 I was hoping it was?

from conp-portal.

jbpoline avatar jbpoline commented on August 16, 2024

All datasets should be in the conp-pcno/conp-dataset - pointing (as submodules) to their original datasets - ie, repositories. Some but not all of these repositories are living - names are super confusing - on the github conpdatasets organization, but they can certainly live outside as well. So yes, I think you are right Tom, as long as we have access to the datasets from conp-pcno/conp-dataset then we should be able to populate the portal.

from conp-portal.

jbpoline avatar jbpoline commented on August 16, 2024

Note that it would require
1- clone conp-dataset
2- datalad install . for all datasets ... so that we get the dats.json files

from conp-portal.

shots47s avatar shots47s commented on August 16, 2024

Thanks @jbpoline, really all we need in the repository is the metadata, so I would qualify your previous point with that there needs to be at a minimum the dats.json with the appropriate information to populate the front page. Also there needs to be a means to access the data where it lives, whether it be through datalad or some other means. Right now, the portal is storing a url to the data, which is not a robust means to define how to access the data.

But to populate the portal page with appropriate information, the metadata that describes the dataset should be in the conp-dataset repository.

@jbpoline, have you and the team settled on a definitive list of things that are necessary in the dats.json? I think so, but I thought it would be good to provide a pointer here in case others would like to comment.

from conp-portal.

jbpoline avatar jbpoline commented on August 16, 2024

@shots47s : yes, the dats files are compulsery, tests should fail if they dont exist !
yes - there is a list - need to dig out where (in Jen's documents) but here they are

  • the schema.org required fields (title + description)
  • license / data usage agreement
  • authors / contact
  • doi
    So, 5 required fields (we intentionally kept that to the minimal)
    @[email protected] : could you point us to your document on this ?

from conp-portal.

driusan avatar driusan commented on August 16, 2024

Is there any documentation somewhere about how they're currently statically created at install time?

It sounds to me like a simple cron job that does git pull would be sufficient based on the above thread, and I don't see why they would need to be completely dynamic if it's a significant amount more work for marginal benefit.

from conp-portal.

shots47s avatar shots47s commented on August 16, 2024

@driusan, thank you for the clarification. Currently, the data is being populated from a manually created SQL dump file, so not dynamic at all. When I say dynamic, I mean "created from the repo" or something like that, not something that is being populated by manual entry. I would agree, some periodic updating through something like cron would be appropriate.

from conp-portal.

xlecours avatar xlecours commented on August 16, 2024

#120 and #115 fixes this issue.

Datalad python module is used to clone https://github.com/CONP-PCNO/conp-dataset then install, update and merge its subdatasets.

This can be done with flask update_datasets and have been wrapped in a cronjob that run every 15 minutes.

The datasets details are read from their respective DATS.json file

from conp-portal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.