Giter Club home page Giter Club logo

Comments (3)

broeder-j avatar broeder-j commented on May 19, 2024

@apirogov: The current compose in codemetapy is a simple overwrite on the triple level and triples for which are not in the new graph are removed than there is an rdf merge. There is no entity resolution implemented in codemetapy, but this is also stated in the readme.

I can image that one can do better.

A simple rdf merge could already be better (in some cases), but would not be enough, since it only works for objects with identifiers in both graphs.
But it would at least merge the email if the second person also has an orcid as identifier, due to a usual rdf merge, please check if this is the case. I am not sure how blank nodes are handled in detail in codemetapy.

from codemetapy.

proycon avatar proycon commented on May 19, 2024

The current
compose
in codemetapy is a simple overwrite on the triple level and triples for
which are not in the new graph are removed than there is an rdf merge. There
is no entity resolution implemented in codemetapy, but this is also stated in
the readme.

Correct, it overwrites the entire triple. This behaviour is by design so you
can compose a codemeta file from multiple input files, where the ordering
determines which takes priority. This behaviour is used by
codemeta-harvester.

A simple rdf merge could already be better (in some cases), but would not be enough, since it only works for objects with identifiers in both graphs.

Yes. If you want a merge, the only way to do so currently is to ensure the authors
have the same @id. So if everything already has ORCIDs it'll work fine.
I realize it's sub-optimal and some better mechanism could be implemented

However, merging multiple instances of persons is more tricky than it might
seem. Names are not always consistent (an extra middle name, a missing
diacritic, etc). Then which do you choose? We definitely don't want to end up
with multiple givenName and familyName properties. Multiple emails or urls
may be ok.

Another challenge is when having a graph of multiple SoftwareSourceCode
instances (which codemetapy supports) where an author appears in multiple
projects; but what if he/she has different affiliations in such a context?

from codemetapy.

proycon avatar proycon commented on May 19, 2024

Closing as 'invalid' since it's not a bug but by design. But of course the question and discussion itself (feel free to continue here) is very valid, and a better solution may be devised.

from codemetapy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.