Comments (3)
@apirogov: The current compose in codemetapy is a simple overwrite on the triple level and triples for which are not in the new graph are removed than there is an rdf merge. There is no entity resolution implemented in codemetapy, but this is also stated in the readme.
I can image that one can do better.
A simple rdf merge could already be better (in some cases), but would not be enough, since it only works for objects with identifiers in both graphs.
But it would at least merge the email if the second person also has an orcid as identifier, due to a usual rdf merge, please check if this is the case. I am not sure how blank nodes are handled in detail in codemetapy.
from codemetapy.
The current
compose
in codemetapy is a simple overwrite on the triple level and triples for
which are not in the new graph are removed than there is an rdf merge. There
is no entity resolution implemented in codemetapy, but this is also stated in
the readme.
Correct, it overwrites the entire triple. This behaviour is by design so you
can compose a codemeta file from multiple input files, where the ordering
determines which takes priority. This behaviour is used by
codemeta-harvester
.
A simple rdf merge could already be better (in some cases), but would not be enough, since it only works for objects with identifiers in both graphs.
Yes. If you want a merge, the only way to do so currently is to ensure the authors
have the same @id
. So if everything already has ORCIDs it'll work fine.
I realize it's sub-optimal and some better mechanism could be implemented
However, merging multiple instances of persons is more tricky than it might
seem. Names are not always consistent (an extra middle name, a missing
diacritic, etc). Then which do you choose? We definitely don't want to end up
with multiple givenName
and familyName
properties. Multiple emails or urls
may be ok.
Another challenge is when having a graph of multiple SoftwareSourceCode
instances (which codemetapy supports) where an author appears in multiple
projects; but what if he/she has different affiliations in such a context?
from codemetapy.
Closing as 'invalid' since it's not a bug but by design. But of course the question and discussion itself (feel free to continue here) is very valid, and a better solution may be devised.
from codemetapy.
Related Issues (20)
- Add support for ORCIDs HOT 7
- Error on windows? only HOT 4
- Context parsed wrong? HOT 1
- Add support for DOIs
- Person ids wrong gitlab HOT 3
- graph creation of many entries fails with rekursion depth error HOT 4
- json-ld parsing error
- Metadata from pypi, or internal HOT 2
- Feature request: pre-commit hook to update codemeta.json HOT 1
- Possible bug: Serialization to JSON is not deterministic HOT 3
- Pyproject-based parser fails on some valid input files HOT 6
- codemetapy fails on pyproject.toml if some other tool section comes before tool.poetry HOT 5
- Incorrect parsing of versions from dependencies if e.g. extras are stated to be installed HOT 2
- Feature request: codemetapy does not extract maintainer from package.json or pyproject.toml HOT 2
- Bug: codemetapy fails to add orcids to all contributors listed in a package.json HOT 2
- Bug: codemetapy incorrectly expands a url into a nested Person inside of a Person HOT 2
- Adapt to codemeta v3 release
- Incorrect parsing of versions from dependencies if e.g. extras are stated to be installed HOT 1
- Error on windows usage HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from codemetapy.