Comments (7)
Yes, it does: https://info.orcid.org/faq/how-do-i-find-orcid-record-holders-at-my-institution/
BUT (this is what I figured could be wrong): emails of users are per default not visible to the outside, a member has to upgrade this to either internal or public on a per email level. So only if people have done this you have a chance to find them via an authorized query to the API by email. I think most people do not change the default, so i expect this way to yield 10%. (test query https://pub.orcid.org/v3.0/csv-search/?q=affiliation-org-name:ORCID&fl=orcid,given-names,family-name,current-institution-affiliation-name,email)
A better way could be to find people over name, plus affiliation, i.e. institution name or identifier.
Here codemetapy probably only has a chance if the institution is given or it can get it from the metadata already there...
How to do this I do not know, since contributors can be from everywhere, maybe a first thing would be to allow for a list to try.
Let me know if you plan to work on this. I have a layout of what I want, but not implemented anything yet and it is currently not on my todo list
in terms of code out there I found this which is old and may or may not work:
https://github.com/ORCID/python-orcid
https://github.com/scholrly/orcid-python
from codemetapy.
emails of users are per default not visible to the outside, a member has to upgrade this to either internal or public on a per email > level. So only if people have done this you have a chance to find them via an authorized query to the API by email. I think most > people do not change the default, so i expect this way to yield 10%.
Too bad, this would be the ideal method but if it yields only 10% it's not very useful indeed.
A better way could be to find people over name, plus affiliation, i.e. institution name or identifier.
That sounds viable yes, though one issue with affiliations is that people tend to come and go in institutions.
..maybe a first thing would be to allow for a list to try.
Like explicitly passing a tsv file to codemetapy with say emails and orcids? That would work yes, though it isn't as fully automated as we'd want ideally.
from codemetapy.
An add on to this. codemetapy parses the Citation.cff file, but it does not use the orcids in there for authors/contributors Ids but instead the gitlab id (account page) "@id": "https://iffgit.fz-juelich.de/fleur/fleur/person/cmax347"
.
Ideally once would keep both information... i.e that the orcid and the git id are same as somewhere.
also in that context the familyName
and givenName
parsing is also not optimal if the link of the person does not contain the name, example:
{
"@id": "https://iffgit.fz-juelich.de/fleur/fleur/person/cmax347",
"@type": "Person",
"email": "[email protected]@gmail.com",
"familyName": "",
"givenName": "cMax347",
"position": 71
},
{
"@id": "https://iffgit.fz-juelich.de/fleur/fleur/person/christian-roman-gerhorst",
"@type": "Person",
"email": "[email protected]",
"familyName": "Gerhorst",
"givenName": "Christian-Roman",
"position": 72
}
So it has also problems with middle names. I would assume that these would be easier to parse from an Citation.cff file.
from codemetapy.
An add on to this. codemetapy parses the Citation.cff file, but it does not use the orcids in there for authors/contributors Ids
Hmm.. Agreed, if there are ORCIDs then they shouldn't be overwritten. I wonder if it's an issue in codemetapy or in https://github.com/citation-file-format/cff-converter-python, we don't do the CITATION.cff parsing ourselves.
but instead the gitlab id (account page) "@id": "https://iffgit.fz-juelich.de/fleur/fleur/person/cmax347".
(it's not the gitlab id, see #34)
Ideally once would keep both information... i.e that the orcid and the git id are same as somewhere.
from codemetapy.
also in that context the familyName and givenName parsing is also not optimal if the link of the person does not contain the name, example:
{ "@id": "https://iffgit.fz-juelich.de/fleur/fleur/person/cmax347", "@type": "Person", "email": "[email protected]@gmail.com", "familyName": "", "givenName": "cMax347", "position": 71 },
Yes, we'd better just use schema:name
if we can't decipher given and family names, needs some fine-tuning. That e-mail looks malformed too.
For the actual name parsing from arbitrary strings I'm using nameparser
from codemetapy.
I've been giving this some more thought and there are some challenges to solve, mostly related to 'affiliations':
- In the current implementation, whenever an author appears in multiple
software metadata projects (or even multiple times in the same one), there
is a high risk of properties getting conflated if not consistently named.
The most notable one is 'affiliation'. If an author at various points has different
affiliations (or even the same one but not consistently named). Then these will all
be propagated to all instances when the full graph of multiple software projects is loaded. - Related to the above: 'affiliation' is a property of a
schema:Person
. But
that means it is no longer attached to any specific software project,
meaning we can't differentiate between affiliations at the time of the
sofware project or later/before. We'd always get all of them, which may be
less informative than desired. It's common for people to have (had) multiple
affiliations throughout their career. We do useschema:producer
to tie
software projects to institutions directly, so at least that is expressable
(relates to codemeta/codemeta#286) - We already ascertained that automatically going from names or e-mails to
ORCIDs is hard. We probably need a custom mapping as input (like a tsv
file). - The reverse, going from ORCIDs to all the names/emails/urls is fairly easy, we can
just queryorcid.org
and requestapplication/ld+json
to get a schema.org
representation that is compatible with codemeta. Some caveats there:
* It does not contain the e-mail, even if it is public. The turtle
output, however, does (it uses a completely different vocabulary than
the JSON-LD serialisation)
* The JSON-LD output lists all affiliations it knows (including those
that have ended, but that information is not outputted). The turtle
output lists no affiliations at all.
from codemetapy.
Related Issues (20)
- Error on windows? only HOT 4
- Context parsed wrong? HOT 1
- Add support for DOIs
- Person ids wrong gitlab HOT 3
- graph creation of many entries fails with rekursion depth error HOT 4
- json-ld parsing error
- Metadata from pypi, or internal HOT 2
- Feature request: pre-commit hook to update codemeta.json HOT 1
- Possible bug: Serialization to JSON is not deterministic HOT 3
- Pyproject-based parser fails on some valid input files HOT 6
- codemetapy fails on pyproject.toml if some other tool section comes before tool.poetry HOT 5
- Incorrect parsing of versions from dependencies if e.g. extras are stated to be installed HOT 2
- codemetapy fails to merge triples for the same person HOT 3
- Feature request: codemetapy does not extract maintainer from package.json or pyproject.toml HOT 2
- Bug: codemetapy fails to add orcids to all contributors listed in a package.json HOT 2
- Bug: codemetapy incorrectly expands a url into a nested Person inside of a Person HOT 2
- Adapt to codemeta v3 release
- Incorrect parsing of versions from dependencies if e.g. extras are stated to be installed HOT 1
- Error on windows usage HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from codemetapy.