Comments (4)
Yes I think it makes sense to complement like that the pmid lookup.
The main issue is then that the publisher metadata are not well maintained. When a PMID becomes invalid at PubMed (which is the reference), apparently they are not removed from the publisher metadata and we would have invalid mapping via the istex data file.
The submodule pubmed-glutton
map DOI/PMID to MESH classes using the officiel PubMed metadata dump files. One approach to control the pmid information would be to exploit this additional mapping. I still need then to work a bit on it again.
So I would suggest to wait until I review the pubmed-glutton
submodule. This module will produce an additional mapping PMID/MESH classes to enrich the record of biblio-glutton, and we could use it to control valid PMID.
from biblio-glutton.
it looks not in the PMID source file:
(base) Johan:consolidationData lfoppiano$ zgrep 16262981 PMID_PMCID_DOI.csv.gz
(base) Johan:consolidationData lfoppiano$
from biblio-glutton.
but present in the ISTEX ids file
lopez@work:/mnt/data/biblio$ zgrep 16262981 ~/biblio-glutton/data/istex/istexIds.all.gz
{"corpusName":"cambridge","istexId":"CC91E0F1789978CE79D653533100BA315CA337B3","ark":["ark:/67375/6GQ-9RTTRBZ7-G"],"doi":["10.1017/S0266462305050762"],"pmid":["16262981"],"pii":["S0266462305050762"]}
So it might be present in the ISTEX metadata as provided by the publishers, but not in the mapping file DOI/PMID. It is actually not a valid PMID -> https://www.ncbi.nlm.nih.gov/pubmed/16262981
This DOI does not correspond to a real article but to an index of a Cambridge journal.
I think it's fine not to map it to anything indeed with the lookup service, as it's not a valid PMID anymore. But I am not so sure how to deal with it in the full record.
from biblio-glutton.
right... one thing I notice is that we do not use the pmid information from the istex mapping but only from the pmid file (which should be the authority here, isn't it?).
We should complement the pmid lookup with information from istex mapping, perhaps?
from biblio-glutton.
Related Issues (20)
- Maximum number of requests and request/second
- Sanity check for field request HOT 1
- Experiment with alternative compression
- support matching a bulk of identifiers
- health check errors after a while HOT 1
- Move doc to readthedocs
- Revisited result format for aggregated sources
- Run path for gap/daily sync with Crossref HOT 1
- Crossref gap update command might not stop
- Consider faster REST API / microservice framework
- Use Openalex over Crossref? HOT 1
- Error during import of gz files HOT 8
- Docker and Docker-compose files incorrect + java.net.ConnectException HOT 3
- Error during startup of jar service HOT 9
- Error during import in elasticsearch HOT 1
- LMDB - "Transaction must abort, has a child, or is invalid"
- Docker problem: Error: Unable to access jarfile lib/lookup-service-0.2-SNAPSHOT-onejar.jar HOT 1
- Slow importing of Crossref full metadata dump in LMDB HOT 2
- What's the best way to understand the logic used for consolidation?
- Leaking threads (and low performance) in docker image HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from biblio-glutton.