Comments (4)
Thank you very much for getting in touch! Your project looks fantastic. It is something we have also wanted to do for a long time. Really thrilled to see where this will go!
(I quickly went through your bibliography...very cool!)
For the specific questions:
First of all, the data on lotus.naturalproducts.net is not up-to-date. We currently have some limitations in updating it so please better take the data from Wikidata, Zenodo (https://zenodo.org/communities/the-lotus-initiative), or PubChem (feeding on Zenodo also)
- Chemical names are heavily discouraged to look for chemical compounds. They are not "identifiers" at all and can lead to huge discrepancies indeed. Checking the names of hundreds of thousands of compounds is not trivial, so many of them are also possibly incorrect in many sources.
- This is exactly what I wanted to suggest. InChIKeys are the way to go.
- The names present on the website were generated using proprietary software (molconvert by ChemAxon). This is not the case anymore and is the reason why names can change. There are additional limitations in Wikidata, as the labels cannot be more than 250 characters long, so sometimes you might not find the name on Wikidata. Moreover, there is currently no "chemical name" property on Wikidata, so we only rely on the label anyone can change and eventually adapt to their language. It looks very intuitive to search for "limonene" but if you want to do so for the whole tree of Life, you will have to forget it...
- What you mention here are tautomers. We have some of them in the LOTUS corpus and eventually not all of them can be perfectly standardized. (The chemical "truth" is rather an equilibrium between the different species, changing depending on solvent, pH, etc.) This "problem" is known in cheminformatics for many years, but I think there is still no real solution to it.
- The data on Wikidata is moving every second. If someone considered the "found in taxon" statement incorrect and removed it, it won't appear anymore. If someone adds (like you did, thank you 😊 ) new statements, they won't appear on the other LOTUS endpoints instantly. We usually try to do trimestrial versions of LOTUS, including all the new changes made on Wikidata, they are then stored on Zenodo.
- Wow, you went deep into digging, beautiful! Those statements (on the references) were actually made by one of our collaborators and were based on the "found in taxon" statements we had at the time. They will probably lose synchronization with time going, as most probably 99% of the people will only update the data on one side. The tagging of "main subject" on articles was made to identify literature matching given subjects, mainly in the frame of Scholia. See https://scholia.toolforge.org/taxon/Q135389 for example. This might change in the future following some of our recent discussions (https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Chemistry/Natural_products#Mapping_near_to_ubiquituous_compounds)
- True, because there is no "found in taxon"
Acanthus
anymore. This statement was removed from Wikidata (correctly or not, as for any community-based curation, 99% of it is good, we cannot avoid human errors but it goes toward the better). - I think you already found https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry/Natural_products#Queries to guide some of your queries, your query is correct, no issue. 6 of the 16 compounds present in the outdated data were removed.
We are really happy to discuss anything in more depth! Please reach out so we can see how to best help you achieve what you want (also including chemical similarity in the speciation gradient, for example, etc.)
from lotus-web.
from lotus-web.
Your work looks amazing (and we clearly need plant taxonomists!).
If there is anything we can do to help, very happy to!
What you suggest looks good.
I would recommend using https://decimer.ai/ developed by some collaborators for the structure recognition from image.
Do not hesitate to contact us for more details if needed.
PS: As a strating point: https://w.wiki/6bt5
from lotus-web.
@tatyanalivshultz By the way, thanks to some amazing collaborators, a huge list of novel alkaloidic occurrences were added to WD, see: https://www.wikidata.org/w/index.php?title=Special:Contributions/NPImporterBot&target=NPImporterBot&offset=&limit=500
from lotus-web.
Related Issues (20)
- How to download all the chemical compound and their related data of an organism from LOTUS ? HOT 8
- [Licensing] Add license terms for the logo reuse HOT 5
- [Enhancement] add md5sum of the downloadable files at a predefined url HOT 1
- download files with source organisms HOT 1
- No mongodb data HOT 3
- I have Lotus running locally. How do I access? HOT 8
- Question about latest version of the Mongo database HOT 1
- Error occurred on image search HOT 1
- Differences between relations available on Wikidata and the LOTUS web interface HOT 2
- Returning all metabolites in a given clade, including possibly missing properties HOT 10
- How to get all compounds from Plantae HOT 3
- Use of NPASS HOT 1
- Chloroquine HOT 2
- SMILES outputs from LOTUS and WikiData HOT 2
- Bad SDF file when downloaded after similarity search HOT 2
- Role of WikiData in LOTUS HOT 4
- Errors in Lotus HOT 2
- where can I find the docker-compose.yml mentioned in the readme file
- Support for LOTUS build on Apple M chips HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lotus-web.