Giter Club home page Giter Club logo

Comments (4)

Adafede avatar Adafede commented on July 30, 2024

Hi @tatyanalivshultz!

Thank you very much for getting in touch! Your project looks fantastic. It is something we have also wanted to do for a long time. Really thrilled to see where this will go!
(I quickly went through your bibliography...very cool!)

For the specific questions:
First of all, the data on lotus.naturalproducts.net is not up-to-date. We currently have some limitations in updating it so please better take the data from Wikidata, Zenodo (https://zenodo.org/communities/the-lotus-initiative), or PubChem (feeding on Zenodo also)

  1. Chemical names are heavily discouraged to look for chemical compounds. They are not "identifiers" at all and can lead to huge discrepancies indeed. Checking the names of hundreds of thousands of compounds is not trivial, so many of them are also possibly incorrect in many sources.
  2. This is exactly what I wanted to suggest. InChIKeys are the way to go.
  3. The names present on the website were generated using proprietary software (molconvert by ChemAxon). This is not the case anymore and is the reason why names can change. There are additional limitations in Wikidata, as the labels cannot be more than 250 characters long, so sometimes you might not find the name on Wikidata. Moreover, there is currently no "chemical name" property on Wikidata, so we only rely on the label anyone can change and eventually adapt to their language. It looks very intuitive to search for "limonene" but if you want to do so for the whole tree of Life, you will have to forget it...
  4. What you mention here are tautomers. We have some of them in the LOTUS corpus and eventually not all of them can be perfectly standardized. (The chemical "truth" is rather an equilibrium between the different species, changing depending on solvent, pH, etc.) This "problem" is known in cheminformatics for many years, but I think there is still no real solution to it.
  5. The data on Wikidata is moving every second. If someone considered the "found in taxon" statement incorrect and removed it, it won't appear anymore. If someone adds (like you did, thank you 😊 ) new statements, they won't appear on the other LOTUS endpoints instantly. We usually try to do trimestrial versions of LOTUS, including all the new changes made on Wikidata, they are then stored on Zenodo.
  6. Wow, you went deep into digging, beautiful! Those statements (on the references) were actually made by one of our collaborators and were based on the "found in taxon" statements we had at the time. They will probably lose synchronization with time going, as most probably 99% of the people will only update the data on one side. The tagging of "main subject" on articles was made to identify literature matching given subjects, mainly in the frame of Scholia. See https://scholia.toolforge.org/taxon/Q135389 for example. This might change in the future following some of our recent discussions (https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Chemistry/Natural_products#Mapping_near_to_ubiquituous_compounds)
  7. True, because there is no "found in taxon" Acanthus anymore. This statement was removed from Wikidata (correctly or not, as for any community-based curation, 99% of it is good, we cannot avoid human errors but it goes toward the better).
  8. I think you already found https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry/Natural_products#Queries to guide some of your queries, your query is correct, no issue. 6 of the 16 compounds present in the outdated data were removed.

We are really happy to discuss anything in more depth! Please reach out so we can see how to best help you achieve what you want (also including chemical similarity in the speciation gradient, for example, etc.)

from lotus-web.

tatyanalivshultz avatar tatyanalivshultz commented on July 30, 2024

from lotus-web.

Adafede avatar Adafede commented on July 30, 2024

Your work looks amazing (and we clearly need plant taxonomists!).

If there is anything we can do to help, very happy to!

What you suggest looks good.
I would recommend using https://decimer.ai/ developed by some collaborators for the structure recognition from image.

Do not hesitate to contact us for more details if needed.

PS: As a strating point: https://w.wiki/6bt5

from lotus-web.

Adafede avatar Adafede commented on July 30, 2024

@tatyanalivshultz By the way, thanks to some amazing collaborators, a huge list of novel alkaloidic occurrences were added to WD, see: https://www.wikidata.org/w/index.php?title=Special:Contributions/NPImporterBot&target=NPImporterBot&offset=&limit=500

from lotus-web.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.