Giter Club home page Giter Club logo

Comments (10)

Adafede avatar Adafede commented on July 30, 2024 1

Something like

SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
  VALUES ?taxon {
    wd:Q55925442
  }
  ?organism (wdt:P171*) ?taxon;
    wdt:P225 ?organism_name.
  ?structure (p:P703/ps:P703) ?organism.
  OPTIONAL { ?structure wdt:P235 ?structure_inchikey. }
  OPTIONAL { ?structure wdt:P233 ?structure_smiles. }
  OPTIONAL { ?structure wdt:P231 ?structure_cas. }
  BIND (CONCAT(COALESCE(?structure_inchikey,""), COALESCE(?structure_smiles,""), COALESCE(?structure_cas,"")) AS ?key)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  FILTER (STRLEN(?key) > 1)
}
LIMIT 100000

should do the trick. I do not think there are any "fully empty" entries to test but anyway...

from lotus-web.

Adafede avatar Adafede commented on July 30, 2024

Hi @alrichardbollans,

You are perfectly right, what you were missing is the OPTIONAL, allowing for a property also not to be present.

Here is probably what you were looking for:

SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
  VALUES ?taxon {
    wd:Q21754
  }
  ?organism (wdt:P171*) ?taxon;
    wdt:P225 ?organism_name.
  ?structure wdt:P233 ?structure_smiles;
    (p:P703/ps:P703) ?organism.
  OPTIONAL {
    ?structure wdt:P231 ?structure_cas;
      wdt:P235 ?structure_inchikey.
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100000

Hope this answers your question, happy to elaborate if not. 👍🏼

from lotus-web.

alrichardbollans avatar alrichardbollans commented on July 30, 2024

Aha this is great, thanks! Still getting my head around SPARQL so this is really handy. How would I also make the SMILES key optional?

from lotus-web.

Adafede avatar Adafede commented on July 30, 2024

The issue you might face by putting it as optional is that you would end up having things that are not necessarily small molecules. You should then force given instances at the beginning (like in https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry_Natural_products#What_was_already_there?) and I am not sure you would have much more results.
You could eventually switch the current InChIKey/SMILES if you want to try.

from lotus-web.

alrichardbollans avatar alrichardbollans commented on July 30, 2024

OK, thanks for this!

from lotus-web.

alrichardbollans avatar alrichardbollans commented on July 30, 2024

I've just noticed that the INCHI key isn't being returned for metabolites in some taxa, even though the InChi key is given in lotus/wikidata.
For example, with the query:

SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
  VALUES ?taxon {
    wd:Q55925442
  }
  ?organism (wdt:P171*) ?taxon; # Include children taxa
    wdt:P225 ?organism_name. # Get organism name
  ?structure wdt:P233 ?structure_smiles; # Get the SMILES
    (p:P703/ps:P703) ?organism.   # Found in given taxon/taxa
  OPTIONAL {
    ?structure wdt:P231 ?structure_cas;
      wdt:P235 ?structure_inchikey.
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100000

The structure wd:Q104888293 is returned but no value is provided for its structure_inchikey. Why is this?

from lotus-web.

Adafede avatar Adafede commented on July 30, 2024

Good catch!

Something like

SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
  VALUES ?taxon {
    wd:Q55925442
  }
  ?organism (wdt:P171*) ?taxon;
    wdt:P225 ?organism_name.
  ?structure wdt:P233 ?structure_smiles;
    (p:P703/ps:P703) ?organism.
  OPTIONAL { ?structure wdt:P235 ?structure_inchikey. }
  OPTIONAL { ?structure wdt:P231 ?structure_cas. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100000

Should solve this, do not hesitate to reopen in case

from lotus-web.

alrichardbollans avatar alrichardbollans commented on July 30, 2024

This is great! Is it possible to also make the SMILES also optional, or is this redundant? My attempt is:

SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
  VALUES ?taxon {
    wd:Q55925442
  }
  ?organism (wdt:P171*) ?taxon;
    wdt:P225 ?organism_name.
  ?structure (p:P703/ps:P703) ?organism.
  OPTIONAL { ?structure wdt:P235 ?structure_inchikey. }
  OPTIONAL { ?structure wdt:P233 ?structure_smiles. }
  OPTIONAL { ?structure wdt:P231 ?structure_cas. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100

from lotus-web.

Adafede avatar Adafede commented on July 30, 2024

I would not recommend it, but it is feasible. The problem is not the redundancy but rather having something you can trust. An (almost) empty entry with no SMILES, no CAS, no InChIKey, I would hardly trust.

from lotus-web.

alrichardbollans avatar alrichardbollans commented on July 30, 2024

Ok thanks, this is good to know. My intention is to incorporate this into my data by matching CAS, SMILES or InCHIKeys so effectively those instances with none of these would be ignored. I guess ideally the query would return all those metabolites with at least of one CAS, SMILES or InCHIKeys

from lotus-web.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.