Comments (10)
Something like
SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
VALUES ?taxon {
wd:Q55925442
}
?organism (wdt:P171*) ?taxon;
wdt:P225 ?organism_name.
?structure (p:P703/ps:P703) ?organism.
OPTIONAL { ?structure wdt:P235 ?structure_inchikey. }
OPTIONAL { ?structure wdt:P233 ?structure_smiles. }
OPTIONAL { ?structure wdt:P231 ?structure_cas. }
BIND (CONCAT(COALESCE(?structure_inchikey,""), COALESCE(?structure_smiles,""), COALESCE(?structure_cas,"")) AS ?key)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER (STRLEN(?key) > 1)
}
LIMIT 100000
should do the trick. I do not think there are any "fully empty" entries to test but anyway...
from lotus-web.
You are perfectly right, what you were missing is the OPTIONAL
, allowing for a property also not to be present.
Here is probably what you were looking for:
SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
VALUES ?taxon {
wd:Q21754
}
?organism (wdt:P171*) ?taxon;
wdt:P225 ?organism_name.
?structure wdt:P233 ?structure_smiles;
(p:P703/ps:P703) ?organism.
OPTIONAL {
?structure wdt:P231 ?structure_cas;
wdt:P235 ?structure_inchikey.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100000
Hope this answers your question, happy to elaborate if not. 👍🏼
from lotus-web.
Aha this is great, thanks! Still getting my head around SPARQL so this is really handy. How would I also make the SMILES key optional?
from lotus-web.
The issue you might face by putting it as optional is that you would end up having things that are not necessarily small molecules. You should then force given instances at the beginning (like in https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry_Natural_products#What_was_already_there?) and I am not sure you would have much more results.
You could eventually switch the current InChIKey/SMILES if you want to try.
from lotus-web.
OK, thanks for this!
from lotus-web.
I've just noticed that the INCHI key isn't being returned for metabolites in some taxa, even though the InChi key is given in lotus/wikidata.
For example, with the query:
SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
VALUES ?taxon {
wd:Q55925442
}
?organism (wdt:P171*) ?taxon; # Include children taxa
wdt:P225 ?organism_name. # Get organism name
?structure wdt:P233 ?structure_smiles; # Get the SMILES
(p:P703/ps:P703) ?organism. # Found in given taxon/taxa
OPTIONAL {
?structure wdt:P231 ?structure_cas;
wdt:P235 ?structure_inchikey.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100000
The structure wd:Q104888293 is returned but no value is provided for its structure_inchikey
. Why is this?
from lotus-web.
Good catch!
Something like
SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
VALUES ?taxon {
wd:Q55925442
}
?organism (wdt:P171*) ?taxon;
wdt:P225 ?organism_name.
?structure wdt:P233 ?structure_smiles;
(p:P703/ps:P703) ?organism.
OPTIONAL { ?structure wdt:P235 ?structure_inchikey. }
OPTIONAL { ?structure wdt:P231 ?structure_cas. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100000
Should solve this, do not hesitate to reopen in case
from lotus-web.
This is great! Is it possible to also make the SMILES also optional, or is this redundant? My attempt is:
SELECT DISTINCT ?structure ?structureLabel ?structure_smiles ?structure_cas ?structure_inchikey ?organism ?organism_name WHERE {
VALUES ?taxon {
wd:Q55925442
}
?organism (wdt:P171*) ?taxon;
wdt:P225 ?organism_name.
?structure (p:P703/ps:P703) ?organism.
OPTIONAL { ?structure wdt:P235 ?structure_inchikey. }
OPTIONAL { ?structure wdt:P233 ?structure_smiles. }
OPTIONAL { ?structure wdt:P231 ?structure_cas. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
from lotus-web.
I would not recommend it, but it is feasible. The problem is not the redundancy but rather having something you can trust. An (almost) empty entry with no SMILES, no CAS, no InChIKey, I would hardly trust.
from lotus-web.
Ok thanks, this is good to know. My intention is to incorporate this into my data by matching CAS, SMILES or InCHIKeys so effectively those instances with none of these would be ignored. I guess ideally the query would return all those metabolites with at least of one CAS, SMILES or InCHIKeys
from lotus-web.
Related Issues (20)
- How to download all the chemical compound and their related data of an organism from LOTUS ? HOT 8
- [Licensing] Add license terms for the logo reuse HOT 5
- [Enhancement] add md5sum of the downloadable files at a predefined url HOT 1
- download files with source organisms HOT 1
- No mongodb data HOT 3
- I have Lotus running locally. How do I access? HOT 8
- Question about latest version of the Mongo database HOT 1
- Error occurred on image search HOT 1
- Inconsistency in taxon:compound, name:compound mapping? Example (Q105216729) HOT 4
- Differences between relations available on Wikidata and the LOTUS web interface HOT 2
- How to get all compounds from Plantae HOT 3
- Use of NPASS HOT 1
- Chloroquine HOT 2
- SMILES outputs from LOTUS and WikiData HOT 2
- Bad SDF file when downloaded after similarity search HOT 2
- Role of WikiData in LOTUS HOT 4
- Errors in Lotus HOT 2
- where can I find the docker-compose.yml mentioned in the readme file
- Support for LOTUS build on Apple M chips HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lotus-web.