Comments (15)
I'll make a pull request. Docs with notes here:
https://docs.google.com/spreadsheets/d/1syH5W4o9uWDApb5LgT5conTAFCiN62yPgzu9FzwbuH0/edit?usp=sharing
@cmungall Can you comment on some of the issues in the notes?
from biolink-model.
cell https://www.wikidata.org/wiki/Q7868 vs cell type https://www.wikidata.org/wiki/Q189118 in wd
It looks like neuron is an instance of cell-type and a subclass of cell. This makes sense. I don't really see the use case for WD to have both, it's unlikely you would have actual cell instances in WD (maybe an instance for the ur-cell, or the ancestral cell of all euks and archaea?).
I found one case of a dual instance/subclass. I made a note here but I don't really know if I should be notifying a bot:
https://www.wikidata.org/wiki/Talk:Q2619679
The class/metaclass distinction is quite useful for organism vs taxon though
from biolink-model.
spoke too soon, here are some instances of cell (should be subclasses):
$ pq-wd "Cell=wd:'Q7868',instance_of(X,Cell),enlabel(X,XN)"
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q5010870,CFU-E
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q3493700,Splenocyte
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q5712474,Hemocyte
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q28000183,Medlar bodies
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q101026,platelet
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q574674,Anti-HBs
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q963397,synoviocyte
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q2619679,Akinete
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q2382063,promyelocyte
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q3108891,Glioblast
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q632518,T helper cell
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q1543282,Granulosa cell
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q2594789,Band cell
http://www.wikidata.org/entity/Q7868,http://www.wikidata.org/entity/Q3270846,myelocyte
And subclasses of cell type:
$ pq-wd "CellType=wd:'Q189118',subclass_of(X,CellType),enlabel(X,XN)"
http://www.wikidata.org/entity/Q189118,http://www.wikidata.org/entity/Q47088881,nucleated cell
http://www.wikidata.org/entity/Q189118,http://www.wikidata.org/entity/Q6810199,Meiocyte
from biolink-model.
Added some notes here https://www.wikidata.org/wiki/Talk:Q7868
from biolink-model.
symptom in wikidata. First there seems to be some confusion about classes vs instances.
# transitive subclass of symtpom
$ pq-wd "isa_symptom(S),enlabel(S,SN)" S-SN | wc
786 2351 53252
# inferred instance of synonym
$ pq-wd "symptom_inf(S),enlabel(S,SN)" S-SN | wc
290 678 17496
all dual instance/subclass
$ pq-wd "symptom_inf(S),isa_symptom(S),enlabel(S,SN)" S-SN
http://www.wikidata.org/entity/Q86,headache
http://www.wikidata.org/entity/Q183425,abdominal pain
http://www.wikidata.org/entity/Q35830,sneeze
http://www.wikidata.org/entity/Q3002092,abdominal cramps
http://www.wikidata.org/entity/Q2673323,malaise
http://www.wikidata.org/entity/Q3589142,epigastric pain
http://www.wikidata.org/entity/Q245455,chromosome 5q deletion syndrome
http://www.wikidata.org/entity/Q186889,nausea
http://www.wikidata.org/entity/Q270421,muscle weakness
http://www.wikidata.org/entity/Q537297,heartburn
http://www.wikidata.org/entity/Q1338684,tension headache
http://www.wikidata.org/entity/Q693058,chest pain
http://www.wikidata.org/entity/Q21109236,urinary burning
http://www.wikidata.org/entity/Q8038367,wrist pain
http://www.wikidata.org/entity/Q21077144,abnormal hematologic indices
http://www.wikidata.org/entity/Q21109840,chest tightness
http://www.wikidata.org/entity/Q21120091,discomfort
http://www.wikidata.org/entity/Q21117104,weak feet
http://www.wikidata.org/entity/Q21117872,deep pain
http://www.wikidata.org/entity/Q21402621,greatly increased nasal secretions and oral secretions
http://www.wikidata.org/entity/Q21120264,precordial pain
http://www.wikidata.org/entity/Q21077147,eye ache
http://www.wikidata.org/entity/Q21110281,dry burning throat
http://www.wikidata.org/entity/Q21119839,eye pain
http://www.wikidata.org/entity/Q21120154,pain on inspiration
symptom in WD appears to be general constitutive symptoms?
Then we have terms like https://www.wikidata.org/wiki/Q1100988 micrognathism which is neither isa nor part of
it's under:
https://www.wikidata.org/wiki/Q6869195
which is under 'clinical sign'
https://www.wikidata.org/wiki/Q1441305
oh no, please not the sign vs symptom distinction...
going up we have
https://www.wikidata.org/wiki/Q28807560 clinical finding
which is a
https://www.wikidata.org/wiki/Q639907 medical finding
I don't know the distinction between medical and clinical finding, but I think medical-finding is closer to phenotypic feature. And symptom is already a subtype of clinical finding.
So Q639907 may be the best, but this is highly inappropriate for model organism phenotypic features...
from biolink-model.
We also have https://www.wikidata.org/wiki/Q1921834 feature
another general question is what to do with traits in biolink-model. Do we treat as phenotypes? Or have a separate class?
wd trait subclasses
$ pq-wd "isa_trait(T),enlabel(T,TN)"
http://www.wikidata.org/entity/Q1211967,trait
http://www.wikidata.org/entity/Q7243545,primitive
http://www.wikidata.org/entity/Q23786,eye color
and instances
$ pq-wd "trait_inf(T),enlabel(T,TN)"
http://www.wikidata.org/entity/Q80157,temperament
http://www.wikidata.org/entity/Q17122705,brown
http://www.wikidata.org/entity/Q16939403,blue-green
http://www.wikidata.org/entity/Q27839441,purple
http://www.wikidata.org/entity/Q27777837,yellow
http://www.wikidata.org/entity/Q30069237,dark eyes
http://www.wikidata.org/entity/Q30069240,light eyes
http://www.wikidata.org/entity/Q42845936,blue-gray
http://www.wikidata.org/entity/Q17126729,red
http://www.wikidata.org/entity/Q17122854,green
http://www.wikidata.org/entity/Q17122740,hazel
http://www.wikidata.org/entity/Q17122834,blue
http://www.wikidata.org/entity/Q17244465,black
http://www.wikidata.org/entity/Q17291407,amber
http://www.wikidata.org/entity/Q19359739,Midphalangeal hair
http://www.wikidata.org/entity/Q17245659,grey
http://www.wikidata.org/entity/Q17244894,dark brown
Midphalangeal hair is a phenotypic feature (or trait value). The others are values.
Anyway this may be sorted out with a general alignment of WD to phenotype ontologies, trait ontologies (e.g. OBA) and PATO (primitive attributes and values)
from biolink-model.
As you may have noticed, these types of entities don't have a widespread and systematic structure... (We (genewiki team) haven't done much specifically with symptoms, findings, phenotypes, cell types, etc.)) Being able to make well-structured queries on them will have to wait until they are better organized (i.e. are aligned with some ontology). In the meantime, we can construct custom queries to get some of the more useful parts out...
from biolink-model.
Just give one of the Wikidata's statements (at this moment), which speaks about the quality of its model of the world: set cell is a subset of the set cellular component (part of a cell), "cell is part of a cell", =a set is a subset of the subset (Russell's paradox?)
from biolink-model.
Thanks @stuppie - hope to have some time to help with cells and findings in WD. The mappings thus far in your spreadsheet seem good, will work on more later
@fractaler - the statement makes sense to me. cell subClassOf cellular component (which is consistent with GO). CC is a confusing term though, it doesn't mean cell part, it is a designation for things at a certain level of granularity
from biolink-model.
Right, @fractaler , this aligns with GO. The cellular component is not "a component of a cell", its the root node for the CC ontology. Cell is a type of cellular component.
from biolink-model.
I just take what the Wikidata said. And Wikidata said: cell: the basic structural and functional unit of all organisms. Ok, let's take, for example, 1-cell organism: unicellular organism (organism that consists of only one cell). Substitute the value in the variable "organism" and get: "cell is the basic structural and functional unit of organism that consists of only one cell". I would not recommend this model to anyone.
Wikidata now do not allow to create an accurate, scientific model, it is now dominated by the terminological chaos. For example, homonyms in it: parent = parents, child = children, chemical = chemicals, sibling = siblings, ancestor = ancestors, first-degree relative = first-degree relatives, etc. "Cell" also is a homonym: 1) unit of multicellular organism structure, 2) unicellular organism.
from biolink-model.
Hello, @cmungall @stuppie and @fractaler ,
I found this issue looking at the discussion page on the talk page of the item cell.
I am planning to work on cleaning the problems around cell type definitions on Wikidata for the near future, and it is good to see that these issues affect other Wikidata users. I mean, good to see that solving the issues may have a practical value.
I am specifically focused on the issues about cell types. If you have any suggestions of issues that if solved on Wikidata, would be improve its value for the Cell Ontology / OBO community, that would be great.
Also, if there are people actively working on this in 2020, I would love to join and help.
from biolink-model.
What is the status of this?
from biolink-model.
Needs more input from @cmungall
from biolink-model.
@cmungall - I think I am going to close this for now. Nomi and I traced the PR associated with this case and it got merged (and then WD ids were removed from the model). We can definitely reopen if necessary.
from biolink-model.
Related Issues (20)
- biolink prefix in biolink-model.yaml returns 404
- Hazardous or Poisonous Substance (T131) mapped incorrectly to SmallMolecule
- Add a biolink:CommonDataElementBundle to provide an ordered list of biolink:CommonDataElement
- Incorrect mapping? skos:relatedMatch HOT 1
- SEMMED mappings to related to
- FMA identifiers don't resolve HOT 1
- Importing biolink raises ModuleNotFoundError HOT 1
- Inverses of mixin predicates don't establish an is_a path to a "root" mixin
- Obsolete Bacterium, or place it correctly
- Make the hierarchy is-a complete
- `biolink` base namespace does not redirect correctly? HOT 1
- Two discrepencies in `OMIM` prefix
- A URL from Reactome database HOT 11
- Move prefix map to src
- Prediction Qualifiers HOT 2
- Generated Python/Pydantic Models as Package? HOT 1
- Pathway `preceded_by`/`followed_by` HOT 2
- Sequence Motif PWMs
- What is the Prefix for microRNA?
- It seems impossible to instantiate GeneToGoTermAssociation objects using the python classes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from biolink-model.