Giter Club home page Giter Club logo

Comments (2)

anergictcell avatar anergictcell commented on May 26, 2024

Hi @bheavner,

Interesting question, I had to look into it a bit and the answer is not straight forward.

  1. PyHPO currently uses the HPO masterdata from 2022-04-14. There is one newer release from June and I haven't yet updated it (Will do so once I find some time). However, that does not explain the discrepancy.
  2. PyHPO relies on the masterdata from JAX and when I test PyHPO, I always compare it to the HPO browser from Jax: https://hpo.jax.org/app/.
  3. On the Jax HPO browser, the list of child terms for HP:0003674 is identical to PyHPO:

Screenshot 2022-07-13 at 14 52 48

  1. The HPOTerm.children returns the list of direct children. It does not go down another level to grandchildren etc.

Some of the terms in the EBI browser are not direct children of HP:0003674, e.g.:
Childhood onset | HP:0011463

[Term]
id: HP:0011463
name: Childhood onset
alt_id: HP:0003586
alt_id: HP:0003617
def: "Onset of disease at the age of between 1 and 5 years." [DDD:hfirth]
comment: This term refers to ages up to but not including the fifth birthday (see Juvenile onset).
synonym: "Symptoms begin in childhood" EXACT layperson [orcid.org/0000-0002-6548-5200]
xref: UMLS:C1837352
is_a: HP:0410280 ! Pediatric onset
created_by: peter
creation_date: 2012-03-25T07:16:20Z

The actual hierarchy for Childhood onset looks like:

- HP:0003674 ! Onset
    -  HP:0410280 ! Pediatric onset
        - HP:0011463 ! Childhood onset

This means that EBI is also listing indirect children and not just direct children, which could explain the difference.

We can get a list of all (direct and indirect) children through a function like:

def all_children(term):
    children = set()
    for child in term.children:
        children.add(child)
        children.update(all_children(child))
    return children


for term in sorted(all_children(Ontology[3674]), key=lambda t: t.name):
    print(term)

# HP:0003581 | Adult onset
# HP:0030674 | Antenatal onset
# HP:0011463 | Childhood onset
# HP:0003577 | Congenital onset
# HP:0025708 | Early young adult onset
# HP:0011460 | Embryonal onset
# HP:0011461 | Fetal onset
# HP:0003593 | Infantile onset
# HP:0025709 | Intermediate young adult onset
# HP:0003621 | Juvenile onset
# HP:0034199 | Late first trimester onset
# HP:0003584 | Late onset
# HP:0025710 | Late young adult onset
# HP:0003596 | Middle age onset
# HP:0003623 | Neonatal onset
# HP:0410280 | Pediatric onset
# HP:4000040 | Puerpural onset
# HP:0034198 | Second trimester onset
# HP:0034197 | Third trimester onset
# HP:0011462 | Young adult onset

Interestingly, this list now has more children than shown the EBI browser. I tried looking into which terms are shown at EBI and which ones aren't, but I could not find a good explanation.

So, long story short: The EBI browser shows direct children and some indirect children. I'm sure there is some logic behind it, but I don't have experience with EBI myself and thus cannot really explain this difference.
I usually use the Jax HPO browser https://hpo.jax.org/app/ for browsing the HPO ontology visually. This is developed by the same team that maintains the HPO list, so I trust that they show the most up to date and correct data at any time.

from pyhpo.

bheavner avatar bheavner commented on May 26, 2024

Thank you, this is very helpful! Your suggestion resolves my immediate use case, but I'm not sure if I should close this issue or not.

from pyhpo.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.