Giter Club home page Giter Club logo

txt2hpo's Introduction

txt2hpo

txt2hpo is a Python library for extracting HPO-encoded phenotypes from text. txt2hpo recognizes differences in inflection (e.g. hypotonic vs. hypotonia), handles negation and comes with a built-in medical spellchecker.

Installation

Install using pip

pip install txt2hpo

Install from GitHub

git clone https://github.com/GeneDx/txt2hpo.git
cd txt2hpo
python setup.py install

Library usage

from txt2hpo.extract import Extractor
extract = Extractor()

result = extract.hpo("patient with developmental delay and hypotonia")

print(result.hpids)


["HP:0001263", "HP:0001290"]
    

txt2hpo will attempt to correct spelling errors by default, at the cost of slower processing. This feature can be turned off by setting the correct_spelling flag to False.

from txt2hpo.extract import Extractor
extract = Extractor(correct_spelling = False)
result = extract.hpo("patient with devlopental delay and hyptonia")

print(result.hpids)

[]
 
    

txt2hpo handles negation using negspaCy. To remove negated phenotypes set remove_negated flag to True. Both the extracted and negated HPO terms can be retrieved.

from txt2hpo.extract import Extractor
extract = Extractor(remove_negated=True)
result = extract.hpo("patient has developmental delay but no hypotonia")

print(result.hpids)

["HP:0001263"]

print(result.negated_hpids)

["HP:0001252"]
    

txt2hpo picks the longest overlapping phenotype by default. To disable this feature set remove_overlapping flag to False.

from txt2hpo.extract import Extractor
extract = Extractor(remove_overlapping=False)
result = extract.hpo("patient with polycystic kidney disease")

print(result.hpids)

["HP:0000113", "HP:0000112"]


extract = Extractor(remove_overlapping=True)
result = extract.hpo("patient with polycystic kidney disease")

print(result.hpids)

["HP:0000113"]
 
    

txt2hpo outputs a valid JSON string, which contains information about extracted HPIDs, their character span and matched string.

from txt2hpo.extract import Extractor
extract = Extractor()

result = extract.hpo("patient with developmental delay and hypotonia")

print(result.json)


'[{"hpid": ["HP:0001290"], "index": [37, 46], "matched": "hypotonia"}, 
{"hpid": ["HP:0001263"], "index": [13, 32], "matched": "developmental delay"}]'

    

txt2hpo's People

Contributors

arvkevi avatar bennwei avatar dependabot[bot] avatar jamienoss avatar mcgeestephen avatar rebeccaito avatar vgainullin avatar vincent-ustach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

txt2hpo's Issues

Installation deprecated

I'm having issues with installing the package. Trying to install the package using setup.py, I get

Processing dependencies for txt2hpo==2021.0 Searching for gensim==3.8.1 Reading https://pypi.org/simple/gensim/ Download error on https://pypi.org/simple/gensim/: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997) -- Some packages may not be found! Couldn't find index page for 'gensim' (maybe misspelled?) Scanning index of all packages (this may take a while) Reading https://pypi.org/simple/ Download error on https://pypi.org/simple/: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997) -- Some packages may not be found! No local packages or working download links found for gensim==3.8.1 error: Could not find suitable distribution for Requirement.parse('gensim==3.8.1')

Is this a problem with gensim or txt2hpo?

installation through pip gives

`Installing build dependencies ... error
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> [212 lines of output]
Collecting setuptools
Using cached setuptools-62.3.2-py3-none-any.whl (1.2 MB)
Collecting wheel
Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Collecting cython>=0.25
Using cached Cython-0.29.30-py2.py3-none-any.whl (985 kB)
Collecting cymem<2.1.0,>=2.0.2
Using cached cymem-2.0.6-cp310-cp310-macosx_11_0_arm64.whl (30 kB)
Collecting preshed<3.1.0,>=3.0.2
Using cached preshed-3.0.6-cp310-cp310-macosx_11_0_arm64.whl (101 kB)
Collecting murmurhash<1.1.0,>=0.28.0
Using cached murmurhash-1.0.7-cp310-cp310-macosx_11_0_arm64.whl (19 kB)
Collecting thinc==7.4.0
Using cached thinc-7.4.0.tar.gz (1.3 MB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting blis<0.5.0,>=0.4.0
Using cached blis-0.4.1.tar.gz (1.8 MB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting wasabi<1.1.0,>=0.0.9
Using cached wasabi-0.9.1-py3-none-any.whl (26 kB)
Collecting srsly<1.1.0,>=0.0.6
Using cached srsly-1.0.5-cp310-cp310-macosx_10_9_universal2.whl
Collecting catalogue<1.1.0,>=0.0.7
Using cached catalogue-1.0.0-py2.py3-none-any.whl (7.7 kB)
Collecting numpy>=1.7.0
Using cached numpy-1.22.4-cp310-cp310-macosx_11_0_arm64.whl (12.8 MB)
Collecting plac<1.2.0,>=0.9.6
Using cached plac-1.1.3-py2.py3-none-any.whl (20 kB)
Collecting tqdm<5.0.0,>=4.10.0
Using cached tqdm-4.64.0-py2.py3-none-any.whl (78 kB)
Building wheels for collected packages: thinc, blis
Building wheel for thinc (setup.py): started
Building wheel for thinc (setup.py): finished with status 'error'
error: subprocess-exited-with-error

    × python setup.py bdist_wheel did not run successfully.
    │ exit code: 1
    ╰─> [32 lines of output]
        /Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
          warnings.warn(
        Traceback (most recent call last):
          File "<string>", line 2, in <module>
          File "<pip-setuptools-caller>", line 34, in <module>
          File "/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/thinc_94f9b50ea8df4fe28f30b0ef670e6c3e/setup.py", line 264, in <module>
            setup_package()
          File "/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/thinc_94f9b50ea8df4fe28f30b0ef670e6c3e/setup.py", line 201, in setup_package
            setup(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 86, in setup
            _install_setup_requires(attrs)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
            dist.fetch_build_eggs(dist.setup_requires)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 861, in fetch_build_eggs
            resolved_dists = pkg_resources.working_set.resolve(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 789, in resolve
            dist = best[req.key] = env.best_match(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1075, in best_match
            return self.obtain(req, installer)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1087, in obtain
            return installer(requirement)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 941, in fetch_build_egg
            return fetch_build_egg(self, req)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
            wheel.install_as_egg(dist_location)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 96, in install_as_egg
            self._install_as_egg(destination_eggdir, zf)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 104, in _install_as_egg
            self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 148, in _convert_metadata
            os.rename(dist_info, egg_info)
        OSError: [Errno 66] Directory not empty: '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/thinc_94f9b50ea8df4fe28f30b0ef670e6c3e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/numpy-1.22.4.dist-info' -> '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/thinc_94f9b50ea8df4fe28f30b0ef670e6c3e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/EGG-INFO'
        [end of output]
  
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for thinc
    Running setup.py clean for thinc
    Building wheel for blis (setup.py): started
    Building wheel for blis (setup.py): finished with status 'error'
    error: subprocess-exited-with-error
  
    × python setup.py bdist_wheel did not run successfully.
    │ exit code: 1
    ╰─> [31 lines of output]
        BLIS_COMPILER? None
        /Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
          warnings.warn(
        Traceback (most recent call last):
          File "<string>", line 2, in <module>
          File "<pip-setuptools-caller>", line 34, in <module>
          File "/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/setup.py", line 239, in <module>
            setup(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 86, in setup
            _install_setup_requires(attrs)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
            dist.fetch_build_eggs(dist.setup_requires)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 861, in fetch_build_eggs
            resolved_dists = pkg_resources.working_set.resolve(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 789, in resolve
            dist = best[req.key] = env.best_match(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1075, in best_match
            return self.obtain(req, installer)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1087, in obtain
            return installer(requirement)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 941, in fetch_build_egg
            return fetch_build_egg(self, req)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
            wheel.install_as_egg(dist_location)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 96, in install_as_egg
            self._install_as_egg(destination_eggdir, zf)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 104, in _install_as_egg
            self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 148, in _convert_metadata
            os.rename(dist_info, egg_info)
        OSError: [Errno 66] Directory not empty: '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/numpy-1.22.4.dist-info' -> '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/EGG-INFO'
        [end of output]
  
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for blis
    Running setup.py clean for blis
    error: subprocess-exited-with-error
  
    × python setup.py clean did not run successfully.
    │ exit code: 1
    ╰─> [31 lines of output]
        BLIS_COMPILER? None
        /Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
          warnings.warn(
        Traceback (most recent call last):
          File "<string>", line 2, in <module>
          File "<pip-setuptools-caller>", line 34, in <module>
          File "/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/setup.py", line 239, in <module>
            setup(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 86, in setup
            _install_setup_requires(attrs)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
            dist.fetch_build_eggs(dist.setup_requires)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 861, in fetch_build_eggs
            resolved_dists = pkg_resources.working_set.resolve(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 789, in resolve
            dist = best[req.key] = env.best_match(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1075, in best_match
            return self.obtain(req, installer)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1087, in obtain
            return installer(requirement)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 941, in fetch_build_egg
            return fetch_build_egg(self, req)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
            wheel.install_as_egg(dist_location)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 96, in install_as_egg
            self._install_as_egg(destination_eggdir, zf)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 104, in _install_as_egg
            self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 148, in _convert_metadata
            os.rename(dist_info, egg_info)
        OSError: [Errno 66] Directory not empty: '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/numpy-1.22.4.dist-info' -> '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/EGG-INFO'
        [end of output]
  
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed cleaning build dir for blis
  Failed to build thinc blis
  Installing collected packages: wasabi, srsly, plac, murmurhash, cymem, wheel, tqdm, setuptools, preshed, numpy, cython, catalogue, blis, thinc
    Running setup.py install for blis: started
    Running setup.py install for blis: finished with status 'error'
    error: subprocess-exited-with-error
  
    × Running setup.py install for blis did not run successfully.
    │ exit code: 1
    ╰─> [31 lines of output]
        BLIS_COMPILER? None
        /Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
          warnings.warn(
        Traceback (most recent call last):
          File "<string>", line 2, in <module>
          File "<pip-setuptools-caller>", line 34, in <module>
          File "/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/setup.py", line 239, in <module>
            setup(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 86, in setup
            _install_setup_requires(attrs)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
            dist.fetch_build_eggs(dist.setup_requires)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 861, in fetch_build_eggs
            resolved_dists = pkg_resources.working_set.resolve(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 789, in resolve
            dist = best[req.key] = env.best_match(
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1075, in best_match
            return self.obtain(req, installer)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1087, in obtain
            return installer(requirement)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/dist.py", line 941, in fetch_build_egg
            return fetch_build_egg(self, req)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
            wheel.install_as_egg(dist_location)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 96, in install_as_egg
            self._install_as_egg(destination_eggdir, zf)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 104, in _install_as_egg
            self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
          File "/Users/tuomas.poutanen/koodi/txt2hpo/venv/lib/python3.10/site-packages/setuptools/wheel.py", line 148, in _convert_metadata
            os.rename(dist_info, egg_info)
        OSError: [Errno 66] Directory not empty: '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/numpy-1.22.4.dist-info' -> '/private/var/folders/wp/j33q8lm90yx5crb6z4192vbw0000gn/T/pip-install-t6ymh64w/blis_2c4c791d2a814128855adaa7496d734e/.eggs/numpy-1.22.4-py3.10-macosx-10.9-universal2.egg/EGG-INFO'
        [end of output]
  
    note: This error originates from a subprocess, and is likely not a problem with pip.
  error: legacy-install-failure
  
  × Encountered error while trying to install package.
  ╰─> blis
  
  note: This is an issue with the package mentioned above, not pip.
  hint: See above for output from the failure.
  WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
  You should consider upgrading via the '/Users/tuomas.poutanen/koodi/txt2hpo/venv/bin/python -m pip install --upgrade pip' command.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.`

AttributeError: vocab attribute removed from Keyed Vector in Gensim 4.0.0

Odd AttributeError with the following query:

from txt2hpo.extract import Extractor
extract=Extractor()
extract.hpo('blood group').hpids

Traceback (most recent call last):
File "", line 1, in
File "/home/stephen/anaconda3/lib/python3.8/site-packages/txt2hpo-0.2.2-py3.8.egg/txt2hpo/extract.py", line 236, in hpo
extracted_terms.resolve_conflicts()
File "/home/stephen/anaconda3/lib/python3.8/site-packages/txt2hpo-0.2.2-py3.8.egg/txt2hpo/extract.py", line 98, in resolve_conflicts
similarity_scores.append(similarity_term_to_context(term, entry['context'], self.model))
File "/home/stephen/anaconda3/lib/python3.8/site-packages/txt2hpo-0.2.2-py3.8.egg/txt2hpo/nlp.py", line 79, in similarity_term_to_context
term_tokens = remove_out_of_vocab(remove_stopwords(hpo_term_definition).split())
File "/home/stephen/anaconda3/lib/python3.8/site-packages/txt2hpo-0.2.2-py3.8.egg/txt2hpo/nlp.py", line 75, in remove_out_of_vocab
return [x for x in tokens if x in model.vocab]
File "/home/stephen/anaconda3/lib/python3.8/site-packages/txt2hpo-0.2.2-py3.8.egg/txt2hpo/nlp.py", line 75, in
return [x for x in tokens if x in model.vocab]
File "/home/stephen/anaconda3/lib/python3.8/site-packages/gensim-4.0.0b0-py3.8-linux-x86_64.egg/gensim/models/keyedvectors.py", line 648, in vocab
raise AttributeError(
AttributeError: The vocab attribute was removed from KeyedVector in Gensim 4.0.0.
Use KeyedVector's .key_to_index dict, .index_to_key list, and methods .get_vecattr(key, attr) and .set_vecattr(key, attr, new_val) instead.
See https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4

TypeError: expected str, bytes or os.PathLike object, not NoneType

I have some issues about the example code. When I run the code below, it will report TypeError. Could you help me?

  • code:
from txt2hpo.extract import Extractor
extract = Extractor()
result = extract.hpo("patient with developmental delay and hypotonia")
print(result.hpids)
  • results
Traceback (most recent call last):
  File "D:/PPD/2hpo.py", line 2, in <module>
    from txt2hpo.extract import Extractor
  File "D:\ProgramData\Anaconda3\envs\python36\lib\site-packages\txt2hpo\extract.py", line 7, in <module>
    from txt2hpo.build_tree import update_progress, hpo_network
  File "D:\ProgramData\Anaconda3\envs\python36\lib\site-packages\txt2hpo\build_tree.py", line 4, in <module>
    from txt2hpo.config import logger, config
  File "D:\ProgramData\Anaconda3\envs\python36\lib\site-packages\txt2hpo\config.py", line 29, in <module>
    config_directory = os.path.join(os.environ.get('HOME'), f'.{__project__}')
  File "D:\ProgramData\Anaconda3\envs\python36\lib\ntpath.py", line 76, in join
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

ACB-69

The following should return HP:0002107

ext = Extractor(removed_negated=True)
ext.hpo("The trachea is midline and there is no pneumothorax").negated_hpids

[ ]

The following does return HP:0002107 but only because there are two spaces between "no" and "pneumothorax"

ext.hpo("The trachea is midline and there is no  pneumothorax").negated_hpids

['HP:0002107']

The following should return HP:0002108 but instead returns the term for "pneumothorax"

ext.hpo("The trachea is midline and there is no spontaneous pneumothorax").negated_hpids

['HP:0002107']

Retaining duplicate terms

This may sound really niche and specific, I understand, but I was thinking of using txt2hpo to extract terms for tf-idf calculations, but term frequency is rather impossible to calculate without counting duplicates. Is there a quick and dirty way to use an option to retain duplicates? Or add it into the code?

Removing "A" as a stop word

Trying to get the HPIDs for the query of "blood group" returns two ids when resolve_conflicts=False

from txt2hpo.extract import Extractor
extract=Extractor(resolve_conflicts=False,correct_spelling=False)
extract.hpo('blood group').hpids

['HP:0032370','HP:0032223']

These refer to the "Blood group A" and "Blood group" phenotypes, respectively. The model considers both phenotypes to be identical given that the term "A" is considered as a stop word.

Fix installation

Relates to #59

  • Need to identify the minimum working version of python for the project and identify it explicitly in the documentation
  • Investigate installation options/requirements for M1 Mac users

Returning HPO terms that do NOT apply

There is interest in augmenting txt2hpo to be able to return phenotypes that explicitly do NOT apply. Currently, the remove_negated=TRUE argument identifies these scenarios (via negspaCy), but subsequently removes the HPO terms from the output. This could be implemented in one of two ways:

  1. The addition of a new feature, such as a boolean flag (e.g. - retain_negated=TRUE), could permit the software to output this information in the form of a separate list.
  2. Adjusting the Data class object to retain the list of negated terms (edit to the following module)

TypeError: type() takes 1 or 3 arguments

When using python 3.9, I'm getting this following error

TypeError: type() takes 1 or 3 arguments
Traceback:
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/streamlit/script_runner.py", line 430, in _run_script
    exec(code, module.__dict__)
File "/Users/tarunmamidi/Documents/Development/uab-meter/src/streamlit.py", line 39, in <module>
    main()
File "/Users/tarunmamidi/Documents/Development/uab-meter/src/streamlit.py", line 35, in main
    page = page()
File "/Users/tarunmamidi/Documents/Development/uab-meter/src/diagnosis/diagnosis.py", line 80, in hd
    data, id_to_name, res, gene2hpo, flat_ls, staged_data, extract = load_data()
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/streamlit/legacy_caching/caching.py", line 573, in wrapped_func
    return get_or_create_cached_value()
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/streamlit/legacy_caching/caching.py", line 557, in get_or_create_cached_value
    return_value = func(*args, **kwargs)
File "/Users/tarunmamidi/Documents/Development/uab-meter/src/diagnosis/diagnosis.py", line 75, in load_data
    extract = Extractor(remove_negated=True)
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/txt2hpo/extract.py", line 186, in __init__
    self.negation_model = nlp_model(negation_language=negation_language)
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/txt2hpo/nlp.py", line 14, in nlp_model
    nlp.add_pipe(nlp.create_pipe('sentencizer'))
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/spacy/language.py", line 302, in create_pipe
    return factory(self, **config)
File "/Users/tarunmamidi/opt/anaconda3/envs/hazel/lib/python3.9/site-packages/spacy/language.py", line 1045, in factory
    return obj.from_nlp(nlp, **cfg)
File "pipes.pyx", line 1465, in spacy.pipeline.pipes.Sentencizer.from_nlp

I saw the similar issue here.

Can you please help me with this?

Skipping parents/sibling phenotypes

A male patient with the following phrase - “mother was treated for trichomonas vaginitis during pregnancy” returns HP:0030683 for "Vaginitis". Can we adapt txt2hpo to identify/remove phenotypes from parents and siblings?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.