Giter Club home page Giter Club logo

genewalk's Issues

KeyError: 'ensembl_id'

Hi, I tried running the newest version with Ensembl IDs, and after around 1 hour of running time using 20 cores this is what I get, which looks like a bug to me:

INFO: [2019-09-23 17:22:32] gensim.models.base_any2vec - worker thread finished; awaiting finish of 2 more threads
INFO: [2019-09-23 17:22:32] gensim.models.base_any2vec - worker thread finished; awaiting finish of 1 more threads
INFO: [2019-09-23 17:22:32] gensim.models.base_any2vec - worker thread finished; awaiting finish of 0 more threads
INFO: [2019-09-23 17:22:32] gensim.models.base_any2vec - EPOCH - 5 : training on 164576000 raw words (164576000 effective words) took 119.9s, 1372717 effective words/s
INFO: [2019-09-23 17:22:32] gensim.models.base_any2vec - training on a 822880000 raw words (822880000 effective words) took 566.7s, 1452084 effective words/s
INFO: [2019-09-23 17:22:32] genewalk.deepwalk - Generating node vectors done in 610.30s
INFO: [2019-09-23 17:22:33] genewalk.cli - Saving into /home/carnold/genewalk/cll_test/deepwalk_node_vectors_rand_3.pkl...
INFO: [2019-09-23 17:22:41] genewalk.cli - Saving into /home/carnold/genewalk/cll_test/genewalk_rand_simdists.pkl...
INFO: [2019-09-23 17:22:41] genewalk.cli - Loading /home/carnold/genewalk/cll_test/multi_graph.pkl...
INFO: [2019-09-23 17:22:41] genewalk.cli - Loading /home/carnold/genewalk/cll_test/genes.pkl...
INFO: [2019-09-23 17:22:41] genewalk.cli - Loading /home/carnold/genewalk/cll_test/deepwalk_node_vectors_1.pkl...
INFO: [2019-09-23 17:22:42] genewalk.cli - Loading /home/carnold/genewalk/cll_test/deepwalk_node_vectors_2.pkl...
INFO: [2019-09-23 17:22:42] genewalk.cli - Loading /home/carnold/genewalk/cll_test/deepwalk_node_vectors_3.pkl...
INFO: [2019-09-23 17:22:42] genewalk.cli - Loading /home/carnold/genewalk/cll_test/genewalk_rand_simdists.pkl...
Traceback (most recent call last):
File "bla/TOOLS/miniconda/lib/python3.7/site-packages/pandas/core/indexes/", line 2897, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'ensembl_id'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "bla/TOOLS/miniconda/bin/genewalk", line 10, in
File "bla/TOOLS/miniconda/lib/python3.7/site-packages/genewalk/", line 203, in main
File "bla/TOOLS/miniconda/lib/python3.7/site-packages/genewalk/", line 178, in generate_output
df[base_id_type] = df[base_id_type].astype('category')
File "bla/TOOLS/miniconda/lib/python3.7/site-packages/pandas/core/", line 2980, in getitem
indexer = self.columns.get_loc(key)
File "bla/TOOLS/miniconda/lib/python3.7/site-packages/pandas/core/indexes/", line 2899, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'ensembl_id'


Criteria for barplot output

Hi all,

Quick question - can you explain the criteria used to determine which barplots are automatically generated in the output files? The github page says "barplots with GO annotations ranked by relevance for each input gene that GeneWalk was able to generate results for," but I'm not sure if this is supposed to explain why only a subset of the bar plots get produced and on what basis they are selected.


extending to KEGG


It's a very cool method. I'm wondering if the method could be extended to enrichment on KEGG terms?


Code for regulator/moonlighter plots


I was curious if you have a python or R script for reproducing the scatterplots such that individual genes of interest can be labeled and the size of the plot can be adjusted for easier visualization for publication?


Ensembl IDs with dots cause problems

Hi, another Ensembl ID related issue: When the IDs contain the ".X" notation, like ".3", the mapping fails for all of them, causing the pipeline to run through but an empty file at the end. I think this should be improved like this:

  1. If all IDs could not been mapped, abort right away
  2. For Ensembl IDs, if the IDs end with ".X", X being any integer, remove it from the ID and then do the mapping.

We can of course also remove them, but it should be stated somewhere, and the nicest of course is to do it automatically for the user :)

Importing the numpy c-extensions failed.


Thank you for the super interesting package.
I successfully installed it on my local Anaconda machine running on a Windows 10 machine.
Now I am currently trying to run genewalk on our cluster (Ubuntu, 2.6.32-431.20.3.el6.x86_64).

genewalk --project qki --genes /home/gitpycode/Documents/genes.csv --id_type mgi_id

I already set up the whole installation multiple times using virtual environments and trying different versions of python (3.5.0 and 3.7.0) and always get the same error message:

Traceback (most recent call last):
  File "/home/gitpycode/gwalk1/lib/python3.7/site-packages/numpy/core/", line 17, in <module>
    from . import multiarray
  File "/home/gitpycode/gwalk1/lib/python3.7/site-packages/numpy/core/", line 14, in <module>
    from . import overrides
  File "/home/gitpycode/gwalk1/lib/python3.7/site-packages/numpy/core/", line 7, in <module>
    from numpy.core._multiarray_umath import (
ImportError: PyCapsule_Import could not import module "datetime"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gitpycode/gwalk1/bin/genewalk", line 5, in <module>
    from genewalk.cli import main
  File "/home/gitpycode/gwalk1/lib/python3.7/site-packages/genewalk/", line 8, in <module>
    import numpy as np
  File "/home/gitpycode/gwalk1/lib/python3.7/site-packages/numpy/", line 142, in <module>
    from . import core
  File "/home/gitpycode/gwalk1/lib/python3.7/site-packages/numpy/core/", line 47, in <module>
    raise ImportError(msg)


Importing the numpy c-extensions failed.
- Try uninstalling and reinstalling numpy.
- If you have already done that, then:
  1. Check that you expected to use Python3.7 from "/home/gitpycode/gwalk1/bin/python3",
     and that you have no directories in your PATH or PYTHONPATH that can
     interfere with the Python and numpy version "1.17.3" you're trying to use.
  2. If (1) looks fine, you can open a new issue at  Please include details on:
     - how you installed Python
     - how you installed numpy
     - your operating system
     - whether or not you have multiple versions of Python installed
     - if you built from source, your compiler versions and ideally a build log

- If you're working with a numpy git repository, try `git clean -xdf`
  (removes all files not under version control) and rebuild numpy.

Note: this error has many possible causes, so please don't comment on
an existing issue about this - open a new one instead.

Original error was: PyCapsule_Import could not import module "datetime"

Segmentation fault

Can somebody help me to identify the problem?
Thank you for your help!

compiling the resource folder - error


Thanks again for making this and making it available.

I'm new to python- I apologize if my error is something basic but I'd appreciate anyone taking a look:

I installed the genewalk using
pip install genewalk

got one error:
indra 1.15.1 has requirement networkx<=2.3,>=2, but you'll have networkx 2.4 which is incompatible.

but then indra (1.15.1 ) installs anyway and appears in the list when I run
conda list

I've attempted to run the following command with the same error. I ran it in a python 3.7 env and 3.5 and got basically the same error. Any ideas?

(py35) osx2560:~ James$ genewalk --project QKI --genes ~/Downloads/QKI_forGW.csv --id_type mgi_id
INFO: [2019-10-30 08:50:15] genewalk.cli - Creating project folder at /Users/James/genewalk/QKI
INFO: [2019-10-30 08:50:15] genewalk.resources - Using /Users/James/genewalk/resources as resource folder.
INFO: [2019-10-30 08:50:15] genewalk.resources - Downloading into /Users/James/genewalk/resources/go.obo
Traceback (most recent call last):
  File "/Users/James/miniconda3/envs/py35/bin/genewalk", line 11, in <module>
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/site-packages/genewalk/", line 145, in main
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/site-packages/genewalk/", line 51, in download_all
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/site-packages/genewalk/", line 20, in get_go_obo
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/site-packages/genewalk/", line 59, in download_go
    urllib.request.urlretrieve(url, fname)
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 188, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 163, in urlopen
    return, data, timeout)
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 472, in open
    response = meth(req, response)
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 582, in http_response
    'http', request, response, code, msg, hdrs)
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 510, in error
    return self._call_chain(*args)
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 444, in _call_chain
    result = func(*args)
  File "/Users/James/miniconda3/envs/py35/lib/python3.5/urllib/", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

Trouble Using GeneWalk

Hello Churchman Lab Team,

I have recently tried to implement your module for an enrichment analysis on genes I got from a differential gene expression analysis. However, an error keeps on recurring and I am unsure of what the problem is.

My Python version is 3.8.6 which should be able to run GeneWalk. I also installed the module with no errors. The following lines show up when I try to run the module

$ genewalk --project PMS --genes PMSUpGenesOnly.txt --id_type hgnc_symbol
INFO: [2021-03-01 13:54:37] genewalk.cli - Creating PMS folder at /Users/sinjiafan/genewalk/PMS
INFO: [2021-03-01 13:54:37] genewalk.resources - Using /Users/sinjiafan/genewalk/resources as resource folder.
INFO: [2021-03-01 13:54:37] genewalk.resources - Downloading into /Users/sinjiafan/genewalk/resources/hgnc_entries.tsv
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 1350, in do_open
    h.request(req.get_method(), req.selector,, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/", line 1010, in _send_output
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/", line 950, in send
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/", line 1424, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/", line 1040, in _create
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/", line 1309, in do_handshake
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/bin/genewalk", line 8, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/", line 157, in main
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/", line 195, in run_main
    genes = read_gene_list(args.genes, args.id_type, rm)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/", line 31, in read_gene_list
    gene_mapper = GeneMapper(resource_manager)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/", line 232, in __init__
    self.hgnc_file = self.resource_manager.get_hgnc()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/", line 78, in get_hgnc
    download_url(url, fname)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/", line 125, in download_url
    urllib.request.urlretrieve(url, fname)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 222, in urlopen
    return, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 525, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 502, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 1393, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/", line 1353, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124)>

I have attached the gene list I am trying to run as well.


I hope you can help me resolve this issue. Thank you so much in advance!

Sinja (Xuanjia) Fan


Hi, a quick question: Is it possible to support also ENSEMBL IDs as input?

Installation failed.

Hi there,

I created a fresh conda environment with
conda create -n genewalk python=3.5

and installed genewalk using
pip install git+

but genewalk -h gave me this error:

Traceback (most recent call last):
  File "/exports/igmm/eddie/Glioblastoma-WGS/anaconda/envs/genewalk/bin/genewalk", line 5, in <module>
    from genewalk.cli import main
  File "/exports/igmm/eddie/Glioblastoma-WGS/anaconda/envs/genewalk/lib/python3.5/site-packages/genewalk/", line 11, in <module>
    from genewalk.nx_mg_assembler import load_network
  File "/exports/igmm/eddie/Glioblastoma-WGS/anaconda/envs/genewalk/lib/python3.5/site-packages/genewalk/", line 6, in <module>
    from indra.databases import go_client
  File "/exports/igmm/eddie/Glioblastoma-WGS/anaconda/envs/genewalk/lib/python3.5/site-packages/indra/databases/", line 7, in <module>
    from .identifiers import get_identifiers_url, parse_identifiers_url, \
  File "/exports/igmm/eddie/Glioblastoma-WGS/anaconda/envs/genewalk/lib/python3.5/site-packages/indra/databases/", line 302
    if not db_id.startswith(f'{db_ns}{colon}'):
SyntaxError: invalid syntax

Could you help me troubleshoot please?


AttributeError: module 'typing' has no attribute 'NoReturn

genewalk --project RNAseq9 --genes cluster9genelist.txt --id_type custom --network_source sif_annot --network_file fullnetwork.txt --base_folder Genewalk --nproc 8


Traceback (most recent call last):
  File "/n/groups/churchman/Genewalk/genewalkenv/bin/genewalk", line 5, in <module>
    from genewalk.cli import main
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/genewalk/", line 20, in <module>
    from genewalk.plot import GW_Plotter
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/genewalk/", line 10, in <module>
    import as px
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/plotly/", line 34, in <module>
    from plotly import (
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/plotly/io/", line 6, in <module>
    from . import orca, kaleido
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/plotly/io/", line 1, in <module>
    from ._orca import (
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/plotly/io/", line 15, in <module>
    import tenacity
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/tenacity/", line 184, in <module>
    class RetryError(Exception):
  File "/n/groups/churchman/Genewalk/genewalkenv/lib/python3.6/site-packages/tenacity/", line 191, in RetryError
    def reraise(self) -> t.NoReturn:
AttributeError: module 'typing' has no attribute 'NoReturn'

Genewalk$ pip freeze


IndexError: list index out of range

I ran genewalk from a virtual environment (to avoid conflicting version dependencies) and received the following error (“IndexError”):

Traceback (most recent call last):
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/bin/genewalk", line 5, in
from genewalk.cli import main
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/lib/python3.9/site-packages/genewalk/", line 12, in
from genewalk.gene_lists import read_gene_list
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/lib/python3.9/site-packages/genewalk/", line 8, in
from indra.databases import hgnc_client, uniprot_client
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/lib/python3.9/site-packages/indra/databases/", line 10, in
from protmapper.uniprot_client import *
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/lib/python3.9/site-packages/protmapper/", line 16, in
from protmapper.api import ProtMapper, MappedSite
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/lib/python3.9/site-packages/protmapper/", line 712, in
File "/software/genewalk/Python-3.9.0-genewalk-1.4.0-venv/lib/python3.9/site-packages/protmapper/", line 1221, in _build_hgnc_mappings
uniprot_id = row[6]
IndexError: list index out of range

Any thoughts on why this error arises?

Error when calling word2vec: unexpected keyword argument 'size'

GeneWalk needs updating because of GenSim 4.0.0 release

For users running into the following error:
File "/lib64/python3.6/site-packages/genewalk/", line 138, in word2vec sample=sample) TypeError: __init__() got an unexpected keyword argument 'size'

Immediate fix: downgrade gensim to previous version before running genewalk:
pip install --upgrade gensim==3.8.3

Long term solution that I will implement very soon: make GeneWalk compatible with gensim 4.0.0.

More info on the Gensim migration:
in Word2Vec: size ctr parameter is now consistently vector_size

Bioconda integration

GeneWalker fits very well into Bioconda, do you have plans of adding it to Bioconda as well so that it can be installed via "conda install" also? Would be great, the tools looks very promising!

AttributeError: type object 'object' has no attribute 'dtype'

Hello, thank you for developing this great algorithm!

  1. I installed on my macbook, as recommended, without error.
  2. It went smooth for the first set of genes.
  3. Before the second set, I realized it was complaining about the NumPy version,
  4. so I ran pip install numpy --upgrade.
  5. Then I ran the for the second set of genes, and it failed after the moonlight plot.
  6. I reran, failed
  7. I reran the 1st gene set, failed as well now.
INFO: [2021-02-09 15:29:42] gensim.models.base_any2vec - EPOCH 5 - PROGRESS: at 96.25% examples, 2188362 words/s, in_qsize 8, out_qsize 0
INFO: [2021-02-09 15:29:43] gensim.models.base_any2vec - EPOCH 5 - PROGRESS: at 97.75% examples, 2190384 words/s, in_qsize 8, out_qsize 0
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - EPOCH 5 - PROGRESS: at 99.33% examples, 2194058 words/s, in_qsize 8, out_qsize 0
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - worker thread finished; awaiting finish of 3 more threads
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - worker thread finished; awaiting finish of 2 more threads
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - worker thread finished; awaiting finish of 1 more threads
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - worker thread finished; awaiting finish of 0 more threads
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - EPOCH - 5 : training on 155070000 raw words (155070000 effective words) took 70.6s, 2195957 effective words/s
INFO: [2021-02-09 15:29:44] gensim.models.base_any2vec - training on a 775350000 raw words (775350000 effective words) took 330.0s, 2349202 effective words/s
INFO: [2021-02-09 15:29:44] genewalk.deepwalk - Generating node vectors done in 374.79s
INFO: [2021-02-09 15:29:45] genewalk.cli - Saving into ~/genewalk/SEO_cl6/deepwalk_node_vectors_rand_3.pkl...
INFO: [2021-02-09 15:29:48] genewalk.cli - Saving into ~/genewalk/SEO_cl6/genewalk_rand_simdists.pkl...
INFO: [2021-02-09 15:29:48] genewalk.cli - Loading ~/genewalk/SEO_cl6/multi_graph.pkl...
INFO: [2021-02-09 15:29:48] genewalk.cli - Loading ~/genewalk/SEO_cl6/genes.pkl...
INFO: [2021-02-09 15:29:48] genewalk.cli - Loading ~/genewalk/SEO_cl6/deepwalk_node_vectors_1.pkl...
INFO: [2021-02-09 15:29:48] genewalk.cli - Loading ~/genewalk/SEO_cl6/deepwalk_node_vectors_2.pkl...
INFO: [2021-02-09 15:29:48] genewalk.cli - Loading ~/genewalk/SEO_cl6/deepwalk_node_vectors_3.pkl...
INFO: [2021-02-09 15:29:49] genewalk.cli - Loading ~/genewalk/SEO_cl6/genewalk_rand_simdists.pkl...
INFO: [2021-02-09 15:29:49] genewalk.cli - Saving final results into ~/genewalk/SEO_cl6/genewalk_results.csv
INFO: [2021-02-09 15:29:49] genewalk.cli - Creating figures folder at ~/genewalk/SEO_cl6/figures
INFO: [2021-02-09 15:29:49] genewalk.cli - Creating barplots folder at ~/genewalk/SEO_cl6/figures/barplots
INFO: [2021-02-09 15:29:49] genewalk.plot - Scatter plot data output to genewalk_scatterplots.csv...
INFO: [2021-02-09 15:29:51] genewalk.plot - Regulator genes plotted in regulators_x_gene_con_y_frac_rel_go...
INFO: [2021-02-09 15:29:51] genewalk.plot - Regulator genes listed in genewalk_regulators.csv...
INFO: [2021-02-09 15:29:52] genewalk.plot - Moonlighting genes plotted in moonlighters_x_go_con_y_frac_rel_go...
Traceback (most recent call last):
  File "~/miniconda3/bin/genewalk", line 11, in <module>
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 146, in main
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 235, in run_main
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 53, in generate_plots
    moonlight_html = self.scatterplot_moonlighters()
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 221, in scatterplot_moonlighters
    df = pd.DataFrame(sorted(moonlighters), columns=['gw_moonlighter'])
  File "~/miniconda3/lib/python3.7/site-packages/pandas/core/", line 453, in __init__
    mgr = init_dict({}, index, columns, dtype=dtype)
  File "~/miniconda3/lib/python3.7/site-packages/pandas/core/internals/", line 196, in init_dict
  File "~/miniconda3/lib/python3.7/site-packages/pandas/core/dtypes/", line 1175, in construct_1d_arraylike_from_scalar
    dtype = dtype.dtype
AttributeError: type object 'object' has no attribute 'dtype'

Error after reunning the with the gene set that worked:

INFO: [2021-02-09 21:21:22] genewalk.plot - Moonlighting genes plotted in moonlighters_x_go_con_y_frac_rel_go...
Traceback (most recent call last):
  File "~/miniconda3/bin/genewalk", line 11, in <module>
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 146, in main
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 235, in run_main
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 53, in generate_plots
    moonlight_html = self.scatterplot_moonlighters()
  File "~/miniconda3/lib/python3.7/site-packages/genewalk/", line 221, in scatterplot_moonlighters
    df = pd.DataFrame(sorted(moonlighters), columns=['gw_moonlighter'])
  File "~/miniconda3/lib/python3.7/site-packages/pandas/core/", line 453, in __init__
    mgr = init_dict({}, index, columns, dtype=dtype)
  File "~/miniconda3/lib/python3.7/site-packages/pandas/core/internals/", line 196, in init_dict
  File "~/miniconda3/lib/python3.7/site-packages/pandas/core/dtypes/", line 1175, in construct_1d_arraylike_from_scalar
    dtype = dtype.dtype
AttributeError: type object 'object' has no attribute 'dtype'

I guess I would have to downgrade NumPy. Or do you have a version working with the latest NumPy?


No regulators identified?


I've run 10 gene sets through genewalk from an RNA-seq experiment (various treatment conditions with up- or down-regulated genes).

While 9/10 gene sets have produced expected results, one particular gene set fails to identify any regulators, i.e. the scatterplot is empty with the exception of a few dots on the x-axis and the genewalk_regulators.csv is empty. Despite this, the barplot folder is populated with 688 figures, so its not clear to me if this is a true reflection of the gene set I've provided or some kind of error. I've attempted to re-run this analysis on a few different occasions by re-generating the source gene list file (thinking it was corrupted in some way maybe? Just a wild guess). Nothing has seemed to help.

For reference, the analysis is being conducted on MacOS 11.2.1 with Python v3.8. The code I'm using for the analysis is below:

$ genewalk --project UTD24_DOWN --genes UTD24_DOWN.txt --id_type ensembl_id --nproc 4 --nreps_graph 10 --nreps_null 10

I've also attached the output log file, results, scatter plot and regulators spreadsheet.


Mouse ids are not working with genewalk?


I have tried to run some mouse gene list (from my differentially expressed data) with mouse_entrez ids (around 250 genes). Even though, on the axises of Regulator & Moonlight genes plots, I got "Number of GO annotations per gene" on X axis, "the fraction of relevant GO terms" was 0 and "Connections with other genes" was also 0.
I was wondering if my entered format of mouse_entrez ids is not correct, or if, there are just no GO terms associated with these genes (from the human orthologs that Genewalk uses). Please let me know also, if the format of my entrez_mouse ids is not correct (I have the list of all my genes in GeneSymbol format before i use their entrez_ids for genewalk):
The command i run for genewalk is:

genewalk --project genewalkspermlongRNAseq --genes mouse_entrez_ids_list.txt --id_type mouse_entrez --nproc 8
I have included several files with this issue

  1. folder with the plots that i received from genewalk (the plots of Regulator genes & Moonlight genes)
  2. my raw list of mouse_entrez genes (as a zipped file, but it's basically a .csv file)
  3. The error list that i receive

INFO: [2021-03-26 14:58:49] genewalk.cli - Creating sperm_downregulated_entrez_mouse_26032021_anara folder at /home/anara/genewalk/sperm_downregulated_entrez_mouse_26032021_anara
INFO: [2021-03-26 14:58:49] genewalk.resources - Using /home/anara/genewalk/resources as resource folder.
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 3608415
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC102224.2
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 2142174
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC124977.2
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC133523.2
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC133902.3
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 2141341
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC138299.1
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC140364.2
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 3796981
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC158352.1
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC161409.5
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID AC164544.5
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Astx2
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Atcayos
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID CH25-501L8.4
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID CH36-169F23.5
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 107303
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Dlx1as
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Gm10217
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Gm17571
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Gm22690
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Gm26381
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Gm26545
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 3646599
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Kat6b-ps1
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Kif22-ps
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID Lincmd1
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not find an MGI mapping for Entrez ID LINE/L1?
WARNING: [2021-03-26 14:58:49] genewalk.gene_lists - Could not get HGNC ID for MGI ID 1888480

GO Terms


Not an issue but more of a question - is it possible to restrict the GO terms utilized in Genewalk to only those of a specific category (e.g. biological process, etc.)? I'm curious if I can exclude GO terms I'm not particularly interested in (cellular component, for example) and derive more meaningful, significant GO term associations for identified regulators. I surveyed the options in genewalk --help but it didn't seem like any of the commands could be used to modify the GO terms.


version argument missing

A --version argument would be good to have, so it becomes easier to quickly check the version and extract it via automated pipelines that integrate genewalk.

Rat genome

I want to do analysis of genes in the Rat genome. Is this possible.
Kindly let me know.


Illustration of the GeneWalk network

Thanks a lot for your work. I ran a GeneWalk analysis and would like to visualise the network generated. I think that's saved in multi_graph.pkl? I tried to draw it with networkx & pyplot, but it didn't turn out very pretty. Do you have a script?


Error while downloading resources - PathwayCommons11.All.hgnc.sif.gz

Hi there,

I was trying to get genewalk going on my data, however when running genewalk like this

genewalk --project test --genes ./input.csv --id_type hgnc_symbol --nproc 4

I'm presented with the following error message(s):

INFO: [2019-10-31 12:37:46] genewalk.cli - Creating project folder at /users/lule/genewalk/test
INFO: [2019-10-31 12:37:46] genewalk.resources - Using /users/lule/genewalk/resources as resource folder.
INFO: [2019-10-31 12:37:46] genewalk.resources - Downloading and extracting into /users/lule/genewalk/resources/PathwayCommons11.All.hgnc.sif
Traceback (most recent call last):
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 1318, in do_open
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/http/", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/http/", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/http/", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/http/", line 1026, in _send_output
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/http/", line 964, in send
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/http/", line 936, in connect
    (,self.port), self.timeout, self.source_address)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/", line 724, in create_connection
    raise err
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/", line 713, in create_connection
OSError: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/users/lule/.local/bin/genewalk", line 11, in <module>
  File "/users/lule/.local/lib/python3.6/site-packages/genewalk/", line 145, in main
  File "/users/lule/.local/lib/python3.6/site-packages/genewalk/", line 53, in download_all
  File "/users/lule/.local/lib/python3.6/site-packages/genewalk/", line 37, in get_pc
    download_gz(fname, url_pc)
  File "/users/lule/.local/lib/python3.6/site-packages/genewalk/", line 65, in download_gz
    urllib.request.urlretrieve(url, gz_file)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 223, in urlopen
    return, data, timeout)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 526, in open
    response = self._open(req, data)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 544, in _open
    '_open', req)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 504, in _call_chain
    result = func(*args)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 1346, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/software/2020/software/python/3.6.6-foss-2018b/lib/python3.6/urllib/", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 113] No route to host>

Is the PathwayCommons11.All.hgnc.sif.gz file no longer available under the URL?


Install issue

Hi all,

I'm having an issue with GeneWalk install as it gets to the point of installing gensim, which continues to error out with an exit status 1 regardless of being run in a virtual environment. I'm using MacOS v11.1 and Python v3.9.

The code I'm using is as follows:

$ python3 -m venv tutorial-env
$ source tutorial-env/bin/activate
$ pip install genewalk

There error I get (alongside the program log) is:

ERROR: Command errored out with exit status 1: /Users/npokoryznski/tutorial-env/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/jc/43vp9sr55j714y8tzqs948580000gp/T/pip-install-3uba9rue/gensim/'"'"'; file='"'"'/private/var/folders/jc/43vp9sr55j714y8tzqs948580000gp/T/pip-install-3uba9rue/gensim/'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);'"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/jc/43vp9sr55j714y8tzqs948580000gp/T/pip-record-3x8walux/install-record.txt --single-version-externally-managed --compile --install-headers /Users/npokoryznski/tutorial-env/include/site/python3.9/gensim Check the logs for full command output.

I've tried to upgrade pip and setuptools to potentially resolve the issue but neither helped. I've tried a variety of other commands to circumvent administrative barriers etc as well, but since none seemed to resolve the issue I thought I would keep it simple. I'm very novice when it comes to python, bash, etc. so I fully expect to be making a trivial error somewhere here but I can't figure out the problem. Any help is appreciated!

TypeError: Input graph is not a networkx graph type

Is there any additional inputs required for running GeneWalk on a list of human gene IDs? I am running the following command, which has about 80 gene names from a DE experiment.

genewalk --project test --genes /results.txt --id_type hgnc_symbol

Which returned this:

INFO: [2019-09-11 12:01:29] genewalk.nx_mg_assembler - Adding gene edges from Pathway Commons to graph.
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/networkx/", line 46, in _prep_create_using
TypeError: clear() missing 1 required positional argument: 'self'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/anaconda3/bin/genewalk", line 11, in
File "/anaconda3/lib/python3.6/site-packages/genewalk/", line 151, in main
File "/anaconda3/lib/python3.6/site-packages/genewalk/", line 38, in load_network
mg = PcNxMgAssembler(genes, resource_manager=resource_manager)
File "/anaconda3/lib/python3.6/site-packages/genewalk/", line 196, in init
File "/anaconda3/lib/python3.6/site-packages/genewalk/", line 214, in add_pc_edges
File "/anaconda3/lib/python3.6/site-packages/networkx/", line 313, in from_pandas_edgelist
g = _prep_create_using(create_using)
File "/anaconda3/lib/python3.6/site-packages/networkx/", line 48, in _prep_create_using
raise TypeError("Input graph is not a networkx graph type")

Any insight?

rdflib=4.2.2 and python 3 version conflict

I am using anaconda python ver 3.6
adding missing library

conda install -n py36 rdflib=4.2

ends up with information about conflict between py3.6 and rdflib=4.2.2
saying that rdflib=4.2 -> python=3.4


network source file

I ran Genewalk using the following command :
`genewalk --project context1 --genes /home/amit/genewalk/gene_list_DE_ER_UBT.txt --id_type hgnc_id --stage all --base_folder /home/amit/genewalk/chigozie/ --network_source /home/amit/genewalk/chigozie/resources/PathwayCommons12.All.hgnc_current.sif --nproc 6

but it gave me an error:
genewalk: error: argument --network_source: invalid choice: '/home/amit/genewalk/chigozie/resources/PathwayCommons12.All.hgnc_current.sif' (choose from 'pc', 'indra', 'edge_list', 'sif')

I looked into the command argument and found these:
--network_source {pc,indra,edge_list,sif}
The source of the network to be used.Possible values
are: pc, indra, edge_list, and sif. In case of indra,
edge_list, and sif, the network_file argument must be
specified. Default: pc
--network_file NETWORK_FILE
If network_source is indra, this argument points to a
Python pickle file in which a list of INDRA Statements
constituting the network is contained. In case
network_source is edge_list or sif, the network_file
argument points to a text file representing the
Can you kindly help in terms of the source of these files or whether the user has to supply them.


INDRA script generation

Hello, I was looking to explore creating a custom INDRA input. I was wondering if you could provide the script used to create the INDRA network from the paper.



The documentation shows that genewalk supports ensembl_id. However, after simple installation (pip install genewalk) and a test run, it reports an error that ensembl_id is not supported.

usage: genewalk [-h] --project PROJECT --genes GENES --id_type
[--stage {all,node_vectors,null_distribution,statistics}]
[--base_folder BASE_FOLDER]
[--network_source {pc,indra,edge_list,sif}]
[--network_file NETWORK_FILE] [--nproc NPROC]
[--nreps_graph NREPS_GRAPH] [--nreps_null NREPS_NULL]
[--alpha_fdr ALPHA_FDR] [--save_dw SAVE_DW]
[--random_seed RANDOM_SEED]

Id_type for mouse genes

I have a text file of mouse genes with its MGI_ID (e.g. MGI:894679). However when I ran genewalk I receive errors : genewalk.gene_lists - Could not get HGNC ID for MGI ID although the code kept running. Is this an issue and if so should I convert the gene names into HGNC ID instead?

Visualization of the results

Hi, I have a result table now, and I am wondering whether you or anyone else already has an R or Python script to visualize a GeneWalk result table in an automated fashion, similar to what you show in the publication. I can code it for myself, but why reinvent the wheel? :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.