kausmees / genocae Goto Github PK
View Code? Open in Web Editor NEWConvolutional autoencoder for genotype data
License: BSD 3-Clause "New" or "Revised" License
Convolutional autoencoder for genotype data
License: BSD 3-Clause "New" or "Revised" License
Hello!,
How would I go about extracting feature importance scores from a trained GenoCAE model? Some methods I'd like to try out:
I think part of my issue is not being sure how to recontruct the model from the GenoCAE outputs, since I see the weights are all stored in weights subfolder but not sure how to import them.
As a side note, I'm more familiar with the format where the entire model (architecture, weights) is saved as one .h5 file. Is there a way to save GenoCAE models in this way?
Thanks so much!,
Brian
From, for example, 'Best Practices for Maintainers' on can learn it is a good idea to have guidelines on the rules for contributors to do so, e.g. one of my own CONTRIBUTING.md documents. One of the many benefits of this, is that is makes it easier to say 'no' to undesired features.
I suggest to add such a CONTRIBUTING.md document and volunteer to create a first sketch of one. Of course, the current maintainers are boss, so I do not expect the rules I put in to become the actual rules :-)
Good idea?
The error occurred at the end of the building procedure.
I think the version format in the requirements.txt file is the problem.
=> ERROR [6/6] RUN python3 -m pip install -r /workspace/requirements.txt and &&rm /workspace/requirements.txt 1.2s
------
> [6/6] RUN python3 -m pip install -r /workspace/requirements.txt and &&rm /workspace/requirements.txt:
#8 1.023 ERROR: Could not find a version that satisfies the requirement and (from versions: none)
#8 1.024 ERROR: No matching distribution found for and
------
executor failed running [/bin/bash -c python3 -m pip install -r /workspace/requirements.txt and &&rm /workspace/requirements.txt]: exit code: 1
I'm currently trying to train GenoCAE on a datasets of 15 million SNPs across 67 individuals, but seem to be running into memory issues despite the fact that I'm using a AMD Threadripper workstations with 252 GB of memory and 64 cores (128 threads).
I suspect this may be due to the large number of SNPs I'm including, since the example data (which runs fine) only contains 9,259 SNPS, and in the original paper 161k were used.
2,067 individuals typed at 160,858
From my limited experience with these models, the number of input features drastically affect memory usage (much moreso than sample size). So I think my first step will be to filter the number of variants I'm training the model on based on some of the guidelines provided in the paper:
Dear GenoCAE maintainer,
Here I suggest to to allow a user to run GCAE from any folder, instead of forcing him/here to work from the GenoCAE folder.
When running the 'training' example code from the GenoCAE folder, the training works awesome:
Here I run the command:
richel@N141CU:~/.local/share/gcaer/gcae_v1_0$ /home/richel/.local/share/r-miniconda/envs/r-reticulate/bin/python \
~/.local/share/gcaer/gcae_v1_0/run_gcae.py train --datadir ~/.local/share/gcaer/gcae_v1_0/example_tiny/ \
--data HumanOrigins249_tiny --model_id M1 --epochs 20 --save_interval 2 --train_opts_id ex3 --data_opts_id b_0_4
Here is part of the result:
2021-06-28 14:50:13.776150: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-06-28 14:50:13.776180: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
tensorflow version 2.3.3
______________________________ arguments ______________________________
train : True
datadir : /home/richel/.local/share/gcaer/gcae_v1_0/example_tiny/
data : HumanOrigins249_tiny
model_id : M1
...
However, when I work from another folder, say, one folder up ...
richel@N141CU:~/.local/share/gcaer$ /home/richel/.local/share/r-miniconda/envs/r-reticulate/bin/python \
~/.local/share/gcaer/gcae_v1_0/run_gcae.py train --datadir ~/.local/share/gcaer/gcae_v1_0/example_tiny/ \
--data HumanOrigins249_tiny --model_id M1 --epochs 20 --save_interval 2 --train_opts_id ex3 --data_opts_id b_0_4
I get an error message that "data_opts/" + data_opts_id+".json"
cannot be found, at here in the code:
2021-06-28 14:50:53.728916: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-06-28 14:50:53.728947: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
tensorflow version 2.3.3
Traceback (most recent call last):
File "/home/richel/.local/share/gcaer/gcae_v1_0/run_gcae.py", line 396, in <module>
with open("data_opts/" + data_opts_id+".json") as data_opts_def_file:
FileNotFoundError: [Errno 2] No such file or directory: 'data_opts/b_0_4.json'
The problem here is the hardcoded "data_opts/"
part, that forces me to work in the same folder as GenoCAE. It feels clumsy to work with, as I have to change the working directory when calling GenoCAE. Note that, looking at the code, the same applies for train_opts
and models
.
I would enjoy a way to either (my favorites are first :-) ):
--data_opts_folder
CLI argument, code becomes data_opts_folder + "data_opts/" + data_opts_id+".json"
, or--data_opts myfolder/b_0_4.json
, code becomes data_opts
(which is now a filename), or--datadir
( data_dir + "/" + data_opts_id+".json"
), orWould one of these options be doable?
This is a note to self, as I cannot assign myself as I am not a Collaborator. Hence I assign myself in text :-)
Hi @kausmees,
GCAE seems awesome to me! What I feel is missing is a reference to the paper at BioRxiv. I suggest to add it add it as reference, something like I do below. Sure, I volunteer to do so myself via a Pull Request :-)
Dear GenoCAE maintainer,
Thanks so much for having example files and example code: I find those very useful!
I did find something unexpected, the file extension of HumanOrigins249_tiny.eigenstratgeno
: this appears to be a PLINK .bed
file, as it follows the same structure as described in the PLINK .bim file format doc. Also, genio (an R package to read PLINK files) cannot read .bed files if they do not have that extension.
I suggest to rename the file to what any PLINK user would expect for a .bed
file, which is HumanOrigins249_tiny.bed
I volunteer to do so.
I'm trying to project sample metadata onto a trained GenoCAE model,
I'm wondering if this is a formatting issue with my --superpops
input file (e.g. i have >2 columns, and some rows have NAs for missing metadata), or if GenoCAE is hard-code to only accept superpopulation-type metadata.
I'm attaching here an example of my metadata.
merged.metadata.csv
python3 run_gcae.py project --datadir /shared/bms20/projects/MND_ALS/SNP_VCFs/merged/ --data merged.SNPS_filtered.plink --trainedmodeldir als_out --model_id M1 --train_opts_id ex3 --data_opts_id b_0_4 --superpops /home/bms20/projects/MND_ALS/SNP_VCFs/merged/merged.metadata.csv
...
...
Projecting epochs: [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
Already projected: []
In DG.get_train_set: number of -1.0 genotypes in train: 5689140
In DG.get_train_set: number of -9 genotypes in train: 0
In DG.get_train_set: number of 0 values in train mask: 0
2022-06-23 12:39:48.063862: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
______________________________ Building model ______________________________
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'strides': 1}
Adding layer: BatchNormalization: {}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: MaxPooling1D: {'pool_size': 5, 'strides': 2, 'padding': 'same'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Flatten: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dense: {'units': 2, 'name': 'encoded'}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 621496}
Adding layer: Reshape: {'target_shape': (77687, 8), 'name': 'i_msvar'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Reshape: {'target_shape': (77687, 1, 8)}
Adding layer: UpSampling2D: {'size': (2, 1)}
Adding layer: Reshape: {'target_shape': (155374, 8)}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu', 'name': 'nms'}
Adding layer: BatchNormalization: {}
Adding layer: Conv1D: {'filters': 1, 'kernel_size': 1, 'padding': 'same'}
Adding layer: Flatten: {'name': 'logits'}
########################### epoch 2 ###########################
Reading weights from /shared/bms20/projects/GenoCAE/als_out/ae.M1.ex3.b_0_4.merged.SNPS_filtered.plink/weights/2
**Traceback (most recent call last):
File "/shared/bms20/projects/GenoCAE/run_gcae.py", line 1011, in <module>
plot_coords_by_superpop(coords_by_pop,"{0}/dimred_e_{1}_by_superpop".format(results_directory, epoch), superpopulations_file, plot_legend = epoch == epochs[0])
File "/shared/bms20/projects/GenoCAE/utils/visualization.py", line 203, in plot_coords_by_superpop
max_num_pops = max([len(superpop_dict[spop]) for spop in superpops])
ValueError: max() arg is an empty sequence**
Dear GenoCAE maintainers, hi Carl and Kristiina,
Thanks for GenoCAE and its tests using GitHub Actions, showing off how awesome it is!
However, upstream something has happened that cause the builds of all of my Python-dependent work to fail. Sadly, it happened to GenoCAE as well. As you are superior with Python, I hope you will help me/us :-)
Currently, the last GitHub Action trigger of the repo passed, which was (as of today) 5 days ago. That seems great! However, today this build fails. I figured this out by simpling forking this repo and trigger a rebuild. From the GitHub Actions log one can read:
Run python3 run_gcae.py --help
2022-02-07 14:01:16.923333: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-02-07 14:01:16.923388: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
Traceback (most recent call last):
File "run_gcae.py", line 32, in <module>
import tensorflow as tf
File "/home/runner/.local/lib/python3.8/site-packages/tensorflow/__init__.py", line 37, in <module>
from tensorflow.python.tools import module_util as _module_util
File "/home/runner/.local/lib/python3.8/site-packages/tensorflow/python/__init__.py", line 37, in <module>
from tensorflow.python.eager import context
File "/home/runner/.local/lib/python3.8/site-packages/tensorflow/python/eager/context.py", line 35, in <module>
from tensorflow.python.client import pywrap_tf_session
File "/home/runner/.local/lib/python3.8/site-packages/tensorflow/python/client/pywrap_tf_session.py", line 19, in <module>
from tensorflow.python.client._pywrap_tf_session import *
ImportError: SystemError: <built-in method __contains__ of dict object at 0x7f9dbba31580> returned a result with an error set
The problem is obviously:
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
I have been trying all day to fix this, but I did not dare to meddle with requirements.txt
. I will continue trying, yet I hope you will beat me fix this ๐
Dear GenoCAE maintainers,
Thanks for the example code and examples, I find these very useful!
What is unexpected, however, is that somehow the genetic input files in the folder example_tiny
are set to be executables, as can be seen in the screenshot of my terminal (see below, green indicates an executable) and by the File Manager asking me to run a text file when I open it (see below, at the right-hand side):
I guess a chmod +x
was messed up somewhere :-)
I suggest to remove the executable flag of these simple text files.
I volunteer to do so.
Dear GenoCAE maintainer, hi Carl and Kristiina,
Thanks for GenoCAE as well as the Docker container script: It's great for running GenoCAE on a computer cluster :-)
This Issue is related to #26, which is probably also caused by an upstream update: the Docker file does not work anymore. The installation instructions at https://github.com/kausmees/GenoCAE#docker-installation are great! Doing the suggested command, i.e. (note I added sudo
) ...
sudo docker build -t gcae/genocae:build -f docker/build.dockerfile .
... results in a failed build, with a full error log below.
I have been trying the whole day ( for example, there are 6 failed attempts here), but could not fix this.
Does the Docker build work for you? Do you have an idea how to fix the Docker file?
A temporary workaround could be to upload an existing Docker container to Docker hub. Do you happen to have one? Would be awesome!
I hope it will be easy for you to help me solve this. I am not very experiences with Docker nor Python, so I can imagine an easy fix being possible (on the other hand, the 6 Stack Overflow 'solutions' hint that the problem is there).
To reproduce, I have created a script to build the Docker container, together with a GitHub Actions script with an error log here.
I hope you can help me out here! Thanks and cheers, Richel
Sending build context to Docker daemon 186.6MB
Step 1/15 : ARG CUDA_VERSION=11.1.1
Step 2/15 : ARG OS_VERSION=20.04
Step 3/15 : FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-ubuntu${OS_VERSION}
---> 1189781af5ec
Step 4/15 : LABEL maintainer="Dong Wang"
---> Using cache
---> 9ae2635141d3
Step 5/15 : ENV PATH="/root/miniconda3/bin:${PATH}"
---> Using cache
---> f907151c27bd
Step 6/15 : ARG PATH="/root/miniconda3/bin:${PATH}"
---> Using cache
---> 76b31f23bd5e
Step 7/15 : SHELL ["/bin/bash", "-c"]
---> Using cache
---> e52cb6a4a70e
Step 8/15 : RUN apt-get update && apt-get upgrade -y && apt-get install -y wget
---> Using cache
---> f779a2c5021a
Step 9/15 : RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && mkdir /root/.conda && bash Miniconda3-latest-Linux-x86_64.sh -b && rm -f Miniconda3-latest-Linux-x86_64.sh
---> Using cache
---> e86837b5b18e
Step 10/15 : RUN pip3 install --upgrade pip
---> Running in 7a97102a4336
Requirement already satisfied: pip in /root/miniconda3/lib/python3.9/site-packages (21.1.3)
Collecting pip
Downloading pip-22.0.3-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 21.1.3
Uninstalling pip-21.1.3:
Successfully uninstalled pip-21.1.3
Successfully installed pip-22.0.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Removing intermediate container 7a97102a4336
---> b48bafcfa623
Step 11/15 : RUN pip3 install --upgrade setuptools
---> Running in ea564922654f
Requirement already satisfied: setuptools in /root/miniconda3/lib/python3.9/site-packages (52.0.0.post20210125)
Collecting setuptools
Downloading setuptools-60.8.1-py3-none-any.whl (1.1 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.1/1.1 MB 13.7 MB/s eta 0:00:00
Installing collected packages: setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 52.0.0.post20210125
Uninstalling setuptools-52.0.0.post20210125:
Successfully uninstalled setuptools-52.0.0.post20210125
Successfully installed setuptools-60.8.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Removing intermediate container ea564922654f
---> 874907ea03a5
Step 12/15 : WORKDIR /workspace
---> Running in ca79d4cb1457
Removing intermediate container ca79d4cb1457
---> a96a63e990ee
Step 13/15 : ADD ./requirements.txt /workspace
---> ad981a4056bd
Step 14/15 : RUN pip3 install -r /workspace/requirements.txt and && rm /workspace/requirements.txt
---> Running in 47c7e4f5b30c
Collecting and
Downloading and-0.1.1-py3-none-any.whl (2.0 kB)
Collecting docopt
Downloading docopt-0.6.2.tar.gz (25 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting grpcio
Downloading grpcio-1.43.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 4.1/4.1 MB 24.1 MB/s eta 0:00:00
Collecting setuptools==47.1.1
Downloading setuptools-47.1.1-py3-none-any.whl (583 kB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 583.2/583.2 KB 32.3 MB/s eta 0:00:00
Collecting tensorflow>=2.2.0
Downloading tensorflow-2.8.0-cp39-cp39-manylinux2010_x86_64.whl (497.6 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 497.6/497.6 MB 4.1 MB/s eta 0:00:00
Collecting numpy==1.18.4
Downloading numpy-1.18.4.zip (5.4 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 5.4/5.4 MB 26.8 MB/s eta 0:00:00
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Collecting scikit-learn
Downloading scikit_learn-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.4 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 26.4/26.4 MB 25.0 MB/s eta 0:00:00
Collecting matplotlib==3.2.1
Downloading matplotlib-3.2.1.tar.gz (40.3 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 40.3/40.3 MB 20.9 MB/s eta 0:00:00
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting seaborn
Downloading seaborn-0.11.2-py3-none-any.whl (292 kB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 292.8/292.8 KB 26.8 MB/s eta 0:00:00
Collecting scipy==1.4.1
Downloading scipy-1.4.1.tar.gz (24.6 MB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 24.6/24.6 MB 24.9 MB/s eta 0:00:00
Installing build dependencies: started
Installing build dependencies: still running...
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
ร Preparing metadata (pyproject.toml) did not run successfully.
โ exit code: 1
โฐโ> [171 lines of output]
setup.py:418: UserWarning: Unrecognized setuptools command ('dist_info --egg-base /tmp/pip-modern-metadata-jxp98lbc'), proceeding with generating Cython sources and expanding templates
warnings.warn("Unrecognized setuptools command ('{}'), proceeding with "
Running from scipy source directory.
lapack_opt_info:
lapack_mkl_info:
customize UnixCCompiler
libraries mkl_rt not found in ['/root/miniconda3/lib', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/x86_64-linux-gnu']
NOT AVAILABLE
openblas_lapack_info:
customize UnixCCompiler
customize UnixCCompiler
libraries openblas not found in ['/root/miniconda3/lib', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/x86_64-linux-gnu']
NOT AVAILABLE
openblas_clapack_info:
customize UnixCCompiler
customize UnixCCompiler
libraries openblas,lapack not found in ['/root/miniconda3/lib', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/x86_64-linux-gnu']
NOT AVAILABLE
flame_info:
customize UnixCCompiler
libraries flame not found in ['/root/miniconda3/lib', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/x86_64-linux-gnu']
NOT AVAILABLE
atlas_3_10_threads_info:
Setting PTATLAS=ATLAS
customize UnixCCompiler
libraries lapack_atlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries tatlas,tatlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
libraries tatlas,tatlas not found in /usr/local/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib64
customize UnixCCompiler
libraries tatlas,tatlas not found in /usr/lib64
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
libraries tatlas,tatlas not found in /usr/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib/x86_64-linux-gnu
customize UnixCCompiler
libraries tatlas,tatlas not found in /usr/lib/x86_64-linux-gnu
<class 'numpy.distutils.system_info.atlas_3_10_threads_info'>
NOT AVAILABLE
atlas_3_10_info:
customize UnixCCompiler
libraries lapack_atlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries satlas,satlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
libraries satlas,satlas not found in /usr/local/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib64
customize UnixCCompiler
libraries satlas,satlas not found in /usr/lib64
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
libraries satlas,satlas not found in /usr/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib/x86_64-linux-gnu
customize UnixCCompiler
libraries satlas,satlas not found in /usr/lib/x86_64-linux-gnu
<class 'numpy.distutils.system_info.atlas_3_10_info'>
NOT AVAILABLE
atlas_threads_info:
Setting PTATLAS=ATLAS
customize UnixCCompiler
libraries lapack_atlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries ptf77blas,ptcblas,atlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib64
customize UnixCCompiler
libraries ptf77blas,ptcblas,atlas not found in /usr/lib64
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
libraries ptf77blas,ptcblas,atlas not found in /usr/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib/x86_64-linux-gnu
customize UnixCCompiler
libraries ptf77blas,ptcblas,atlas not found in /usr/lib/x86_64-linux-gnu
<class 'numpy.distutils.system_info.atlas_threads_info'>
NOT AVAILABLE
atlas_info:
customize UnixCCompiler
libraries lapack_atlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries f77blas,cblas,atlas not found in /root/miniconda3/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/local/lib
customize UnixCCompiler
libraries f77blas,cblas,atlas not found in /usr/local/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib64
customize UnixCCompiler
libraries f77blas,cblas,atlas not found in /usr/lib64
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib
customize UnixCCompiler
libraries f77blas,cblas,atlas not found in /usr/lib
customize UnixCCompiler
libraries lapack_atlas not found in /usr/lib/x86_64-linux-gnu
customize UnixCCompiler
libraries f77blas,cblas,atlas not found in /usr/lib/x86_64-linux-gnu
<class 'numpy.distutils.system_info.atlas_info'>
NOT AVAILABLE
accelerate_info:
NOT AVAILABLE
lapack_info:
customize UnixCCompiler
libraries lapack not found in ['/root/miniconda3/lib', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/x86_64-linux-gnu']
NOT AVAILABLE
/tmp/pip-build-env-jf9lnjy9/overlay/lib/python3.9/site-packages/numpy/distutils/system_info.py:1712: UserWarning:
Lapack (http://www.netlib.org/lapack/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [lapack]) or by setting
the LAPACK environment variable.
if getattr(self, '_calc_info_{}'.format(lapack))():
lapack_src_info:
NOT AVAILABLE
/tmp/pip-build-env-jf9lnjy9/overlay/lib/python3.9/site-packages/numpy/distutils/system_info.py:1712: UserWarning:
Lapack (http://www.netlib.org/lapack/) sources not found.
Directories to search for the sources can be specified in the
numpy/distutils/site.cfg file (section [lapack_src]) or by setting
the LAPACK_SRC environment variable.
if getattr(self, '_calc_info_{}'.format(lapack))():
NOT AVAILABLE
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
main()
File "/root/miniconda3/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/root/miniconda3/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 164, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
File "/tmp/pip-build-env-jf9lnjy9/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 173, in prepare_metadata_for_build_wheel
self.run_setup()
File "/tmp/pip-build-env-jf9lnjy9/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 266, in run_setup
super(_BuildMetaLegacyBackend,
File "/tmp/pip-build-env-jf9lnjy9/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 157, in run_setup
exec(compile(code, __file__, 'exec'), locals())
File "setup.py", line 540, in <module>
setup_package()
File "setup.py", line 536, in setup_package
setup(**metadata)
File "/tmp/pip-build-env-jf9lnjy9/overlay/lib/python3.9/site-packages/numpy/distutils/core.py", line 137, in setup
config = configuration()
File "setup.py", line 435, in configuration
raise NotFoundError(msg)
numpy.distutils.system_info.NotFoundError: No lapack/blas resources found.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
ร Encountered error while generating package metadata.
โฐโ> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Dear GenoCAE maintainer,
Thanks for the conversion of the example files to PLINK format! I checked the .bim and .fam file and they match the PLINK doc (this time, I checked more carefully :-) ).
Sadly, this conversion resulted in files that cannot be run by PLINK (note I ran into the same problems as well :-) . I also found out that convertf
is also a .deb package installed on Ubuntu). I can let PLINK2 do something, but this does not result in PLINK-readable files either. Below some notes, mostly reminders to self.
Would you try again?
scripts
Cheers, Richel
./plink --bfile ~/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny --assoc --out ~/test --noweb
@----------------------------------------------------------@
| PLINK! | v1.07 | 10/Aug/2009 |
|----------------------------------------------------------|
| (C) 2009 Shaun Purcell, GNU General Public License, v2 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://pngu.mgh.harvard.edu/purcell/plink/ |
@----------------------------------------------------------@
Skipping web check... [ --noweb ]
Writing this text to log file [ /home/richel/test.log ]
Analysis started: Wed Jun 30 07:47:48 2021
Options in effect:
--bfile /home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny
--assoc
--out /home/richel/test
--noweb
Reading map (extended format) from [ /home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny.bim ]
ERROR: Problem reading BIM file, line 1
./plink --bfile ~/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny --assoc --out ~/test
PLINK v1.90b6.22 64-bit (16 Apr 2021) www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to /home/richel/test.log.
Options in effect:
--assoc
--bfile /home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny
--out /home/richel/test
7652 MB RAM detected; reserving 3826 MB for main workspace.
Error: Invalid chromosome code 'rs6515824' on line 1 of .bim file.
(Use --allow-extra-chr to force it to be accepted.)
./plink2 --bfile ~/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny --glm --out ~/test
PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020) www.cog-genomics.org/plink/2.0/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to /home/richel/test.log.
Options in effect:
--bfile /home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny
--glm
--out /home/richel/test
Start time: Wed Jun 30 07:49:08 2021
7652 MiB RAM detected; reserving 3826 MiB for main workspace.
Using up to 8 compute threads.
249 samples (0 females, 0 males, 249 ambiguous; 249 founders) loaded from
/home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny.fam.
Error: Invalid chromosome code 'rs6515824' on line 1 of .pvar file.
(Use --allow-extra-chr to force it to be accepted.)
End time: Wed Jun 30 07:49:08 2021
following the suggestion results in
./plink2 --bfile ~/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny --glm --allow-extra-chr --out ~/test
PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020) www.cog-genomics.org/plink/2.0/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to /home/richel/test.log.
Options in effect:
--allow-extra-chr
--bfile /home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny
--glm
--out /home/richel/test
Start time: Wed Jun 30 07:49:35 2021
7652 MiB RAM detected; reserving 3826 MiB for main workspace.
Using up to 8 compute threads.
249 samples (0 females, 0 males, 249 ambiguous; 249 founders) loaded from
/home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny.fam.
9259 variants loaded from
/home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny.bim.
1 binary phenotype loaded (0 cases, 249 controls).
Calculating allele frequencies... done.
--glm: Skipping case/control phenotype 'PHENO1' since all samples are controls.
End time: Wed Jun 30 07:49:35 2021
Aha, so the .bim file can be read! Let's re-create it:
./plink2 --bfile ~/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny --allow-extra-chr --make-bpgen --out ~/HumanOrigins249_tiny
Something is successfully created:
PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020) www.cog-genomics.org/plink/2.0/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to /home/richel/HumanOrigins249_tiny.log.
Options in effect:
--allow-extra-chr
--bfile /home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny
--make-bpgen
--out /home/richel/HumanOrigins249_tiny
Start time: Wed Jun 30 07:53:16 2021
7652 MiB RAM detected; reserving 3826 MiB for main workspace.
Using up to 8 compute threads.
249 samples (0 females, 0 males, 249 ambiguous; 249 founders) loaded from
/home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny.fam.
9259 variants loaded from
/home/richel/GitHubs/GenoCAE/example_tiny/HumanOrigins249_tiny.bim.
1 binary phenotype loaded (0 cases, 249 controls).
Writing /home/richel/HumanOrigins249_tiny.fam ... done.
Writing /home/richel/HumanOrigins249_tiny.bim ... done.
Writing /home/richel/HumanOrigins249_tiny.pgen ... done.
End time: Wed Jun 30 07:53:16 2021
Sadly, in R, the files cannot be read.
Here is genio's response:
genio::read_bed(
bed_filename,
names_loci = bim_table$id,
names_ind = fam_table$id
)
Reading: /home/richel/.local/share/gcaer/gcae_v1_0/example_tiny//HumanOrigins249_tiny.bed
Error in read_bed_cpp(file, m_loci, n_ind) :
Row 1 padding was non-zero. Either the specified number of individuals is incorrect or the input file is corrupt!
Here is ARTP2's response:
ARTP2::read.bed(bed = bed_filename, bim = bim_filename, fam = fam_filename)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'an integer', got 'rs6515824'
Dear GenoCAE maintainers,
I enjoy GenoCAE quite a bit and especially the examples are great!
What would make me like GenoCAE even better, is to have clearer error messages from the CLI. I think redirecting the user to the help is great, but a clearer error message to guide the user to the next step would be even better.
Some examples:
This is not something a user will blame you for, it is more of an opening to the next example.
python run_gcae.py train
I get:
2021-07-02 11:35:47.399470: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
tensorflow version 2.3.3
Invalid command. Run 'python run_gcae.py --help' for more information.
I expected something like:
`datadir` is missing. Please specify the data folder using `--datadir [data dir]`, e.g. `--datadir example_tiny/`
This is what I had myself:
python run_gcae.py train --datadir example_tiny/ --data HumanOrigins249_tiny --model_id M1 --train_opts_id ex3 --data_opts_id b_0_4
I got:
2021-07-02 11:35:25.100815: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
tensorflow version 2.3.3
Invalid command. Run 'python run_gcae.py --help' for more information.
I expected something like:
`epochs` is missing. Please specify the number of epochs using `--epochs [number]`, e.g. `--epochs 20`
GCAE uses the exponential linear unit ('ELU') as an activation function. In [1] it is claimed that 'the Swish activation function would be better in all cases [over ELU]'.
I am unsure if you think it would be worth to try out Swish? The improvements in accuracy as shown in [1] are only minor.
Dear GenoCAE maintainers,
I quote from the GitHub docs:
Releases are deployable software iterations you can package and make available for a wider audience to download and use.
I would love to have a named release version, e.g. v1.0
(as that is the version shown in the help) that I can use (over the commit hash) for the gcaer R package I am writing to install this cool tool.
It's easy to make one:
Sure, I volunteer to do this, but for that I need more access rights than you may want to give (which I'd understand :-) ).
When I run GCAE from the command line with a nonsense argument, e.g. nonsense
:
python run_gcae.py nonsense
I get to see the help
file:
2021-06-24 06:34:30.118604: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-06-24 06:34:30.118629: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
tensorflow version 2.3.3
Usage:
run_gcae.py train --datadir=<name> --data=<name> --model_id=<name> --train_opts_id=<name> --data_opts_id=<name> --save_interval=<num> --epochs=<num> [--resume_from=<num> --trainedmodeldir=<name> ]
run_gcae.py project --datadir=<name> [ --data=<name> --model_id=<name> --train_opts_id=<name> --data_opts_id=<name> --superpops=<name> --epoch=<num> --trainedmodeldir=<name> --pdata=<name> --trainedmodelname=<name>]
run_gcae.py plot --datadir=<name> [ --data=<name> --model_id=<name> --train_opts_id=<name> --data_opts_id=<name> --superpops=<name> --epoch=<num> --trainedmodeldir=<name> --pdata=<name> --trainedmodelname=<name>]
run_gcae.py animate --datadir=<name> [ --data=<name> --model_id=<name> --train_opts_id=<name> --data_opts_id=<name> --superpops=<name> --epoch=<num> --trainedmodeldir=<name> --pdata=<name> --trainedmodelname=<name>]
run_gcae.py evaluate --datadir=<name> --metrics=<name> [ --data=<name> --model_id=<name> --train_opts_id=<name> --data_opts_id=<name> --superpops=<name> --epoch=<num> --trainedmodeldir=<name> --pdata=<name> --trainedmodelname=<name>]
I think it is already friendly to show the help, yet I would not expect the output to be exactly the same as when doing python run_gcae.py --help
. I suggest to add an error message (and error code) if an invalid CLI argument is given.
Hi @kausmees, I seem to be having some issues setting up the Docker container, which I think stems from installing the python requirements.
git clone https://github.com/kausmees/GenoCAE.git
cd GenoCAE
docker build -t gcae/genocae:build -f docker/build.dockerfile .
Here is the full output, but the main error comes at the very end.
Sending build context to Docker daemon 5.337MB
Step 1/14 : ARG CUDA_VERSION=11.1.1
Step 2/14 : ARG OS_VERSION=20.04
Step 3/14 : FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-ubuntu${OS_VERSION}
---> 75f53d2b5da8
Step 4/14 : LABEL maintainer="Dong Wang"
---> Using cache
---> 05c68a023e26
Step 5/14 : ENV PATH="/root/miniconda3/bin:${PATH}"
---> Using cache
---> 84aafea13cc7
Step 6/14 : ARG PATH="/root/miniconda3/bin:${PATH}"
---> Using cache
---> d5d84110b3c8
Step 7/14 : ARG DEBIAN_FRONTEND=noninteractive
---> Running in 79a5ae18bf31
Removing intermediate container 79a5ae18bf31
---> a9ec0c06e8ee
Step 8/14 : SHELL ["/bin/bash", "-c"]
---> Running in 07a8435b31cd
Removing intermediate container 07a8435b31cd
---> bb8387371d5a
Step 9/14 : RUN apt-get update && apt-get upgrade -y &&apt-get install -y wget python3-pip
---> Running in 07b6d6f53352
Get:1 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease [1581 B]
Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB]
Get:9 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [30.3 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1161 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [2415 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [1404 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [27.1 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [54.2 kB]
Get:16 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Packages [579 kB]
Get:17 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [27.5 kB]
Get:18 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [1324 kB]
Get:19 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [881 kB]
Get:20 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [1974 kB]
Fetched 23.3 MB in 2s (13.5 MB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
Calculating upgrade...
The following packages have been kept back:
libcudnn8 libcudnn8-dev libnccl-dev libnccl2
The following packages will be upgraded:
apt ca-certificates dpkg dpkg-dev e2fsprogs libapt-pkg6.0 libc-bin
libcom-err2 libdpkg-perl libext2fs2 libpcre3 libsepol1 libss2 libssl1.1
libsystemd0 libudev1 linux-libc-dev login logsave openssl passwd
21 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Need to get 10.6 MB of archives.
After this operation, 22.5 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 dpkg amd64 1.19.7ubuntu3.2 [1128 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 login amd64 1:4.8.1-1ubuntu5.20.04.2 [220 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libc-bin amd64 2.31-0ubuntu9.9 [633 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libsystemd0 amd64 245.4-4ubuntu3.17 [269 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libudev1 amd64 245.4-4ubuntu3.17 [76.5 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libapt-pkg6.0 amd64 2.0.9 [839 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 apt amd64 2.0.9 [1294 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 logsave amd64 1.45.5-2ubuntu1.1 [10.2 kB]
Get:9 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libext2fs2 amd64 1.45.5-2ubuntu1.1 [183 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 e2fsprogs amd64 1.45.5-2ubuntu1.1 [527 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpcre3 amd64 2:8.39-12ubuntu0.1 [232 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libsepol1 amd64 3.0-1ubuntu0.1 [252 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 passwd amd64 1:4.8.1-1ubuntu5.20.04.2 [797 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libcom-err2 amd64 1.45.5-2ubuntu1.1 [9548 B]
Get:15 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libss2 amd64 1.45.5-2ubuntu1.1 [11.3 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libssl1.1 amd64 1.1.1f-1ubuntu2.15 [1321 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 openssl amd64 1.1.1f-1ubuntu2.15 [623 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 ca-certificates all 20211016~20.04.1 [144 kB]
Get:19 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 dpkg-dev all 1.19.7ubuntu3.2 [679 kB]
Get:20 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libdpkg-perl all 1.19.7ubuntu3.2 [231 kB]
Get:21 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 linux-libc-dev amd64 5.4.0-120.136 [1113 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 10.6 MB in 0s (55.0 MB/s)
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../dpkg_1.19.7ubuntu3.2_amd64.deb ...
Unpacking dpkg (1.19.7ubuntu3.2) over (1.19.7ubuntu3) ...
Setting up dpkg (1.19.7ubuntu3.2) ...
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../login_1%3a4.8.1-1ubuntu5.20.04.2_amd64.deb ...
Unpacking login (1:4.8.1-1ubuntu5.20.04.2) over (1:4.8.1-1ubuntu5.20.04.1) ...
Setting up login (1:4.8.1-1ubuntu5.20.04.2) ...
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../libc-bin_2.31-0ubuntu9.9_amd64.deb ...
Unpacking libc-bin (2.31-0ubuntu9.9) over (2.31-0ubuntu9.7) ...
Setting up libc-bin (2.31-0ubuntu9.9) ...
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../libsystemd0_245.4-4ubuntu3.17_amd64.deb ...
Unpacking libsystemd0:amd64 (245.4-4ubuntu3.17) over (245.4-4ubuntu3.16) ...
Setting up libsystemd0:amd64 (245.4-4ubuntu3.17) ...
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../libudev1_245.4-4ubuntu3.17_amd64.deb ...
Unpacking libudev1:amd64 (245.4-4ubuntu3.17) over (245.4-4ubuntu3.16) ...
Setting up libudev1:amd64 (245.4-4ubuntu3.17) ...
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../libapt-pkg6.0_2.0.9_amd64.deb ...
Unpacking libapt-pkg6.0:amd64 (2.0.9) over (2.0.6) ...
Setting up libapt-pkg6.0:amd64 (2.0.9) ...
(Reading database ... 12626 files and directories currently installed.)
Preparing to unpack .../archives/apt_2.0.9_amd64.deb ...
Unpacking apt (2.0.9) over (2.0.6) ...
Setting up apt (2.0.9) ...
Removing obsolete conffile /etc/kernel/postinst.d/apt-auto-removal ...
(Reading database ... 12625 files and directories currently installed.)
Preparing to unpack .../logsave_1.45.5-2ubuntu1.1_amd64.deb ...
Unpacking logsave (1.45.5-2ubuntu1.1) over (1.45.5-2ubuntu1) ...
Preparing to unpack .../libext2fs2_1.45.5-2ubuntu1.1_amd64.deb ...
Unpacking libext2fs2:amd64 (1.45.5-2ubuntu1.1) over (1.45.5-2ubuntu1) ...
Setting up libext2fs2:amd64 (1.45.5-2ubuntu1.1) ...
(Reading database ... 12625 files and directories currently installed.)
Preparing to unpack .../e2fsprogs_1.45.5-2ubuntu1.1_amd64.deb ...
Unpacking e2fsprogs (1.45.5-2ubuntu1.1) over (1.45.5-2ubuntu1) ...
Preparing to unpack .../libpcre3_2%3a8.39-12ubuntu0.1_amd64.deb ...
Unpacking libpcre3:amd64 (2:8.39-12ubuntu0.1) over (2:8.39-12build1) ...
Setting up libpcre3:amd64 (2:8.39-12ubuntu0.1) ...
(Reading database ... 12625 files and directories currently installed.)
Preparing to unpack .../libsepol1_3.0-1ubuntu0.1_amd64.deb ...
Unpacking libsepol1:amd64 (3.0-1ubuntu0.1) over (3.0-1) ...
Setting up libsepol1:amd64 (3.0-1ubuntu0.1) ...
(Reading database ... 12625 files and directories currently installed.)
Preparing to unpack .../passwd_1%3a4.8.1-1ubuntu5.20.04.2_amd64.deb ...
Unpacking passwd (1:4.8.1-1ubuntu5.20.04.2) over (1:4.8.1-1ubuntu5.20.04.1) ...
Setting up passwd (1:4.8.1-1ubuntu5.20.04.2) ...
(Reading database ... 12625 files and directories currently installed.)
Preparing to unpack .../0-libcom-err2_1.45.5-2ubuntu1.1_amd64.deb ...
Unpacking libcom-err2:amd64 (1.45.5-2ubuntu1.1) over (1.45.5-2ubuntu1) ...
Preparing to unpack .../1-libss2_1.45.5-2ubuntu1.1_amd64.deb ...
Unpacking libss2:amd64 (1.45.5-2ubuntu1.1) over (1.45.5-2ubuntu1) ...
Preparing to unpack .../2-libssl1.1_1.1.1f-1ubuntu2.15_amd64.deb ...
Unpacking libssl1.1:amd64 (1.1.1f-1ubuntu2.15) over (1.1.1f-1ubuntu2.13) ...
Preparing to unpack .../3-openssl_1.1.1f-1ubuntu2.15_amd64.deb ...
Unpacking openssl (1.1.1f-1ubuntu2.15) over (1.1.1f-1ubuntu2.13) ...
Preparing to unpack .../4-ca-certificates_20211016~20.04.1_all.deb ...
Unpacking ca-certificates (20211016~20.04.1) over (20210119~20.04.2) ...
Preparing to unpack .../5-dpkg-dev_1.19.7ubuntu3.2_all.deb ...
Unpacking dpkg-dev (1.19.7ubuntu3.2) over (1.19.7ubuntu3) ...
Preparing to unpack .../6-libdpkg-perl_1.19.7ubuntu3.2_all.deb ...
Unpacking libdpkg-perl (1.19.7ubuntu3.2) over (1.19.7ubuntu3) ...
Preparing to unpack .../7-linux-libc-dev_5.4.0-120.136_amd64.deb ...
Unpacking linux-libc-dev:amd64 (5.4.0-120.136) over (5.4.0-113.127) ...
Setting up libssl1.1:amd64 (1.1.1f-1ubuntu2.15) ...
Setting up linux-libc-dev:amd64 (5.4.0-120.136) ...
Setting up libcom-err2:amd64 (1.45.5-2ubuntu1.1) ...
Setting up libss2:amd64 (1.45.5-2ubuntu1.1) ...
Setting up libdpkg-perl (1.19.7ubuntu3.2) ...
Setting up logsave (1.45.5-2ubuntu1.1) ...
Setting up openssl (1.1.1f-1ubuntu2.15) ...
Setting up e2fsprogs (1.45.5-2ubuntu1.1) ...
Setting up dpkg-dev (1.19.7ubuntu3.2) ...
Setting up ca-certificates (20211016~20.04.1) ...
Updating certificates in /etc/ssl/certs...
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL
7 added, 8 removed; done.
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
Processing triggers for ca-certificates (20211016~20.04.1) ...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
file libexpat1 libexpat1-dev libmagic-mgc libmagic1 libmpdec2 libpsl5
libpython3-dev libpython3-stdlib libpython3.8 libpython3.8-dev
libpython3.8-minimal libpython3.8-stdlib mime-support publicsuffix
python-pip-whl python3 python3-dev python3-distutils python3-lib2to3
python3-minimal python3-pkg-resources python3-setuptools python3-wheel
python3.8 python3.8-dev python3.8-minimal zlib1g-dev
Suggested packages:
python3-doc python3-tk python3-venv python-setuptools-doc python3.8-venv
python3.8-doc binfmt-support
The following NEW packages will be installed:
file libexpat1 libexpat1-dev libmagic-mgc libmagic1 libmpdec2 libpsl5
libpython3-dev libpython3-stdlib libpython3.8 libpython3.8-dev
libpython3.8-minimal libpython3.8-stdlib mime-support publicsuffix
python-pip-whl python3 python3-dev python3-distutils python3-lib2to3
python3-minimal python3-pip python3-pkg-resources python3-setuptools
python3-wheel python3.8 python3.8-dev python3.8-minimal wget zlib1g-dev
0 upgraded, 30 newly installed, 0 to remove and 4 not upgraded.
Need to get 14.9 MB of archives.
After this operation, 63.1 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpython3.8-minimal amd64 3.8.10-0ubuntu1~20.04.4 [717 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libexpat1 amd64 2.2.9-1ubuntu0.4 [74.4 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 python3.8-minimal amd64 3.8.10-0ubuntu1~20.04.4 [1899 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal/main amd64 python3-minimal amd64 3.8.2-0ubuntu2 [23.6 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal/main amd64 mime-support all 3.64ubuntu1 [30.6 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal/main amd64 libmpdec2 amd64 2.4.2-3 [81.1 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpython3.8-stdlib amd64 3.8.10-0ubuntu1~20.04.4 [1675 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 python3.8 amd64 3.8.10-0ubuntu1~20.04.4 [387 kB]
Get:9 http://archive.ubuntu.com/ubuntu focal/main amd64 libpython3-stdlib amd64 3.8.2-0ubuntu2 [7068 B]
Get:10 http://archive.ubuntu.com/ubuntu focal/main amd64 python3 amd64 3.8.2-0ubuntu2 [47.6 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 libmagic-mgc amd64 1:5.38-4 [218 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal/main amd64 libmagic1 amd64 1:5.38-4 [75.9 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal/main amd64 file amd64 1:5.38-4 [23.3 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal/main amd64 python3-pkg-resources all 45.2.0-1 [130 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal/main amd64 libpsl5 amd64 0.21.0-1ubuntu1 [51.5 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal/main amd64 publicsuffix all 20200303.0012-1 [111 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 wget amd64 1.20.3-1ubuntu2 [348 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libexpat1-dev amd64 2.2.9-1ubuntu0.4 [117 kB]
Get:19 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpython3.8 amd64 3.8.10-0ubuntu1~20.04.4 [1625 kB]
Get:20 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpython3.8-dev amd64 3.8.10-0ubuntu1~20.04.4 [3952 kB]
Get:21 http://archive.ubuntu.com/ubuntu focal/main amd64 libpython3-dev amd64 3.8.2-0ubuntu2 [7236 B]
Get:22 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 python-pip-whl all 20.0.2-5ubuntu1.6 [1805 kB]
Get:23 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 zlib1g-dev amd64 1:1.2.11.dfsg-2ubuntu1.3 [155 kB]
Get:24 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 python3.8-dev amd64 3.8.10-0ubuntu1~20.04.4 [514 kB]
Get:25 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 python3-lib2to3 all 3.8.10-0ubuntu1~20.04 [76.3 kB]
Get:26 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 python3-distutils all 3.8.10-0ubuntu1~20.04 [141 kB]
Get:27 http://archive.ubuntu.com/ubuntu focal/main amd64 python3-dev amd64 3.8.2-0ubuntu2 [1212 B]
Get:28 http://archive.ubuntu.com/ubuntu focal/main amd64 python3-setuptools all 45.2.0-1 [330 kB]
Get:29 http://archive.ubuntu.com/ubuntu focal/universe amd64 python3-wheel all 0.34.2-1 [23.8 kB]
Get:30 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 python3-pip all 20.0.2-5ubuntu1.6 [231 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 14.9 MB in 0s (81.4 MB/s)
Selecting previously unselected package libpython3.8-minimal:amd64.
(Reading database ... 12624 files and directories currently installed.)
Preparing to unpack .../libpython3.8-minimal_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking libpython3.8-minimal:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package libexpat1:amd64.
Preparing to unpack .../libexpat1_2.2.9-1ubuntu0.4_amd64.deb ...
Unpacking libexpat1:amd64 (2.2.9-1ubuntu0.4) ...
Selecting previously unselected package python3.8-minimal.
Preparing to unpack .../python3.8-minimal_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking python3.8-minimal (3.8.10-0ubuntu1~20.04.4) ...
Setting up libpython3.8-minimal:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Setting up libexpat1:amd64 (2.2.9-1ubuntu0.4) ...
Setting up python3.8-minimal (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package python3-minimal.
(Reading database ... 12915 files and directories currently installed.)
Preparing to unpack .../0-python3-minimal_3.8.2-0ubuntu2_amd64.deb ...
Unpacking python3-minimal (3.8.2-0ubuntu2) ...
Selecting previously unselected package mime-support.
Preparing to unpack .../1-mime-support_3.64ubuntu1_all.deb ...
Unpacking mime-support (3.64ubuntu1) ...
Selecting previously unselected package libmpdec2:amd64.
Preparing to unpack .../2-libmpdec2_2.4.2-3_amd64.deb ...
Unpacking libmpdec2:amd64 (2.4.2-3) ...
Selecting previously unselected package libpython3.8-stdlib:amd64.
Preparing to unpack .../3-libpython3.8-stdlib_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking libpython3.8-stdlib:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package python3.8.
Preparing to unpack .../4-python3.8_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking python3.8 (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package libpython3-stdlib:amd64.
Preparing to unpack .../5-libpython3-stdlib_3.8.2-0ubuntu2_amd64.deb ...
Unpacking libpython3-stdlib:amd64 (3.8.2-0ubuntu2) ...
Setting up python3-minimal (3.8.2-0ubuntu2) ...
Selecting previously unselected package python3.
(Reading database ... 13317 files and directories currently installed.)
Preparing to unpack .../00-python3_3.8.2-0ubuntu2_amd64.deb ...
Unpacking python3 (3.8.2-0ubuntu2) ...
Selecting previously unselected package libmagic-mgc.
Preparing to unpack .../01-libmagic-mgc_1%3a5.38-4_amd64.deb ...
Unpacking libmagic-mgc (1:5.38-4) ...
Selecting previously unselected package libmagic1:amd64.
Preparing to unpack .../02-libmagic1_1%3a5.38-4_amd64.deb ...
Unpacking libmagic1:amd64 (1:5.38-4) ...
Selecting previously unselected package file.
Preparing to unpack .../03-file_1%3a5.38-4_amd64.deb ...
Unpacking file (1:5.38-4) ...
Selecting previously unselected package python3-pkg-resources.
Preparing to unpack .../04-python3-pkg-resources_45.2.0-1_all.deb ...
Unpacking python3-pkg-resources (45.2.0-1) ...
Selecting previously unselected package libpsl5:amd64.
Preparing to unpack .../05-libpsl5_0.21.0-1ubuntu1_amd64.deb ...
Unpacking libpsl5:amd64 (0.21.0-1ubuntu1) ...
Selecting previously unselected package publicsuffix.
Preparing to unpack .../06-publicsuffix_20200303.0012-1_all.deb ...
Unpacking publicsuffix (20200303.0012-1) ...
Selecting previously unselected package wget.
Preparing to unpack .../07-wget_1.20.3-1ubuntu2_amd64.deb ...
Unpacking wget (1.20.3-1ubuntu2) ...
Selecting previously unselected package libexpat1-dev:amd64.
Preparing to unpack .../08-libexpat1-dev_2.2.9-1ubuntu0.4_amd64.deb ...
Unpacking libexpat1-dev:amd64 (2.2.9-1ubuntu0.4) ...
Selecting previously unselected package libpython3.8:amd64.
Preparing to unpack .../09-libpython3.8_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking libpython3.8:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package libpython3.8-dev:amd64.
Preparing to unpack .../10-libpython3.8-dev_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking libpython3.8-dev:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package libpython3-dev:amd64.
Preparing to unpack .../11-libpython3-dev_3.8.2-0ubuntu2_amd64.deb ...
Unpacking libpython3-dev:amd64 (3.8.2-0ubuntu2) ...
Selecting previously unselected package python-pip-whl.
Preparing to unpack .../12-python-pip-whl_20.0.2-5ubuntu1.6_all.deb ...
Unpacking python-pip-whl (20.0.2-5ubuntu1.6) ...
Selecting previously unselected package zlib1g-dev:amd64.
Preparing to unpack .../13-zlib1g-dev_1%3a1.2.11.dfsg-2ubuntu1.3_amd64.deb ...
Unpacking zlib1g-dev:amd64 (1:1.2.11.dfsg-2ubuntu1.3) ...
Selecting previously unselected package python3.8-dev.
Preparing to unpack .../14-python3.8-dev_3.8.10-0ubuntu1~20.04.4_amd64.deb ...
Unpacking python3.8-dev (3.8.10-0ubuntu1~20.04.4) ...
Selecting previously unselected package python3-lib2to3.
Preparing to unpack .../15-python3-lib2to3_3.8.10-0ubuntu1~20.04_all.deb ...
Unpacking python3-lib2to3 (3.8.10-0ubuntu1~20.04) ...
Selecting previously unselected package python3-distutils.
Preparing to unpack .../16-python3-distutils_3.8.10-0ubuntu1~20.04_all.deb ...
Unpacking python3-distutils (3.8.10-0ubuntu1~20.04) ...
Selecting previously unselected package python3-dev.
Preparing to unpack .../17-python3-dev_3.8.2-0ubuntu2_amd64.deb ...
Unpacking python3-dev (3.8.2-0ubuntu2) ...
Selecting previously unselected package python3-setuptools.
Preparing to unpack .../18-python3-setuptools_45.2.0-1_all.deb ...
Unpacking python3-setuptools (45.2.0-1) ...
Selecting previously unselected package python3-wheel.
Preparing to unpack .../19-python3-wheel_0.34.2-1_all.deb ...
Unpacking python3-wheel (0.34.2-1) ...
Selecting previously unselected package python3-pip.
Preparing to unpack .../20-python3-pip_20.0.2-5ubuntu1.6_all.deb ...
Unpacking python3-pip (20.0.2-5ubuntu1.6) ...
Setting up libpsl5:amd64 (0.21.0-1ubuntu1) ...
Setting up mime-support (3.64ubuntu1) ...
Setting up wget (1.20.3-1ubuntu2) ...
Setting up libmagic-mgc (1:5.38-4) ...
Setting up libmagic1:amd64 (1:5.38-4) ...
Setting up file (1:5.38-4) ...
Setting up libexpat1-dev:amd64 (2.2.9-1ubuntu0.4) ...
Setting up zlib1g-dev:amd64 (1:1.2.11.dfsg-2ubuntu1.3) ...
Setting up python-pip-whl (20.0.2-5ubuntu1.6) ...
Setting up libmpdec2:amd64 (2.4.2-3) ...
Setting up libpython3.8-stdlib:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Setting up python3.8 (3.8.10-0ubuntu1~20.04.4) ...
Setting up publicsuffix (20200303.0012-1) ...
Setting up libpython3-stdlib:amd64 (3.8.2-0ubuntu2) ...
Setting up python3 (3.8.2-0ubuntu2) ...
Setting up python3-wheel (0.34.2-1) ...
Setting up libpython3.8:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Setting up python3-lib2to3 (3.8.10-0ubuntu1~20.04) ...
Setting up python3-pkg-resources (45.2.0-1) ...
Setting up python3-distutils (3.8.10-0ubuntu1~20.04) ...
Setting up python3-setuptools (45.2.0-1) ...
Setting up libpython3.8-dev:amd64 (3.8.10-0ubuntu1~20.04.4) ...
Setting up python3-pip (20.0.2-5ubuntu1.6) ...
Setting up python3.8-dev (3.8.10-0ubuntu1~20.04.4) ...
Setting up libpython3-dev:amd64 (3.8.2-0ubuntu2) ...
Setting up python3-dev (3.8.2-0ubuntu2) ...
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
Removing intermediate container 07b6d6f53352
---> bdad80779900
Step 10/14 : RUN python3 -m pip install --no-cache-dir --upgrade pip
---> Running in a7851101a995
Collecting pip
Downloading pip-22.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 20.0.2
Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
Can't uninstall 'pip'. No files were found to uninstall.
Successfully installed pip-22.1.2
Removing intermediate container a7851101a995
---> fdd991e1849d
Step 11/14 : WORKDIR /workspace
---> Running in 69aa6a53c3e4
Removing intermediate container 69aa6a53c3e4
---> 325c1b83c37e
Step 12/14 : ADD ./requirements.txt /workspace
---> 1757a4388050
Step 13/14 : RUN python3 -m pip install -r /workspace/requirements.txt and &&rm /workspace/requirements.txt
---> Running in df9be31afa55
ERROR: Could not find a version that satisfies the requirement and (from versions: none)
ERROR: No matching distribution found for and
The command '/bin/bash -c python3 -m pip install -r /workspace/requirements.txt and &&rm /workspace/requirements.txt' returned a non-zero code: 1
One way to avoid this might be to make use of conda environments with less restrictive version requirements. I've created a yaml file which can be used to set up all the dependencies. Haven't yet tried running GenoCAE with it yet though.
Perhaps this could be used when setting up the Docker container, instead of the requirements.txt file? (PS- only added the .txt suffix to allow it to be uploaded to GH Issues).
Thanks! Really looking forward to using GenoCAE!
env.yml.txt
conda env create -f env.yml.txt
Best,
Brian
Dear GenoCAE maintainers,
Thanks for GenoCAE and its Continuous Integration (GitHub Actions) script!
When I run GenoCAE with the added/experimental phenotype, I can now (thanks to #19) train the neural network. Great!
However, when I want to project the genotypes, I can get it to run (after fixing #21), but in the end it fails.
Training goes great, as confirmed by this example GitHub Actions log:
python3 run_gcae.py train --datadir example_tiny --data issue_6_bin --model_id M1 --epochs 3 --save_interval 1 --train_opts_id ex3 --data_opts_id b_0_4 --pheno_model_id=p1
The last line of the output is also clear:
Done training. Wrote to /home/runner/work/GenoCAE/GenoCAE/ae_out/ae.M1.ex3.b_0_4.issue_6_bin.p1
When I start using the project
option (after fixing #21), it starts running, but fails in the visualization:
When I run on GHA like this:
python3 run_gcae.py project --datadir example_tiny --data issue_6_bin --model_id M1 --train_opts_id ex3 --data_opts_id b_0_4 --superpops example_tiny/HO_superpopulations --pheno_model_id=p1
It starts doing the projection until the visualisation, then it gives the following error (as copied from the GHA log) (full error message is at the bottom of this Issue):
Traceback (most recent call last):
File "run_gcae.py", line 1616, in <module>
main()
File "run_gcae.py", line 1365, in main
plot_coords_by_superpop(coords_by_pop,"{0}/dimred_e_{1}_by_superpop".format(results_directory, epoch), superpopulations_file, plot_legend = epoch == epochs[0])
File "/home/runner/work/GenoCAE/GenoCAE/utils/visualization.py", line 222, in plot_coords_by_superpop
max_num_pops = max([len(superpop_dict[spop]) for spop in superpops])
ValueError: max() arg is an empty sequence
Error: Process completed with exit code 1.
I expected the projection to work the same with or without the phenotype, as it gives a useful visualization of the dimensionality reduction, as show in the Ausmees & Nettelblad paper [1].
How do I get this to work?
Thanks and cheers, Richel
Copied from a GHA log:
2021-12-07 12:55:17.080095: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-12-07 12:55:17.080130: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-12-07 12:55:19.686377: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-12-07 12:55:19.686415: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-12-07 12:55:19.686436: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fv-az74-543): /proc/driver/nvidia/version does not exist
2021-12-07 12:55:19.686793: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
tensorflow version 2.7.0
______________________________ arguments ______________________________
train : False
datadir : example_tiny
data : issue_6_bin
model_id : M1
train_opts_id : ex3
data_opts_id : b_0_4
save_interval : None
epochs : None
resume_from : None
trainedmodeldir : None
pheno_model_id : p1
project : True
superpops : example_tiny/HO_superpopulations
epoch : None
pdata : None
trainedmodelname : None
plot : False
animate : False
evaluate : False
metrics : None
______________________________ data opts ______________________________
sparsifies : [0.0, 0.1, 0.2, 0.3, 0.4]
norm_opts : {'flip': False, 'missing_val': -1.0}
norm_mode : genotypewise01
impute_missing : True
validation_split : 0.2
______________________________ train opts ______________________________
learning_rate : 0.00032
batch_size : 10
noise_std : 0.0032
n_samples : -1
loss : {'module': 'tf.keras.losses', 'class': 'CategoricalCrossentropy', 'args': {'from_logits': False}}
regularizer : {'reg_factor': 1e-07, 'module': 'tf.keras.regularizers', 'class': 'l2'}
lr_scheme : {'module': 'tf.keras.optimizers.schedules', 'class': 'ExponentialDecay', 'args': {'decay_rate': 0.96, 'decay_steps': 100, 'staircase': False}}
______________________________
Imputing originally missing genotypes to most common value.
Reading ind pop list from /home/runner/work/GenoCAE/GenoCAE/example_tiny/issue_6_bin.fam
Reading ind pop list from /home/runner/work/GenoCAE/GenoCAE/example_tiny/issue_6_bin.fam
Mapping files: 0%| | 0/3 [00:00<?, ?it/s]
Mapping files: 100%|โโโโโโโโโโ| 3/3 [00:00<00:00, 227.43it/s]array([[ 0.10205683, -0.50682646, -0.7242572 , -0.41514382],
[ 0.08256383, -0.4811394 , -0.66083604, -0.38157633],
[ 0.04545861, -0.38441584, -0.52315164, -0.36843315],
[ 0.10497452, -0.50911087, -0.73314375, -0.42168617],
[ 0.06072002, -0.40441108, -0.57263076, -0.39334926]],
dtype=float32)
[[0.5 0.5 0 0.5]
[0.5 0.5 0.5 0.5]
[0.5 0.5 0.5 0.5]
[0.5 0.5 0 0.5]
[0.5 0.5 0.5 0.5]] array([[1. , 1. , 0.5, 1. ],
[1. , 0.5, 1. , 0.5],
[1. , 0. , 0.5, 1. ],
[0.5, 0.5, 0.5, 0.5],
[1. , 0.5, 0. , 0.5]])
Encoded data file not found: /home/runner/work/GenoCAE/GenoCAE/ae_out/ae.M1.ex3.b_0_4.issue_6_bin.p1/issue_6_bin/encoded_data.h5
Projecting epochs: [1, 2, 3]
Already projected: []
In DG.get_train_set: number of -1.0 genotypes in train: 0
In DG.get_train_set: number of -9 genotypes in train: 0
In DG.get_train_set: number of 0 values in train mask: 0
______________________________ Building model ______________________________
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'strides': 1}
Adding layer: BatchNormalization: {}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: MaxPooling1D: {'pool_size': 5, 'strides': 2, 'padding': 'same'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Flatten: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dense: {'units': 2, 'name': 'encoded'}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 16}
Adding layer: Reshape: {'target_shape': (2, 8), 'name': 'i_msvar'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Reshape: {'target_shape': (2, 1, 8)}
Adding layer: UpSampling2D: {'size': (2, 1)}
Adding layer: Reshape: {'target_shape': (4, 8)}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu', 'name': 'nms'}
Adding layer: BatchNormalization: {}
Adding layer: Conv1D: {'filters': 1, 'kernel_size': 1, 'padding': 'same'}
Adding layer: Flatten: {'name': 'logits'}
______________________________ Building model ______________________________
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dense: {'units': 1}
No marker specific variable.
########################### epoch 1 ###########################
Reading weights from /home/runner/work/GenoCAE/GenoCAE/ae_out/ae.M1.ex3.b_0_4.issue_6_bin.p1/weights/1
tf.Tensor(
[0.23319791 0.19988029 0.19377704 0.23756738 0.21141517 0.19966617
0.18867628 0.22778517 0.22095694 0.23621069 0.24232931 0.23751874
0.21052796 0.2478469 0.22860327 0.24310602 0.2248713 0.22607517
0.21316327 0.24836378 0.24003232 0.2017554 0.2420473 0.25501102
0.236629 0.22140262 0.20744076 0.20671275 0.22881663 0.19617875], shape=(30,), dtype=float32)
(30,)
tf.Tensor(
[0.2380066 0.19335003 0.23319791 0.23863165 0.23831964 0.18697073
0.23600692 0.21887155 0.1588867 0.1949499 0.21993566 0.25195104
0.18325783 0.24391052 0.18994804 0.23802298 0.20401272 0.22448331
0.2229189 0.21869858 0.23501374 0.22200371 0.23621069 0.20525226
0.2003792 0.24328303 0.24873887 0.23385888 0.24009213 0.2101993 ], shape=(30,), dtype=float32)
(30,)
tf.Tensor(
[0.21908474 0.22722876 0.2339882 0.23049714 0.20231368 0.2011057
0.21599561 0.17069077 0.23896992 0.2420473 0.24056283 0.20440042
0.24473031 0.23399888 0.2080764 0.21408066 0.23621069 0.20678237
0.20441745 0.19976138 0.200915 0.22420047 0.21946532 0.2136519
0.23442683 0.2120275 0.23950182 0.21574992 0.23319791 0.2304342 ], shape=(30,), dtype=float32)
(30,)
tf.Tensor(
[0.24093255 0.23797578 0.23640822 0.19804403 0.21179074 0.24339268
0.20556012 0.24795108 0.22094505 0.25103822 0.2339133 0.18515958
0.23047343 0.23206557 0.20824197 0.23773867 0.22685748 0.18689398
0.21542913 0.23442683 0.24944112 0.24592474 0.22365497 0.22963423
0.19812118 0.24454156 0.23143443 0.21166426 0.21157375 0.2214748 ], shape=(30,), dtype=float32)
(30,)
tf.Tensor(
[0.2144528 0.23790193 0.20847306 0.17789963 0.22853369 0.22519661
0.22575805 0.23663682 0.2309236 0.21082726 0.19669218 0.1876471
0.18697073 0.22914475 0.20111184 0.2027495 0.22810453 0.24159692
0.24206525 0.19896871 0.22794227 0.21941704 0.21471435 0.19822605
0.20103996 0.23831964 0.18830639 0.20552994 0.23621069 0.23235762], shape=(30,), dtype=float32)
(30,)
tf.Tensor(
[0.24529332 0.2289039 0.23621069 0.19376357 0.23415062 0.22575805
0.21179074 0.21793306 0.23040852 0.21893027 0.24770258 0.19905189
0.21635695 0.2532666 0.24553452 0.1958462 ], shape=(16,), dtype=float32)
(16,)
Traceback (most recent call last):
File "run_gcae.py", line 1616, in <module>
main()
File "run_gcae.py", line 1365, in main
plot_coords_by_superpop(coords_by_pop,"{0}/dimred_e_{1}_by_superpop".format(results_directory, epoch), superpopulations_file, plot_legend = epoch == epochs[0])
File "/home/runner/work/GenoCAE/GenoCAE/utils/visualization.py", line 222, in plot_coords_by_superpop
max_num_pops = max([len(superpop_dict[spop]) for spop in superpops])
ValueError: max() arg is an empty sequence
Error: Process completed with exit code 1.
This is a note to self, as I cannot assign myself as I am not a Collaborator. Hence I assign myself in text :-)
Dear GCAE maintainer,
Here I try to convince you to give a shorter error message when --datadir
is absent.
Thanks for the GCAE examples provided; these are very helpful!
When I run the example code of the first GCAE training example ...
python3 run_gcae.py train --datadir example_tiny/ --data HumanOrigins249_tiny --model_id M1 --epochs 20 --save_interval 2 --train_opts_id ex3 --data_opts_id b_0_4
I get a clear-but-long error message:
2021-06-28 13:48:01.293305: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-06-28 13:48:01.293338: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
tensorflow version 2.3.3
Traceback (most recent call last):
File "/home/richel/.local/share/gcaer/gcae_v1_0/run_gcae.py", line 396, in <module>
with open("data_opts/" + data_opts_id+".json") as data_opts_def_file:
FileNotFoundError: [Errno 2] No such file or directory: 'data_opts/b_0_4.json'
The drawback is that this is too long of an error message for R to display (here I use the gcaer R package):
Also, one could argue that initializing Tensorflow and looking for CUDA should be done after checking if the CLI arguments are valid.
In that way, the error message would shorten to the lines below and I would be happy:
Traceback (most recent call last):
File "/home/richel/.local/share/gcaer/gcae_v1_0/run_gcae.py", line 396, in <module>
with open("data_opts/" + data_opts_id+".json") as data_opts_def_file:
FileNotFoundError: [Errno 2] No such file or directory: 'data_opts/b_0_4.json'
An alternative would be to be able to remove these Tensorflow warnings from a CLI argument.
What I suggest is one of these options:
--verbose
argument and only show the Tensorflow things when it is enabledWhat do you think about this idea?
Dear GenoCAE maintainers,
Thanks for GenoCAE and its Continuous Integration (GitHub Actions) script!
When I run GenoCAE with the added/experimental phenotype, I can now (thanks to #19) train the neural network. Great!
However, when I want to project the genotypes I get the wrong error messages that are too early.
Training goes great, as confirmed by this example GitHub Actions log:
python3 run_gcae.py train --datadir example_tiny --data issue_2_bin --model_id M1 --epochs 20 --save_interval 2 --train_opts_id ex3 --data_opts_id b_0_4 --pheno_model_id=p1
The last line of the output is also clear:
Done training. Wrote to /home/runner/work/GenoCAE/GenoCAE/ae_out/ae.M1.ex3.b_0_4.issue_2_bin.p1
Note the .p1
addition to the folder name, which is not there when not working with a phenotype.
When I start using the project
option, that I copy from the doc, I get unexpected and/or too early error messages:
When I run on GHA like this:
python3 run_gcae.py project --datadir example_tiny --data issue_2_bin --model_id M1 --train_opts_id ex3 --data_opts_id b_0_4 --superpops example_tiny/HO_superpopulations --pheno_model_id=p1
I get the error:
Invalid command. Run 'python run_gcae.py --help' for more information.
as if the --pheno_model_id=p1
is not supported yet.
Sure, I can delete that flag altogether, but then I get:
FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/GenoCAE/GenoCAE/ae_out/ae.M1.ex3.b_0_4.issue_2_bin/weights'
Note the absence of .p1
in the folder name.
The error I expect would be that the dataset used (issue_2_bin
) would not work with the file specified with --superpops example_tiny/HO_superpopulations
(although it might work by sheer luck).
How can I use project
on a neural net that can also do a phenotype?
Dear GenoCAE maintainers, hi @kausmees and @cnettel,
When I run the GenCAE experimental Pheno
branch, I get an error of which I have no idea what to do with. Below the reprex.
Currently, the GitHub Actions script runs GenoCAE with the --help
flag, showing the help successfully.
On my fork of GenoCAE in the GitHub Actions 'check.yaml' script, I added the following command to run:
python3 run_gcae.py train --datadir example_tiny --data issue_2_bin --model_id M1 --epochs 20 --save_interval 2 --train_opts_id ex3 --data_opts_id b_0_4 --pheno_model_id=p1
The --data issue_2_bin
are the data files I supplied to Carl at this Issue and are already put in the example_tiny
folder of my 'GenoCAE' fork.
GitHub Actions gives the following error:
ValueError: in user code:
File "/home/richel/GitHubs/GenoCAE/run_gcae.py", line 424, in run_optimization *
loss_value += tf.math.reduce_sum(((-y_pred) * y_true)) * 1e-6
ValueError: Dimensions must be equal, but are 2 and 4 for '{{node mul_21}} = Mul[T=DT_FLOAT](Neg_2, one_hot_2)' with input shapes: [2,4], [2,4,3].
Below is the full error log, which can also be found in this GitHub Actions log.
What does the error mean?
Thanks and cheers, Richel
richel@N141CU:~/GitHubs/GenoCAE$ python3 run_gcae.py train --datadir example_tiny --data issue_2_bin --model_id M1 --epochs 20 --save_interval 2 --train_opts_id ex3 --data_opts_id b_0_4 --pheno_model_id=p1
2021-11-30 13:41:10.460286: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-11-30 13:41:10.460312: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-11-30 13:41:13.244831: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-11-30 13:41:13.244903: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (N141CU): /proc/driver/nvidia/version does not exist
2021-11-30 13:41:13.245186: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
tensorflow version 2.7.0
______________________________ arguments ______________________________
train : True
datadir : example_tiny
data : issue_2_bin
model_id : M1
train_opts_id : ex3
data_opts_id : b_0_4
save_interval : 2
epochs : 20
resume_from : None
trainedmodeldir : None
pheno_model_id : p1
project : False
superpops : None
epoch : None
pdata : None
trainedmodelname : None
plot : False
animate : False
evaluate : False
metrics : None
______________________________ data opts ______________________________
sparsifies : [0.0, 0.1, 0.2, 0.3, 0.4]
norm_opts : {'flip': False, 'missing_val': -1.0}
norm_mode : genotypewise01
impute_missing : True
validation_split : 0.2
______________________________ train opts ______________________________
learning_rate : 0.00032
batch_size : 10
noise_std : 0.0032
n_samples : -1
loss : {'module': 'tf.keras.losses', 'class': 'CategoricalCrossentropy', 'args': {'from_logits': False}}
regularizer : {'reg_factor': 1e-07, 'module': 'tf.keras.regularizers', 'class': 'l2'}
lr_scheme : {'module': 'tf.keras.optimizers.schedules', 'class': 'ExponentialDecay', 'args': {'decay_rate': 0.96, 'decay_steps': 100, 'staircase': False}}
______________________________
Imputing originally missing genotypes to most common value.
Reading ind pop list from /home/richel/GitHubs/GenoCAE/example_tiny/issue_2_bin.fam
Reading ind pop list from /home/richel/GitHubs/GenoCAE/example_tiny/issue_2_bin.fam
Mapping files: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 3/3 [00:00<00:00, 362.20it/s]
Using learning rate schedule tf.keras.optimizers.schedules.ExponentialDecay with {'decay_rate': 0.96, 'decay_steps': 100, 'staircase': False}
______________________________ Data ______________________________
N unique train samples: 800
--- training on : 800
N valid samples: 200
N markers: 4
______________________________ Building model ______________________________
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'strides': 1}
Adding layer: BatchNormalization: {}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: MaxPooling1D: {'pool_size': 5, 'strides': 2, 'padding': 'same'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Flatten: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dense: {'units': 2, 'name': 'encoded'}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 16}
Adding layer: Reshape: {'target_shape': (2, 8), 'name': 'i_msvar'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Reshape: {'target_shape': (2, 1, 8)}
Adding layer: UpSampling2D: {'size': (2, 1)}
Adding layer: Reshape: {'target_shape': (4, 8)}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu', 'name': 'nms'}
Adding layer: BatchNormalization: {}
Adding layer: Conv1D: {'filters': 1, 'kernel_size': 1, 'padding': 'same'}
Adding layer: Flatten: {'name': 'logits'}
______________________________ Building model ______________________________
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'strides': 1}
Adding layer: BatchNormalization: {}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: MaxPooling1D: {'pool_size': 5, 'strides': 2, 'padding': 'same'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same'}
Adding layer: BatchNormalization: {}
Adding layer: Flatten: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dense: {'units': 2, 'name': 'encoded'}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 16}
Adding layer: Reshape: {'target_shape': (2, 8), 'name': 'i_msvar'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same'}
Adding layer: BatchNormalization: {}
Adding layer: Reshape: {'target_shape': (2, 1, 8)}
Adding layer: UpSampling2D: {'size': (2, 1)}
Adding layer: Reshape: {'target_shape': (4, 8)}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
--- conv1d filters: 8 kernel_size: 5
--- batch normalization
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'name': 'nms'}
Adding layer: BatchNormalization: {}
Adding layer: Conv1D: {'filters': 1, 'kernel_size': 1, 'padding': 'same'}
Adding layer: Flatten: {'name': 'logits'}
______________________________ Building model ______________________________
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: LeakyReLU: {}
Adding layer: Dense: {'units': 1}
No marker specific variable.
ALLVARS [<tf.Variable 'autoencoder/conv1d/kernel:0' shape=(5, 3, 8) dtype=float32>, <tf.Variable 'autoencoder/conv1d/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/conv1d_1/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/conv1d_1/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/batch_normalization_1/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/batch_normalization_1/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/conv1d_2/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/conv1d_2/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/batch_normalization_2/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2/batch_normalization_2/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/conv1d_3/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder/conv1d_3/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization_3/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization_3/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/dense/kernel:0' shape=(16, 75) dtype=float32>, <tf.Variable 'autoencoder/dense/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder/dense_1/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder/dense_1/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder/encoded/kernel:0' shape=(75, 2) dtype=float32>, <tf.Variable 'autoencoder/encoded/bias:0' shape=(2,) dtype=float32>, <tf.Variable 'autoencoder/dense_2/kernel:0' shape=(2, 75) dtype=float32>, <tf.Variable 'autoencoder/dense_2/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder/dense_3/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder/dense_3/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder/dense_4/kernel:0' shape=(75, 16) dtype=float32>, <tf.Variable 'autoencoder/dense_4/bias:0' shape=(16,) dtype=float32>, <tf.Variable 'autoencoder/conv1d_4/kernel:0' shape=(5, 10, 8) dtype=float32>, <tf.Variable 'autoencoder/conv1d_4/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization_4/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization_4/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/conv1d_5/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/conv1d_5/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/batch_normalization_5/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/batch_normalization_5/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/conv1d_6/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/conv1d_6/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/batch_normalization_6/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/residual_block2_1/batch_normalization_6/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/nms/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder/nms/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization_7/gamma:0' shape=(9,) dtype=float32>, <tf.Variable 'autoencoder/batch_normalization_7/beta:0' shape=(9,) dtype=float32>, <tf.Variable 'autoencoder/conv1d_7/kernel:0' shape=(1, 9, 1) dtype=float32>, <tf.Variable 'autoencoder/conv1d_7/bias:0' shape=(1,) dtype=float32>, <tf.Variable 'Variable:0' shape=(1, 4) dtype=float32>, <tf.Variable 'Variable:0' shape=(1, 4) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_8/kernel:0' shape=(5, 3, 8) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_8/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_8/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_8/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/conv1d_9/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/conv1d_9/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/batch_normalization_9/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/batch_normalization_9/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/conv1d_10/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/conv1d_10/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/batch_normalization_10/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_2/batch_normalization_10/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_11/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_11/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_11/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_11/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/dense_5/kernel:0' shape=(16, 75) dtype=float32>, <tf.Variable 'autoencoder_1/dense_5/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_1/dense_6/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder_1/dense_6/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_1/encoded/kernel:0' shape=(75, 2) dtype=float32>, <tf.Variable 'autoencoder_1/encoded/bias:0' shape=(2,) dtype=float32>, <tf.Variable 'autoencoder_1/dense_7/kernel:0' shape=(2, 75) dtype=float32>, <tf.Variable 'autoencoder_1/dense_7/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_1/dense_8/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder_1/dense_8/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_1/dense_9/kernel:0' shape=(75, 16) dtype=float32>, <tf.Variable 'autoencoder_1/dense_9/bias:0' shape=(16,) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_12/kernel:0' shape=(5, 10, 8) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_12/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_12/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_12/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/conv1d_13/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/conv1d_13/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/batch_normalization_13/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/batch_normalization_13/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/conv1d_14/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/conv1d_14/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/batch_normalization_14/gamma:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/residual_block2_3/batch_normalization_14/beta:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/nms/kernel:0' shape=(5, 8, 8) dtype=float32>, <tf.Variable 'autoencoder_1/nms/bias:0' shape=(8,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_15/gamma:0' shape=(9,) dtype=float32>, <tf.Variable 'autoencoder_1/batch_normalization_15/beta:0' shape=(9,) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_15/kernel:0' shape=(1, 9, 1) dtype=float32>, <tf.Variable 'autoencoder_1/conv1d_15/bias:0' shape=(1,) dtype=float32>, <tf.Variable 'Variable:0' shape=(1, 4) dtype=float32>, <tf.Variable 'Variable:0' shape=(1, 4) dtype=float32>, <tf.Variable 'autoencoder_2/dense_10/kernel:0' shape=(2, 75) dtype=float32>, <tf.Variable 'autoencoder_2/dense_10/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_2/dense_11/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder_2/dense_11/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_2/dense_12/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder_2/dense_12/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_2/dense_13/kernel:0' shape=(75, 75) dtype=float32>, <tf.Variable 'autoencoder_2/dense_13/bias:0' shape=(75,) dtype=float32>, <tf.Variable 'autoencoder_2/dense_14/kernel:0' shape=(75, 1) dtype=float32>, <tf.Variable 'autoencoder_2/dense_14/bias:0' shape=(1,) dtype=float32>] ###
Traceback (most recent call last):
File "/home/richel/GitHubs/GenoCAE/run_gcae.py", line 1616, in <module>
main()
File "/home/richel/GitHubs/GenoCAE/run_gcae.py", line 1014, in main
run_optimization(autoencoder, autoencoder2, optimizer, optimizer2, loss_func, input_init, targets_init, True, phenomodel=pheno_model, phenotargets=phenotargets_init)
File "/home/richel/miniconda3/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/richel/miniconda3/lib/python3.9/site-packages/tensorflow/python/framework/func_graph.py", line 1129, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
File "/home/richel/GitHubs/GenoCAE/run_gcae.py", line 424, in run_optimization *
loss_value += tf.math.reduce_sum(((-y_pred) * y_true)) * 1e-6
ValueError: Dimensions must be equal, but are 2 and 4 for '{{node mul_21}} = Mul[T=DT_FLOAT](Neg_2, one_hot_2)' with input shapes: [2,4], [2,4,3].
Dear GenoCAE maintainers, hi @cnettel and @kausmees,
Thanks for GenoCAE and the experimental Pheno
branch!
What I would enjoy is a toy Mx
model (e.g. M0
) and a toy px
model (e.g. p0
) that would be the smallest neural network possible, respecting the dimensions of the input and output (or: 'they just work' (although their predictions will be bad)).
I have tried modifying the /models/M1.json
and /models/p2.json
files (the latter only available on the Pheno
branch), but I feel this will take you seconds to create.
I would enjoy this as this would speed up my GitHub Actions test suite: now training alone takes 150 seconds, whereas I am (usually) only able in that it creates some files, not the output being useful (for useful output I would use the regular models).
Would it be easy to add toy models Mx
(e.g. models/M0.json
) and toy model px
(e.g. models/p0.json
)?
If I underestimate how hard this is, just let me know, and I will try harder :-)
Thanks and cheers, Richel
Dear GenoCAE maintainers, hi @cnettel and @kausmees,
As you are back, I have found the following (here discussed from my point of view). Here I submit something I found unexpected. If you also did not expect this, I'd happily create a minimally reproducible example.
When using evaluate
with a superpops
file, in one of my cases I got the following:
Population | num samples | f1_score_3 | f1_score_5 |
---|---|---|---|
C | 333 | 0.0000 | 0.0000 |
B | 334 | 0.2431 | 0.0000 |
A | 333 | 0.4400 | 0.4996 |
avg (micro) | 1000 | 0.3100 | 0.3330 |
The unexpectedness is in the last line, that suggests to calculate the average, but appears to do different things per column (and I understand for the first column (num_samples
) to use a sum there :-) ).
I would expect the averages to be:
Population | num samples | f1_score_3 | f1_score_5 |
---|---|---|---|
C | 333 | 0.0000 | 0.0000 |
B | 334 | 0.2431 | 0.0000 |
A | 333 | 0.4400 | 0.4996 |
avg (micro) | 333 | 0.2277 | 0.1665 |
I checked: these 'averages' are also neither the harmonic nor geometric mean.
What are those values?
If you think these are weird as well, I will happily create a reproducible example. Else, I am happy to learn what these values are :-)
Dear GenoCAE maintainer,
Thanks so much for having example files and example code: I find those very useful!
I did find something unexpected, the file extension of HumanOrigins249_tiny.snp
: this appears to be a PLINK .bim
file, as it follows the same structure as described in the PLINK .bim file format doc:
I suggest to rename the file to what any PLINK user would expect for a .bim
file, which is HumanOrigins249_tiny.bim
I volunteer to do so.
Hi @kausmees,
Currently, when I create an Issue that my supervisor needs to be informed about, I cannot tag her, (i.e. use @AasaJohanssonUU
), as ร
sa is not a Collaborator.
Could ร
sa be added as a Collaborator (AasaJohanssonUU
) so I can tag here in Issues, in that way, allowing her to stay in the loop better? Would be great!
Continuous integration is the workflow in which after every -among others- git push
, the project is tested to 'still work'. Not only is this helpful to speed up development, it also allows one to see if code from contributors (via a Pull Request) keep the build intact.
I suggest to add a minimal GitHub Actions script that simply does the steps in the README.md.
I volunteer to write it and maintain it, as I have plenty of experience with that (e.g. plinkr, but there are dozens if not hundreds)
Good idea?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.