Giter Club home page Giter Club logo

vib-psb / ksrates Goto Github PK

View Code? Open in Web Editor NEW
14.0 5.0 8.0 40.82 MB

ksrates is a tool to position whole-genome duplications relative to speciation events using substitution-rate-adjusted mixed paralog-ortholog Ks distributions.

Home Page: https://ksrates.readthedocs.io

License: GNU General Public License v3.0

Dockerfile 0.13% Python 92.25% Nextflow 7.48% Singularity 0.14%
wgd wgm whole-genome-duplication evolution substitution-rate ks-distributions

ksrates's Introduction

Test pipeline CI Push DockerHub CI Documentation Status

VIB-UGent Center for Plant Systems Biology—Evolutionary Systems Biology Lab

ksrates

ksrates is a tool to position whole-genome duplications* (WGDs) relative to speciation events using substitution-rate-adjusted mixed paralog–ortholog distributions of synonymous substitutions per synonymous site (KS).

* or, more generally, whole-genome multiplications (WGMs), but we will simply use the more common WGD to refer to any multiplication

Quick overview

To position ancient WGD events with respect to speciation events in a phylogeny, the KS values of WGD paralog pairs in a species of interest are often compared with the KS values of ortholog pairs between this species and other species. For example, it is common practice to superimpose ortholog and paralog KS distributions in a mixed plot. However, if the lineages involved exhibit different substitution rates, such direct naive comparison of paralog and ortholog KS estimates can be misleading and result in phylogenetic misinterpretation of WGD signatures.

ksrates is user-friendly command-line tool and Nextflow pipeline to compare paralog and ortholog KS distributions derived from genomic or transcriptomic sequences. ksrates estimates differences in synonymous substitution rates among the lineages involved and generates an adjusted mixed plot of paralog and ortholog KS distributions that allows to assess the relative phylogenetic positioning of presumed WGD and speciation events.

For more details, see the related publication and the documentation below.

Documentation

Documentation
Tutorial
FAQ

Quick start

ksrates can be executed using either a Nextflow pipeline (recommended) or a manual command-line interface. The latter is available via Docker and Singularity containers, and as a Python package to integrate into existing genomics toolsets and workflows.

In the following sections we briefly describe how to install, configure and run the Nextflow pipeline and the basic usage of the command-line interface for the Docker or Singularity containers. For detailed usage information, a full tutorial and additional installation options, please see the full documentation.

Example datasets

To illustrate how to use ksrates, two example datasets are provided for a simple example use case analyzing WGD signatures in monocot plants with oil palm (Elaeis guineensis) as the focal species.

  • example: a full dataset which contains the complete sequence data for the focal species and two other species and may require hours of computations depending on the available computing resources. We advice to run this dataset on a compute cluster and using the ksrates Nextflow pipeline should make it fairly easy to configure this for a variety of HPC schedulers.

  • test: a small test dataset that contains only a small subset of the sequence data for each of the species and takes only a few minutes to be run. This is intended for a quick check of the tool only and can be run locally, e.g. on a laptop. The results are not very meaningful.

See the Usage sections below and the Tutorial for more detail.

Nextflow pipeline

Installation

  1. Install Nextflow, official instructions are here, but briefly:

    1. If you do not have Java installed, install Java (8 or later, up to 15); on Linux you can use:

      sudo apt-get install default-jdk
      
    2. Install Nextflow using either:

      wget -qO- https://get.nextflow.io | bash
      

      or:

      curl -fsSL https://get.nextflow.io | bash
      

      It creates the nextflow executable file in the current directory. You may want to move it to a folder accessible from your $PATH, for example:

      mv nextflow /usr/local/bin
      
  2. Install either Singularity (recommended, but see here) or Docker. This is needed to run the ksrates Singularity or Docker container which contain all other required software dependencies, so nothing else needs to be installed.

  3. Install ksrates: When using Nextflow, ksrates and the ksrates Singularity or Docker container will be automatically downloaded simply when you execute the launch of the ksrates pipeline for the first time, and they will be stored and reused for any further executions (see Nextflow pipeline sharing). Therefore, in this case it is not necessary to manually install ksrates, simply continue with the Usage section below.

Usage

We briefly illustrate here how to run the ksrates Nextflow pipeline on the test dataset.

  1. Get the example datasets.

    1. Clone the repository to get the test datasets:

      git clone https://github.com/VIB-PSB/ksrates
      
    2. You may want to copy the dataset folder you want to use to another location, for example your home folder, and then change to that folder:

      cp ksrates/test ~
      cd ~/test
      
  2. Prepare the configuration files.

    The test directory already contains:

    • A pre-filled ksrates configuration file (config_elaeis.txt) for the oil palm use case.

    • A Nextflow configuration file template (nextflow.config) to configure the executor to be used (i.e., a local computer or a compute cluster) and its resources made available to Nextflow such as the number of CPUs. It also configures whether to use the ksrates Singularity or Docker container. The configuration file may need to be adapted to your available resources.

      See the full documentation and the Nextflow documentation for more detail on Nextflow configuration, e.g. for different HPC schedulers. We also provide additional, more general template Nextflow configuration files in the doc directory in the repository.

  3. Launch the ksrates Nextflow pipeline.

    Note: If this is the first time you launch the pipeline, Nextflow will first download ksrates Nextflow pipeline and the ksrates Singularity or Docker container.

    nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
    

    The path to the ksrates configuration file is specified through the --config parameter. If the Nextflow configuration file is named nextflow.config and located in the launching folder the file is automatically detected. Alternatively, the user can specify a custom file by using the -C option (see Nextflow documentation).

    Note: To generate a new ksrates configuration file template for a new analysis, use the --config option to specify its file name or file path. If the specified file does not exist (at the given path), the pipeline will generate the template and then exit. Edit and fill in this generated configuration file (see the full documentation for more detail) and then rerun the same command above to relaunch the pipeline.

Command-line interface

Installation

Install either Singularity (recommended, but see here) or Docker. This is needed to run the ksrates Singularity or Docker container which contain ksrates and all other required software dependencies, so nothing else needs to be installed. The ksrates Singularity or Docker container will be automatically downloaded simply when you execute a ksrates command on the publicly accessible container for the first time, and they will be stored and reused for any further command executions.

Usage

We briefly illustrate here how to run ksrates using the Singularity or Docker container.

  • ksrates comes with a command-line interface. Its basic syntax is:

    ksrates [OPTIONS] COMMAND [ARGS]...
    
  • To execute a ksrates command using the Singularity container the syntax is:

    singularity exec docker://vibpsb/ksrates ksrates [OPTIONS] COMMAND [ARGS]...
    
  • Or to execute a ksrates command using the Docker container the syntax is:

    docker run --rm -v $PWD:/temp -w /temp vibpsb/ksrates ksrates [OPTIONS] COMMAND [ARGS]...
    

Some example ksrates commands are:

Show usage and all available COMMANDs and OPTIONS:

ksrates -h

Generate a template configuration file for the focal species:

ksrates generate-config config_elaeis.txt

Show usage and ARGS for a specific COMMAND:

ksrates orthologs-ks -h

Run the ortholog KS analysis between two species using four threads/CPU cores:

ksrates orthologs-ks config_elaeis.txt elaeis oryza --n-threads 4

Please see the full documentation for more details and the complete set of commands.

Support

If you come across a bug or have any question or suggestion, please open an issue.

Citation

If you publish results generated using ksrates, please cite:

Sensalari C., Maere S. and Lohaus R. (2021) ksrates: positioning whole-genome duplications relative to speciation events in KS distributions. Bioinformatics, btab602, doi: https://doi.org/10.1093/bioinformatics/btab602

ksrates's People

Contributors

bedroesb avatar cecilia-sensalari avatar lohausr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ksrates's Issues

Incompatibility with recent Nextflow version

As pointed out in issue #38, the Nextflow pipeline implemented in main.nf is not compatible anymore with recent Nextflow versions (the issue refers to version 22.04.0.5697). It gives the following error message:

$ nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
N E X T F L O W  ~  version 22.04.0
Launching `https://github.com/VIB-PSB/ksrates` [determined_bernard] DSL2 - revision: bfbb623720 [master]


K S R A T E S   -   N E X T F L O W   P I P E L I N E   (v1.1.1)
----------------------------------------------------------------

Configuration file:                    ./config_elaeis.txt
Logs folder:                           logs_3cf4463c
Preserve leftover files:               false

Command line:               nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
Launch directory:           /Users/wickell/ksrates
Work directory:             /Users/wickell/ksrates/work
ksrates directory:          /Users/wickell/.nextflow/assets/VIB-PSB/ksrates

Start time:                 2022-04-28T09:05:59.358787-04:00


No such variable: outCheckConfig

 -- Check script '/Users/wickell/.nextflow/assets/VIB-PSB/ksrates/main.nf' at line: 252 or see '.nextflow.log' file for more details

A quick way to fix this problem is to make use of the older Nextflow version used to develop ksrates (21.10.6.5660) by adding the NXF_VER environmental variable in the command line (see Nextflow docs):

 NXF_VER=21.10.6 nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt

The actual fix will have to identify the cause of this incompatibility and edit the affected lines in main.nf.

No output files

Hi,
I'm using the ksrates docker image. I have an issue with not getting relevant output files. Here are the codes I have run,

/usr/local/bin/ksrates init config_oxalis.txt
/usr/local/bin/ksrates paralogs-ks config_oxalis.txt --n-threads 16
/usr/local/bin/ksrates orthologs-ks config_oxalis.txt oxalis populus --n-threads 16
/usr/local/bin/ksrates orthologs-ks config_oxalis.txt oxalis arabidopsis --n-threads 16
/usr/local/bin/ksrates orthologs-ks config_oxalis.txt oxalis vitis --n-threads 16
/usr/local/bin/ksrates orthologs-analysis config_oxalis.txt
/usr/local/bin/ksrates plot-orthologs config_oxalis.txt
/usr/local/bin/ksrates orthologs-adjustment config_oxalis.txt
/usr/local/bin/ksrates plot-paralogs config_oxalis.txt
/usr/local/bin/ksrates plot-tree config_oxalis.txt
/usr/local/bin/ksrates paralogs-analyses config_oxalis.txt

When I check the errors, this is what I found this error related to i-adhore step,

INFO	Done
INFO	---
INFO	Running wgd colinearity Ks pipeline...
INFO	No colinearity anchor pair Ks data, will run wgd colinearity Ks analysis
INFO	Checking external software...
INFO	This is i-ADHoRe v3.0.
INFO	Creating i-ADHoRe tmp directory /var/lib/condor/execute/slot1/dir_774720/paralog_distributions/wgd_oxalis/oxalis.ks_anchors_tmp
INFO	Parsing GFF file
INFO	Writing gene lists for i-ADHoRe
INFO	Writing families file for i-ADHoRe
INFO	Writing i-ADHoRe configuration file
INFO	Running i-ADHoRe 3.0...
INFO	i-adhore /var/lib/condor/execute/slot1/dir_774720/paralog_distributions/wgd_oxalis/oxalis.ks_anchors_tmp/i-adhore.conf
ERROR	i-ADHoRe execution failed with return code: -6
ERROR	i-ADHoRe standard error output:
i-adhore: /i-adhore/src/Profile.cpp:95: void Profile::createNodes(const std::set<Link>&, const std::vector<GeneList*>&, std::vector<std::vector<Node*> >&, bool, bool) const: Assertion `geneX.isPairWith(geneY)' failed.
ERROR	Exiting.
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Here are the other errors and warnings I got,

INFO	Finished 1242/1243 gene family analyses...
INFO	Finished 1243/1243 gene family analyses...
INFO	Finished all gene family analyses
INFO	Analysis done
INFO	Making results data frame
INFO	Removing tmp directory
INFO	---
INFO	Done
INFO	Fri Mar  8 16:10:48 2024
INFO	Done
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Computing ortholog distribution peaks with related error
INFO	Fri Mar  8 16:10:48 2024
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Loading parameters and input files
WARNING	Ortholog peak database [ortholog_peak_db.tsv] not found or empty: a new one is now generated.
WARNING	Ortholog Ks list database [ortholog_ks_list_db.tsv] not found or empty: a new one is now generated.
INFO	
INFO	Oxalis oulophora and Populus trichocarpa:
INFO	- Extracting ortholog Ks list
INFO	- Computing distribution peak through bootstrap (200 iterations)
INFO	- Adding peak to ortholog peak database
INFO	- Adding Ks list to ortholog Ks list database
INFO	
INFO	Oxalis oulophora and Vitis vinifera:
INFO	- Extracting ortholog Ks list
INFO	- Computing distribution peak through bootstrap (200 iterations)
INFO	- Adding peak to ortholog peak database
INFO	- Adding Ks list to ortholog Ks list database
INFO	
INFO	Populus trichocarpa and Vitis vinifera:
INFO	- Extracting ortholog Ks list
WARNING	  Ortholog Ks TSV file not found [populus_vitis.ks.tsv]. Skipping peak estimate.
INFO	
INFO	Arabidopsis thaliana and Oxalis oulophora:
INFO	- Extracting ortholog Ks list
INFO	- Computing distribution peak through bootstrap (200 iterations)
INFO	- Adding peak to ortholog peak database
INFO	- Adding Ks list to ortholog Ks list database
INFO	
INFO	Arabidopsis thaliana and Populus trichocarpa:
INFO	- Extracting ortholog Ks list
WARNING	  Ortholog Ks TSV file not found [arabidopsis_populus.ks.tsv]. Skipping peak estimate.
INFO	
INFO	Arabidopsis thaliana and Vitis vinifera:
INFO	- Extracting ortholog Ks list
WARNING	  Ortholog Ks TSV file not found [arabidopsis_vitis.ks.tsv]. Skipping peak estimate.
INFO	
WARNING	Number of failed peak computations: 3
WARNING	
INFO	All done
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Plotting ortholog distributions for all ortholog trios
INFO	Fri Mar  8 16:10:51 2024
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Loading parameters and input files
INFO	
INFO	Plotting ortholog Ks distributions for species pair [Oxalis oulophora - Populus trichocarpa]
WARNING	- Skipping all outspecies: not enough ortholog data available (PDF figure not generated)
INFO	
INFO	Plotting ortholog Ks distributions for species pair [Oxalis oulophora - Vitis vinifera]
WARNING	- Skipping all outspecies: not enough ortholog data available (PDF figure not generated)
WARNING	
WARNING	The species pairs listed below are not (yet) available in the ortholog databases
WARNING	The trios involving such species pairs have not been plotted
WARNING	
WARNING	Species pairs not yet available in both Ks peak and Ks list ortholog databases:
WARNING	  Arabidopsis thaliana - Populus trichocarpa
WARNING	  Arabidopsis thaliana - Vitis vinifera
WARNING	  Populus trichocarpa - Vitis vinifera
WARNING	
WARNING	Please compute their ortholog Ks data and/or add the ortholog data to the databases,
WARNING	then rerun this step.
INFO	
INFO	All done
INFO	- - - - - - - - - - - - - - - - - - - - - - 
INFO	Rate-adjustment of ortholog Ks distributions
INFO	Fri Mar  8 16:10:52 2024
INFO	- - - - - - - - - - - - - - - - - - - - - - 
INFO	Loading parameters and input files
INFO	
INFO	Performing rate-adjustment of each divergent pair by using one or more outgroups:
INFO	 - Adjusting the peak for [Oxalis oulophora] and [Populus trichocarpa] with outspecies [Vitis vinifera]
WARNING	Couldn't process trio [Oxalis oulophora, Populus trichocarpa, Vitis vinifera]:
WARNING	 - [Populus trichocarpa_Vitis vinifera] not in ortholog peak database.
INFO	 - Adjusting the peak for [Oxalis oulophora] and [Populus trichocarpa] with outspecies [Arabidopsis thaliana]
WARNING	Couldn't process trio [Oxalis oulophora, Populus trichocarpa, Arabidopsis thaliana]:
WARNING	 - [Arabidopsis thaliana_Populus trichocarpa] not in ortholog peak database.
INFO	 - Adjusting the peak for [Oxalis oulophora] and [Vitis vinifera] with outspecies [Arabidopsis thaliana]
WARNING	Couldn't process trio [Oxalis oulophora, Vitis vinifera, Arabidopsis thaliana]:
WARNING	 - [Arabidopsis thaliana_Vitis vinifera] not in ortholog peak database.
INFO	Rate-adjustment results for each trio saved in TSV format [adjustment_table_oxalis_all.tsv]
INFO	
INFO	Rate-adjustment results as consensus values saved in TSV format [adjustment_table_oxalis.tsv]
INFO	
INFO	All done
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Generating mixed paralog and ortholog distributions
INFO	Fri Mar  8 16:10:53 2024
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Loading parameters and input files
ERROR	Anchor pair Ks TSV file not found at default position [paralog_distributions/wgd_oxalis/oxalis.ks_anchors.tsv].
ERROR	Exiting
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Generating PDF of input tree with branch length equal to Ks distances
INFO	Fri Mar  8 16:10:54 2024
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Loading parameters and input files
WARNING	Rate-adjustment data not available yet: PDF figure of phylogenetic tree not generated.
INFO	Exiting
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Clustering anchorpoints Ks values to reconstruct recent WGD events
INFO	Fri Mar  8 16:10:55 2024
INFO	- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
INFO	Loading parameters and input files
INFO	 - maximum EM iterations: 600
INFO	 - number of EM initializations: 10
ERROR	anchorpoints.txt file not found at default position [paralog_distributions/wgd_oxalis/oxalis_i-adhore/anchorpoints.txt].
ERROR	multiplicons.txt file not found at default position [paralog_distributions/wgd_oxalis/oxalis_i-adhore/anchorpoints.txt].
ERROR	segments.txt file not found at default position [paralog_distributions/wgd_oxalis/oxalis_i-adhore/segments.txt].
ERROR	multiplicon_pairs.txt file not found at default position [paralog_distributions/wgd_oxalis/oxalis_i-adhore/multiplicon_pairs.txt].
ERROR	list_elements.txt file not found at default position [paralog_distributions/wgd_oxalis/oxalis_i-adhore/list_elements.txt].
ERROR	oxalis.ks_anchors.tsv file not found at default position [paralog_distributions/wgd_oxalis/oxalis.ks_anchors.tsv].
ERROR	Exiting

Do you have any suggestions to fix this error?
Thanks so much!

Warning: Dubious indirect gene relationship - closest genes get same color in alignment

I'm currently trying to solve this error:

ERROR	i-ADHoRe execution failed with standard error output:
Warning: Dubious indirect gene relationship between evm.model.HiC_scaffold_10.80 and evm.model.HiC_scaffold_10.99, closest genes get same color in alignment
Warning: Dubious indirect gene relationship between evm.model.HiC_scaffold_10.99 and evm.model.HiC_scaffold_10.80, closest genes get same color in alignment
Warning: Dubious indirect gene relationship between evm.model.HiC_scaffold_10.119 and evm.model.HiC_scaffold_10.65, closest genes get same color in alignment
ERROR	Exiting.

Any recommendation on how to fix this? Thanks

A tree with branch length set to "rate-adjusted mixed Ks distances" and its Newick string?

Hi,

I could find the "input phylogenetic tree in PDF format with branch length set to KS distances estimated from ortholog KS distributions (tree_species_distances.pdf)." at the output folder of "rate_adjustment/species: this directory collects the output files of the substitution rate-adjustment relative to the focal species."

I have two questions:

  1. Where could I find the corresponding tree with branch length set to "rate-adjusted mixed Ks distances"? I guess that Figure 1 in the main paper has such a figure.
  2. Where could I find the Newick string for the tree shown in tree_species_distances.pdf?

Thank you for the great development.

TypeError: cannot convert the series to <class 'float'>

Hi,
I met an error when I run the step " ksrates orthologs-adjustment". I do not know how to resolve it, can you help me? I would appreciate it. Here I have replaced the species name with A, B,C,D,....in INFO.

INFO    - - - - - - - - - - - - - - - - - - - - - -
INFO    Rate-adjustment of ortholog Ks distributions
INFO    Wed Jun 22 23:03:01 2022
INFO    - - - - - - - - - - - - - - - - - - - - - -
INFO    Loading parameters and input files
INFO
INFO    Performing rate-adjustment of each divergent pair by using one or more outgroups:
INFO     - Adjusting the peak for [A] and [B] with outspecies [C]
INFO     - Adjusting the peak for [A] and [B] with outspecies [D]
INFO     - Adjusting the peak for [A] and [B] with outspecies [E]
INFO     - Adjusting the peak for [A] and [B] with outspecies [F]
INFO     - Adjusting the peak for [A] and [B] with outspecies [G]
INFO     - Adjusting the peak for [A] and [B] with outspecies [H]
INFO     - Adjusting the peak for [A] and [B] with outspecies [I]
Traceback (most recent call last):
  File "/home/zhaoxy/python3.8/ksrates/bin/ksrates", line 33, in <module>
    sys.exit(load_entry_point('ksrates', 'console_scripts', 'ksrates')())
  File "/home/zhaoxy/python3.8/ksrates/bin/ksrates/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/zhaoxy/python3.8/ksrates/lib/python3.8/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/zhaoxy/python3.8/ksrates/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/zhaoxy/python3.8/ksrates/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/zhaoxy/python3.8/ksrates/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/zhaoxy/biosoft/ksrates/ksrates_cli.py", line 136, in orthologs_adjustment
    correct(config_file, expert, trios)
  File "/home/zhaoxy/biosoft/ksrates/ksrates/correct.py", line 71, in correct
    rate_species, rate_species_sd, rate_sister, rate_sister_sd = fcCorrect.decompose_ortholog_ks(db, species_sister, species_out, sister_out, peak_stats)
  File "/home/zhaoxy/biosoft/ksrates/ksrates/fc_rrt_correction.py", line 45, in decompose_ortholog_ks
    rel_rate_species_sd = sqrt(pow(sd_sp_out, 2) + pow(sd_sp_sis, 2) + pow(sd_sis_out, 2)) / 2.0 # also called k_AO_sd
  File "/home/zhaoxy/python3.8/ksrates/lib/python3.8/site-packages/pandas/core/series.py", line 185, in wrapper
    raise TypeError(f"cannot convert the series to {converter}")
TypeError: cannot convert the series to <class 'float'>

ERROR Unexpected internal error during analysis of gene family GF_000001

Hello, when I run in test data and my data, I get an error at the paralogs-ks step like that, could you give some advice. Thank you very much.

ksrates paralogs-ks config_elaeis.txt --n-threads=20

INFO	- - - - - - - - - - - - - - - - - - - - - 
INFO	Paralog wgd analysis for species [elaeis]
INFO	Tue Apr 11 16:51:53 2023
INFO	- - - - - - - - - - - - - - - - - - - - - 
INFO	Checking if sequence data files exist and if sequence IDs are compatible with wgd pipeline...
INFO	Completed
INFO	Running wgd paralog Ks pipeline...
INFO	---
INFO	Checking external software...
INFO	makeblastdb: 2.12.0+
INFO	blastp: 2.12.0+
INFO	mcl 14-137
INFO	muscle 5.1.linux64 []
INFO	AAML in paml version 4.9, March 2015
INFO	Usage for FastTree version 2.1.11 Double precision (No SSE3):
INFO	Creating output directory /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis
INFO	Translating CDS file elaeis.fasta...
INFO	---
INFO	Running all versus all Blastp
INFO	Writing protein Blastdb sequences to /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/...
INFO	Writing protein query sequences to /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/...
INFO	Performing all versus all Blastp (this might take a while)...
INFO	Making Blastdb
INFO	makeblastdb -in /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/elaeis.db.fasta -dbtype prot
INFO	makeblastdb output:
Building a new DB, current time: 04/11/2023 16:51:56
New DB name:   /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/elaeis.db.fasta
New DB title:  /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/elaeis.db.fasta
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 500 sequences in 0.0105121 seconds.
INFO	Running Blastp
INFO	blastp -db /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/elaeis.db.fasta -query /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast_tmp/elaeis.query.fasta -evalue 1e-10 -outfmt 6 -num_threads 20 -out /home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.blast.tsv
INFO	All versus all Blastp done
INFO	Removing tmp directory
INFO	---
INFO	Running gene family construction (MCL clustering with inflation factor = 2.0)
INFO	Started MCL clustering (mcl)
INFO	---
INFO	Running whole paranome Ks analysis...
INFO	Started analysis of 66 gene families in parallel using 20 threads
INFO	Performing analysis on gene family GF_000001 (size 13)
INFO	Performing analysis on gene family GF_000002 (size 9)
INFO	Performing analysis on gene family GF_000003 (size 6)
INFO	Performing analysis on gene family GF_000004 (size 6)
ERROR	Unexpected internal error during analysis of gene family GF_000001:
Traceback (most recent call last):
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 280, in analyse_family_try_except
    n_families, is_last_family)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.ks_tmp/GF_000001.fasta.msa'
ERROR	Skipping gene family
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Unexpected internal error during analysis of gene family GF_000003:
Traceback (most recent call last):
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 280, in analyse_family_try_except
    n_families, is_last_family)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.ks_tmp/GF_000003.fasta.msa'
ERROR	Skipping gene family
ERROR	Unexpected internal error during analysis of gene family GF_000002:
Traceback (most recent call last):
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 280, in analyse_family_try_except
    n_families, is_last_family)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.ks_tmp/GF_000002.fasta.msa'
ERROR	Skipping gene family
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Unexpected internal error during analysis of gene family GF_000004:
Traceback (most recent call last):
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 280, in analyse_family_try_except
    n_families, is_last_family)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/home/jwang/miniconda3/envs/wgd/lib/python3.7/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.ks_tmp/GF_000004.fasta.msa'
ERROR	Skipping gene family
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	Too many gene family analyses failed, terminating threads...
ERROR	--
ERROR	The analyses of more than 1% of gene families [4/66] have failed due to unexpected internal errors
ERROR	Please check the nature of the error(s), remove the tmp directory [/home/jwang/my_apps/ksrates/test/paralog_distributions/wgd_elaeis/elaeis.ks_tmp] and rerun the Ks analysis
ERROR	See the tracebacks above for the following gene family IDs:
ERROR	GF_000001, GF_000002, GF_000003, GF_000004
ERROR	Exiting

Difficulty in Output Plots Interpretation

Hi Cecilia,
I have two question regarding my outputs,

  1. I didn't get this output "mixed_oxalis_lmm_colinearity.pdf" mentioned in your tutorial. (Ran with expert configuration file). But I got this output file "mixed_species_lmm_paranome.pdf" Can I know the reason for that?

  2. This is regarding the main output file "mixed_oxalis_anchor_clusters.pdf". In this plot, I got just one peak. In the literature, I know that ancient γ event occur. How can I know that one peak is for the ancient γ event or it is species specific WGD? Here I attached the "mixed_oxalis_anchor_clusters.pdf" and "mixed_oxalis_adjusted.pdf" files. Could you please help me to interpret these results.
    mixed_oxalis_anchor_clusters.pdf
    mixed_oxalis_adjusted.pdf

where would I find the equation or lognorm fit parameters to the Ks distributions?

Hi Cecilia,

This looks like a great application! I have a few questions:

  1. Does the Ksrates output provide the equation and/or lognorm fit parameters for the mixture-model fits? I'd like to be able to see the actual fit parameters and be able to reproduce the distribution & fit.

  2. This software can also be run on a single genome, correct? It does not require a species trio? (Understood that this may affect the rate adjustments)

Thanks so much,
Tamsen

Issue with wgd paralog Ks analysis

Hi.

I am trying to perform a Ks analysis comparing a species of interest with two others. I was able to generate the configuration file below.

[SPECIES]
focal_species = poatr
# informal name of the species that will be used to perform the rate-adjustment

newick_tree = ((poatr, horvu), brac);
# input phylogenetic tree in newick format; use the informal names of the species

latin_names = poatr:Poa trivialis, horvu:Hordeum vulgare, brac:Brachypodium distachyon
# informal names associated to their scientific names through a colon and separated by comma

fasta_filenames = poatr:cds_final.fasta, horvu:Hordeum_vulgare.MorexV3_pseudomolecules_assembly.cds.all_edited.fa, brac:Brachypodium_distachyon.Brachypodium_distachyon_v3.0.cds.all.fa
gff_filename = poatr:poatr_cds.gff
# informal names associated to their fasta or gff filenames/paths through a colon and separated by commas

peak_database_path = ortholog_peak_db.tsv
ks_list_database_path = ortholog_ks_list_db.tsv
# filenames/paths of the ortholog data databases


[ANALYSIS SETTING]
paranome = yes
collinearity = no
# analysis type for paralog data; allowed values: 'yes' and 'no'

gff_feature = cds
# keyword to parse the sequence type from the gff file (column 3); can be 'gene', 'mrna'...

gff_attribute = id
# keyword to parse gene id from the gff file (column 9); can be 'id', 'name'...

max_number_outgroups = 4
# maximum number of outspecies/trios selected to rate-adjust each divergent species pair (default: 4)

consensus_mode_for_multiple_outgroups = mean among outgroups
# allowed values: 'mean among outgroups' or 'best outgroup' (default: 'mean among outgroups')


[PARAMETERS]
x_axis_max_limit_paralogs_plot = 5
# highest value of the x axis in the mixed distribution plot (default: 5)

bin_width_paralogs = 0.1
# bin width in paralog ks histograms (default: 0.1, ten bins per unit)

y_axis_max_limit_paralogs_plot = None
# highest value of the y axis in the mixed distribution plot  (default: none)

num_bootstrap_iterations = 200
# number of bootstrap iterations for ortholog peak estimate

divergence_colors = Red, MediumBlue, DarkGoldenrod, ForestGreen, HotPink, DarkCyan, SaddleBrown, Black
# color of the divergence lines drawn in correspondence of the ortholog peaks
# use color names/codes separated by comma and use at least as many colors as the number of divergence nodes

x_axis_max_limit_orthologs_plots = 5
# highest value of the x axis in the ortholog distribution plots (default: 5)

bin_width_orthologs = 0.1
# bin width in ortholog ks histograms (default: 0.1, ten bins per unit)

max_ks_paralogs = 5
# maximum paralog ks value accepted from ks data table (default: 5)

max_ks_orthologs = 10
# maximum ortholog ks value accepted from ks data table (default: 10)

I then run the following command to estimate the whole-paranome KS values.

ksrates paralogs-ks config_filename.txt

**However, I am getting this error. I tried multiple things, including installing and reinstalling packages, changing versions, but nothing seems to work. My understanding is that the alignment files aren't being generated, but I can't figure out why. Any thought how I can make this work? **

INFO    - - - - - - - - - - - - - - - - - - - -
INFO    Paralog wgd analysis for species [poatr]
INFO    Wed Jun  1 21:01:56 2022
INFO    - - - - - - - - - - - - - - - - - - - -
INFO    Checking if sequence data files exist and if sequence IDs are compatible with wgd pipeline...
WARNING Poa trivialis: sequence IDs in FASTA file [cds_final.fasta] could raise an error due to:
WARNING  - ID length longer than 50 characters, it is advised to shorten them
WARNING  - ID name contains one or more characters that are not allowed: =
INFO    Completed
INFO    Running wgd paralog Ks pipeline...
INFO    Paralog blast data poatr.blast.tsv already exists, will skip wgd all versus all Blastp
INFO    Paralog gene family data poatr.mcl.tsv already exists, will skip wgd mcl
INFO    No paralog Ks data, will run wgd Ks analysis
INFO    ---
INFO    Checking external software...
INFO    muscle 5.1.linux64 []
INFO    AAML in paml version 4.9j, February 2020
INFO    Usage for FastTree version 2.1.11 Double precision (No SSE3):
INFO    Translating CDS file cds_final.fasta...
INFO    ---
INFO    Running whole paranome Ks analysis...
WARNING Filtered out the 3 largest gene families because their size is > 200
WARNING If you want to analyse these large families anyhow, please raise the `max_gene_family_size` parameter
INFO    Started analysis of 5019 gene families in parallel using 4 threads
INFO    Performing analysis on gene family GF_000004 (size 165)
ERROR   Unexpected internal error during analysis of gene family GF_000004:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000004.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000005 (size 138)
ERROR   Unexpected internal error during analysis of gene family GF_000005:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000005.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000006 (size 129)
ERROR   Unexpected internal error during analysis of gene family GF_000006:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000006.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000007 (size 122)
ERROR   Unexpected internal error during analysis of gene family GF_000007:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000007.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000008 (size 122)
ERROR   Unexpected internal error during analysis of gene family GF_000008:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000008.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000009 (size 120)
ERROR   Unexpected internal error during analysis of gene family GF_000009:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000009.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000010 (size 117)
ERROR   Unexpected internal error during analysis of gene family GF_000010:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000010.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000011 (size 115)
ERROR   Unexpected internal error during analysis of gene family GF_000011:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000011.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000012 (size 111)
ERROR   Unexpected internal error during analysis of gene family GF_000012:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000012.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000013 (size 110)
ERROR   Unexpected internal error during analysis of gene family GF_000013:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000013.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000014 (size 107)
ERROR   Unexpected internal error during analysis of gene family GF_000014:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000014.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000015 (size 95)
ERROR   Unexpected internal error during analysis of gene family GF_000015:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000015.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000016 (size 92)
ERROR   Unexpected internal error during analysis of gene family GF_000016:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000016.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000017 (size 89)
ERROR   Unexpected internal error during analysis of gene family GF_000017:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000017.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000018 (size 89)
ERROR   Unexpected internal error during analysis of gene family GF_000018:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000018.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000019 (size 86)
ERROR   Unexpected internal error during analysis of gene family GF_000019:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000019.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000020 (size 81)
ERROR   Unexpected internal error during analysis of gene family GF_000020:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000020.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000021 (size 79)
ERROR   Unexpected internal error during analysis of gene family GF_000021:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000021.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000022 (size 79)
ERROR   Unexpected internal error during analysis of gene family GF_000022:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000022.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000023 (size 75)
ERROR   Unexpected internal error during analysis of gene family GF_000023:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000023.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000024 (size 73)
ERROR   Unexpected internal error during analysis of gene family GF_000024:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000024.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000025 (size 72)
ERROR   Unexpected internal error during analysis of gene family GF_000025:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000025.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000026 (size 69)
ERROR   Unexpected internal error during analysis of gene family GF_000026:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000026.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000027 (size 67)
ERROR   Unexpected internal error during analysis of gene family GF_000027:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000027.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000028 (size 67)
ERROR   Unexpected internal error during analysis of gene family GF_000028:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000028.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000029 (size 64)
ERROR   Unexpected internal error during analysis of gene family GF_000029:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000029.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000030 (size 64)
ERROR   Unexpected internal error during analysis of gene family GF_000030:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000030.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000031 (size 62)
ERROR   Unexpected internal error during analysis of gene family GF_000031:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000031.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000032 (size 61)
ERROR   Unexpected internal error during analysis of gene family GF_000032:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000032.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000033 (size 59)
ERROR   Unexpected internal error during analysis of gene family GF_000033:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000033.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000034 (size 57)
ERROR   Unexpected internal error during analysis of gene family GF_000034:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000034.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000035 (size 56)
ERROR   Unexpected internal error during analysis of gene family GF_000035:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000035.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000036 (size 55)
ERROR   Unexpected internal error during analysis of gene family GF_000036:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000036.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000037 (size 55)
ERROR   Unexpected internal error during analysis of gene family GF_000037:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000037.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000038 (size 52)
ERROR   Unexpected internal error during analysis of gene family GF_000038:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000038.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000039 (size 51)
ERROR   Unexpected internal error during analysis of gene family GF_000039:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000039.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000040 (size 50)
ERROR   Unexpected internal error during analysis of gene family GF_000040:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000040.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000041 (size 49)
ERROR   Unexpected internal error during analysis of gene family GF_000041:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000041.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000042 (size 48)
ERROR   Unexpected internal error during analysis of gene family GF_000042:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000042.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000043 (size 48)
ERROR   Unexpected internal error during analysis of gene family GF_000043:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000043.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000044 (size 46)
ERROR   Unexpected internal error during analysis of gene family GF_000044:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000044.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000045 (size 46)
ERROR   Unexpected internal error during analysis of gene family GF_000045:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000045.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000046 (size 45)
ERROR   Unexpected internal error during analysis of gene family GF_000046:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000046.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000047 (size 45)
ERROR   Unexpected internal error during analysis of gene family GF_000047:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000047.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000048 (size 45)
ERROR   Unexpected internal error during analysis of gene family GF_000048:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000048.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000049 (size 45)
ERROR   Unexpected internal error during analysis of gene family GF_000049:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000049.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000050 (size 45)
ERROR   Unexpected internal error during analysis of gene family GF_000050:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000050.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000051 (size 45)
ERROR   Unexpected internal error during analysis of gene family GF_000051:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000051.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000052 (size 44)
ERROR   Unexpected internal error during analysis of gene family GF_000052:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000052.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000053 (size 44)
ERROR   Unexpected internal error during analysis of gene family GF_000053:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000053.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000054 (size 44)
ERROR   Unexpected internal error during analysis of gene family GF_000054:
Traceback (most recent call last):
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 278, in analyse_family_try_except
    analysis_function(family_id, family, nucleotide, tmp, codeml, preserve,
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/storage/home/cpb5881/.local/lib/python3.8/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp/GF_000054.fasta.msa'
ERROR   Skipping gene family
ERROR   Too many gene family analyses failed, terminating threads...
ERROR   Too many gene family analyses failed, terminating threads...
ERROR   Too many gene family analyses failed, terminating threads...
ERROR   --
ERROR   The analyses of more than 1% of gene families [51/5019] have failed due to unexpected internal errors
ERROR   Please check the nature of the error(s), remove the tmp directory [/storage/group/cpb5881/default/WGD/wgd/paralog_distributions/wgd_poatr/poatr.ks_tmp] and rerun the Ks analysis
ERROR   See the tracebacks above for the following gene family IDs:
ERROR   GF_000004
ERROR   GF_000005
ERROR   GF_000006
ERROR   GF_000007
ERROR   GF_000008
ERROR   GF_000009
ERROR   GF_000010
ERROR   GF_000011
ERROR   GF_000012
ERROR   GF_000013
ERROR   GF_000014
ERROR   GF_000015
ERROR   GF_000016
ERROR   GF_000017
ERROR   GF_000018
ERROR   GF_000019
ERROR   GF_000020
ERROR   GF_000021
ERROR   GF_000022
ERROR   GF_000023
ERROR   GF_000024
ERROR   GF_000025
ERROR   GF_000026
ERROR   GF_000027
ERROR   GF_000028
ERROR   GF_000029
ERROR   GF_000030
ERROR   GF_000031
ERROR   GF_000032
ERROR   GF_000033
ERROR   GF_000034
ERROR   GF_000035
ERROR   GF_000036
ERROR   GF_000037
ERROR   GF_000038
ERROR   GF_000039
ERROR   GF_000040
ERROR   GF_000041
ERROR   GF_000042
ERROR   GF_000043
ERROR   GF_000044
ERROR   GF_000045
ERROR   GF_000046
ERROR   GF_000047
ERROR   GF_000048
ERROR   GF_000049
ERROR   GF_000050
ERROR   GF_000051
ERROR   GF_000052
ERROR   GF_000053
ERROR   GF_000054
ERROR   Exiting

ERROR Unexpected internal error during analysis of gene family GF_000001

Hello, I get an error at the whole paranome Ks analysis like that, could you give me some advice? Thank you very much.

INFO    Paralog wgd analysis for species [Dede]
INFO    Mon Jul 10 17:18:26 2023
INFO    - - - - - - - - - - - - - - - - - - - -
INFO    Checking if sequence data files exist and if sequence IDs are compatible with wgd pipeline...
INFO    Completed
INFO    Creating directory [paralog_distributions/]
INFO    Running wgd paralog Ks pipeline...
INFO    ---
INFO    Checking external software...
INFO    makeblastdb: 2.11.0+
INFO    blastp: 2.11.0+
INFO    mcl 14-137
INFO    muscle 5.1.linux64 []
INFO    AAML in paml version 4.9j, February 2020
INFO    FastTree Version 2.1.11 Double precision (No SSE3)
INFO    Creating output directory /data/jwang/5-dendrobium_devonianum_genome/7.2.ksrates/paralog_distributions/wgd_Dede
INFO    Translating CDS file Dede.cds...
WARNING Sequence length != multiple of 3 for De_Chr06G000001.mRNA1!
WARNING Invalid codon  AG in De_Chr06G000001.mRNA1
WARNING There were 2 warnings during translation
INFO    ---
INFO    Running all versus all Blastp
INFO    Writing protein Blastdb sequences to /data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/...
INFO    Writing protein query sequences to /data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/...
INFO    Performing all versus all Blastp (this might take a while)...
INFO    Making Blastdb
INFO    makeblastdb -in /data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/Dede.db.fasta -dbtype prot
INFO    makeblastdb output:
Building a new DB, current time: 07/10/2023 17:18:36
New DB name:   /data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/Dede.db.fasta
New DB title:  /data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/Dede.db.fasta
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 30408 sequences in 0.42232 seconds.
INFO    Running Blastp
INFO    blastp -db /data/jwang/5-dendrobium_devonianum_genome/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/Dede.db.fasta -query /data/jwang/5-dendrobium_devonianum_genome/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast_tmp/Dede.query.fasta -evalue 1e-10 -outfmt 6 -num_threads 30 -out /data/jwang/5-dendrobium_devonianum_genome/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.blast.tsv
INFO    All versus all Blastp done
INFO    Removing tmp directory
INFO    ---
INFO    Running whole paranome Ks analysis...
WARNING Filtered out the 3 largest gene families because their size is > 200
WARNING If you want to analyse these large families anyhow, please raise the `max_gene_family_size` parameter
INFO    Started analysis of 4368 gene families in parallel using 30 threads
INFO    Performing analysis on gene family GF_000004 (size 159)
ERROR   Unexpected internal error during analysis of gene family GF_000004:
Traceback (most recent call last):
  File "/data/jwang/software/miniconda3/envs/ksrate/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 280, in analyse_family_try_except
    n_families, is_last_family)
  File "/data/jwang/software/miniconda3/envs/ksrate/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/data/jwang/software/miniconda3/envs/ksrate/lib/python3.7/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.ks_tmp/GF_000004.fasta.msa'
ERROR   Skipping gene family
INFO    Performing analysis on gene family GF_000005 (size 130)
ERROR   Unexpected internal error during analysis of gene family GF_000005:
Traceback (most recent call last):
  File "/data/jwang/software/miniconda3/envs/ksrate/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 280, in analyse_family_try_except
    n_families, is_last_family)
  File "/data/jwang/software/miniconda3/envs/ksrate/lib/python3.7/site-packages/wgd_ksrates/ks_distribution.py", line 371, in analyse_family
    msa_path, stats, successful = prepare_aln(msa_path_protein, nucleotide)
  File "/data/jwang/software/miniconda3/envs/ksrate/lib/python3.7/site-packages/wgd_ksrates/alignment.py", line 43, in prepare_aln
    with open(msa_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/jwang/7.2.ksrates/paralog_distributions/wgd_Dede/Dede.ks_tmp/GF_000005.fasta.msa'
.....
ERROR   Skipping gene family
ERROR   Too many gene family analyses failed, terminating threads...
ERROR   Too many gene family analyses failed, terminating threads...
ERROR   Too many gene family analyses failed, terminating threads...
ERROR   --
ERROR   The analyses of more than 1% of gene families [44/4368] have failed due to unexpected internal errors

I have suspected the issue with i-adhore, but it works normally.

/data/jwang/software/miniconda3/envs/ksrate/bin/i-adhore
Usage: /data/jwang/software/miniconda3/envs/ksrate/bin/i-adhore [configuration file]

installation error

Hello, I was able to install and use Singularity, but I'm having an issue with Nextflow. When I use "java -version" is get:
openjdk version "19.0.1" 2022-10-18
OpenJDK Runtime Environment Homebrew (build 19.0.1)
OpenJDK 64-Bit Server VM Homebrew (build 19.0.1, mixed mode, sharing)

but when I use "wget -qO- https://get.nextflow.io | bash", I get this error:

NOTE: Nextflow is not tested with Java 1.8.0_331 -- It's recommended the use of version 11 up to 18

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/eclipse/jgit/api/errors/GitAPIException has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
...

I'm not sure why the Nextflow installation is using the wrong version of java and I'm having trouble finding how to solve this online. I know this is only tangentially related to ksrates, but I would appreciate any help you can provide.

singularity

Hello
@Cecilia-Sensalari
After I try to run, I get an error like this, can you help me?Thank you very much。
`N E X T F L O W ~ version 21.10.6
Launching VIB-PSB/ksrates [nauseous_torricelli] - revision: bfbb623 [master]

K S R A T E S - N E X T F L O W P I P E L I N E (v1.1.1)
Configuration file: ./config_elaeis.txt
Logs folder: logs_f55ff9e5
Preserve leftover files: false

Command line: nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
Launch directory: /share/home/stu_wangtianyou/soft/ksrates/test
Work directory: /share/home/stu_wangtianyou/soft/ksrates/test/work
ksrates directory: /share/home/stu_wangtianyou/.nextflow/assets/VIB-PSB/ksrates

Start time: 2022-05-12T16:46:58.801640+08:00

[- ] process > checkConfig -
[- ] process > checkConfig -
[- ] process > setupAdjustment -
[- ] process > setParalogAnalysis -
[- ] process > setOrthologAnalysis -
[- ] process > estimatePeaks -
[- ] process > wgdParalogs -
[- ] process > wgdOrthologs -
[- ] process > plotOrthologDistrib -
[- ] process > doRateAdjustment -
[- ] process > paralogsAnalyses -
[- ] process > drawTree -
Pulling Singularity image docker://vibpsb/ksrates:latest [cache /share/home/stu_wangtianyou/soft/ksrates/test/work/singularity/vibpsb-ksrates-latest.img]WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /share/home/stu_wangtianyou/soft/ksrates/test/work/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location
Error executing process > 'checkConfig'

Caused by:
Failed to pull singularity image
command: singularity pull --name vibpsb-ksrates-latest.img.pulling.1652345220904 docker://vibpsb/ksrates:latest > /dev/null
status : 1
message:
[- ] process > checkConfig -
[- ] process > setupAdjustment -
[- ] process > setParalogAnalysis -
[- ] process > setOrthologAnalysis -
[- ] process > estimatePeaks -
[- ] process > wgdParalogs -
[- ] process > wgdOrthologs -
[- ] process > plotOrthologDistrib -
[- ] process > doRateAdjustment -
[- ] process > paralogsAnalyses -
[- ] process > drawTree -
Pulling Singularity image docker://vibpsb/ksrates:latest [cache /share/home/stu_wangtianyou/soft/ksrates/test/work/singularity/vibpsb-ksrates-latest.img]
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /share/home/stu_wangtianyou/soft/ksrates/test/work/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location
Error executing process > 'checkConfig'

Caused by:
Failed to pull singularity image
command: singularity pull --name vibpsb-ksrates-latest.img.pulling.1652345220904 docker://vibpsb/ksrates:latest > /dev/null
status : 1 message:
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 1: Bootstrap:: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 2: From:: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 4: fg: no job control
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 5: AUTHOR: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 7: fg: no job control
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 8: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8): No such file or directory
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 8: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 12: fg: no job control
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/requirements.txt: line 9: scikit-learn==0.24.2: command not found
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/setup.py: line 1: from: command not found
/share/home/stu_wangtianyou/soft/ksrates/setup.py: line 2: from: command not found
/share/home/stu_wangtianyou/soft/ksrates/setup.py: line 4: syntax error near unexpected token (' /share/home/stu_wangtianyou/soft/ksrates/setup.py: line 4: with open("README.md", 'r') as f:'
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 15: ksrates: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 16: wgd_ksrates: command not found
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/README.md: line 1: syntax error near unexpected token (' /share/home/stu_wangtianyou/soft/ksrates/README.md: line 1: Test pipeline CI'
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
[- ] process > checkConfig -
[- ] process > setupAdjustment -
[- ] process > setParalogAnalysis -
[- ] process > setOrthologAnalysis -
[- ] process > estimatePeaks -
[- ] process > wgdParalogs -
[- ] process > wgdOrthologs -
[- ] process > plotOrthologDistrib -
[- ] process > doRateAdjustment -
[- ] process > paralogsAnalyses -
[- ] process > drawTree -
Pulling Singularity image docker://vibpsb/ksrates:latest [cache /share/home/stu_wangtianyou/soft/ksrates/test/work/singularity/vibpsb-ksrates-latest.img]
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /share/home/stu_wangtianyou/soft/ksrates/test/work/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location
Error executing process > 'checkConfig'

Caused by:
Failed to pull singularity image
command: singularity pull --name vibpsb-ksrates-latest.img.pulling.1652345220904 docker://vibpsb/ksrates:latest > /dev/null
status : 1
message:
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 1: Bootstrap:: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 2: From:: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 4: fg: no job control
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 5: AUTHOR: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 7: fg: no job control
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 8: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8): No such file or directory
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 8: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 12: fg: no job control
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/requirements.txt: line 9: scikit-learn==0.24.2: command not found
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/setup.py: line 1: from: command not found
/share/home/stu_wangtianyou/soft/ksrates/setup.py: line 2: from: command not found
/share/home/stu_wangtianyou/soft/ksrates/setup.py: line 4: syntax error near unexpected token (' /share/home/stu_wangtianyou/soft/ksrates/setup.py: line 4: with open("README.md", 'r') as f:'
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 15: ksrates: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 16: wgd_ksrates: command not found
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/README.md: line 1: syntax error near unexpected token (' /share/home/stu_wangtianyou/soft/ksrates/README.md: line 1: Test pipeline CI'
bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
/share/home/stu_wangtianyou/soft/ksrates/ksrates_cli.py: line 1: import: command not found
/share/home/stu_wangtianyou/soft/ksrates/ksrates_cli.py: line 2: import: command not found
/share/home/stu_wangtianyou/soft/ksrates/ksrates_cli.py: line 3: from: command not found
/share/home/stu_wangtianyou/soft/ksrates/ksrates_cli.py: line 4: from: command not found
/share/home/stu_wangtianyou/soft/ksrates/ksrates_cli.py: line 6: syntax error near unexpected token context_settings={'help_option_names':' /share/home/stu_wangtianyou/soft/ksrates/ksrates_cli.py: line 6: @click.group(context_settings={'help_option_names': ['-h', '--help']})'
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 20: fg: no job control
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 24: apt-get: command not found
/share/home/stu_wangtianyou/soft/ksrates/singularity: line 29: apt-get: command not found
/bin/sh: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
Directory '/install' is not installable. File 'setup.py' not found.

======================================================================================
The pipeline terminated during process 'checkConfig' with the following error message:

Failed to pull singularity image

More details may be found in the error report above and/or in[.nextflow.log]`

Uploading .nextflow.log…

NXF_VER=21.10.6 nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt is my command line

ksrates paralogs-ks seemingly freezing on i-adhore step

Hello,
I've run ksrates paralogs-ks for a few different genome assemblies and they typically finish pretty quick:
Sp. A (32 threads) - finished in 1 hr 24 min and used 3.74 GB memory
Sp. B (32 threads) - finished in 1 hr 8 min and used 3.82 GB memory
Sp. C (32 threads) - finished in 3 hr 1 min and used 5.43 GB memory

But when I work with one publicly available species in particular, Eustoma grandiflorum (https://plantgarden.jp/en/list/t52518/genome/t52518.G001), it has been giving me some issues. When I ran it with 32 threads, it exceeded the maximum memory that I was giving it. The job reported that it used ~40 GB memory. I think this value was multiplied among the threads, making it ~1,280 GB, which was more than what I was allocating. I Eventually lowered the threads and increased the memory, but it hasn't made progress in at least 27 hours (anywhere between 27 and 48 hours).

run 1 (32 threads) - stopped at 10 hours due to memory issues, used 40.39 GB (1,292 GB by my understanding)
run 2 (12 threads) - reached time limit at 24 hours, used 38.69 GB memory (464 GB/512 GB allocated)
run 3 (12 threads) - reached time limit at 48 hours, used 33.77 GB memory (405 GB/512 GB allocated)
run 4 (12 threads) - currently at ~48 hours out of my university's limit of 72 hours, but I haven't seen any change in the output since I mistakenly touched the files ~27 hours ago.

I knew Eustoma grandiflorum was going to use more memory and take longer than other species because it also took a bit longer when it was one of the species I was using in the orthologs-ks step, but this amount of time is worrying based on my time and memory limits at my university. I know it's possible that the job finishes before 72 hours is up, but I wanted to reach out in case you had any recommendations for speeding up this process for future runs. Thank you for any help you can provide

Error when executing ksrates test in nextflow

I am trying to get ksrates running on my laptop (Macbook M1 Pro running macOS Monterey) using nextflow. I am able to get nextflow to run successfully but when I try to run the test example I get the following error:

$ nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
N E X T F L O W  ~  version 22.04.0
Launching `https://github.com/VIB-PSB/ksrates` [determined_bernard] DSL2 - revision: bfbb623720 [master]


K S R A T E S   -   N E X T F L O W   P I P E L I N E   (v1.1.1)
----------------------------------------------------------------

Configuration file:                    ./config_elaeis.txt
Logs folder:                           logs_3cf4463c
Preserve leftover files:               false

Command line:               nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
Launch directory:           /Users/wickell/ksrates
Work directory:             /Users/wickell/ksrates/work
ksrates directory:          /Users/wickell/.nextflow/assets/VIB-PSB/ksrates

Start time:                 2022-04-28T09:05:59.358787-04:00


No such variable: outCheckConfig

 -- Check script '/Users/wickell/.nextflow/assets/VIB-PSB/ksrates/main.nf' at line: 252 or see '.nextflow.log' file for more details

When I check the .nextflow.log file it doesn't tell me much more than the initial error message
I have also tried running the next flow pipeline on our server (running Ubuntu 14.04.5) and get the same error.

Any idea what might be going wrong?

Updated Errors

Hello!
I am unable to update ksrates-1.1.3 to ksrates-1.1.4 with the following command:

pip3 uninstall ksrates
git clone https://github.com/VIB-PSB/ksrates
cd ksrates
pip3 install .

The update looks like the following:

liu@admin1 14:08:06 /vol2/software/ksrates-1.1.4
$ /vol2/software/Python-3.7.12/bin/pip3 install .
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /vol2/software/ksrates-1.1.4
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: click==8.0.1 in /vol2/software/Python-3.7.12/lib/python3.7/site-packages (from ksrates==1.1.3) (8.0.1)
Requirement already satisfied: numpy==1.21.2 in /public/home/liu/.local/lib/python3.7/site-packages (from ksrates==1.1.3) (1.21.2)
Requirement already satisfied: scipy==1.7.1 in /public/home/liu/.local/lib/python3.7/site-packages (from ksrates==1.1.3) (1.7.1)
Requirement already satisfied: click==8.0.1 in /vol2/software/Python-3.7.12/lib/python3.7/site-packages (from ksrates==1.1.3) (8.0.1)
Requirement already satisfied: numpy==1.21.2 in /public/home/liu/.local/lib/python3.7/site-packages (from ksrates==1.1.3) (1.21.2)
......
Successfully built ksrates
Installing collected packages: ksrates
  Attempting uninstall: ksrates
    Found existing installation: ksrates 1.1.3
    Uninstalling ksrates-1.1.3:
      Successfully uninstalled ksrates-1.1.3
Successfully installed ksrates-1.1.3

Thank you!
Xuping

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.