Giter Club home page Giter Club logo

tetools's Introduction

Dfam TE Tools Container

Note that as of version 1.88 TETools contains RepeatMasker 4.1.6. This version of RepeatMasker uses a new version of FamDB with a new format. This README contains instructions for modifying the files available to the TETools container, but the README file in the FamDB repository contains more information on the format itself.

Dfam TE Tools includes RepeatMasker, RepeatModeler, and coseg. This container is an easy way to get a minimal yet fully functional installation of RepeatMasker and RepeatModeler and is additionally useful for testing or reproducibility purposes.

You should consider using the Dfam TE Tools container if:

  • You or your computing environment already use docker or singularity
  • You are uncomfortable compiling dependencies by hand or have had problems doing so
  • You do not need to use any of the other tools (e.g. RMBlast, HMMER) for anything besides RepeatMasker and RepeatModeler

You should install RepeatMasker and/or RepeatModeler manually if:

  • You do not have the necessary system permissions to install and/or run docker or singularity
  • You need to install a different version of a dependency
  • You need to compile a dependency in a specific way
  • You need to use RepeatMasker as part of another pipeline
  • You need to use AB-BLAST or Cross-Match. Either AB-BLAST or Cross-Match can probably be used with this container, but it is inconvenient and not tested.

Using the Container

Requirements

  • A 64-bit Linux operating system, or appropriate virtualization software. Docker for Mac is known to work, including the wrapper script, but we do not regularly test this platform ourselves.
  • singularity or docker installed with permissions to run containers. For docker, this usually means being in the docker group or running the container as the root user.

Using the Wrapper Script

We provide a wrapper script, dfam-tetools.sh, which does most of the work automatically.

The wrapper script does the following:

  • Runs the container from Docker Hub. You can also use the --container option to specify a different version of the container or an .sif file for singularity.
  • Uses singularity if it is available, or docker. You can choose between them with the --docker or --singularity options.
  • Runs the container as the current user, with the current working directory accessible from within the container. Depending on the environment and the software used, this directory appears inside the container at its original location and/or at the path /work.

NOTE: When using the wrapper script with the --docker option, it will attempt to mount the host $(pwd)/Libraries folder to the container /opt/RepeatMasker/Libraries folder. This is intended to enable the modification of local FamDB files, but it will also overwrite the library files within the container. To avoid this, copy those files to the host system using the steps below under Customizing the RepeatMasker libraries.

curl -sSLO https://github.com/Dfam-consortium/TETools/raw/master/dfam-tetools.sh
chmod +x dfam-tetools.sh
./dfam-tetools.sh

Inside the container, the included tools are now available:

BuildDatabase -name genome1 genome1.fa
RepeatModeler -database genome1 [-LTRStruct] [-threads 16]

RepeatMasker genome1.fa [-lib library.fa] [-pa 8]

runcoseg.pl -d -m 50 -c ALU.cons -s ALU.seqs -i ALU.ins

Running the Container Manually

The container can also be run manually, bypassing the wrapper script:

  • docker run -it --rm dfam/tetools:latest
  • singularity run docker://dfam/tetools:latest

When running the container manually, you will also need to set the UID/GID, directories to mount, and so on according to your needs. By default singularity mounts the current directory and your HOME directory; docker does neither.

You can also use singularity pull to download the container to a file:

singularity pull dfam-tetools-latest.sif docker://dfam/tetools:latest
singularity run dfam-tetools-latest.sif

Customizing the RepeatMasker libraries

By default, RepeatMasker is only packaged with the root (0th) FamDB file. Details about the contents of the files are available from the FamDB repo and the files themselves are available to download from Dfam.org

Additional setup is needed to install additional FamDB files and/or to install RepBase RepeatMasker Edition for use with the container. This is a different process from using a custom library of FASTA or HMM models, which can be accomplished by using the -lib option to RepeatMasker.

Modifying the container can become a quite complex task, depending on which software and versions are being used and how the system is configured. Instead these instructions will create a modifiable local Libraries folder that can be mounted back into the container.

The first step is to copy the RepeatMasker Libraries out of the container to the host filesystem:

# Enter the container tagged with <tag>. This command will also mount your current directory to 
# /work inside the container
docker run -it --rm --workdir /work -v $(pwd):/work dfam/tetools:<tag>

# confirm visibility to your working directory
ls

# Make a copy of the original RepeatMasker Libraries/ directory on your host filesystem
cp -r /opt/RepeatMasker/Libraries/ ./

# exit the container
exit

# Make sure you own the Libraries folder
chown -R $USER ./Libraries/

To include additional FamDB files, download, unzip, and include them in Libraries/famdb. They will be detected and included in queries automatically. Note that you will need a copy of the 0th partition in the host Libraries folder in addition to any addtitional files. It should be copied out of the container with above steps.

Whenever you modify the FamDB files, the RepeatMasker libraries must be regenerated.

# run the container binding your host Libraries directory over the RepeatMasker directory
docker run -it --rm -v <host path>/Libraries:/opt/RepeatMasker/Libraries dfam/tetools:<tag>

# navigate to the RepeatMasker folder
cd /opt/RepeatMasker

# unlock libaries for reconfiguring 
rm ./Libraries/famdb/rmlib.config

# run the reconfigure script
./tetoolsDfamUpdate.pl

To include RepBase data, download and unzip RepBaseRepeatMaskerEdition-#######.tar.gz into Libraries and run the updater script.

# Unpack the RepBase data
tar -xvzf /work/path/to/Libaries/RepBaseRepeatMaskerEdition-#######.tar.gz

# run the container. You can also use the first command above again
./dfam-tetools.sh --docker

# navigate to the RepeatMasker folder
cd /opt/RepeatMasker

# unlock libaries for reconfiguring 
rm ./Libraries/famdb/rmlib.config

# rerun the reconfigure script
./tetoolsDfamUpdate.pl

When using the image with the new Libraries folder, mount it to the container using the -v argument. All paths must be absolute. ./dfam-tetools.sh does this automatically.

-v <host filesystem path to Libraries>/Libraries:/opt/RepeatMasker/Libraries

Note that the process for modifying the files in a Singularity container is identical, but the arguments vary slightly. For example, instead of -v Singularity uses -B. dfam-tetools.sh also does not currently automatically mount host library directories for Singularity.


You can now specify this `Libraries/` directory by setting the `LIBDIR` environment variable, for example with the `export LIBDIR=` command or `-e LIBDIR=` depending on you are running the container.

When you use a new version of the TETools container (particularly a new RepeatMasker), you should re-create the Libraries directory for the new version.

# Set the LIBDIR environment variable before running RepeatMasker
export LIBDIR=/path/to/Libraries
RepeatMasker genome.fa

Building the Container

You will need:

  • curl
  • docker, with permissions to build containers
  • singularity (optional) - if building a singularity container
# Download dependencies
./getsrc.sh

# Build a docker container
docker build -t org/name:tag .

# (optional) build a singularity container
singularity build dfam-tetools.sif dfam-tetools.def

Multi-Platform Docker Build

docker buildx build --platform=linux/amd64,linux/arm64 --output=type=registry -t dfam/tetools:<tag> .

Included software

The following software is included in the Dfam TE Tools container (version 1.88.5):

RepeatModeler 2.0.5 http://www.repeatmasker.org/RepeatModeler/
RepeatMasker 4.1.6 http://www.repeatmasker.org/RMDownload.html
coseg 0.2.3 http://www.repeatmasker.org/COSEGDownload.html
RMBlast 2.14.1 http://www.repeatmasker.org/RMBlast.html
HMMER 3.4 http://hmmer.org/
TRF 4.09.1 https://github.com/Benson-Genomics-Lab/TRF
RepeatScout 1.0.6 http://www.repeatmasker.org/RepeatScout-1.0.6.tar.gz
RECON 1.08 http://www.repeatmasker.org/RepeatModeler/RECON-1.08.tar.gz
cd-hit 4.8.1 https://github.com/weizhongli/cdhit
genometools 1.6.4 https://github.com/genometools/genometools
LTR_retriever 2.9.0 https://github.com/oushujun/LTR_retriever/
MAFFT 7.471 https://mafft.cbrc.jp/alignment/software/
NINJA 0.99-cluster_only https://github.com/TravisWheelerLab/NINJA
UCSC utilities* v413 http://hgdownload.soe.ucsc.edu/admin/exe/>

* Selected tools only: faToTwoBit, twoBitInfo, twoBitToFa

License

The Dfam TE Tools container project (the Dockerfile and associated build and run scripts) is licensed under the CC 1.0 Universal Public Domain Dedication. The software packages included in the container have their own associated license terms; see the individual software packages for details.

tetools's People

Contributors

asgray avatar epaule avatar jebrosen avatar rmhubley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tetools's Issues

RepeatMasker genome1.fa -lib library.fa -pa 8

$ RepeatMasker genome1.fa [-lib library.fa] [-pa 8]
Hello, the example you put on the website, I don't know what this - lib library. fa refers to?
If I have an assembled pig genome, how should I set up library.fa in this place?

LTRPipeline : Error - could not open clusters.dat!

Hello,

Thank you for maintaining the useful tools.
I encountered an error when I ran RepeatModeler to generate species-specific library for masking a de novo assembly.

Describe the bug
The error was

Clustering...LTRPipeline: Error - could not cluster MAFFT results.
             : 00:00:00 (hh:mm:ss) Elapsed Time
LTRPipeline : Error - could not open /work/RM_13.SatJun290653562024/LTR_1007365.SunJun300251312024/clusters.dat! at /opt/RepeatModeler/LTRPipeline line 333.

To Reproduce
The command was:

SCAFFOLD_FASTA=out_JBAT.FINAL.fa    ### My assembly file
PREFIX=scaffold
BuildDatabase -name ${PREFIX} ${SCAFFOLD_FASTA}
RepeatModeler -database ${PREFIX} -LTRStruct -threads 40

I mounted my local dir on /work with following command:

sudo docker run -it --rm -v $(pwd):/work dfam/tetools:latest

Host system (please complete as much of the following information as you can find out):

  • My OS is: Ubuntu 20.04.6 LTS
  • Docker info: Docker version 24.0.2, build cb74dfc
  • Docker image: latest one created 4 months ago
  • Computing environment setup: single computer

Additional context
As the error said, I can't fine clusters.dat in /work/RM_13.SatJun290653562024/LTR_1007365.SunJun300251312024.
The directory includes four files:

LtrRetriever-redundant-results.fa
LtrRetriever-redundant-results.fa.no_orient
mafft-alignment.fa
raw-struct-results.txt

Best,
Yasuto

Error in Job submitting of Slurm on cluster : if it is possible to disable the need for an internet connection

Dear Authors of program.

Thanks for very nice program.
Now I run RepeatModeler with Dfam TE Tools Container. ( with using Singularity Container)
Job can work directly on command line. But When I summit it as job, it failed.

My Job:
(
#!/bin/bash
#SBATCH --time=1:00:00 # hh:mm:ss
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=4000 # megabytes
singularity run -B /scratch/ulg/bbasv/daomhai/Docker docker://dfam/tetools:latest RepeatModeler -database Catfish22 -engine ncbi -pa 32 )

The result in Slurm:
(
time="2021-12-26T19:39:42+01:00" level=warning msg=""/run/user/3003415" directory set by $XDG_RUNTIME_DIR does not exist. Either create the directory or unset $XDG_RUNTIME_DIR.: stat /run/user/3003415: no such file or directory: Trying to pull image in the event that it is a public image."
FATAL: Unable to handle docker://dfam/tetools:latest uri: failed to get checksum for docker://dfam/tetools:latest: error pinging docker registry registry-1.docker.io: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:47463->[::1]:53: read: connection refused.
)

I contact with manager of Cluster he said that compute nodes does not have a direct internet access**.
So I would like to ask the authors of the program if it is possible to disable the need for an internet connection when running
job program on cluster ??
Or do you have other advices ??

Thank you for your help and I look forward to hearing your advice.

Thanks in advance
Sincerely yours
Hai

RepeatModeler BuildDatabase can not open file

Describe the bug
BuildDatabase can not open file.translation from fasta files in directory.

To Reproduce
I tried this script on a HPC in singularity container:

module load singularity
singularity exec --no-home --env TEMP=/temp/ --workdir /data/ \
 --bind /temps/:/temp/ \
 --bind /repeatmodeler/results/:/data/ \
 --bind /input/fasta/:/files/ \
 /singularity_files/dfam-tetools-latest.sif BuildDatabase -name file -engine ncbi -dir /files/

I got this error:

BuildDatabase: Cannot open file file.translation

The fasta files contain a de novo assembly with 8.5 million sequences.

Expected behavior
I expected thet BuildDatabase will generate the database from fasta files in directory.

Host system (please complete as much of the following information as you can find out):

 LSB Version:	:core-4.1-amd64:core-4.1-noarch
Distributor ID:	RedHatEnterprise
Description:	Red Hat Enterprise Linux release 8.6 (Ootpa)
Release:	8.6
Codename:	Ootpa

Singularity-CE 4.0.1

famdb.py: command not found

Hello,
thank you for tools, and I had an error when I tried using famdb.py in singularity container
Now, I use a singularity container which released at Dec 12, 2023 and I use it well until Jan 30, 2024.
However, it didn't work well when I tried using that again today.
Here is my command line and error.

singularity run dfam-tetools-latest.sif
Singularity> famdb.py names -h
bash: famdb.py: command not found

I updated h5py also using pip3 install --upgrade h5py

How can I use famdb.py again in this singularity container? Should I download the latest version of it released 3 weeks ago?

How to make and use a RepeatMasker custom library inside the singularity container?

Hi,

I have installed the Rmodeler2 singularity container and I have merged the RepBase library to the RMasker library which is integrated inside the container using below instruction:

# Navigate to an appropriate directory that is persistent outside the container
$ cd /work

# Make a copy of RepeatMasker's Libraries directory here
$ cp -r /opt/RepeatMasker/Libraries/ ./

# Extract RepBase (the .tar.gz file unpacks files into Libraries/)
$ tar -x -f /work/path/to/RepBaseRepeatMaskerEdition-#######.tar.gz

# Run the 'addRepBase.pl' script (part of the RepeatMasker package) to merge the databases,
# specifying the custom Libraries directory.
$ addRepBase.pl -libdir Libraries/

# Run RepeatMasker with the LIBDIR environment variable set
$ export LIBDIR=/path/to/Libraries

I have run the Repeatmodeler on a non-model organism's genome assembly:
singularity exec --bind $PWD:$PWD ./tetools.sif RepeatModeler -database monCan3F9 -pa 40 -LTRStruct

now before running the RepeatMasker, I want to merge the monCan3F9-families.fa with the "Dfam.h5" library, how should I do that?

Also, regarding your new RM version, should I utilize the "queryRepeatDatabase" option to separately drag Drosophila library or not? because I saw it in older versions of RM but couldn't find in your current documentation with the container.

this is how inside my Library looks like now.
capture4

Not sure how I should deal with the forecast results and the results of the existing database

Hello! Thank you very much for developing such good software. Recently, I have a few unsure questions that I would like to ask you.
We assembled a genome and we made predictions through RepeatModeler software. We ran the RepeatMasker software using the predicted set of repeat sequences (***-family. fa) and found that the repeat sequence of the genome was as high as 38.75 %. This may be right. However, we also used the repetitive sequences of existing species based on dfam and repbase databases, and we found that the repetitive sequences reached 41.31%. I also tried to merge the prediction results with the results of the database and found that the repetitive sequence was as high as 44.71%. This result was much higher than our expectations. This may be wrong.
I am not sure which result we should use for subsequent genome annotation analysis, I would like to ask you.
Looking forward to hearing from you!

MAFFT failed while running RepeatModeler

Describe the bug
While running RepeatModeler, I am consistently getting error at the point where the de novo LTR sequences found by LtrRetriever are being aligned with MAFFT. The RepeatScout / Recon pipeline is working and those sequences are included in the final consensus sequences, but the LTR pipeline seemingly fails after LtrRetriever is complete and MAFFT does not run correctly, and therefore the LTR sequences are not included in the final consensus sequences. I have run RepeatModeler several times on the same data and received the same error message. I have attached a screenshot of the error message. Here is the full line with the error message:

/opt/mafft/bin/mafft: line 2718: 2108323 Killed "$prefix/disttbfast" -q $npickup -E $cycledisttbfast -V "-"$gopdist -s $unalignlevel $legacygapopt $mergearg -W $tuplesize $termgapopt $outnum $addarg $add2ndhalfarg -C $numthreads-$numthreadstb $memopt $weightopt $treeinopt $treeoutopt $distoutopt $seqtype $model -g $gexp -f "-"$gop -Q $spfactor -h $aof $param_fft $algopt $treealg $scoreoutarg $anchoropt -x $maxanchorseparation $oneiterationopt < infile > pre 2>> "$progressfile"

Screenshot 2024-07-15 at 10 18 00 AM

To Reproduce
Genome I used: https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_000778455.1/

Making blast database:
singularity run $dfam BuildDatabase -name DmelFixDfamDb GCA_000778455.1_CA_8.2_MHAP_genomic.fna

Running RepeatModeler:
nohup singularity run $dfam RepeatModeler -database DmelFixDfamDb -threads 20 -LTRStruct >& run2.out &
(or running without nohup, receive same error message:)
singularity run $dfam RepeatModeler -database PpecDfamDb -threads 20 -LTRStruct

Expected behavior
The final fasta files with consensus families should include both sequences from the Recon/RepeatScout pipeline and the LTR pipeline, but I am not getting any LTR families. Since this genome was used for benchmarking in the publication RepeatModeler2 was presented in, I know I should be expecting ~734 families, however I am only getting ~430 families whenever I run it.

Host system (please complete as much of the following information as you can find out):
This was run on a computing cluster on a linux operating system. More info:
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: Rocky
Description: Rocky Linux release 8.9 (Green Obsidian)
Release: 8.9
Codename: GreenObsidian

Singularity version: apptainer version 1.3.1-1.el8
The singularity container was downloaded on July 2, 2024

RepeatModeler was run on a computing cluster using 1 node, 4 cores, and 3G per core. Job efficiency:
CPU Utilized: 2-02:29:28
CPU Efficiency: 81.56% of 2-13:54:16 core-walltime
Job Wall-clock time: 15:28:34
Memory Utilized: 9.82 GB
Memory Efficiency: 81.83% of 12.00 GB

Taxonomy::new() needs a path for a famdb directory!

I'm trying to use RepeatMasker and execute it for a chromosome sequence that is in fasta format but I'm getting this message: Taxonomy::new() needs a path for a famdb directory!

I pulled the docker image dfam/tetools.
I cloned this repository.
I executed the dfam-tetools.sh file and entered the container created with the image.
I'm entering: RepeatMasker Libraries/c_elegans_chromosome_I.fasta
Libraries/c_elegans_chromosome_I.fasta is the path of the sequence I want to use.

And I'm getting:
RepeatMasker version 4.1.6
Search Engine: NCBI/RMBLAST [ 2.14.1+ ]
Taxonomy::new() needs a path for a famdb directory!
at /opt/RepeatMasker/RepeatMasker line 682.

I have tried other similar commands but the result is the same.

I'm using a single computer with Windows 10 with WSL 2 (Ubuntu-22.04).
The command ./dfam-tetools.sh is being executed in the WSL 2.
Docker image: dfam-tetools 1.88.5

I have also tried the command:
RepeatMasker -lib /opt/RepeatMasker/Libraries -dir /tmp/ Libraries/c_elegans_chromosome_I.fasta

Because I had the error of not specifying the output folder. I'm using /tmp/ because I haven't been able to give writing permissions to any other directory.

How can I solve this issue so that I can use RepeatMasker with a sequence?

forksys: Program terminated by a signal 9.

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
forksys: Program terminated by a signal 9.
The executing command was: /opt/RepeatMasker/ProcessRepeats -lib RM_919634.MonOct101112052022/consensi.fa.classified -orifile GCA_018340775.1_ASM1834077v1_genomic.fna -maskSource /PBIO2/Taxus/communicationbiology/RM_3932828.ThuOct131112242022/GCA_018340775.1_ASM1834077v1_genomic.fna /PBIO2/Taxus/communicationbiology/RM_3932828.ThuOct131112242022/GCA_018340775.1_ASM1834077v1_genomic.fna.cat.gz
Singularity> ls CommunicationBioQC.tar.gz Cplant.03.nni
CommunicationBioQCtest Cplant.03.nog
Cplant-families.fa Cplant.03.nsq
Cplant-families.stk Cplant.nal
Cplant-rmod.log Cplant.translation
Cplant.00.nhr Crun.out
Cplant.00.nin GCA_018340775.1_ASM1834077v1_genomic.fna
Cplant.00.nnd RM_919634.MonOct101112052022
Cplant.00.nni SRR14089425
Cplant.00.nog SRR14089426
Cplant.00.nsq SRR14089427
Cplant.01.nhr SRR14089428
Cplant.01.nin SRR14089429
Cplant.01.nnd SRR14089430
Cplant.01.nni SRR14089431
Cplant.01.nog SRR14089432
Cplant.01.nsq SRR14089433
Cplant.02.nhr SRR14089434
Cplant.02.nin SRR14089435
Cplant.02.nnd SRR14089436
Cplant.02.nni SRR14453978
Cplant.02.nog SRR14453979
Cplant.02.nsq SRR14865989
Cplant.03.nhr commfastq
Cplant.03.nin repeatmaskout
Cplant.03.nnd
Singularity> RepeatMasker -nolow -no_is -norna -pa 16 -lib RM_919634.MonOct101112052022/consensi.fa.classified -dir repeatmaskout GCA_018340775.1_ASM1834077v1_genomic.fna

Above is what I got after running for a long time. I thinks something is broken. Can you help me figure it out? Thanks.

famdef and eleredef steps fail when using the docker image on a Mac

Hi :-)

Hope this finds you well! Thank you so much for the tool.

I successfully downloaded the docker image onto my Mac as per the instructions on the README page. I also manage to execute the BuildDatabase command fine, however when I try to execute the RepeatModeler command (-database <name_of_database -pa 10 -LTRStruct ), the code fails. So far it has failed twice and I'm re-running it again in case it solves itself :-)

The first time the error stated: "famdef failed. Exit code 1024"
The second time, after re-running with the -recover_Dir option: "eleredef failed. Exit code 11"

The log files appear intact, and the only clue that something is wrong comes from the Exit code details.

Have you encountered this issue with a Mac? Or is it simply a matter of trying it a couple more times to see if it runs to completion? I should say that both times it failed after round_3, so I could still use the files in this folder or from the earlier rounds, but I just wanted to see if I can run it to completion.

Any advice would be greatly appreciated!

Cheers,
kevin

Recovering run that failed while executing LTR_retriever

Used version: TETools git commit bf94dc1

Hi, I ran RepeatModeler on my assembly through the Docker container and it failed while executing LTR_retriever. The LTR_retriever.log file in the LRET_49037.WedFeb260356552020 contained the following error: RepeatMasker is not running properly!. As I have managed to complete a run of RepeatMasker through the Docker container in the past, I believe that a simple rerun of RepeatModeler from that point may solve the issue.

However, I am unable to rerun RepeatModeler using the --recoverDir option, as it tells me that the working directory appears to contain a successful run. Do you have any suggestions on how I could finish this run without starting from scratch?

Below you will find the LTR_retriever.log file, please let me know if you need any additional details from my side.

LTR_retriever.log

Perhaps CONS-Dfam_3.2 missing from container?

Describe the bug
Not sure, but maybe the RepeatMasker Libraries directory is missing a file/dir like CONS-Dfam_3.2? Should this work, or is it misguided?

To Reproduce
Using LIBDIR, which is simply a writable, private copy of /opt/RepeatMasker/Libraries. The command seems to fail because it's trying to access Libraries files it expects, but which do not exist.

$ LIBDIR=/projects/hpcrcf/mcolema5/mcurrey-repeatmasker/Libraries  singularity exec --
bind=/projects,/packages docker://dfam/tetools:1.2 RepeatMasker  -species liliopsida empty.fa
RepeatMasker version 4.1.1
Search Engine: NCBI/RMBLAST [ 2.10.0+ ]

Using Master RepeatMasker Database: /projects/hpcrcf/mcolema5/mcurrey-repeatmasker/Libraries/RepeatMaskerLib.h5
  Title    : Dfam
  Version  : 3.2
  Date     : 2020-07-02
  Families : 6,953

Species/Taxa Search:
  Liliopsida [NCBI Taxonomy ID: 4447]
  Lineage: root;cellular organisms;Eukaryota;Viridiplantae;
           Streptophyta;Streptophytina;Embryophyta;Tracheophyta
  9 families in ancestor taxa; 0 lineage-specific families

Building general libraries in: /projects/hpcrcf/mcolema5/mcurrey-repeatmasker/Libraries/CONS-Dfam_3.2/general
RepeatMasker::createLib(): Error invoking /opt/rmblast/bin/makeblastdb on file /projects/hpcrcf/mcolema5/mcurrey-repeatmasker/Libraries/CONS-Dfam_3.2/general
/is.lib.

Expected behavior
Not very familiar with RM, but was expecting it to try to extract/build the sub-library for the named species. ??

Alternatively, is there any reasonably simple way to add this to my writable copy?

Host system (please complete as much of the following information as you can find out):

  • RHEL 7.8
  • Singularity 3.3.0-1
  • Container image 1.2 (same behavior with latest)

RepeatMasker returning impossible coordinates

Describe the bug

Many repeat coordinates hits in the ".out" file are longer than the query repeat.
For example, the line below is data is from the ".out", which a hit is to a consensus repeat (DR0021180) which is 3217bp long, yet says it's a hit to 4804-5218:

15803 11.6 1.8 1.5 ctg_1 35113158 35113313 (105938908) + DR0021180 LINE/RTE-BovB 4804 5218 (195) 43774

However the likely corresponding line from the ".align" file do not have this issue:
15803 7.67 1.17 0.28 ctg_1 35110690 35112821 (105939400) DR0021180#LINE/RTE-BovB 1 2151 (1066) m_b606s001i18 43774

Of note there is are hits from other BovBs which overlap the coordinates(from .align):

1366 29.30 3.74 8.29 ctg_1 35112325 35113180 (105939041) BovB_Ml#LINE/RTE-BovB 456 1275 (174) m_b606s001i19 43774 2098 17.41 0.51 1.81 ctg_1 35112819 35113210 (105939011) DR0087524#LINE/RTE-BovB 1444 1830 (438) m_b606s001i21 43774 520 31.22 9.94 4.46 ctg_1 35112938 35113299 (105938922) DR0143386#LINE/RTE-BovB 4838 5218 (195) m_b606s001i22 43774 714 17.39 5.07 0.00 ctg_1 35113176 35113313 (105938908) DR0020736#LINE/RTE-BovB 2466 2610 (1) m_b606s001i23 43774

The library used was from

https://www.dfam.org/releases/Dfam_3.5/families/Dfam.h5.gz

To Reproduce
Steps to reproduce the behavior, including the full command line used to start the container and any commands run inside the container.

RepeatMasker -species reptilia -pa 32 genome.fasta

Expected behavior

The repeats coordinates should not be outside that of the reference sequence.

Host system (please complete as much of the following information as you can find out):
Linux 5.4.0-94-generic #106-Ubuntu
The above data is from an manual install from the RepeatMasker website, but I have had the same issue on multiple other reptile genomes using RepeatMasker installed from the docker container

Additional context
This is on a species of snake. I have an assumption this may be due to a growing level of redundancy with the reptile Dfam sequences, along with misclassified sequences.

Bump version to 2.0

Describe the bug

TETools 1.88 looks like not just another release after 1.87. Users should be given a better chance to notice that it comes with breaking changes in FamDB 1.0, and is 10 times bigger in size.

To Reproduce

docker pull dfam/tetools:1
Surprise!

Expected behavior

TETools with RepeatMasker >=4.1.6 and FamDB >=1.0 should be tagged with 2.
The tag 1 should point to 1.87, not 1.88.

Docker Image Cannot Run LTRStruct pipeline

Hello,

I am having trouble running the LTRstruct pipeline with the (latest) docker image, I received the following error:
Dependency checking: Error: The RMblast engine is not installed in RepeatMasker!

This is how I am executing repeatmodeler:
shifter --module=none --image=dfam/tetools@sha256:e1e1a6f1cd8badf25746865ca8978760f86ab13ec37684ff0fd81b5b8f37ca2c BuildDatabase -engine ncbi -name arabi Athaliana_167.fa

shifter --module=none --image=dfam/tetools@sha256:e1e1a6f1cd8badf25746865ca8978760f86ab13ec37684ff0fd81b5b8f37ca2c RepeatModeler -database arabi -threads 120 -engine ncbi -LTRStruct &> repeatmodeler.log

I am using the nersc computing system (x86_64) and shifter to pull the docker image. When I run the image, I know rmblast is there and it is in the path. So I am unsure as to why the dependency check is failing. Additionally, I dug in to the LRET logs and the makeblastdb.log has the following error: BLAST Database creation error: mdb_env_open: Function not implemented

Any guidance is much appreciated!

Singularity Container Issues

Having some issues running the container via singularity. RepeatModeler works perfectly fine, but RepeatMasker crashes.

josephguhlin@biochemcompute1 /V/a/d/g/j/a/m/M/assemblies> bash repeatmasker.sh                                                                                                                                                                (base) 
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LC_TIME = "en_NZ.UTF-8",
        LC_MONETARY = "en_NZ.UTF-8",
        LC_ADDRESS = "en_NZ.UTF-8",
        LC_TELEPHONE = "en_NZ.UTF-8",
        LC_NAME = "en_NZ.UTF-8",
        LC_MEASUREMENT = "en_NZ.UTF-8",
        LC_IDENTIFICATION = "en_NZ.UTF-8",
        LC_NUMERIC = "en_NZ.UTF-8",
        LC_PAPER = "en_NZ.UTF-8",
        LANG = "C"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
RepeatMasker version 4.1.0
Search Engine: NCBI/RMBLAST [ 2.10.0+ ]
Master RepeatMasker Database: /opt/RepeatMasker/Libraries/RepeatMaskerLib.embl ( Complete Database: CONS-Dfam_3.1 )
Custom Repeat Library: ./MH-families.fa



analyzing file mhype.assembly.fa

Checking for E. coli insertion elements

Checking for E. coli insertion elements

Checking for E. coli insertion elements

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-1.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-2.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-3.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-4.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-5.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-6.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-7.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-8.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-10.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-9.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-11.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-12.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-13.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-14.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-15.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-16.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 1 ) [ 255,, 72915]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-1.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 9 ) [ 255,, 67740]...
WARNING: Retrying batch ( 10 ) [ 255,, 33838]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 13 ) [ 255,, 38600]...

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-9.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 2 ) [ 255,, 48537]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-10.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 12 ) [ 255,, 75686]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-13.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 3 ) [ 255,, 74351]...

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: Retrying batch ( 16 ) [ 255,, 47190]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-2.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-12.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-3.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-16.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 5 ) [ 255,, 12318]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-5.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 14 ) [ 255,, 47190]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-14.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 6 ) [ 255,, 50273]...
WARNING: Retrying batch ( 4 ) [ 255,, 70609]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 7 ) [ 255,, 50274]...

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-6.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-4.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-7.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 8 ) [ 255,, 71856]...
WARNING: Retrying batch ( 11 ) [ 255,, 33838]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 15 ) [ 255,, 47190]...

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-8.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-11.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-15.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 9 ) [ 255,, 67740]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 1 ) [ 255,, 72915]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-9.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-1.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 2 ) [ 255,, 48537]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 13 ) [ 255,, 38600]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-2.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: Retrying batch ( 10 ) [ 255,, 33838]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-13.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-10.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 5 ) [ 255,, 12318]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 6 ) [ 255,, 50273]...
WARNING: Retrying batch ( 16 ) [ 255,, 47190]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 3 ) [ 255,, 74351]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-5.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-6.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 7 ) [ 255,, 50274]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-16.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 14 ) [ 255,, 47190]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-3.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-7.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-14.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 11 ) [ 255,, 33838]...

Checking for E. coli insertion elements
WARNING: Retrying batch ( 4 ) [ 255,, 70609]...
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-11.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-4.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 12 ) [ 255,, 75686]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-12.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 8 ) [ 255,, 71856]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-8.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.
WARNING: Retrying batch ( 15 ) [ 255,, 47190]...

Checking for E. coli insertion elements
WARNING: The search engine returned an error (255, status = 255 )
Engine parameters: /opt/rmblast/bin/rmblastn  -num_alignments 9999999 -db /Volumes/userdata/staff_users/josephguhlin/.RepeatMaskerCache/CONS-Dfam_3.1/general/is.lib -query /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020/mhype.assembly.fa_batch-15.masked -gapopen 12 -gapextend 2 -complexity_adjust  -word_size 15 -xdrop_ungap 34 -xdrop_gap_final 17 -xdrop_gap 8  -min_raw_gapped_score 17 -dust no  -num_threads 4  -matrix identity.matrix 
A search phase could not complete on this batch.
The batch file will be re-run and if possible the
program will resume.


FATAL ERROR: RepeatMasker giving up. One or more
batches failed!  Unfortunately this type of error
cannot be recovered from. Please submit the following
details to the feedback page at the repeatmasker
website:

       http://www.repeatmasker.org

RepeatMasker Version: 4.1.0
Library Version: CONS-Dfam_3.1
Search Engine: ncbi [ 2.10.0+ ]
Command Line: /opt/RepeatMasker/RepeatMaskermhype.assembly.fa -e ncbi -pa 16 -xsmall -html -gff -lib ./MH-families.fa
Batch Number: 3
Disk Space:
Filesystem        1K-blocks         Used   Available Use% Mounted on
archive        507380835584 427940422680 79440412904  85% /Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies

System Memory:
MemTotal:       1056114176 kB
MemFree:        865090320 kB
MemAvailable:   879192996 kB
Cached:         127629024 kB
SwapCached:            0 kB
SwapTotal:      45855740 kB
SwapFree:       45855740 kB
Further details about this problem may be found in
the directory: **/Volumes/archive/deardenlab/guhlin/johnskelly/annotations/microctonus/MH/assemblies/RM_108032.MonFeb30920372020**
/V/a/d/g/j/a/m/M/a/RM_75721.MonFeb30911162020> cat ncResults-1580674289-99751.err                                                                                                                                (base) 
Error: mdb_dbi_open: MDB_NOTFOUND: No matching key/data pair found

Default output directory cannot be modified

Describe the bug
Hi,

I cannot change the default output directory when mounting the image in singularity. It will always write to home. There should be an option to specify the output directory and/or the working directory. LOCAL is the local scratch directory. Even though I specifically try to overide the home directory, it will always write to the home directory. Please advice

To Reproduce

CWD=$(pwd)
cp repeatmodeler.sif ${LOCAL}
cp ${MYFASTA} ${LOCAL}
cd ${LOCAL}

export HOME=${LOCAL}
HOME=${LOCAL}
MOUNTDIR=/data
singularity instance start --bind $(pwd):${MOUNTDIR} repeatmodeler.sif repeatmodeler

SPECIES=myspecies

singularity exec instance://repeatmodeler export HOME=${MOUNTDIR}
singularity exec instance://repeatmodeler BuildDatabase -name ${SPECIES} ${MOUNTDIR}/$MYFASTA

expected behavoir
The user should be able to specify the output and working directories.

This is using singularity 3.7.1

Docker build from Dockerfile fails: no source files were specified

I'm trying to build the container from the Dockerfile, however I keep getting the following:

Sending build context to Docker daemon  7.168kB
Step 1/28 : FROM debian:9 AS builder
 ---> 5ddf6ebdcdb4
Step 2/28 : RUN apt-get -y update && apt-get -y install     curl gcc g++ make zlib1g-dev libgomp1     perl     python3-h5py     libfile-which-perl     libtext-soundex-perl     libjson-perl liburi-perl libwww-perl
 ---> Using cache
 ---> 9778373078cc
Step 3/28 : COPY src/* /opt/src/
COPY failed: no source files were specified

To Reproduce
docker build - < Dockerfile

Docker version: Docker version 20.10.10, build b485636

Main workstation running Ubuntu 20.04 LTS.

rmblast does not work in new docker image for TETools 1.86.

In the latest version of docker image (dfam/tetools:latest 1.86), running rmblastn gives this error message:
/opt/rmblast/bin/rmblastn: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /opt/rmblast/bin/rmblastn)

Command line fasta file scaffolds_final.fa does not exist!

Hi all,

I'm running RepeatModeler using singularity with command:

singularity run dfam-tetools-latest.sif BuildDatabase -name myDB scaffolds_final.fa

However, when I execute the command on my university's HPC, I receive the error:

Command line fasta file scaffolds_final.fa does not exist!

I should mention that when I run:

singularity run dfam-tetools-latest.sif BuildDatabase -h

I receive the help documentation, as expected. Any suggestions?

Thank you!

Dfam Browse download

Additional context
Dear developers,
I'm working with the gff3 of the C. elegans annotation in worm base parasite,
I got the rows of repetitive elements that look like this:

I RepeatMasker repeat_region 1622 1744 431 + . Target=LONGPAL1 136 261 +

I would like to get the annotation of these repeats at the class or family level using the "target" information.
I noticed that most of the targets have a hit in Dfam Browse however my gff3 has 116933 rows, Is it possible to download the table of Dfam browse?

RepeatMasker "forksys: Program terminated by a signal 7"

Hi all,

I'm running RepeatMasker using TETools through the singularity wrapper script. And it seems to be terminating at the same spot. I get the following error message while it is generating the output:

forksys: Program terminated by a signal 7. The executing command was: /opt/RepeatMasker/ProcessRepeats -lib consensi.fa.classified -gff -poly -u -source -html -xm -orifile assembly_v1.0.fasta -maskSource /home/roger/Analysis/Assembly_v1.0/RM_1798.MonMar92212332020/assembly_v1.0.fasta -xsmall /home/roger/Analysis/Assembly_v1.0/RM_1798.MonMar92212332020/assembly_v1.0.fasta.cat.gz

I'm using this command:
RepeatMasker -pa 30 -gff -html -xsmall -lib consensi.fa.classified -xm -u -poly -q -source assembly_v1.0.fasta

any ideas what the problem could be?

Same genome assembly, different RM versions: output comparison

Describe the bug
I noticed that using the newer version of rmblast 2.10.0+ and RepeatMasker4.1.1 returns less known TE families but more unknown TEs

To Reproduce
commands used with the previous version of RMblastn and RepeatMasker4.0.1 without the container:

/RepeatModeler.v2.0.1/BuildDatabase -name drosophila_XXX -engine ncbi ../../5_freeze_v0/monCan3F9.ctg.v0.fa
RepeatModeler -engine ncbi -pa 64 -database drosophila_XXX
queryRepeatDatabase.pl -species drosophila > drosophila.repeat.lib
cat consensi.fa.classified drosophila.repeat.lib > drosophila.monCan3F9.repeat.lib
/RepeatMasker-4.1.0/RepeatMasker -e ncbi -pa 64 -s -lib drosophila.monCan3F9.repeat.lib -dir repeatmasker_final -xsmall -html -gff ../../5_freeze_v0/monCan3F9.ctg.v0.fa

commands used with the newer singularity container version:

singularity exec --bind $PWD:$PWD ../RepeatModeler/tetools.sif BuildDatabase -name monCan3F9 monCan3F9.fa
singularity exec --bind $PWD:$PWD ../RepeatModeler/tetools.sif RepeatModeler  -database monCan3F9 -pa 40 -LTRStruct 
singularity> famdb.py -i Libraries/RepeatMaskerLib.h5 families --format fasta_name --include-class-in-name --ancestors --descendants 7215 > dorosphila.rm.fa
singularity > famdb.py -i Libraries/RepeatMaskerLib.h5 families --format fasta_name --include-class-in-name --ancestors --descendants 32281 > drosophila.subgenus.rm.fa
 cat monCan3F9-families.fa drosophila.rm.fa drosophila.subgenus.rm.fa > drosophila.monCan3F9_newlib.fa
singularity exec --bind $PWD:$PWD ../RepeatModeler/tetools.sif RepeatMasker -lib drosophila.monCan3F9_newlib.fa  -dir repeatmasker_new  monCan3F9.fa -engine ncbi -pa 40 -nolow

Expected behavior
I expected to recover higher TEs than the older version since the total size of my new lib was even bigger in the newer version (8,23 MG) than the older one (7,12 MG) and the input file is the same. Instead the total number of masked bases and known TEs are lower and only unknown TEs are higher.

below is the comparison of output tables:
capture4

p.s. I have tried masking with several different combinations; running RM_4.1.1; with only 7215 library, and with the RM4.0.1 output library, running RM_4.0.1 with the new (4.1.1) library... I could recover slightly more unclassified TEs but the known families are always less than old approach.

p.s.s I have added and merged the RepBase library to the RepeatMasker_4.1.1 inside the container, but it doesn't indicate it in the output table file, I don't know if it is normal?

LTRStruct can't find RMblast engine

Hello,

I ran singularity exec repmasker.simg RepeatModeler -database mydb -LTRStruct. I have a families.fa file, but my output log says the following:

LTR Structural Analysis

Running LtrHarvest... : 00:00:30 (hh:mm:ss) Elapsed Time
Running Ltr_retriever...LTRPipeline: No results after LTR_Retriever filtering.
LTRPipeline Time: 00:00:35 (hh:mm:ss) Elapsed Time

RepeatClassifier Version 2.0.1

Search Engine = rmblast

  • Looking for Simple and Low Complexity sequences..
  • Looking for similarity to known repeat proteins..
  • Looking for similarity to known repeat consensi..
    Classification Time: 00:15:23 (hh:mm:ss) Elapsed Time

The LTR_retriever.log says the following:
Dependency checking: Error: The RMblast engine is not installed in RepeatMasker!

I'm not sure (a) how the program can't find a dependency within a container or (b) how rmblast worked for RepeatClassifier but not the LTR pipeline.

Host system
-linux x86.64
-singularity 3.6.4
-container installed as follows:
singularity build repmasker.simg docker://dfam/tetools:latest
(I couldn't get a successful install with the wrapper script)

Thank you

BuildDatabase step problem

Hello, TEtools team

Describe the bug
BuildDatabase command dont work well this time.

To Reproduce
BuildDatabase -name JEC21 /data/JEC21/db/JEC21.fasta

Expected behavior
BuildDatabase: Cannot open file JEC21.translation
I'm confused because I can run this last time. I just use a new *.fasta file. And im sure that the file is same with the one as before.

Host system (please complete as much of the following information as you can find out):

  • uname -a
  • Linux bbc0927150e3 5.10.47-linuxkit # 1 SMP Sat Jul 3 21:51:47 UTC 2021 x86_64 GNU/Linux
  • lsb_release -a
  • No LSB modules are available
  • Distributor ID: Denian
  • Description: Debian GNU/Linux 9.13(stretch)
  • Release:9.13 Codename:stretch
  • docker --version
  • 20.10.10.build b485636

Looking for your reply
Best Regards!

error of repeatmasker in container

I used the container with RepeatModeler Version 2.0.5 and RepeatMasker version 4.1.5. The RepeatModeler was successfully completed, while RepeatMasker reported an error. here is my error message, what can I do to make it work?

Standard Output:
RepeatMasker version 4.1.5
Search Engine: NCBI/RMBLAST [ 2.14.1+ ]

Standard Error:
Taxonomy::new() needs a path for a famdb file!
at /opt/RepeatMasker/RepeatMasker line 658.

runcoseg.pl -d -m 50 -c ALU.cons -s ALU.seqs -i ALU.ins

After l done the RepeatMasker genome1.fa [-lib library.fa] [-pa 8].
How can I use the next command? runcoseg.pl -d -m 50 -c ALU.cons -s ALU.seqs -i ALU.ins
This is the result document I have at present!
genome1.fa genome1.fa.cat.gz genome1.fa.masked genome1-families.fa genome1-families.stk genome1.fa.out genome1.fa.tbl genome1.nhr genome1.nin genome1.nnd genome1.nni genome1.nog genome1.nsq genome1.translation
thank you!

Customizing RepeatMasker libraries: Absent

Hello,
thank you for tools, and I had some errors when I tried to customize my own repeat masker libraries.
Now, I use this tools from singularity:
singularity pull dfam-tetools-latest.sif docker://dfam/tetools:latest

Then, I used this command to use singularity container.
singularity run dfam-tetools-latest.sif

Next, I copied the opt/ReaPeatMasker/Libraries to /home/$USER and download dfam38_full.8.h5.gz in the folder after exit.
cp -r Libraries /home/$USER
exit
cd /home/$USER/Libraries/famdb
wget https://www.dfam.org/releases/Dfam_3.8/families/FamDB/dfam38_full.8.h5.gz && gunzip dfam38_full.8.h5.gz
chown -R $USER ./Libraries

And I mounted my home directory to /work inside the container:
cd /home/$USER
singularity run -B $(pwd):/work dfam-tetools-latest.sif

Finally, I tried to regenerate my own libraries:
cd /opt/RepeatMasker
./tetoolsDfamUpdate.pl
but I had a log like this. It seems the tools cannot find Partition 8.

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = "en_US.utf-8",
LANG = "C"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

Checking for libraries...

  • Found a FamDB root partition

Building RMBlast frozen libraries..
The program is installed with a the following repeat libraries:

FamDB Directory : /opt/RepeatMasker/Libraries/famdb
FamDB Generator : famdb.py v1.0
FamDB Format Version: 1.0
FamDB Creation Date : 2023-11-15 11:30:15.311827

Database: Dfam
Version : 3.8
Date : 2023-11-14

Dfam - A database of transposable element (TE) sequence alignments and HMMs.

1 Partitions Present
Total consensus sequences present: 295552
Total HMMs present : 295552

Partition Details

Partition 0 [dfam38_full.0.h5]: root - Mammalia, Amoebozoa, Bacteria , Choanoflagellata, Rhodophyta, Haptista, Metamonada, Fungi, Sar, Placozoa, Ctenophora , Filasterea, Spiralia, Discoba, Cnidaria, Porifera, Viruses
Consensi: 295552, HMMs: 295552

Partition 1 [ Absent ]: Obtectomera

Partition 2 [ Absent ]: Euteleosteomorpha

Partition 3 [ Absent ]: Sarcopterygii - Sauropsida, Coelacanthimorpha, Amphibia, Dipnomorpha

Partition 4 [ Absent ]: Diptera

Partition 5 [ Absent ]: Viridiplantae

Partition 6 [ Absent ]: Deuterostomia - Chondrichthyes, Hemichordata, Cladistia, Holostei, Tunicata, Cephalochordata, Cyclostomata , Osteoglossocephala, Otomorpha, Elopocephalai, Echinodermata, Chondrostei

Partition 7 [ Absent ]: Hymenoptera

Partition 8 [ Absent ]: Ecdysozoa - Nematoda, Gelechioidea, Yponomeutoidea, Incurvarioidea, Chelicerata, Collembola, Polyneoptera, Tineoidea, Apoditrysia, Monocondylia, Strepsiptera, Palaeoptera, Neuropterida, Crustacea, Coleoptera, Siphonaptera, Trichoptera, Paraneoptera, Myriapoda, Scalidophora

Further documentation on the program may be found here:
/opt/RepeatMasker/repeatmasker.help

Is there anything wrong with the progress of the task? How can I fix it?

hangup error on round5 of RepeatModeler on singularity sif v1.8, v1.85

Describe the bug
A clear and concise description of what the bug is.

The container .sif release version v1.8, and v1.85 have error on round5 on Singularity CE v4.0.0 and singularity 3.11
Platform: Linux Ubuntu 20.04, a single computer
command: singularity --debug exec --bind /usr/lib/locale/,/media/xx/data12T /media/xx/data12T/tetools/dfam-tetools-latest.sif RepeatModeler -database genome1 -LTRStruct -threads 20

error inforamtion:
Warning: [Query 4144-4145] Query_1 gi|4144 gi|9:40001-80000: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options

tetools_sif.sh: line 74: 29800 Hangup singularity --debug exec --bind /usr/lib/locale/,/media/xx/data12T /media/xx/data12T/tetools/dfam-tetools-latest.sif RepeatModeler -database genome1 -LTRStruct -threads 20

TRF path error while running a Docker

Dear,
i was trying to install Repeatmodeler2 via docker container. but i am facing a TRF PATH error problem. i am attaching a screenshot of that, please help me to solve this issue.

TRF ERROR

Error running repeatmodeler in container

Hi,

I installed repeatmodeler using singularity pull as follows:

singularity pull dfam-tetools-latest.sif docker://dfam/tetools:latest
singularity run dfam-tetools-latest.sif

Then I have run this (in Ubuntu):

Apptainer> BuildDatabase -engine rmblast -name genome_db genome.fa

But got this error:

/usr/bin/perl: symbol lookup error: /home/mani/local/.t_coffee/perl/lib/perl5/x86_64-linux-gnu-thread-multi/auto/Encode/Encode.so: undefined symbol: Perl_xs_apiversion_bootcheck

Any solutions?

reasonaTE "https://github.com/DerKevinRiehl/transposon_annotation_reasonaTE"

Dear jebrosen

Thank you for tools. If I use this docker image, can I run the following comment

  1. RepeatModeler -help
    BuildDatabase -name demo_index -engine ncbi demo.fasta
    RepeatModeler -engine ncbi -pa 10 -database demo_index

  2. RepeatMasker -help
    RepeatMasker -pa 10 demo.fasta

Because I want to combine the results with _reasonaTE to get GFF3.

Please give me your suggestion.

Problems configuring RepeatClasifier on docker.

When classifying repeats for Daphnia pulex, a Crustacean, I get >95% unknown families. I've downloaded dfam38_full.8.h5, and have dfam38_full.0.h5 in the RepeatMasker directory. I delete rmlib.config, and when I run ./tetoolsDfamUpdate.pl it seems to correctly detect the dfam library. I set the environmental variable, but when I run RepeatClassifier I still get mostly unknowns. Does RepeatClassifier use RepeatMasker's library? Is there some step I am forgetting? Thanks in advance!

ERRO[10728] error waiting for container: unexpected EOF

l print the "RepeatModeler -pa 10 -database Test -LTRStruct >& 1.run.out"

##The log showed "Comparison Time: 00:07:52 (hh:mm:ss) Elapsed Time, 1368539 HSPs Collected

  • RECON: Running imagespread..
    RECON Elapsed: 00:00:02 (hh:mm:ss) Elapsed Time
  • RECON: Running initial definition of elements ( eledef )..
    RECON Elapsed: 00:00:14 (hh:mm:ss) Elapsed Time
  • RECON: Running re-definition of elements ( eleredef ).."

ERRO[10728] error waiting for container: unexpected EOF

l do not know how to slove it, can you help me?
Thank you!

addRepbase.pl: no such file

Describe the bug

(dfam-tetools /data/fungi/db)$ tar -x -f RepBaseRepeatMaskerEdition-20181026.tar.gz
(dfam-tetools /data/NJ103/db)$ ls Libraries/
README.RMRBSeqs  RMRBSeqs.embl 
(dfam-tetools /data/NJ103/db)$ addRepBase.pl -lib Libraries/
Rebuilding RepeatMaskerLib.h5 master library
  - Read in 49011 sequences from Libraries//RMRBSeqs.embl
    Reading metadata database...EMBL::_parseFromFile() Unable to open RepeatMasker/EMBL file Libraries//RMRBMeta.embl: No such file or directory

I use TEtools 1.7 for docker.
Looking forward your reply!
Best Wishes!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.