Giter Club home page Giter Club logo

Comments (4)

jebrosen avatar jebrosen commented on August 11, 2024

This is very unfortunate - one of the main goals of the container is to avoid this sort of problem!

The LTR_retriever.log says the following:
Dependency checking: Error: The RMblast engine is not installed in RepeatMasker!

This error message is printed when LTR_retriever attempts to detect if the -e rmblast option will work, by running RepeatMasker -e rmblast on a sample file and checking if it completes. I have seen this fail before, but we test the images before publishing to make sure we don't release the container with errors like this one. It is also working fine for me in my tests so far with -LTRStruct.

I'm not sure (a) how the program can't find a dependency within a container or (b) how rmblast worked for RepeatClassifier but not the LTR pipeline.

LTR_Retriever (used in the LTR search pipeline) has its own separate configuration file; in our container image it's hardcoded to the correct paths. My best guess so far (since it has worked elsewhere before) is that something is wrong with the way LTR_retriever is testing that happened to fail on your machine, or something else in the environment (e.g. out of memory, disk space). Some other examples I have seen before are using a network filesystem instead of a local disk, or memory or CPU usage quotas common on some job-batching systems.

Here are a few things you could try right away with what you already have:

  • Can you check that you have a file named dummy060817.fa.<random>? It should be in the same directory that you found LTR_retriever.log, and it should have a single line of sequence. This is the test file that LTR_retriever uses to determine if RepeatMasker can run.
  • Run the program again, just in case you were unlucky. You can also run LTRPipeline genome.fa standalone to skip RepeatModeler.
  • Try the next preview release of the container. Neither RepeatMasker nor LTR_retriever changed, but it's possible that a bug was fixed in one of the base OS packages (e.g. perl interpreter or libraries). This image is tagged dfam/tetools:1.3-beta-1 on Docker Hub.

Further troubleshooting would need some modifications to the LTR_retriever program to have it log more details about what it is testing or how it is failing; I can prepare some kind of patch or alternative container image to assist with this if we need it.


(I couldn't get a successful install with the wrapper script)

If you don't mind sharing, what went wrong with that approach? If there is anything we can fix or change to make the wrapper script more usable, we definitely want to do that too.

from tetools.

LRFreeborn avatar LRFreeborn commented on August 11, 2024

Thanks for the reply. My responses are interleaved below.

This error message is printed when LTR_retriever attempts to detect if the -e rmblast option will work, by running RepeatMasker -e rmblast

Out of curiosity, I tried RepeatMasker -e rmblast with a random fasta and got the following error:
Building general libraries in: /N/u/layfreeb/Carbonate/.RepeatMaskerCache/CONS-Dfam_3.2/general
RepeatMasker::createLib(): Error invoking /opt/rmblast/bin/makeblastdb on file /N/u/layfreeb/Carbonate/.RepeatMaskerCache/CONS-Dfam_3.2/general/is.lib.

I tried binding this path to is.lib to my .simg but got the same error.

Can you check that you have a file named dummy060817.fa.? It should be in the same directory that you found LTR_retriever.log, and it should have a single line of sequence. This is the test file that LTR_retriever uses to determine if RepeatMasker can run.

Yes, I do.

Run the program again, just in case you were unlucky. You can also run LTRPipeline genome.fa standalone to skip RepeatModeler.

I ran singularity exec repmasker.simg LTRPipeline my.fasta but got the same RMBlast engine error.

Try the next preview release of the container. Neither RepeatMasker nor LTR_retriever changed, but it's possible that a bug was fixed in one of the base OS packages (e.g. perl interpreter or libraries). This image is tagged dfam/tetools:1.3-beta-1 on Docker Hub.

Unfortunately, this didn't fix the problem.

If you don't mind sharing, what went wrong with that approach? If there is anything we can fix or change to make the wrapper script more usable, we definitely want to do that too.

I'm including a screenshot of what happens when I try using the wrapper script. I tried clearing my cache with singularity cache clean but that didn't seem to work.

And full disclosure, I've just started using containers, so it's totally possible I'm missing something obvious to get the dfam-tetoos.sh working!

Thank you!

Screen Shot 2021-02-02 at 3 53 33 PM

from tetools.

LRFreeborn avatar LRFreeborn commented on August 11, 2024

I installed the non-containerized version of repeatmodeler, including all the LTRPipeline dependencies. Oddly enough, I got the same error about the RMBlast engine.

With the non-containerized version, I can run ltr retriever and LTRPipeline on their own just fine, using my original genome file, e.g.
LTR_retriever -genome mygenome.fasta -inharvest raw-struct-results.txt -noanno
and
LTRPipeline mygenome.fasta

Do you have any idea why I can run these separately but not with -LTRStruct?

From what I can tell looking through GitHub, there's no easy way to combine LTRPipeline with a previous RM run. Is this still true?

Thank you!

from tetools.

jebrosen avatar jebrosen commented on August 11, 2024

I'm including a screenshot of what happens when I try using the wrapper script. I tried clearing my cache with singularity cache clean but that didn't seem to work.

I have a guess for why this happened. The default behavior of singularity is to reuse some of your environment from the host system, including your $HOME which includes .bash_profile and/or .bashrc. If you or a system administrator has commands such as module load in any of these files, that command can fail when running inside the container since module is not available.

I think you are not seeing this issue with your own commands because you are running a command directly (e.g. singularity exec file.simg RepeatMasker). You can actually do this with dfam-tetools.sh too: dfam-tetools.sh -- RepeatMasker ... . We are missing any documentation in the README about this ability, which I can add - especially if it is helpful for you or solves that original problem!

Another solution might be to use the --no-home flag to avoid using the host system .bashrc inside the container: https://sylabs.io/guides/3.7/user-guide/bind_paths_and_mounts.html#using-no-home-and-containall-flags. You could try this yourself by adding it to the singularity exec line around line 128 of dfam-tetools.sh. We might add this option to dfam-tetools.sh in the future, since it is a bit closer to the way the container is run if you are using docker. However, some users might prefer the default behavior so we may need to consider other options or approaches.


As to this issue you saw with the other container image. You will probably still have this problem if you use dfam-tetools.sh, depending on the reason it failed.

Out of curiosity, I tried RepeatMasker -e rmblast with a random fasta and got the following error:
Building general libraries in: /N/u/layfreeb/Carbonate/.RepeatMaskerCache/CONS-Dfam_3.2/general
RepeatMasker::createLib(): Error invoking /opt/rmblast/bin/makeblastdb on file /N/u/layfreeb/Carbonate/.RepeatMaskerCache/CONS-Dfam_3.2/general/is.lib.

Do you have a file /N/u/layfreeb/Carbonate/.RepeatMaskerCache/CONS-Dfam_3.2/general/rmblastdb.log? It might include the original cause of the error. This is another issue that might be fixed or fail in a different way if you use the --no-home flag with singularity.


I installed the non-containerized version of repeatmodeler, including all the LTRPipeline dependencies. Oddly enough, I got the same error about the RMBlast engine.

With the non-containerized version, I can run ltr retriever and LTRPipeline on their own just fine, using my original genome file, e.g.
LTR_retriever -genome mygenome.fasta -inharvest raw-struct-results.txt -noanno
and
LTRPipeline mygenome.fasta

Do you have any idea why I can run these separately but not with -LTRStruct?

This is surprising to me! -LTRStruct should not be too different from running LTRPipeline individually, since it simply runs LTRPipeline as one of the steps. I wonder if this is the same or another problem with running makeblastdb; we have been getting more and more reports of failures related to these steps on some machines and environments but only for some users.

From what I can tell looking through GitHub, there's no easy way to combine LTRPipeline with a previous RM run. Is this still true?

This is still true (sorry to say!); we have not yet had the time and attention for that specific issue.

from tetools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.