Giter Club home page Giter Club logo

metablast's People

Contributors

maestsi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

vikash84

metablast's Issues

recover sequences by taxonomy

Dear Simone,
Thanks for developing this useful tool,
I would like to ask you if you could implement it with a script to extract blasted sequences based on the taxonomy of the collapsed table?!
Cheers,
Lapo

Problem running MetaBlast using your docker container

Hi,

We run all of our software on our compute cluster using docker containers. I attempted to use your docker implementation of MetaBlast at (maestsi/metablast:latest), but I am encountering some errors. I followed the procedure described in your documentation for downloading the nt_db from NCBI and that is what I am blasting against. I have a fasta file of 150 bp illumina reads that I am attempting to blast. These are reads from a human sample that are unmapped against the CHM13 human reference sequence. I am trying to use MetaBlast as a contamination screen to determine what these unmapped reads are.

Below are the steps that I am following:

  1. We run everything in an LSF cluster. This is the process that I used on our system to interactively pull down your docker container into an interactive shell:
LSF_DOCKER_PRESERVE_ENVIRONMENT=false LSF_DOCKER_VOLUMES="/storage1/fs1/hprc/Active:/storage1/fs1/hprc/Active /scratch1/fs1/hprc:/scratch1/fs1/hprc" bsub -n 32 -Is -G compute-hprc -q ccdg-interactive -R "select[mem>260000] rusage[mem=260000] span[hosts=1]" -a 'docker(maestsi/metablast:latest)' /bin/bash
Job <525263> is submitted to queue <ccdg-interactive>.
<<Waiting for dispatch ...>>
<<Starting on compute1-exec-224.ris.wustl.edu>>
latest: Pulling from maestsi/metablast
Digest: sha256:800406014a57958091bd34a82dffdc3623925b4300bee230c9d411912398b37b
Status: Image is up to date for maestsi/metablast:latest
docker.io/maestsi/metablast:latest
(base) ctomlins@compute1-exec-224:~$ 
  1. Next I ran the command to activate the MetaBlast_env. I then typed which blastn to show where the blastn executable is being found within the container:
(base) ctomlins@compute1-exec-224:~$ conda activate MetaBlast_env
(MetaBlast_env) ctomlins@compute1-exec-224:~$ which blastn 
/opt/conda/envs/MetaBlast_env/bin/blastn
  1. I then cd to the directory location on my filesystem where I want the MetaBlast output files to be generated and run the following command. The fasta file of reads, HG00621_replicate_1_liftoff_CHM13.UNMAPPED.fasta, is in my working directory:

(MetaBlast_env) ctomlins@compute1-exec-224:/storage1/fs1/hprc/Active/shared/HPRC_Annotation_PhaseI/Contamination_Screen/TEST_1_HG00621_1_CHM13_liftoff_Screen$ /home/tools/MetaBlast/MetaBlast.sh -f HG00621_replicate_1_liftoff_CHM13.UNMAPPED.fasta -db /storage1/fs1/hprc/Active/shared/HPRC_Annotation_PhaseI/Contamination_Screen/NCBI_nt_db/nt

Below is what is written to standard out (prints to the terminal) immediately after executing the above command:

/home/tools/MetaBlast/config_MetaBlast.sh: line 32: -: No such file or directory
/home/tools/MetaBlast/config_MetaBlast.sh: line 33: -: No such file or directory
Fasta reads: /storage1/fs1/hprc/Active/shared/HPRC_Annotation_PhaseI/Contamination_Screen/TEST_1_HG00621_1_CHM13_liftoff_Screen/HG00621_replicate_1_liftoff_CHM13.UNMAPPED.fasta
Blast indexed db: /storage1/fs1/hprc/Active/shared/HPRC_Annotation_PhaseI/Contamination_Screen/NCBI_nt_db/nt
/home/tools/MetaBlast/MetaBlast.sh: line 59: bc: command not found

Then this citation notice for parallel prints to stdout, immediately followed by a large number of lines of
/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory

Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

  Tange, O. (2022, February 22). GNU Parallel 20220222 ('Donetsk Luhansk').
  Zenodo. https://doi.org/10.5281/zenodo.6213471

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

More about funding GNU Parallel and the citation notice:
https://www.gnu.org/software/parallel/parallel_design.html#Citation-notice

To silence this citation notice: run 'parallel --citation' once.

/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory

And then after several thousand lines of /bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
This below prints to stdout and then the job just hangs:

/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
/bin/bash: line 1: /envs/MetaBlast_env/bin/blastn: No such file or directory
Option j requires an argument
Usage:

parallel [options] [command [arguments]] < list_of_arguments
parallel [options] [command [arguments]] (::: arguments|:::: argfile(s))...
cat ... | parallel --pipe [options] [command [arguments]]

-j n            Run n jobs in parallel
-k              Keep same order
-X              Multiple arguments with context replace
--colsep regexp Split input on regexp for positional replacements
{} {.} {/} {/.} {#} {%} {= perl code =} Replacement strings
{3} {3.} {3/} {3/.} {=3 perl code =}    Positional replacement strings
With --plus:    {} = {+/}/{/} = {.}.{+.} = {+/}/{/.}.{+.} = {..}.{+..} =
                {+/}/{/..}.{+..} = {...}.{+...} = {+/}/{/...}.{+...}

-S sshlogin     Example: [email protected]
--slf ..        Use ~/.parallel/sshloginfile as the list of sshlogins
--trc {}.bar    Shorthand for --transfer --return {}.bar --cleanup
--onall         Run the given command with argument on all sshlogins
--nonall        Run the given command with no arguments on all sshlogins

--pipe          Split stdin (standard input) to multiple jobs.
--recend str    Record end separator for --pipe.
--recstart str  Record start separator for --pipe.

GNU Parallel can do much more. See 'man parallel' for details

Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

  Tange, O. (2022, February 22). GNU Parallel 20220222 ('Donetsk Luhansk').
  Zenodo. https://doi.org/10.5281/zenodo.6213471

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

I was wondering if you had any suggestions of what I might need to modify in order to get this to run on my system?
I have not modified anything within your maestsi/metablast:latest docker. I am attempting to rely on all of the code stored the docker (none of the code is installed on my local system). Do I need to modify the config_MetaBlast.sh file within your docker in order to get this to work. It seems to have the correct settings already specified.

Any assistance that you could provide to get this up and running would be very helpful.

Best,
Chad

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.