Giter Club home page Giter Club logo

aaftf's People

Contributors

gamcil avatar hyphaltip avatar nextgenusfs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

aaftf's Issues

masurca polca.sh breaks with recent samtools

polca.sh (part of @alekseyzimin/masurca) appears to need an older version of samtools, I have provided a fix for this in current code patches folder which can be used to update in place the polca.sh script installed with masurca.

Support single-end and interleaved fastq data

Support single ended read data for cleanup and assembly.

Support interleaved read data - may still have to split into fwd and rev but the program should do this not require the user.

AAFTF pipeline always fails at vecscreen step

I had no issues running each step of the pipeline independently, but when I tried to run AAFTF in pipeline mode it always failed at the vecscreen step. Turns out there is just a simply typo on line 137 in the pipeline.py script. Can you remove the space?

# please change
    if not checkfile(basename + ' .vecscreen.fasta'):
# to
    if not checkfile(basename + '.vecscreen.fasta'):

Generate command list as top-level running

Running the tool with all-in-one options so that it generates a BASH / Makefile or script to run all the sub-pieces so that user does not have to write script to do all the steps.

Support "smart" restart options in the pipeline so that previously run steps are not re-done.

This seems like a solution for snakemake or makefiles instead of shell scripts....

AAFTF pipeline exiting when mito fails

Hello,

Is it possible to run the AAFTF pipeline, skipping the mito step? I get an error from Novoplasty about invalid seed, which is not true, so it might be that no mitochondrial reads are present. Is there a way to make the pipeline ignore the mito step and continue with the rest?

Thank you in advance.

aaftf filter URL issue

Hello! I tried to run the script for aaftf filter but terminal gave me always the same error:
Running AAFTF v0.5.0
error with url https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/819/615/GCF_000819615.1_ViralProj14015/GCF_000819615.1_ViralProj14015_genomic.fna.gz aaftf-filter_85caa12f/GCF_000819615.1_ViralProj14015_genomic.fna.gz
Traceback (most recent call last):
File "/mnt/home/dematth2/anaconda3/envs/aaftf/bin/AAFTF", line 8, in
sys.exit(main())
File "/mnt/home/dematth2/anaconda3/envs/aaftf/lib/python3.7/site-packages/AAFTF/AAFTF_main.py", line 1113, in main
args.func(parser, args)
File "/mnt/home/dematth2/anaconda3/envs/aaftf/lib/python3.7/site-packages/AAFTF/AAFTF_main.py", line 53, in run_subtool
submodule.run(parser, args)
File "/mnt/home/dematth2/anaconda3/envs/aaftf/lib/python3.7/site-packages/AAFTF/filter.py", line 65, in run
earliest_file_age = os.path.getctime(acc_file)
File "/mnt/home/dematth2/anaconda3/envs/aaftf/lib/python3.7/genericpath.py", line 65, in getctime
return os.stat(filename).st_ctime
FileNotFoundError: [Errno 2] No such file or directory: 'aaftf-filter_85caa12f/GCF_000819615.1_ViralProj14015_genomic.fna.gz'
======= This is where my script ends! =========

I tried to modify the link because, if i delete the last part and i copy on my internet page i can download the Escherichia coli's sequence but never has changed on the terminal...i obtained the same error.
I tried also to modify the script first by adding the genome directory (before i downloaded it) with the options -u but it keeps insisting on the URL and then i tried to add the right URL instead, but it keeps giving an error on that initial URL.

Thank you 👍

error when running AAFTF filter and AAFTF vecscreen

Hi, dear AAFTF team,
When I running command of AAFTF filter and AAFTF vecscreen, I stuck at trouble as following:

Traceback (most recent call last):
  File "/home/liangdong/opt/anaconda3/bin/AAFTF", line 8, in <module>
    sys.exit(main())
  File "/home/liangdong/opt/anaconda3/lib/python3.10/site-packages/AAFTF/AAFTF_main.py", line 936, in main
    args.func(parser, args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/site-packages/AAFTF/AAFTF_main.py", line 47, in run_subtool
    submodule.run(parser, args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/site-packages/AAFTF/vecscreen.py", line 285, in run
    urllib.request.urlretrieve(url, file)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 241, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

I'm not sure what is happened during this procession, is my internet connection issue? if yes, is there any proxy or mirror site I can use in china?
anyway, here is my command:

(1)AAFTF filter -c 16 --memory 48 --aligner bbduk -o ${sp}_filter --left ${sp}_trim_1P.fastq.gz --right ${sp}_trim_2P.fastq.gz --pipe --AAFTF_DB ./ref_genome
(2)AAFTF vecscreen -c 16 -i $sp.spades.assembly.fa -o $sp.assembly.vecscreen.out -s high --pipe

and following is my python version and installation path:
python version: 3.10.9 (main, Mar 1 2023, 18:23:06) [GCC 11.2.0] on linux
installation path: /home/liangdong/opt/anaconda3/bin/python
I installed AAFTF by pip install

Thanks and best regards

spades assembly failed

​Hello,

I am trying to run the AAFTF pipeline to assemble several Pleurotus genomes (testing it on 1 genome only) and I thought to run it as a pipeline at first. I am getting this error below. Spades seems to fail, but I cannot find any spades .log file anywhere. What do you think?
I am running it in HPC that uses SLURM, please see the slurm output and the submitted sbatch file attached.

Additionally,

  1. I am not totally sure the difference between these options (see AAFTF piepieline -h option below) :
    --tmpdir TMPDIR Assembler temporary dir and -w WORKDIR, --workdir WORKDIR temp directory`
  2. and how to pass parameters to spades using the --assembler_args ASSEMBLER_ARGS Additional SPAdes/Megahit arguments if it is possible, for example, different kmer sizes etc.
    Please let me know if you me to put these in a different issue ticket.
    Thanks much!
    Gian
benucci@dev-amd20 code]$ conda activate aaftf
(aaftf) [benucci@dev-amd20 code]$ AAFTF pipeline -h
usage: AAFTF pipeline [-h] [-q] [--tmpdir TMPDIR] [--assembler_args ASSEMBLER_ARGS] [--method METHOD] -l LEFT [-r RIGHT] -o BASENAME [-c cpus]
                      [-m MEMORY] [-ml MINLEN] [-a [SCREEN_ACCESSIONS ...]] [-u [SCREEN_URLS ...]] [-it ITERATIONS] [-mc MINCONTIGLEN]
                      [--AAFTF_DB AAFTF_DB] [-w WORKDIR] [-v] -p PHYLUM [PHYLUM ...] [--sourdb SOURDB] [--mincovpct MINCOVPCT]

Run entire AAFTF pipeline automagically

options:
  -h, --help            show this help message and exit
  -q, --quiet           Do not output warnings to stderr
  --tmpdir TMPDIR       Assembler temporary dir
  --assembler_args ASSEMBLER_ARGS
                        Additional SPAdes/Megahit arguments
  --method METHOD       Assembly method: spades, dipspades, megahit
  -l LEFT, --left LEFT  left/forward reads of paired-end FASTQ or single-end FASTQ.
  -r RIGHT, --right RIGHT
                        right/reverse reads of paired-end FASTQ.
  -o BASENAME, --out BASENAME
                        Output basename, default to base name of --left reads
  -c cpus, --cpus cpus  Number of CPUs/threads to use.
  -m MEMORY, --memory MEMORY
                        Memory (in GB) setting for SPAdes. Default is Auto
  -ml MINLEN, --minlen MINLEN
                        Minimum read length after trimming, default: 75
  -a [SCREEN_ACCESSIONS ...], --screen_accessions [SCREEN_ACCESSIONS ...]
                        Genbank accession number(s) to screen out from initial reads.
  -u [SCREEN_URLS ...], --screen_urls [SCREEN_URLS ...]
                        URLs to download and screen out initial reads.
  -it ITERATIONS, --iterations ITERATIONS
                        Number of Pilon Polishing iterations to run
  -mc MINCONTIGLEN, --mincontiglen MINCONTIGLEN
                        Minimum length of contigs to keep
  --AAFTF_DB AAFTF_DB   Path to AAFTF resources, defaults to $AAFTF_DB
  -w WORKDIR, --workdir WORKDIR
                        temp directory
  -v, --debug           Provide debugging messages
  -p PHYLUM [PHYLUM ...], --phylum PHYLUM [PHYLUM ...]
                        Phylum or Phyla to keep matches, i.e. Ascomycota
  --sourdb SOURDB       SourMash LCA k-31 taxonomy database
  --mincovpct MINCOVPCT
                        Minimum percent of N50 coverage to remove

aaftf_piperun.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.