Giter Club home page Giter Club logo

ioga's Introduction

IOGA

Iterative Organellar Genome Assembly

⚠️ This project is no longer maintained, and no updates or new versions will be released ⚠️

IOGA was used to assemble chloroplast genomes for a range of herbarium samples, and was published in The Biological Journal of the Linnean Society on 07/08/2015. This repository contains the code that was used for the paper (specifically, code at commit b65d22a14cffdc72f295c85f9e02ee8d5f923d5b), and mainly serves as documentation.

If you use IOGA, please cite:

Bakker et al. 2015, Herbarium genomics: plastome sequence assembly from a range of herbarium specimens using an Iterative Organelle Genome Assembly pipeline, Biol. J. Linnean Soc.


  • Typical runtime on 4 threads is ~20minutes.
  • Written in Python.
  • Uses the BBmap suite to map reads and to do quality-filtering/adapter-trimming. - Comes with a script to download and install dependencies: setup_IOGA.py

Dependencies: Python2, BioPython, BBmap, SOAPdenovo2, SeqTK, SPAdes.py, ALE, Samtools 0.1.19, Picardtools

INSTALL:

  • run setup_IOGA.py to download dependencies, this creates IOGA_config.json
  • run IOGA.py -h

NOTES:

  • BBmap outputs per contig coverage stats, this can be used to determine chloroplast inverted repeats
  • A final step that blasts the assembly agains the input reference to filter out contigs with no hits at all is still required
  • Random subsampling to counter excessive coverage is not implemented. If your sample has a lot of organellar reads, you probably want to reduce the number of reads to work with. This is generally the case, and it also speeds things up considerably, so you might want to do it anyway. Use seqtk sample [reads.fastq] 1000000 > [1million.reads.fastq]to reduce excessive coverage.

ioga's People

Contributors

holmrenser avatar rvosa avatar saravandekerke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ioga's Issues

Running error

Hi Renser,

When I was using this command "python IOGA.py -r ~/reference/reference.fasta -n name -1 file_1.fq -2 file_2.fq -i 300 -t 5 -m 0 -v" to run, an error was reported as below:

Traceback (most recent call last):
File "/lustre/apps/bio-software/IOGA/IOGA.py", line 432, in
main(args.reference,args.name,args.forward,args.reverse,args.threads,args.insertsize,args.maxrounds,args.verbose)
File "/lustre/apps/bio-software/IOGA/IOGA.py", line 377, in main
source,FP,RP,final_iteration = IOGA_loop(name,ref,forward,reverse,insertsize,threads,maxrounds)
File "/lustre/apps/bio-software/IOGA/IOGA.py", line 350, in IOGA_loop
FP,RP = extract_reads(folder,prefix,rmdup_merged,forward,reverse)
File "/lustre/apps/bio-software/IOGA/IOGA.py", line 179, in extract_reads
for line in subprocess.check_output([config['samtools'],'view','-S',samfile],stderr=fnull).split('\n'):
File "/lustre/apps/bio-software/Anaconda2/lib/python2.7/subprocess.py", line 566, in check_output
process = Popen(stdout=PIPE, _popenargs, *_kwargs)
File "/lustre/apps/bio-software/Anaconda2/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/lustre/apps/bio-software/Anaconda2/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied

Could you please tell me how to fix it? Thanks.

Best wishes,
Wen-Bin

SOAPdenovo2 segfaults on Arch Linux

Dear Holm,

I am running IOGA on a dataset (paired-end Illumina reads) from which I had successfully assembled complete plastid genomes in the past. When running IOGA on the same dataset today, IOGA reports that SOAPdenovo2 fails under every k-mer level evaluated. IOGA then exits with an index error for variable best, which is probably empty:

Iteration 1
[BAR02A_S1_L001.1] BBmap
[BAR02A_S1_L001.1] Sorting SAM files
[BAR02A_S1_L001.1] Removing duplicates
[BAR02A_S1_L001.1] Extracting reads
[BAR02A_S1_L001.1] Writing forward reads
[BAR02A_S1_L001.1] Writing reverse reads
[BAR02A_S1_L001.1] Old size = 0
[BAR02A_S1_L001.1] New size = 3093513
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 33 -- 10:32:42.149468
[BAR02A_S1_L001.1] SOAPdenovo2 k = 33 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 37 -- 10:32:42.313427
[BAR02A_S1_L001.1] SOAPdenovo2 k = 37 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 41 -- 10:32:42.499712
[BAR02A_S1_L001.1] SOAPdenovo2 k = 41 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 45 -- 10:32:42.672296
[BAR02A_S1_L001.1] SOAPdenovo2 k = 45 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 49 -- 10:32:42.864960
[BAR02A_S1_L001.1] SOAPdenovo2 k = 49 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 53 -- 10:32:43.026761
[BAR02A_S1_L001.1] SOAPdenovo2 k = 53 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 57 -- 10:32:43.212315
[BAR02A_S1_L001.1] SOAPdenovo2 k = 57 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 61 -- 10:32:43.401318
[BAR02A_S1_L001.1] SOAPdenovo2 k = 61 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 65 -- 10:32:43.580864
[BAR02A_S1_L001.1] SOAPdenovo2 k = 65 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 69 -- 10:32:43.760832
[BAR02A_S1_L001.1] SOAPdenovo2 k = 69 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 73 -- 10:32:43.963374
[BAR02A_S1_L001.1] SOAPdenovo2 k = 73 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 77 -- 10:32:44.130588
[BAR02A_S1_L001.1] SOAPdenovo2 k = 77 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 81 -- 10:32:44.322826
[BAR02A_S1_L001.1] SOAPdenovo2 k = 81 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 85 -- 10:32:44.507801
[BAR02A_S1_L001.1] SOAPdenovo2 k = 85 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 89 -- 10:32:44.688810
[BAR02A_S1_L001.1] SOAPdenovo2 k = 89 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 93 -- 10:32:44.867438
[BAR02A_S1_L001.1] SOAPdenovo2 k = 93 failed
[BAR02A_S1_L001.1] Running SOAPdenovo2 k = 97 -- 10:32:45.051394
[BAR02A_S1_L001.1] SOAPdenovo2 k = 97 failed
Traceback (most recent call last):
  File "/home/michael_science/git/IOGA//IOGA.py", line 432, in <module>
    main(args.reference,args.name,args.forward,args.reverse,args.threads,args.insertsize,args.maxrounds,args.verbose)
  File "/home/michael_science/git/IOGA//IOGA.py", line 377, in main
    source,FP,RP,final_iteration = IOGA_loop(name,ref,forward,reverse,insertsize,threads,maxrounds)
  File "/home/michael_science/git/IOGA//IOGA.py", line 359, in IOGA_loop
    best,k = run_soapdenovo(folder,prefix,FP,RP,insertsize,threads)
  File "/home/michael_science/git/IOGA//IOGA.py", line 254, in run_soapdenovo
    best=sorted(n50,key = lambda x: int(x[1]),reverse = True)[0][0]
IndexError: list index out of range

I strongly suspect that a recent Python update (or an update of one of the Python dependency packages) has corrupted the IOGA code such that the wrapper function for SOAPdenovo2 (i.e., run_soapdenovo()) no longer interacts with the assembler properly. Why else would SOAPdenovo2 fail under every k-mer level on a dataset where an assembly worked nicely before.

Edit 1:
I found that each of the SOAPdenovo2 subfolders generated by IOGA during the assembly process (i.e., BAR02A_S1_L001.1.soap_33, BAR02A_S1_L001.1.soap_37, ..., BAR02A_S1_L001.1.soap_97) contains merely two files: an empty log file and an error file with the following contents:

Version 2.04: released on July 13th, 2012
Compile Jul  9 2013	11:57:30

Evidently, SOAPdenovo2 is not called properly.

Can you please check the code?!

installation problem

Hi there,

I am interested to try the IOGA for chloroplast assembly, but running into problem in installing the program.
I could not get both setup_IOGA.py or IOGA.py to run.
Both scripts gave me error messages:

setup_IOGA.py:

Traceback (most recent call last):
File "./setup_IOGA.py", line 5, in
import wget
ImportError: No module named wget

IOGA.py:

File "./IOGA.py", line 305
def IOGA_loop(name=,ref,forward,reverse,insertsize,threads,maxrounds):
^
SyntaxError: invalid syntax

May you help me with this?
Thanks

Release tag missing

Hello,

I would request that you provide a release tag.

It is of great importance to science that research is reproducible and for this it must be clear which version of which software was used to obtain a particular result.

More specifically, for software management systems, such as EasyBuild, which are used by HPC sites, installation of packages can only be automated if programs have well-defined versions. If such automation is in place, it means that users can be provided with software much more quickly (some overstretched administrators could even be unwilling to install software which is hard to integrate into their provisioning mechanism).

Different installation issue

Hi Rens,

I'm trying to use IOGA on my mac. When I run setup_IOGA.py, it looks like there is an issue with installing samtools 0.1.19 and I'm not sure how to fix it.

Any ideas?
Angela

The output from running the script is below. (It doesn't make the .json file.)

picard.jar
100% [......................................................] 7353972 / 7353972
chmod +x picard.jar
trying /Applications/IOGA/exe/picard-tools-1.124/picard.jar
succes
samtools
100% [........................................................] 514507 / 514507x samtools-0.1.19/
x samtools-0.1.19/.gitignore
x samtools-0.1.19/AUTHORS
x samtools-0.1.19/COPYING
x samtools-0.1.19/ChangeLog.old
x samtools-0.1.19/INSTALL
x samtools-0.1.19/Makefile
x samtools-0.1.19/Makefile.mingw
x samtools-0.1.19/NEWS
x samtools-0.1.19/bam.c
x samtools-0.1.19/bam.h
x samtools-0.1.19/bam2bcf.c
x samtools-0.1.19/bam2bcf.h
x samtools-0.1.19/bam2bcf_indel.c
x samtools-0.1.19/bam2depth.c
x samtools-0.1.19/bam_aux.c
x samtools-0.1.19/bam_cat.c
x samtools-0.1.19/bam_color.c
x samtools-0.1.19/bam_endian.h
x samtools-0.1.19/bam_import.c
x samtools-0.1.19/bam_index.c
x samtools-0.1.19/bam_lpileup.c
x samtools-0.1.19/bam_mate.c
x samtools-0.1.19/bam_md.c
x samtools-0.1.19/bam_pileup.c
x samtools-0.1.19/bam_plcmd.c
x samtools-0.1.19/bam_reheader.c
x samtools-0.1.19/bam_rmdup.c
x samtools-0.1.19/bam_rmdupse.c
x samtools-0.1.19/bam_sort.c
x samtools-0.1.19/bam_stat.c
x samtools-0.1.19/bam_tview.c
x samtools-0.1.19/bam_tview.h
x samtools-0.1.19/bam_tview_curses.c
x samtools-0.1.19/bam_tview_html.c
x samtools-0.1.19/bamshuf.c
x samtools-0.1.19/bamtk.c
x samtools-0.1.19/bcftools/
x samtools-0.1.19/bcftools/Makefile
x samtools-0.1.19/bcftools/README
x samtools-0.1.19/bcftools/bcf.c
x samtools-0.1.19/bcftools/bcf.h
x samtools-0.1.19/bcftools/bcf.tex
x samtools-0.1.19/bcftools/bcf2qcall.c
x samtools-0.1.19/bcftools/bcfutils.c
x samtools-0.1.19/bcftools/call1.c
x samtools-0.1.19/bcftools/em.c
x samtools-0.1.19/bcftools/fet.c
x samtools-0.1.19/bcftools/index.c
x samtools-0.1.19/bcftools/kfunc.c
x samtools-0.1.19/bcftools/kmin.c
x samtools-0.1.19/bcftools/kmin.h
x samtools-0.1.19/bcftools/main.c
x samtools-0.1.19/bcftools/mut.c
x samtools-0.1.19/bcftools/prob1.c
x samtools-0.1.19/bcftools/prob1.h
x samtools-0.1.19/bcftools/vcf.c
x samtools-0.1.19/bcftools/vcfutils.pl
x samtools-0.1.19/bcftools/bcf.h~
x samtools-0.1.19/bedcov.c
x samtools-0.1.19/bedidx.c
x samtools-0.1.19/bgzf.c
x samtools-0.1.19/bgzf.h
x samtools-0.1.19/bgzip.c
x samtools-0.1.19/cut_target.c
x samtools-0.1.19/errmod.c
x samtools-0.1.19/errmod.h
x samtools-0.1.19/examples/
x samtools-0.1.19/examples/00README.txt
x samtools-0.1.19/examples/Makefile
x samtools-0.1.19/examples/bam2bed.c
x samtools-0.1.19/examples/calDepth.c
x samtools-0.1.19/examples/chk_indel.c
x samtools-0.1.19/examples/ex1.fa
x samtools-0.1.19/examples/ex1.sam.gz
x samtools-0.1.19/examples/toy.fa
x samtools-0.1.19/examples/toy.sam
x samtools-0.1.19/faidx.c
x samtools-0.1.19/faidx.h
x samtools-0.1.19/kaln.c
x samtools-0.1.19/kaln.h
x samtools-0.1.19/khash.h
x samtools-0.1.19/klist.h
x samtools-0.1.19/knetfile.c
x samtools-0.1.19/knetfile.h
x samtools-0.1.19/kprobaln.c
x samtools-0.1.19/kprobaln.h
x samtools-0.1.19/kseq.h
x samtools-0.1.19/ksort.h
x samtools-0.1.19/kstring.c
x samtools-0.1.19/kstring.h
x samtools-0.1.19/misc/
x samtools-0.1.19/misc/HmmGlocal.java
x samtools-0.1.19/misc/Makefile
x samtools-0.1.19/misc/ace2sam.c
x samtools-0.1.19/misc/bamcheck.c
x samtools-0.1.19/misc/blast2sam.pl
x samtools-0.1.19/misc/bowtie2sam.pl
x samtools-0.1.19/misc/export2sam.pl
x samtools-0.1.19/misc/interpolate_sam.pl
x samtools-0.1.19/misc/maq2sam.c
x samtools-0.1.19/misc/md5.c
x samtools-0.1.19/misc/md5.h
x samtools-0.1.19/misc/md5fa.c
x samtools-0.1.19/misc/novo2sam.pl
x samtools-0.1.19/misc/plot-bamcheck
x samtools-0.1.19/misc/psl2sam.pl
x samtools-0.1.19/misc/r2plot.lua
x samtools-0.1.19/misc/sam2vcf.pl
x samtools-0.1.19/misc/samtools.pl
x samtools-0.1.19/misc/soap2sam.pl
x samtools-0.1.19/misc/varfilter.py
x samtools-0.1.19/misc/vcfutils.lua
x samtools-0.1.19/misc/wgsim.c
x samtools-0.1.19/misc/wgsim_eval.pl
x samtools-0.1.19/misc/zoom2sam.pl
x samtools-0.1.19/padding.c
x samtools-0.1.19/phase.c
x samtools-0.1.19/razf.c
x samtools-0.1.19/razf.h
x samtools-0.1.19/razip.c
x samtools-0.1.19/sam.c
x samtools-0.1.19/sam.h
x samtools-0.1.19/sam_header.c
x samtools-0.1.19/sam_header.h
x samtools-0.1.19/sam_view.c
x samtools-0.1.19/sample.c
x samtools-0.1.19/sample.h
x samtools-0.1.19/samtools.1
x samtools-0.1.19/win32/
x samtools-0.1.19/win32/libcurses.a
x samtools-0.1.19/win32/libz.a
x samtools-0.1.19/win32/xcurses.h
x samtools-0.1.19/win32/zconf.h
x samtools-0.1.19/win32/zlib.h
x samtools-0.1.19/bam.h~

Machine name

Hoi Rens,

Klopt het dat je nog wel de machine name moet aangeven? Ik zet in ieder geval in de terminal steeds -m FCC. Dat leek te helpen.

Sara

Error running IOGA.py

Hi, I have an error encountered with IOGA.py.

sudo nohup python /opt/IOGA/IOGA.py --reference /opt/IOGA/plant_mitochondria.reference.fasta --forward ../PE1.fastq --reverse ../PE2.fastq --insertsize 250 --threads 40 > mesta-IOGA.log 2> mesta-IOGA.err & Traceback (most recent call last): File "/opt/IOGA/IOGA.py", line 432, in main(args.reference,args.name,args.forward,args.reverse,args.threads,args.insertsize,args.maxrounds,args.verbose)

File "/opt/IOGA/IOGA.py", line 377, in main
source,FP,RP,final_iteration = IOGA_loop(name,ref,forward,reverse,insertsize,threads,maxrounds)

File "/opt/IOGA/IOGA.py", line 323, in IOGA_loop
if len(name.split('/')) == 1:

AttributeError: 'NoneType' object has no attribute 'split'

Read trimming/filtering is not finding the necessary adapter file

Continued from #3


Hi Rens,

I have a problem! I'm using IOGA.py, but I keep seeing this error: OSError: [Errno 2]. Would you please help me solve this problem?

[teste] Quality trimming with BBduk

Traceback (most recent call last):

File "./IOGA.py", line 408, in main(args.reference,args.name,args.forward,args.reverse,args.threads,args.insertsize,args.maxrounds,args.verbose)

File "./IOGA.py", line 352, in main

source,FP,RP,final_iteration = IOGA_loop(name,ref,forward,reverse,insertsize,threads,maxrounds)

File "./IOGA.py", line 288, in IOGA_loop

forward,reverse = run_bbduk(name,forward,reverse,threads)

File "./IOGA.py", line 187, in run_bbduk subprocess.call(['bbduk.sh','ref='+adapterdir,'in='+forward,'in2='+reverse,'out='+FP,'out2='+RP,'threads='+threads,'k=25','ktrim=rl','qtrim=t','minlength=32','-Xmx10G'],stderr=fnull,stdout=fnull)

File "/usr/lib/python2.7/subprocess.py", line 522, in call

return Popen(popenargs, *kwargs).wait()

File "/usr/lib/python2.7/subprocess.py", line 710, in init

errread, errwrite)

File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child

raise child_exception

OSError: [Errno 2] No such file or directory

Best regards,

Lilian

New different installation issue

Hi Rens,

I've got IOGA to complete on my laptop and am trying to install it on one of our lab macs but am coming across a different issue. I've installed the dependencies and the paths seem to be set up correctly in the .json file. When I run the help menu, everything looks great. However, on the first iteration, the script stops at/around bbmap (error below). Any ideas on what to do? I don't have any nearby python experts. The computer has python 2.7.10 installed, which is the same as what works for my other computer. Could this be some kind of permissions error? Let me know what you think, and thanks.

Angela

/Applications/IOGA/IOGA_config.json
[MSAG] Quality trimming with BBduk
Iteration 1
[MSAG.1] BBmap
Traceback (most recent call last):
File "IOGA.py", line 432, in
main(args.reference, args.name, args.forward, args.reverse, args.threads, args.insertsize, args.maxrounds, args.verbose)
File "IOGA.py", line 377, in main
source,FP,RP,final_iteration = IOGA_loop(name,ref,forward,reverse,insertsize,threads,maxrounds)
File "IOGA.py", line 344, in IOGA_loop
samfile = run_bbmap(folder,prefix,ref,forward,reverse,threads)
File "IOGA.py", line 114, in run_bbmap
plot_coverage(basecov)
File "IOGA.py", line 121, in plot_coverage
with open(BBmap_coverage,'rU') as infile:
IOError: [Errno 2] No such file or directory: '/Applications/IOGA/MSAG.1/MSAG.1.basecov.txt'

Minor comments on the installation instructions

Hi @holmrenser

we are currently working on a benchmarking of chloroplast assembly tools (see https://github.com/chloroExtractorTeam/benchmark). In this context, I installed IOGA on a fresh installation of Ubuntu 18.04.2 and had to install a few additional packages (see below). You might want to add a note about these in your readme.

via apt:

  • python
  • python-pip
  • wget
  • build-essential
  • default-jre
  • libz-dev
  • libcurses5-dev

via pip:

  • matplotlib
  • biopython

Best,

Niklas

License

Hi @holmrenser,

I noticed you don't have a license assigned to IOGA. Would you mind adding an OSI-License to the repo?

Best,

Niklas

test data or tutorial

Hi @holmrenser

it would be nice to have some test data, so I can check, if my installation was successful. Also some more detailed instructions or a tutorial would be nice.

Best,

Niklas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.