Giter Club home page Giter Club logo

abra's People

Contributors

jlost avatar lmose avatar mozack avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

abra's Issues

java.lang.UnsatisfiedLinkError

Hi,
I get this error on some of our computation nodes but can't figure out what is going wrong/ is different.

/share/java/jdk1.7.0/bin/java -Xmx4g -jar abra-0.92-SNAPSHOT-jar-with-dependencies.jar --in da82e52f.bam --out da82e52f.realigned.unsorted.bam --ref human_g1k_v37.fasta --bwa-ref human_g1k_v37.fasta --threads 4 --targets final_baitcapture_moderate_1_Regions.bed --working /tmp/tmp_5d98 --mad 250 --mapq 10 --mer 0.1 --mur 500000 --rcf 0.01 

Exception in thread "main" java.lang.UnsatisfiedLinkError: > /tmp/tmp_5d98/libAbra.so: /tmp/tmp_5d98/libAbra.so: undefined symbol: _ZZN6google11sparsegroupIPKcLt48ENS_27libc_allocator_with_reallocIS2_EEE12bits_in_charEhE7bits_in
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1928)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1825)
at java.lang.Runtime.load0(Runtime.java:792)
at java.lang.System.load(System.java:1059)
at abra.NativeLibraryLoader.load(NativeLibraryLoader.java:45)
at abra.ReAligner.init(ReAligner.java:1093)
at abra.ReAligner.reAlign(ReAligner.java:116)
at abra.ReAligner.run(ReAligner.java:1214)
at abra.Abra.main(Abra.java:12)

I tried Java 1.7.0 and 1.8.0 with the same outcome.

Thanks for your help

Abra processes unique reads?

I just had a question regarding Abra 0.92, does it realign all reads or only reads that have not had their duplicate flag set by Picard MarkDuplicates?

Thank you

Abra error

Hi,
I am trying this tool to do realignment in a few regions, but got following error:
java.lang.NegativeArraySizeException
at java.lang.AbstractStringBuilder.(AbstractStringBuilder.java:64)
at java.lang.StringBuffer.(StringBuffer.java:108)
at abra.CompareToReference2.getSequence(CompareToReference2.java:400)
at abra.KmerSizeEvaluator.getBases(KmerSizeEvaluator.java:44)
at abra.KmerSizeEvaluator.identifyMinKmer(KmerSizeEvaluator.java:98)
at abra.NativeAssembler.assembleContigs(NativeAssembler.java:273)
at abra.ReAligner.processRegion(ReAligner.java:592)
at abra.ReAlignerRunnable.go(ReAlignerRunnable.java:21)
at abra.AbraRunnable.run(AbraRunnable.java:20)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

My commandline is: java -Xmx8G -jar $ABRA --in ${inbam} --out ${outbam} --ref ${hg19} --targets ${bed} --working ~/abra/ > abra.log
Screen shots of errors and target bed file are attached. I really appreciate if you could point me possible reasons.
Thanks,
Yuhong

abra_test_error

polyA/polyT/lowcomplexity region realignments for single ended reads

Hi,

I would like to ask about ABRA realignment of single-fragment (non-paired) reads around regions where there is a poly-A or poly-T or low complexity region where bowtie2/bwamem are unable to properly align the reads one-by-one but there is a possibility of improving the alignment when looking at the pileup in the region.

Would ABRA be able to work on regions like that even when the reads are single-fragment?

Bug in v0.77 - Input regions not split properly

Larger regions passed in via the --targets param are not split properly in the v0.77 release. This may negatively impact the local assembly.

This issue only applies if your bed file was not generated using KmerSizeEvaluator. This will be corrected shortly.

Requested array size exceeds VM limit

Hi,

While running ABRA, I've got the following error message. If I specified a larger amount of memory (ex. 32GB), can this problem be resolved?

[main] CMD: bwa samse -n 1000 /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/clean_contigs.fasta /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/temp1/align_to_contig.sam.sai /scratch/TCGA-DD-A1EA-01A-11D-A12Z-10_Illumina_1410534136/temp1/original_reads.fastq.gz
[main] Real time: 3956.359 sec; CPU: 3197.467 sec
Stream thread done.
Stream thread done.
BWA time: 3956 seconds.
Clock time in Align to contigs: 19797
Sat Sep 13 00:24:30 EDT 2014 : Adjust reads
Sat Sep 13 00:24:30 EDT 2014 : Adjusting reads.
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at net.sf.samtools.SAMTextHeaderCodec.advanceLine(SAMTextHeaderCodec.java:128)
at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:83)
at net.sf.samtools.SAMTextReader.readHeader(SAMTextReader.java:185)
at net.sf.samtools.SAMTextReader.(SAMTextReader.java:62)
at net.sf.samtools.SAMTextReader.(SAMTextReader.java:71)
at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:556)
at net.sf.samtools.SAMFileReader.(SAMFileReader.java:167)
at net.sf.samtools.SAMFileReader.(SAMFileReader.java:122)
at abra.ReadAdjuster.adjustReads(ReadAdjuster.java:55)
at abra.AdjustReadsRunnable.go(AdjustReadsRunnable.java:37)
at abra.AbraRunnable.run(AbraRunnable.java:19)
at java.lang.Thread.run(Thread.java:745)

Error in Cadabra

Hi
I realigned my reads using abra and wants to do somatic calling using Cadabra but keep having a problem and I do not understand why.

Here is my command:
java -Xmx24G -cp /home/gwenneg/bin/abra-0.95-SNAPSHOT-jar-with-dependencies.jar abra.cadabra.Cadabra $REF $INNORM $INTUM > $OUTPUT

And my error:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2
at abra.cadabra.Cadabra.main(Cadabra.java:508)
Command exited with non-zero status 1

I used the bam files output from abra and the same hg19.fa file that was used with abra. I also tried with bam and sorted bam but with the same result.

Thanks for helping

Regards

Gwenneg Kerdivel

Abra will realign duplicates?

I just had a question regarding release 0.92

Will Abra realign duplicates or ignore these reads? Thank you for the help

xlC compatibility - C++ STL referenced in .c files

Hi, I was compiling ABRA for use on an IBM POWER8 system (ppc64le) with the xlC compiler and noticed it will fail compile, unable to find iostream:

[u0017592@sys-83519 abra]$ sudo make standalone JAVA_HOME=$JAVA_HOME
xlc++ -Isrc/main/c -I/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91-2.6.2.1.ael7b_1.ppc64le/include -I/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91-2.6.2.1.ael7b_1.ppc64le/include/linux src/main/c/assembler.c -o abra
src/main/c/assembler.c:5:10: fatal error: 'iostream' file not found
#include <iostream>
         ^
1 error generated.
Error while processing src/main/c/assembler.c.
make: *** [standalone] Error 1

I poked around a bit and discovered that this is because xlC doesn't import the C++ STL by default for .c files. Since the .c files are using the C++ STL, wouldn't it make more sense to rename the .c files to .cpp and retest it on both compilers to make sure no functionality changed (although none should) and update the Makefile? This would fix ABRA for the xlC compiler.

Would you be willing to accept a pull request that does this?

abra2 error java.lang.NumberFormatException

Once I try to run ABRA2, I get the following error. Would you please look into it? Thanks.

cmd=java -Xmx16G -jar target/abra2-2.19-jar-with-dependencies.jar --in /path_to/filesample_dedupSort.bam --out /path_to/filesample_abraRealign.bam --ref /path_to/hg19.fasta --threads 8 --targets /path_to/uploads_3091241_Covered-2.bed --tmpdir /path_to/tmpDir

INFO Fri Jan 18 10:34:36 CET 2019 Abra version: 2.19
INFO Fri Jan 18 10:34:36 CET 2019 Abra params: [/usr/local/bioinfo/Tools/abra2/target/abra2-2.19-jar-with-dependencies.jar --in /mnt/analyses/KDM-PROD/IOV_baseline/ResVarCall/IOV-KDM_N12enz/alignment/ALN_IOV_IOV_OKDM_IOV_N12enz_dedupSort.bam --out /mnt/analyses/KDM-PROD/IOV_baseline/ResVarCall/IOV-KDM_N12enz/alignment/ALN_IOV_IOV_OKDM_IOV_N12enz_20181012091021_abraRealign.bam --ref /mnt/analyses/Results/Reference/hg19.fasta --threads 8 --targets /mnt/analyses/PROJECT/kdm_IOV/uploads_3091241_Covered-ModifForAbraTest.bed --tmpdir /usr/local/bioinfo/Tools/abra2/tmpRim]
INFO Fri Jan 18 10:34:36 CET 2019 ABRA version: 2.19
INFO Fri Jan 18 10:34:36 CET 2019 input0: /mnt/analyses/KDM-PROD/IOV_baseline/ResVarCall/IOV-KDM_N12enz/alignment/ALN_IOV_IOV_OKDM_IOV_N12enz_dedupSort.bam
INFO Fri Jan 18 10:34:36 CET 2019 output0: /mnt/analyses/KDM-PROD/IOV_baseline/ResVarCall/IOV-KDM_N12enz/alignment/ALN_IOV_IOV_OKDM_IOV_N12enz_20181012091021_abraRealign.bam
INFO Fri Jan 18 10:34:36 CET 2019 regions: /mnt/analyses/PROJECT/kdm_IOV/uploads_3091241_Covered-ModifForAbraTest.bed
INFO Fri Jan 18 10:34:36 CET 2019 reference: /mnt/analyses/Results/Reference/hg19.fasta
INFO Fri Jan 18 10:34:36 CET 2019 num threads: 8
INFO Fri Jan 18 10:34:36 CET 2019 minEdgeFrequency: 0
minNodeFrequncy: 1
minContigLength: -1
minBaseQuality: 20
minReadCandidateFraction: 0.01
maxAverageRegionDepth: 1000
minEdgeRatio: 0.01

INFO Fri Jan 18 10:34:36 CET 2019 paired end: true
INFO Fri Jan 18 10:34:36 CET 2019 isSkipAssembly: false
INFO Fri Jan 18 10:34:36 CET 2019 useSoftClippedReads: true
INFO Fri Jan 18 10:34:36 CET 2019 SW scoring: [8, 32, 48, 1]
INFO Fri Jan 18 10:34:36 CET 2019 Soft clip params: [16, 13, 80, 15]
INFO Fri Jan 18 10:34:36 CET 2019 Java version: 1.8.0_191
INFO Fri Jan 18 10:34:36 CET 2019 hostname: bioit-dev
INFO Fri Jan 18 10:34:36 CET 2019 SG match,mismatch,gap_open_penalty,gap_extend_penalty: 8,-32,-48,-1
INFO Fri Jan 18 10:34:36 CET 2019 Using temp directory: /usr/local/bioinfo/Tools/abra2/tmpRim/abra2_358def45-c903-44a8-a674-5bc2c317367f1665716855282894176
INFO Fri Jan 18 10:34:36 CET 2019 Loading native library from: /usr/local/bioinfo/Tools/abra2/tmpRim/abra2_358def45-c903-44a8-a674-5bc2c317367f1665716855282894176/libAbra.so
INFO Fri Jan 18 10:34:36 CET 2019 Loading reference map: /mnt/analyses/Results/Reference/hg19.fasta
INFO Fri Jan 18 10:36:10 CET 2019 Done loading ref map. Elapsed secs: 93
INFO Fri Jan 18 10:36:10 CET 2019 Reading Input SAM Header and identifying read length
INFO Fri Jan 18 10:36:10 CET 2019 Identifying header and determining read length
INFO Fri Jan 18 10:36:13 CET 2019 Min insert length: 0
INFO Fri Jan 18 10:36:13 CET 2019 Max insert length: 230110226
INFO Fri Jan 18 10:36:13 CET 2019 Max read length is: 150
INFO Fri Jan 18 10:36:13 CET 2019 Min contig length: 151
INFO Fri Jan 18 10:36:13 CET 2019 Read length: 150
INFO Fri Jan 18 10:36:13 CET 2019 Loading target regions
INFO Fri Jan 18 10:36:13 CET 2019 Loading target regions from : /mnt/analyses/PROJECT/kdm_IOV/uploads_3091241_Covered-ModifForAbraTest.bed
INFO Fri Jan 18 10:36:13 CET 2019 Collapsed regions from 1160 to 1029
INFO Fri Jan 18 10:36:13 CET 2019 Num regions: 1300
INFO Fri Jan 18 10:36:13 CET 2019 Total junctions input: 0
INFO Fri Jan 18 10:36:13 CET 2019 Final Junctions: 0, Variant Junctions: 0
INFO Fri Jan 18 10:36:13 CET 2019 Intel deflater disabled
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_1_25000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_25000001_50000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_50000001_75000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_75000001_100000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_100000001_125000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_125000001_150000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_175000001_200000000
INFO Fri Jan 18 10:36:13 CET 2019 Processing chromosome chunk: chr1_150000001_175000000
java.lang.NumberFormatException: For input string: "-3,265707"
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
at java.lang.Double.parseDouble(Double.java:538)
at abra.ScoredContig.convertAndFilter(ScoredContig.java:53)
at abra.ReAligner.assemble(ReAligner.java:1114)
at abra.ReAligner.processRegion(ReAligner.java:1293)
at abra.ReAligner.processChromosomeChunk(ReAligner.java:361)
at abra.ReAlignerRunnable.go(ReAlignerRunnable.java:21)
at abra.AbraRunnable.run(AbraRunnable.java:20)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

[Request] Replace System.out.print* statements with System.err.print* for stream compatibility

The output of ABRA is, unfortunately, not necessarily coordinate-sorted, such that I would prefer to replace any usage with one that piped to samtools #sort. In addition, it would be useful for performance reasons to be able to pipe the output.

By setting --out /dev/stdout, I've been able to have it write the bam to stdout. However, all of the logging statements currently go to stdout rather than stderr, making it impossible to stream. Emitting the parsed-in bed file coordinates also makes it so that filtering by keywords is no longer an option.

I would have fixed it myself, but I ended up messing up some of your tab vs spaces formatting, and I didn't want to assume that I could replace all of your spaces with tabs or vice versa.

java.lang.NegativeArraySizeException

Greetings! I am encountering a java error when running abra. I thought the problem was that my regions are too small (smaller than kmer). I have a de novo genome, and I am interested in coding indels, so my regions file is a CDS.bed file and my gene prediction software produced some impossibly small genes. However, a larger CDS also produces the same error.

Here is my command:

java -Xmx34g -jar abra-0.82.jar --in SC_049.srt.RG.bam --ref genome_v3.fasta --out SC_049.abra.bam --working abra/ --targets CDS.bed

It crashes on the first scaffold following a particular region, scaffold_252_213187_213194:

Assembling: -> abra//scaffold_252_68589_69038_contigs.fasta_k13
Done assembling(0): abra//scaffold_252_68389_68789_contigs.fasta_k13, 24
Elapsed_msecs_in_NativeAssembler    Region: scaffold_252_68389_68789    Length: 400 ReadCount:  334 Elapsed 73  Assembled   true
Mon Sep 29 11:15:44 PDT 2014 : Processing region: scaffold_252_213187_213194
java.lang.NegativeArraySizeException
    at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:64)
    at java.lang.StringBuffer.<init>(StringBuffer.java:108)
    at abra.CompareToReference2.getSequence(CompareToReference2.java:392)
    at abra.KmerSizeEvaluator.getBases(KmerSizeEvaluator.java:44)
    at abra.KmerSizeEvaluator.identifyMinKmer(KmerSizeEvaluator.java:97)
    at abra.NativeAssembler.assembleContigs(NativeAssembler.java:140)
    at abra.ReAligner.processRegion(ReAligner.java:691)
    at abra.ReAlignerRunnable.go(ReAlignerRunnable.java:21)
    at abra.AbraRunnable.run(AbraRunnable.java:19)
    at java.lang.Thread.run(Thread.java:745)
java.lang.NegativeArraySizeException
    at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:64)
    at java.lang.StringBuffer.<init>(StringBuffer.java:108)
    at abra.CompareToReference2.getSequence(CompareToReference2.java:392)
    at abra.KmerSizeEvaluator.getBases(KmerSizeEvaluator.java:44)
    at abra.KmerSizeEvaluator.identifyMinKmer(KmerSizeEvaluator.java:97)
    at abra.NativeAssembler.assembleContigs(NativeAssembler.java:140)
    at abra.ReAligner.processRegion(ReAligner.java:691)
    at abra.ReAlignerRunnable.go(ReAlignerRunnable.java:21)
    at abra.AbraRunnable.run(AbraRunnable.java:19)
    at java.lang.Thread.run(Thread.java:745)
Num reads: 360
Num nodes: 1039
Remaining nodes after pruning step 1: 756
Remaining nodes after pruning step 2: 754
num root nodes: 2
Done assembling(0): abra//scaffold_252_60836_61423_contigs.fasta_k15, 144
Done assembling(0): abra//scaffold_252_68589_69038_contigs.fasta_k13, 289

Having a look at this region, it is smaller than the kmer and likely not real:
scaffold_252_213187_213194

However, I get the same error with a larger region (although still very small): scaffold_252_221019_221342

'align_to_contig.sam.sai' : No such file or directory

Producing this error with our own bams using v0.79 built on Mountain Lion. However, using the demo bam and bed file included with abra runs perfectly.

Command:
java -Xmx16G -jar $JAR --in $inputBam --out ${inputBam/bam/realigned.bam} --ref /seq/data/GATK_Bundles/hg19/ucsc.hg19.fasta --targets $targets --threads 8 --working /Volumes/fastdata/tmp/abraTest > abra_test.log

Error message:
Exception in thread "main" java.lang.RuntimeException: BWA exited with non-zero return code : [1] for command: [bwa samse /Volumes/fastdata/tmp/abraTest/clean_contigs.fasta /Volumes/fastdata/tmp/abraTest/temp1/align_to_contig.sam.sai /Volumes/fastdata/tmp/abraTest/temp1/original_reads.fastq.gz -n 1000 > /Volumes/fastdata/tmp/abraTest/temp1/align_to_contig.sam]
at abra.Aligner.runCommand(Aligner.java:66)
at abra.Aligner.shortAlign(Aligner.java:86)
at abra.ReAligner.alignToContigs(ReAligner.java:1006)
at abra.ReAligner.alignReads(ReAligner.java:525)
at abra.ReAligner.alignReads(ReAligner.java:385)
at abra.ReAligner.reAlign(ReAligner.java:193)
at abra.ReAligner.run(ReAligner.java:1282)
at abra.Abra.main(Abra.java:12)

bwa error message:
[bwa_sai2sam_se_core] fail to open file '/Volumes/fastdata/tmp/abraTest/temp1/align_to_contig.sam.sai' : No such file or directory

bwa version used for with Abra: 0.7.9a-r786
bwa version used for original alignment: 0.7.6a-r433
javac: 1.7.0_67

does abra realign unaligned reads

Very cool project!

Does the code currently (a) try and salvage unaligned reads and (b) distant reads with potential alternate alignments. This would help for the events I'm looking for. i.e. long indels which initially don't align and variants in highly homologous ion channels.

Error while running ABRA

Hello,

I am tying to run ABRA for a panel data that we have. The command that I am using is:

java -Xmx4G -jar abra-0.94-SNAPSHOT-jar-with-dependencies.jar --in ifile.bam --out ofile.bam --ref human_g1k_v37_decoy.fasta --targets b37.bed --threads 8 --working tmp1 > abra.log 2>&1

The version of bwa is : 0.7.8-r455

This is the error message I am getting:

Done assembling(0): tmp1/15_89394994_89395294_contigs.fasta_k15, 1
Elapsed_msecs_in_NativeAssembler Region: 15_89394994_89395294 Length: 300 ReadCount: 188 Elapsed 68 Assembled true 15
Mon Sep 28 19:03:43 CDT 2015 : Processing region: 15_89398655_89399055
Done assembling(0): tmp1/15_89392635_89392995_contigs.fasta_k13, 7
Elapsed_msecs_in_NativeAssembler Region: 15_89392635_89392995 Length: 360 ReadCount: 213 Elapsed 76 Assembled true 13
Mon Sep 28 19:03:43 CDT 2015 : Processing region: 15_89398855_89399255
STOPPED_ON_REPEAT: 15_88690509_88690689
Done assembling(0): tmp1/15_88690509_88690689_contigs.fasta_k17, 0
Abra JNI entry point v0.94, prefix: 15_88690509_88690689, read_length: 300, kmer_size: 19, min_node_freq: 2, min_base_qual: 60, min_edge_ratio 0.020000, debug: 1
Assembling: -> tmp1/15_88690509_88690689_contigs.fasta_k19
STOPPED_ON_REPEAT: 15_70976574_70976814
Done assembling(1): tmp1/15_70976574_70976814_contigs.fasta_k43, 0
Elapsed_msecs_in_NativeAssembler Region: 15_89398855_89399255 Length: 400 ReadCount: 81 Elapsed 11 Assembled true 301
Mon Sep 28 19:03:43 CDT 2015 : Processing region: 15_89399055_89399455
Abra JNI entry point v0.94, prefix: 15_70976574_70976814, read_length: 300, kmer_size: 45, min_node_freq: 2, min_base_qual: 60, min_edge_ratio 0.020000, debug: 1
Assembling: -> tmp1/15_70976574_70976814_contigs.fasta_k45
STOPPED_ON_REPEAT: 15_85666237_85666477
Done assembling(0): tmp1/15_85666237_85666477_contigs.fasta_k31, 0
Abra JNI entry point v0.94, prefix: 15_85666237_85666477, read_length: 300, kmer_size: 33, min_node_freq: 2, min_base_qual: 60, min_edge_ratio 0.020000, debug: 1
Assembling: -> tmp1/15_85666237_85666477_contigs.fasta_k33
Abra JNI entry point v0.94, prefix: 15_89398655_89399055, read_length: 300, kmer_size: 265, min_node_freq: 2, min_base_qual: 60, min_edge_ratio 0.020000, debug: 1
Elapsed_msecs_in_NativeAssembler Region: 15_89399055_89399455 Length: 400 ReadCount: 11 Elapsed 9 Assembled true 301
Mon Sep 28 19:03:43 CDT 2015 : Processing region: 15_89399255_89399655
Assembling: -> tmp1/15_89398655_89399055_contigs.fasta_k265

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x00007f8a7f6bd46c, pid=18058, tid=140232817727232

JRE version: Java(TM) SE Runtime Environment (8.0_20-b26) (build 1.8.0_20-b26)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.20-b23 mixed mode linux-amd64 compressed oops)

Problematic frame:

C [libAbra.so+0x2546c] is_node_in_list(node_, linked_node_)+0x25

Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

An error report file with more information is saved as:

hs_err_pid18058.log

Elapsed_msecs_in_NativeAssembler Region: 15_89399255_89399655 Length: 400 ReadCount: 11 Elapsed 11 Assembled true 301
Mon Sep 28 19:03:43 CDT 2015 : Processing region: 15_89399455_89399855

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

The crash happened outside the Java Virtual Machine in native code.

See problematic frame for where to report the bug.

Aborted

BWA error: fail to locate the index files

I am getting the following error with abra v0.77:

Thu Jun 12 19:48:43 CEST 2014 : Aligning contigs
Running: [bwa mem -t 5 /data/broad/human_g1k_v37.fasta abra_temp_dir/all_contigs.fasta > abra_temp_dir/all_contigs.sam]
[E::bwa_idx_load] fail to locate the index files
Stream thread done.
BWA time: 0 seconds.
Exception in thread "main" java.lang.RuntimeException: BWA exited with non-zero return code : [1] for command: [bwa mem -t 5 /data/broad/human_g1k_v37.fasta abra_temp_dir/all_contigs.fasta > abra_temp_dir/all_contigs.sam]
at abra.Aligner.runCommand(Aligner.java:66)
at abra.Aligner.align(Aligner.java:36)
at abra.ReAligner.alignAndCleanContigs(ReAligner.java:473)
at abra.ReAligner.reAlign(ReAligner.java:181)
at abra.ReAligner.run(ReAligner.java:1258)
at abra.Abra.main(Abra.java:12)
Stream thread done.


The FASTA index file for "/data/broad/human_g1k_v37.fasta" exists, but not for "abra_temp_dir/all_contigs.fasta". I am using BWA version 0.7.9.

hidden dependency on bwa mem

When I run I see that there is a hidden dependency on BWA mem. Would be nice to be able to specify the path to bwa mem since I might be using different versions.

java.lang.IllegalStateException

Exception in thread "main" java.lang.IllegalStateException: Unable to delete: abra_temp_dir
at abra.ReAligner.init(ReAligner.java:1102)
at abra.ReAligner.reAlign(ReAligner.java:121)
at abra.ReAligner.run(ReAligner.java:1240)
at abra.Abra.main(Abra.java:12)
please help me!!!

No contigs assembled: no space left on device

I'm getting the below error when running ABRA1. The command is also below. Has anyone seen this before? There is sufficient disk space in the directory so I'm not sure why I'm getting this java error.

java -Xmx16G -jar abra-0.97.jar --in normal.bam,tumor.bam --out normal_ABRA.bam,tumor_ABRA.bam --ref hg38.fa --bwa-ref BWAIndex/hg38 --targets Regions.hg38.bed --threads 8 --working abra_temp_dir --sv abra.sv.txt

Mon Nov 26 14:17:42 EST 2018 : WARNING! No contigs assembled. Just making a copy of input converting to/from SAM/BAM as appropriate.
Exception in thread "main" htsjdk.samtools.util.RuntimeIOException: java.io.IOException: No space left on device
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:245)
at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:165)
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:180)
at abra.ReAligner.copySam(ReAligner.java:384)
at abra.ReAligner.reAlign(ReAligner.java:229)
at abra.ReAligner.run(ReAligner.java:1240)
at abra.Abra.main(Abra.java:12)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:236)
... 6 more

Problem finding bwa index when running ABRA

I am trying to run ABRA after using sorting and marking duplicates with PICARD. In the .log file, it seems that ABRA worked well until it needed to call bwa, and I get the following error in the .log file:

Mon Jan 26 17:23:18 GMT 2015 : Aligning contigs
Running: [bwa mem -t 8 /nethome/mfield/ref_files/hg19.fa abra_temp_dir/all_contigs.fasta > abra_temp_dir/all_contigs.sam]
[E::bwa_idx_load_from_disk] fail to locate the index files
BWA time: 0 seconds.
Exception in thread "main" Stream thread done.
Stream thread done.
java.lang.RuntimeException: BWA exited with non-zero return code : [1] for command: [bwa mem -t 8 /nethome/mfield/ref_files/hg19.fa abra_temp_dir/all_contigs.fasta > abr$
at abra.Aligner.runCommand(Aligner.java:66)
at abra.Aligner.align(Aligner.java:36)
at abra.ReAligner.alignAndCleanContigs(ReAligner.java:485)
at abra.ReAligner.reAlign(ReAligner.java:188)
at abra.ReAligner.run(ReAligner.java:1306)
at abra.Abra.main(Abra.java:12)

I don't understand how to deal with this, as the bwa index files are in the /nethome/mfield/ref_files/ directory, and I am calling the hg19.fa file. I have the hg19.fa file in my ref_files directory with hg19.amb, hg19.ann, hg19.bwt, hg19.pac, hg19.sa all in that directory as well. When I call bwa mem separately I am not supposed to use the .fa extension, and I am only supposed to use the hg19 prefix. But, I think I need to use the .fa extension for the abra calling. Any ideas what is causing the problem?

Read name length exceeded

I ran into the following problem on my data set:

Warning! Max SAM Read name length exceeded for: HWI-ST815_0101:6:2306:5037:35967#108C 83 1 31245039 60 100M = 31244887 -252 * * BD:Z:LMNMMOQPNPNMOPOOOMONONNLLJMLCKMJNNMLLKMJIMMMLKJJKKIKKLHLMGLLLMNKMLIKKLKMKLLLLNLALOMDMQPMPMNQRTRRKKJJ RG:Z:108C.BSF_0050_D2BYKACXX.6

Can this warning be safely ignored or is this a problem that needs to be fixed? If it can be ignored, can I somehow suppress this warning such that it does not fill up my log file?

Bug in BED file parsing

Hi Lisle,

I found a little bug in your BED file parsing:
Header lines without tab are treated as data lines.
Thus, they lead to a crash because the second element is accessed but not present.
I guess you should only consider lines that have three or more elements after splitting.

Here the command and output:

java -cp abra.jar abra.KmerSizeEvaluator 100 hg19.fa /tmp/test 1 test.bed
Loading reference map: /tmp/local_ngs_data/hg19.fa
Chromosome: chrM length: 16571
Chromosome: chr1 length: 249250621
...
Done loading ref map. Elapsed secs: 179
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at abra.RegionLoader.load(RegionLoader.java:45)
at abra.ReAligner.getRegions(ReAligner.java:702)
at abra.KmerSizeEvaluator.run(KmerSizeEvaluator.java:50)
at abra.KmerSizeEvaluator.main(KmerSizeEvaluator.java:240)

And this is the BED file:

cat test.bed
browser position chr7:127471196-127495720
browser hide all
track name="ItemRGBDemo" description="Item RGB demonstration" visibility=2
chr7 127471196 127472363 Pos1 0 + 127471196 127472363 255,0,0
chr7 127472363 127473530 Pos2 0 + 127472363 127473530 255,0,0
chr7 127473530 127474697 Pos3 0 + 127473530 127474697 255,0,0
chr7 127474697 127475864 Pos4 0 + 127474697 127475864 255,0,0
chr7 127475864 127477031 Neg1 0 - 127475864 127477031 0,0,255
chr7 127477031 127478198 Neg2 0 - 127477031 127478198 0,0,255
chr7 127478198 127479365 Neg3 0 - 127478198 127479365 0,0,255
chr7 127479365 127480532 Pos5 0 + 127479365 127480532 255,0,0
chr7 127480532 127481699 Neg4 0 - 127480532 127481699 0,0,255

Best regards,
Marc

Error running abra [0.96]

Hello,

I was able to run abra for a number of BAM files. But a few of them are giving me this error:

--
Fri Nov 13 03:44:57 CST 2015 : Indexing contigs
Running: [bwa index abra_temp_dir/clean_contigs.fasta]
[bwa_index] Pack FASTA... 0.21 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 7.06 seconds elapse.
[bwa_index] Update BWT... 0.13 sec
[bwa_index] Pack forward-only FASTA... 0.15 sec
[bwa_index] Construct SA from BWT and Occ... 2.10 sec
[main] Version: 0.7.8-r455
[main] CMD: bwa index abra_temp_dir/clean_contigs.fasta
[main] Real time: 10.393 sec; CPU: 9.660 sec
Stream thread done.
Stream thread done.
BWA time: 10 seconds.
Fri Nov 13 03:45:08 CST 2015 : Contig indexing done
Fri Nov 13 03:45:08 CST 2015 : Aligning original reads to contigs
Running: [bwa aln abra_temp_dir/clean_contigs.fasta abra_temp_dir/temp1/original_reads.fastq.gz -t 12 -o 0 | bwa samse abra_temp_dir/clean_contigs.fasta - abra_temp_dir/temp1/original_reads.fastq.gz -n 1000]

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x00007fdc8fd785fe, pid=6272, tid=140585289746176

JRE version: Java(TM) SE Runtime Environment (8.0_20-b26) (build 1.8.0_20-b26)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.20-b23 mixed mode linux-amd64 compressed oops)

Problematic frame:

V [libjvm.so+0x8e35fe] SR_handler(int, siginfo_, ucontext_)+0x3e

Core dump written. Default location:

An error report file with more information is saved as:

--

I am using Version: 0.7.8-r455 of BWA

Can you please help me with this ? Thanks a lot for updating the software !

regards,
Rahul

Amplicon data

Hi

Looks like a great tool I just wondered it will work with amplicon data. I seem to recall haplotype caller struggling with the lack of sequence diversity.

Thanks
Matt

Testing practices

Thank you for your work on this project,

I was wondering if you had any suggestions for testing Abra in a pipeline. For example, with a very small pair of fastqs, say only a few reads, are there any parameters or suggestions that would make an Abra run only take a few seconds? Thank you

SAM Read name length exceeded

Hello, I am working with version 0.69 from the download page released about a month ago. I am running up against a lot of these warnings:

Warning!  Max SAM Read name length exceeded for: HWI-D00291:44:H8934ADXX:1:1113:19955:94195     177     chr1    91448912        16      67S23M11S       chr14   70252830        0       *       *       SA:Z:chr2,65392795,+,52S19M30S,0,0;chr3,180631113,+,38S19M44S,0,0;chr11,49594709,+,19M82S,0,0;chr2,177387370,-,54S19M28S,0,0;   PG:Z:MarkDuplicates     RG:Z:413.14     NM:i:0  YR:i:1  AS:i:23 XS:i:0  YX:i:58

Digging into the code I see this stuff happens around line 183 in both the latest checkout, and my version 0.69 of the file main/java/abra/Sam2Fastq.java.

It seems like there could be weirdness going on, because that variable is labeled as readName on line 195, yet it is printed out as what looks like a sam record.

Is all of this expected/ok/alpha version stuff, or is there a bug here?

Thanks for your time!

Excessive run time

150x whole exome (64 Mb) not completed after 24 hours w 18 cores while building all_contigs.fasta (which continues to increase in size). This is a reproducible sample specific issue as other samples at similar depth take 1.5 - 3 hours. Occurs with default settings.

Change header in bam

Dear all,
I have problems with header after realignment of abra tool. Do you have same problem? FixMateInformation doesn't work.
Thank you
Filip

Is YA tag deterministic?

Thank you for you work developing this tool,

I'm trying to validate two pipelines that are being used at our institution. I was wondering whether the YA tag should always result in the same value between runs on the same input bam, or whether the assembly process could potentially result in different contigs between pipeline runs. Any information would be appreciated.

Thank you,
Ian

New java.lang.ArrayIndexOutOfBoundsException: 44800 Issue

I originally received the ArrayIndexOutOfBoundsException error in all my samples, and then updated to the 0.94 version, which solved most of my problems. However, in one of my sample sets, I am still getting the below error. In fact, one of the two matched samples used in this analysis was successfully analyzed in a different matched set... so maybe it is an issue specific to the second pair? Any suggestions about what might be going on? Is this another contig problem, but instead of mapping near the ends of MT, it is due to mapping near the end of GL000219?

Also, in the bwa fa.ann index file this region shows and other GL000219 reads map properly:
0 GL000219.1 dna:supercontig supercontig:GRCh37:GL000219.1:1:179198:1 REF
3097937706 179198 0

Index error for read: HW-ST997_0199:1:1311:7476:93735#0 147 GL000219.1 179140 28 38H9S54M = 179140 -54 GAGAGATTGTTCTGGAACCCTATGTTACAGACAAACATTGAGACCATCGTTGCAGTGTTCTGG @@@@@@??@??@?@@@?@@@>??@?>?@@@@@@??@@??@@??@@?>??<>>???>?>>?>=> PG:Z:MarkDuplicates AM:i:4 NM:i:5 SM:i:4 PQ:i:492 UQ:i:240 AS:i:240

java.lang.ArrayIndexOutOfBoundsException: 44800
at abra.CompareToReference2.getBaseAsChar(CompareToReference2.java:368)
at abra.CompareToReference2.getRefBase(CompareToReference2.java:362)
at abra.CompareToReference2.numDifferences(CompareToReference2.java:198)
at abra.CompareToReference2.numMismatches(CompareToReference2.java:70)
at abra.SAMRecordUtils.getEditDistance(SAMRecordUtils.java:166)
at abra.Sam2Fastq.convert(Sam2Fastq.java:100)
at abra.ReAligner.sam2Fastq(ReAligner.java:795)
at abra.PreprocessReadsRunnable.go(PreprocessReadsRunnable.java:32)
at abra.AbraRunnable.run(AbraRunnable.java:20)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Abra JNI entry point v0.94, prefix: 7_99687202_99687369__5_66718093_66718396_1, read_length: 101, kmer_size: 13, min_node_freq: 2, min_$
Assembling: -> abra_tmp_MM134/7_99687202_99687369_contigs.fasta_k13
STOPPED_ON_REPEAT: 7_99688005_99688296
Done assembling(0): abra_tmp_MM134/7_99688005_99688296_contigs.fasta_k11, 0
STOPPED_ON_REPEAT: 7_99669418_99669836
Done assembling(0): abra_tmp_MM134/7_99669418_99669836_contigs.fasta_k23, 0
Done assembling(0): abra_tmp_MM134/7_99687202_99687369_contigs.fasta_k13, 1
STOPPED_ON_REPEAT: 7_99686862_99687038__19_7694492_7694795_1
Done assembling(0): abra_tmp_MM134/7_99686862_99687038_contigs.fasta_k13, 0
Done assembling(0): abra_tmp_MM134/7_99688653_99689143_contigs.fasta_k13, 2

And then it just ends... and FixMate says that the file is truncated. Please help.

[fwrite] Remote I/O error

I am running Abra-0.96 using abra-0.96-SNAPSHOT-jar-with-dependencies.jar

It runs over 24 hours and gives me the following error:

Clock time in Align and clean contigs: 49461
Sat Dec 19 21:10:43 CST 2015 : Indexing contigs
Running: [bwa index -a bwtsw /qbrc/home/bcantarel/scratch/abra_temp_S2/clean_contigs.fasta]
[bwa_index] Pack FASTA... [fwrite] Remote I/O error
Stream thread done.
Stream thread done.
BWA time: 1313 seconds.
Exception in thread "main" java.lang.RuntimeException: BWA exited with non-zero return code : [1] for command: [bwa index -a bwtsw /qbrc/home/bcantarel/scratch/abra_temp_S2/clean_contigs.fasta]
at abra.Aligner.runCommand(Aligner.java:76)
at abra.Aligner.runCommand(Aligner.java:33)
at abra.Aligner.index(Aligner.java:103)
at abra.ReAligner.alignReads(ReAligner.java:396)
at abra.ReAligner.reAlign(ReAligner.java:214)
at abra.ReAligner.run(ReAligner.java:1240)
at abra.Abra.main(Abra.java:12)

When I try to run this command directly, I have the error:
[bwa_index] unknown algorithm: 'bwtsq'.

Is there a way for me to run the indexing and not have abra start over at the beginning? Also looks like this indexing command is from a previous bwa version, is it possible to update? I couldn't compile on my system, hence I am using the precompiled version.

Errors without BED file

Is there a way to perform realignment without the use of a BED file, for example, in WGS cases?
Thank you.

Trouble with GATK due to Cigar representation

Hi,

I just found out that for some reads right right after a deletion, Cigar information is changed. (Ex. 149M > 16D149M)

In aligned bam
M00288:85:000000000-A96N9:1:1103:13423:15386 163 chr17 7579660 70 149M = 7579755 244

In re-aligned bam
M00288:85:000000000-A96N9:1:1103:13423:15386 163 chr17 7579644 60 16D149M = 7579755

This causes some problems in GATK and MuTect. Would you please look into it?

Thanks,
Joon

Error running on customs bed file

Hi I am trying to run ABRA on Tumor Normal Pairs, but I am trying to create a custom bed file by running GATK's FindCoveredIntervals tools. The reason to do this is that we want to run ABRA also on offtarget regions, for better off target variant calling. While doing this I have got this error:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 4143
at abra.CompareToReference2.getBaseAsChar(CompareToReference2.java:368)
at abra.CompareToReference2.getSequence(CompareToReference2.java:401)
at abra.ReAligner.cleanAndOutputContigs(ReAligner.java:952)
at abra.ReAligner.alignAndCleanContigs(ReAligner.java:523)
at abra.ReAligner.reAlign(ReAligner.java:188)
at abra.ReAligner.run(ReAligner.java:1306)
at abra.Abra.main(Abra.java:12)

Will appreciate your insights on this error and how to avoid it.

build issue

Hi,
I tested building on following 2 systems and get same resolve errors at bottom for build.

  1. Centos 7
    Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
    Maven home: /export/home/bwiley4/tools/apache-maven-3.6.3
    Java version: 11.0.10, vendor: Oracle Corporation, runtime: /export/home/bwiley4/tools/jdk-11.0.10
    Default locale: en_US, platform encoding: UTF-8
    OS name: "linux", version: "3.10.0-1127.8.2.el7.x86_64", arch: "amd64", family: "unix"

  2. Ubuntu 20.04
    Apache Maven 3.6.3
    Maven home: /usr/share/maven
    Java version: 1.8.0_152-release, vendor: JetBrains s.r.o, runtime: /home/coyote/anaconda3/jre
    Default locale: en_US, platform encoding: UTF-8
    OS name: "linux", version: "5.9.3-050903-generic", arch: "amd64", family: "unix"

Error:

[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  29.503 s
[INFO] Finished at: 2021-01-26T06:12:26-05:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project abra: Could not resolve dependencies for project abra:abra:jar:0.97-SNAPSHOT: The following artifacts could not be resolved: samtools:sam:jar:1.129, picard:picard:jar:1.129: Could not find artifact samtools:sam:jar:1.129 in UBU repository (http://www.unc.edu/~lmose/maven-repo) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException

Weird error ...

Once I run ABRA, I always get the following error. Would you please look into it? Thanks.
Loading native library from: /scratch/BREIGR0124_VAxg_01_1408120760/libAbra.so
Loading reference map: /site/ne/home/wings/ref_data/reference_genome/hg19/chrUn_included/ucsc.hg19.fasta
Done loading ref map. Elapsed secs: 112
Fri Aug 15 12:41:14 EDT 2014 : Reading Input SAM Header and identifying read length
Fri Aug 15 12:41:14 EDT 2014 : Identifying header and determining read length
Min insert length: 0
Max insert length: 240721460
Fri Aug 15 12:42:47 EDT 2014 : Max read length is: 100
Fri Aug 15 12:42:47 EDT 2014 : Min contig length: 101
Fri Aug 15 12:42:47 EDT 2014 : Read length: 100
Fri Aug 15 12:42:47 EDT 2014 : Loading target regions
Exception in thread "main" java.lang.NumberFormatException: For input string: "+"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:484)
at java.lang.Integer.parseInt(Integer.java:527)
at abra.RegionLoader.load(RegionLoader.java:42)
at abra.ReAligner.getRegions(ReAligner.java:784)
at abra.ReAligner.loadRegions(ReAligner.java:794)
at abra.ReAligner.reAlign(ReAligner.java:122)
at abra.ReAligner.run(ReAligner.java:1282)
at abra.Abra.main(Abra.java:12)

My Command:
java -Xmx16g -jar ${ABRA_JAR}
--in ${sample_id}.all.sorted.dedup.bam
--out ${sample_id}.all.sorted.dedup.realigned.bam
--ref ${reference_genome}
--targets ${!target_bed_file_path}
--threads 4 --mad 20000 --mbq 27
--working ${temp_dir}

Exception in thread "Thread-212177" java.lang.OutOfMemoryError: PermGen space

Got the following error message:

[main] Version: 0.7.9a-r786
[main] CMD: bwa samse -n 1000 abra_temp_dir/clean_contigs.fasta abra_temp_dir/temp3/align_to_contig.sam.sai abra_temp_dir/temp3/original_reads.fastq.gz
[main] Real time: 1067.556 sec; CPU: 888.432 sec
Stream thread done.
Stream thread done.
BWA time: 1069 seconds.
Clock time in Align to contigs: 12671
Sun Jun 15 19:51:16 CEST 2014 : Adjust reads
Sun Jun 15 19:51:16 CEST 2014 : Adjusting reads.
Sun Jun 15 19:51:16 CEST 2014 : Adjusting reads.
Sun Jun 15 19:51:16 CEST 2014 : Adjusting reads.
Exception in thread "Thread-212177" java.lang.OutOfMemoryError: PermGen space
at java.lang.String.intern(Native Method)
at net.sf.samtools.SAMSequenceRecord.(SAMSequenceRecord.java:85)
at net.sf.samtools.SAMTextHeaderCodec.parseSQLine(SAMTextHeaderCodec.java:209)
at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:100)
at net.sf.samtools.SAMTextReader.readHeader(SAMTextReader.java:185)
at net.sf.samtools.SAMTextReader.(SAMTextReader.java:62)
at net.sf.samtools.SAMTextReader.(SAMTextReader.java:71)
at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:556)
at net.sf.samtools.SAMFileReader.(SAMFileReader.java:167)
at net.sf.samtools.SAMFileReader.(SAMFileReader.java:122)
at abra.ReadAdjuster.adjustReads(ReadAdjuster.java:55)
at abra.AdjustReadsRunnable.go(AdjustReadsRunnable.java:37)
at abra.AbraRunnable.run(AbraRunnable.java:19)
at java.lang.Thread.run(Thread.java:679)

Here is how a I ran abra:

PATH=$PATH:~/tools/bwa-0.7.9 java -Xmx32G -jar ~/tools/abra-0.77/abra-0.77-SNAPSHOT-jar-with-dependencies.jar --in /data/current/bam/108C.duplicate_marked.realigned.recalibrated.bam,/data/current/bam/108D.duplicate_marked.realigned.recalibrated.bam,/data/current/bam/108R.duplicate_marked.realigned.recalibrated.bam --kmer 43,53,63,73,83 --out /data/current/bam/108C.duplicate_marked.realigned.recalibrated.abra.bam,/data/current/bam/108D.duplicate_marked.realigned.recalibrated.abra.bam,/data/current/bam/108R.duplicate_marked.realigned.recalibrated.abra.bam --ref ~/generic/data/broad/human_g1k_v37.fasta --targets <(cut -f 1,2,3 /generic/data/illumina/nexterarapidcapture_exome_targetedregions.nochr.bed) --threads 5 --working abra_temp_dir 2>&1 | grep -v "Max SAM Read name length exceeded" | tee abra.log

How much RAM is required to run abra? I allocated 32Gb for the Java VM. In the example above, I ran abra with three exomes of the same patient, each sequenced at average coverage ~50x.

Running ABRA on BAM files generated from the GATK pepeline

I am trying to run ABRA for somatic indel detection. I have BAM files which were aligned using BBMap and were then realigned and quality recalibrated using GATK.
Will the GATK realignment step have any +ive/-ive effect on the result of ABRA ?

somatic mode

As suggested in README, I was trying to realign tumor/normal bam together. I found that ABRA generated a realigned bam file only for a given tumor bam. Does ABRA not generate a re-aligned bam file for a normal bam? or Did I miss something?

My command:
java -Xmx16g -jar ${ABRA_JAR}
--in ${tumour_id}.all.sorted.dedup.bam ${normal_id}.all.sorted.dedup.bam
--out ${tumour_id}.all.sorted.dedup.realigned.bam ${normal_id}.all.sorted.dedup.realigned.bam
--ref ${reference_genome}
--targets ${!target_bed_file_path}
--threads 4 --mad ${mad} --mbq ${mbq}
--working ${temp_dir}

Thank you,
Joon

License?

This looks like interesting, useful, awesome code! Would you be willing to put an explicit copyright license statement on the code in this repository? Personally, I'd vote for an OSI license (http://creativecommons.org/software) but anything explicit would probably enable use, re-use, and integration by the wider community!

Exception in thread Error

ABRA version - 0.96
Platform - CentOS
command used to run ABRA
java -Xmx4G -jar abra-0.96-SNAPSHOT-jar-with-dependencies.jar --in C26.bam --out C26.ABRA.bam --ref hs37d5.fa --targets Agilent_Nimblegen.bed --threads 4 --working temp_dir >abra.log 2> abra.error

abra.error

Loading reference map:  hs37d5.fa
        Chromosome: 1 length: 249250621
        Chromosome: 2 length: 243199373
        Chromosome: 3 length: 198022430
        Chromosome: 4 length: 191154276
        Chromosome: 5 length: 180915260
        Chromosome: 6 length: 171115067
        Chromosome: 7 length: 159138663
        Chromosome: 8 length: 146364022
        Chromosome: 9 length: 141213431
        .....
        Chromosome: GL000224.1 length: 179693
        Chromosome: GL000223.1 length: 180455
        Chromosome: GL000195.1 length: 182896
        Chromosome: GL000212.1 length: 186858
        Chromosome: GL000222.1 length: 186861
        Chromosome: GL000200.1 length: 187035
        Chromosome: GL000193.1 length: 189789
        Chromosome: GL000194.1 length: 191469
        Chromosome: GL000225.1 length: 211173
        Chromosome: GL000192.1 length: 547496
        Chromosome: NC_007605 length: 171823
        Chromosome: hs37d5 length: 35477943
Done loading ref map.  Elapsed secs: 169
**Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1**
        at abra.RegionLoader.load(RegionLoader.java:45)
        at abra.ReAligner.getRegions(ReAligner.java:702)
        at abra.ReAligner.loadRegions(ReAligner.java:712)
        at abra.ReAligner.reAlign(ReAligner.java:132)
        at abra.ReAligner.run(ReAligner.java:1240)
        at abra.Abra.main(Abra.java:12)

abra.log

Starting 0.96 ...
input0: C26.bam
output0: C26.ABRA.bam
regions: Agilent_Nimblegen.bed
reference: hs37d5.fa
bwa index: hs37d5.fa
working dir: temp_dir
num threads: 4
max unaligned reads: 50000000
minEdgeFrequency: 0
minNodeFrequncy: 2
minContigLength: -1
maxPotentialContigs: 5000
minBaseQuality: 60
minReadCandidateFraction: 0.01
maxAverageRegionDepth: 250
minEdgeRatio: 0.02

rna: null
rna output: null
paired end: true
use intermediate bam: false
Java version: 1.7.0_85
hostname: nski0244
Loading native library from: /opt/ngstools/aligners/temp_dir/libAbra.so
Fri May 13 15:29:40 EDT 2016 : Reading Input SAM Header and identifying read length
Fri May 13 15:29:40 EDT 2016 : Identifying header and determining read length
Min insert length: 0
Max insert length: 246703503
Fri May 13 15:31:41 EDT 2016 : Max read length is: 93
Fri May 13 15:31:41 EDT 2016 : Min contig length: 94
Fri May 13 15:31:41 EDT 2016 : Read length: 93
Fri May 13 15:31:41 EDT 2016 : Loading target regions

ls temp_dir/
libAbra.so unaligned

ls temp_dir/unaligned/ (empty)

Would appreciate help,

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.