Giter Club home page Giter Club logo

bella's People

Contributors

aydinbuluc avatar giuliaguidi avatar isratnisa avatar kodingkoning avatar mellis-github avatar qizhou1512 avatar richardlett avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bella's Issues

Problem with running evaluation test

Hello,

I'm attempting to use BELLA but have encountered a problem in the alignment step.
The installation is one CentOS 3.10, using gcc 6.3.0, with virtualenv pip install of simplesam. When I try the test files using ./bella -i test.txt -o bella_output -d 30 the produced output file 'bella_output.out' is empty.

Looking over the print out I find this 'Average length of successful alignment -nan bp'. Everything else seems fine.


paeruginosa30x_0001_5reads.fastq: 0 MB
K-mer counting: BELLA
Output filename: bella_output.out
K-mer length: 17
X-drop: 7
Depth: 30X
Compute alignment: true
Seeding: two-kmer

Running with up to 4 threads
Reading FASTQ file paeruginosa30x_0001_5reads.fastq
Initial parsing, error estimation, and k-mer loading took: 0.0749692s

Cardinality estimate is 27251
Table size is: 186907 bits, 0.0222811 MB
Optimal number of hash functions is: 5
First pass of k-mer counting took: 0.0391129s
Second pass of k-mer counting took: 0.0153589s

Entries within reliable range: 231
Error rate estimate is -nan
Reliable lower bound: 2
Reliable upper bound: 30
Deviation from expected alignment score: 0.2
Constant of adaptive threshold: -nan

Running with up to 4 threads
Reading FASTQ file paeruginosa30x_0001_5reads.fastq
Fastq(s) parsing fastq took: 0.0317728s
Total number of reads: 5

Old number of nonzeros before merging: 474
New number of nonzeros after merging: 474
Old number of nonzeros before merging: 474
New number of nonzeros after merging: 474
Sparse matrix construction took: 0.00283812s

Available RAM is assumed to be: 8000 MB
FLOPS is 255
nnz(output): 10 | free memory: 8.38861e+09 | required memory: 288
Stages: 1 | max nnz per stage: 291271111

Columns [0 - 5] overlap time: 0.00734909s
Creating or appending to output file with 0 MB

Columns [0 - 5] alignment time: 0.0329758s | alignment rate: 937294 bases/s | average read length: 5556.6 | read pairs aligned this stage: 10
Average length of successful alignment -nan bps
Average length of failed alignment 5788.6 bps

Outputted 0 lines in 0.0115177s
Total running time: 0.346414s```

build error bella-gpu, issue with large input

I'm getting the same error as issue #32 with a copy of the repository cloned today, in the same system background as described in issue #32 - Ubuntu 20.04, nvcc 11.6, gcc 9.4. The comment in that issue that LOGAN may have issues with "large-ish" input is a concern, because I'd like to use bella-gpu on a file containing about 600 gigabases of long-read sequences.
Two questions: (1) Is there a fix for the compile failure other than to install different versions of nvcc and gcc?
(2) If I'm successful in compiling, will bella-gpu process 600 gigabases of input?
Thanks!

I cant type the letter b

I was doing some testing on my computer and I couldn't type the letter 'b' in the control centre

error when installing the sof

gcc -O3 -fopenmp -c -o bound.o kmercode/bound.cpp
In file included from /usr/include/c++/4.9/array:35:0,
from kmercode/bound.cpp:11:
/usr/include/c++/4.9/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
#error This file requires compiler and library support for the
^
make: *** [bound.o] Error 1

how can I solve the errors? thanks!

Too much memory requested

Hi,

I've been trying to use bella b150701 to assemble a small portion of the human genome. Whenever I try to use it though I run into the issue that bella apparently sometimes wants insane amounts of memory.

./bella -k 13 -i input.fastq -o test-bella -d 3

sometimes gives

@id1 : 133599645 MB
ACGT .... : 133599645 MB
: 133599645 MB

When I let this run, the memory fills up until I have to restart my computer.
Only sometimes i get somethig more feasible (with the exact same command, mind you) :

@id1 : 3 MB
ACGT .... : 3 MB
: 3 MB

I've tried this with different depth settings and with different k-mer lengths, but the problem seems to always occur.

Freeing invalid pointer when using non-zero window size for minimizer selection

Hi Giulia,

I've run into a problem when I try to run Bella using minimizers. Specifically, when I run the command

"bella -f input.txt -o output --paf -k 31 -w 17 -e 0.005 -l 2 -u 100",

I get the following output log:

INFO: src/../include/kmercount.hpp(108) InputFile = metagenome-anonymous.fastq
INFO: src/../include/kmercount.hpp(109) InputSize = 67.693069 MB
INFO: src/main.cpp(200) OutputFile = output.out
INFO: src/main.cpp(203) kmerSize = 31
INFO: src/main.cpp(206) GPUs = DISABLED
INFO: src/main.cpp(209) UserDefinedMemory = 8000.000000 MB
INFO: src/main.cpp(212) OutputPAF = 1
INFO: src/main.cpp(215) BinSize = 500
INFO: src/main.cpp(218) DeltaChernoff = 0.100000
INFO: src/main.cpp(221) RunPairwiseAlignment = 1
INFO: src/main.cpp(231) HOPC = DISABLED
INFO: src/main.cpp(235) xDrop = 7
INFO: src/main.cpp(238) KmerSplitCount = 1
INFO: src/main.cpp(243) useMinimizer = ENABLED
INFO: src/main.cpp(245) minimizerWindow = 17
INFO: src/main.cpp(276) numThreads = 64
INFO: src/../include/kmercount.hpp(711) ReadingFASTQ = metagenome-anonymous.fastq
*** Error in `bella': free(): invalid pointer: 0x00002aaacc076480 ***
Aborted

When I run without the "-w 17" option, it runs perfectly.

Thanks,
Gabe

Segfault in matrix transpose

Issue I neglected earlier

Data set:

/project/projectdirs/mp309/bella-spgemm/ecoli_hifi_29x.fastq

Parameters:

-k 31 -l 20 -u 23

Fix:

The issue appears to be in include/common/transpose.h, ~ line 35.

for (IT i=0; i <= n; i++)
{
    cscColPtr[i+1] = atomicColPtr[i] + cscColPtr[i];
}

cscColPtr

is of length n+1 I believe (n rows, n+1 stores nnz).

However this writes at location n+1 (i.e. n+2nd element), causing out of bounds.

I believe the correct code would be

for (IT i=0; i < n; i++)

cscColPtr[i+1] already handles the +1 size correctly for the last index (n).

build error bella-gpu

Hi,
when building bella-gpu (make bella-gpu) ; I've got an error

nvcc -arch=sm_70 -O3 -maxrregcount=32 -std=c++14 -Xcompiler -fopenmp -w -Iinclude/common/GTgraph/sprng2.0-lite/include -IloganGPU -Iseqan -o bella hash_funcs.o Kmer.o Buffer.o fq_reader.o optlist.o src/main.cu -L/home/boelle/Documents/bella/libbloom/build -lbloom -lpthread -lbz2 -lz -D__NVCC__

seqan/seqan/score/score_matrix_dyn.h:110:489: error: template argument 1 is invalid
110 | enum class AminoAcidScoreMatrixID : std::underlying_type_t<decltype(Find<impl::score::MatrixTags, ScoreSpecBlosum30>::VALUE)>


Seems like some template library must be missing - Do you have an idea ?
building without GPU is ok (make bella ); building LOGAN runs fine

platform : Ubuntu 20.04
gcc : 9.4.0
nvcc : 11.6 (quadro P1000 - architecture=sm_62)

Thanks a lot.

What does MECAT's indices mean

Hi,

I was working with the evaluation module of the Bella. I was going to test the output from MECAT. So, there I need to provide two output files: one is Mecat's output file and the other one id Mecat's indices files. I am not sure what does that " Meact's indices" file mean? Where where to find it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.