Giter Club home page Giter Club logo

assembly-stats's Introduction

assembly-stats

Get assembly statistics from FASTA and FASTQ files.

Build Status License: GPL v3

Contents

Installation

If you encounter an issue when installing assembly-stats please contact your local system administrator. If you encounter a bug please log it here.

Dependencies

  • zlib

Compiling from source

Run the following commands to install the program assembly-stats to /usr/local/bin/.

mkdir build
cd build
cmake ..
make
make test
make install

If you do not have root access, you can install to a directory of your choice by changing the call to cmake. For example:

cmake -DINSTALL_DIR:PATH=/foo/bar/ ..

would mean you finish up with a copy of assembly-stats in the directory /foo/bar/.

Usage

Get statistics from a list of files:

assembly-stats file.fasta another_file.fastq

Detection of FASTA or FASTQ format of each file is automatic from the file contents, so file names and extensions are irrelevant.

The files can be supplied in compressed format (.gz, .bz2 or .xz). Compression support depends on what libraries are available when assembly-stats is compiled. Compression type is detected automatically and does not depend on the file name extensions.

The default output format is human readable. You can change the output format and ignore sequences shorter than a given length. Get the full usage by running with no files listed:

$ assembly-stats
usage: stats [options] <list of fasta/q files>

Reports sequence length statistics from fasta and/or fastq files

options:
-l <int>
    Minimum length cutoff for each sequence.
    Sequences shorter than the cutoff will be ignored [1]
-s
    Print 'grep friendly' output
-t
    Print tab-delimited output
-u
    Print tab-delimited output with no header line

Example

Here is an example on the Plasmodium falciparum reference genome:

$ assembly-stats Pf3D7_v3.fasta
stats for Pf3D7_v3.fasta
sum = 23328019, n = 16, ave = 1458001.19, largest = 3291936
N50 = 1687656, n = 5
N60 = 1472805, n = 7
N70 = 1445207, n = 8
N80 = 1343557, n = 10
N90 = 1067971, n = 12
N100 = 5967, n = 16
N_count = 0
Gaps = 0

The numbers should be self-explanatory, except maybe lines like N50 = 1687656, n = 5. The N50 is 1687656, with 50% of the assembly in 5 sequences. A "gap" is any consecutive run of Ns (undetermined nucleotide bases) of any length (it is case-insensitive so counts any "n" as well). N_count is the total Ns (undetermined nucleotide bases) across the entire assembly.

License

assembly-stats is free software, licensed under GPLv3.

Feedback/Issues

We currently do not have the resources to provide support for assembly-stats. However, the community might be able to help you out if you report any issues about usage of the software to the issues page.

assembly-stats's People

Contributors

aslett1 avatar dkj avatar flass avatar martinghunt avatar ssjunnebo avatar tmaklin avatar tnguyensanger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

assembly-stats's Issues

N50 and L50

just one question, for assembly-stats example (in your main page), in N50=1687656 and n=5, n correspond to L50 ???
Thanks

$ assembly-stats Pf3D7_v3.fasta
stats for Pf3D7_v3.fasta
sum = 23328019, n = 16, ave = 1458001.19, largest = 3291936
N50 = 1687656, n = 5
N60 = 1472805, n = 7
N70 = 1445207, n = 8
N80 = 1343557, n = 10
N90 = 1067971, n = 12
N100 = 5967, n = 16
N_count = 0
Gaps = 0

Different output in single line vs multi line fasta

The software is giving different outputs for single vs multiline fasta.
Seems to be counting "#lines = bytes" added into contig size

Needs urgent fixing to avoid confusion. Or should alarm people while running the software.

Handle gzipped input

Hello,

It would be nice if assembly-stats could handle gzipped input files.

Best,

Phil

Installation issue Assembly-Stats on Mac

Hi, I am trying to install assembly stats via manual installation on my machine.
I have installed zlib via Homebrew ahead of the installation.
I have downloaded the .gz file, positioned it into my Applications folder and then proceeded to the installation as indicated on your page.

I recover multiple errors.
First, while running "make test" i have this problem:


Running tests...
Test project /Applications/assembly-stats-master/build
    Start 1: runUnitTests
Could not find executable runUnitTests
Looked in the following places:
runUnitTests
runUnitTests
Release/runUnitTests
Release/runUnitTests
Debug/runUnitTests
Debug/runUnitTests
MinSizeRel/runUnitTests
MinSizeRel/runUnitTests
RelWithDebInfo/runUnitTests
RelWithDebInfo/runUnitTests
Deployment/runUnitTests
Deployment/runUnitTests
Development/runUnitTests
Development/runUnitTests
Unable to find executable: runUnitTests
1/1 Test #1: runUnitTests .....................***Not Run   0.00 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.00 sec

The following tests FAILED:
	  1 - runUnitTests (Not Run)
Errors while running CTest
Output from these tests are in: /Applications/assembly-stats-master/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [test] Error 8

Secondly, when i run "make install":

[  5%] Building CXX object gtest-1.7.0/CMakeFiles/gtest.dir/src/gtest-all.cc.o
[ 10%] Linking CXX static library libgtest.a
[ 10%] Built target gtest
[ 15%] Building CXX object gtest-1.7.0/CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
[ 21%] Linking CXX static library libgtest_main.a
[ 21%] Built target gtest_main
[ 26%] Building CXX object CMakeFiles/filetype.dir/filetype.cpp.o
In file included from /Applications/assembly-stats-master/filetype.cpp:1:
In file included from /Applications/assembly-stats-master/filetype.h:8:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/bxzstr.hpp:18:
/Applications/assembly-stats-master/build/external/bxzstr/include/stream_wrapper.hpp:17:33: warning: defaulted function definitions are a C++11 extension [-Wc++11-extensions]
    virtual ~stream_wrapper() = default;
                                ^
In file included from /Applications/assembly-stats-master/filetype.cpp:1:
In file included from /Applications/assembly-stats-master/filetype.h:8:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/bxzstr.hpp:19:
/Applications/assembly-stats-master/build/external/bxzstr/include/strict_fstream.hpp:62:18: error: exception specification of overriding function is more lax than base version
    const char * what() const noexcept { return _msg.c_str(); }
                 ^
/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/c++/v1/exception:107:25: note: overridden virtual function is here
    virtual const char* what() const _NOEXCEPT;
                        ^
In file included from /Applications/assembly-stats-master/filetype.cpp:1:
In file included from /Applications/assembly-stats-master/filetype.h:8:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/bxzstr.hpp:19:
/Applications/assembly-stats-master/build/external/bxzstr/include/strict_fstream.hpp:62:30: error: expected ';' at end of declaration list
    const char * what() const noexcept { return _msg.c_str(); }
                             ^
                             ;
/Applications/assembly-stats-master/build/external/bxzstr/include/strict_fstream.hpp:61:41: error: member initializer '_msg' does not name a non-static data member or base class
    Exception(const std::string& msg) : _msg(msg) {}
                                        ^~~~~~~~~
/Applications/assembly-stats-master/build/external/bxzstr/include/strict_fstream.hpp:152:18: warning: defaulted function definitions are a C++11 extension [-Wc++11-extensions]
    ifstream() = default;
                 ^
/Applications/assembly-stats-master/build/external/bxzstr/include/strict_fstream.hpp:172:18: warning: defaulted function definitions are a C++11 extension [-Wc++11-extensions]
    ofstream() = default;
                 ^
/Applications/assembly-stats-master/build/external/bxzstr/include/strict_fstream.hpp:191:17: warning: defaulted function definitions are a C++11 extension [-Wc++11-extensions]
    fstream() = default;
                ^
In file included from /Applications/assembly-stats-master/filetype.cpp:1:
In file included from /Applications/assembly-stats-master/filetype.h:8:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/bxzstr.hpp:20:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/compression_types.hpp:14:
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:64:18: error: exception specification of overriding function is more lax than base version
    const char * what() const noexcept { return _msg.c_str(); }
                 ^
/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/c++/v1/exception:107:25: note: overridden virtual function is here
    virtual const char* what() const _NOEXCEPT;
                        ^
In file included from /Applications/assembly-stats-master/filetype.cpp:1:
In file included from /Applications/assembly-stats-master/filetype.h:8:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/bxzstr.hpp:20:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/compression_types.hpp:14:
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:64:30: error: expected ';' at end of declaration list
    const char * what() const noexcept { return _msg.c_str(); }
                             ^
                             ;
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:25:34: error: member initializer '_msg' does not name a non-static data member or base class
    bzException(const int ret) : _msg("bzlib: ") {
                                 ^~~~~~~~~~~~~~~
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:28:17: error: use of undeclared identifier '_msg'
                _msg += "BZ_CONFIG_ERROR: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:31:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_SEQUENCE_ERROR: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:34:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_PARAM_ERROR: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:37:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_MEM_ERROR: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:40:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_DATA_ERROR: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:43:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_DATA_ERROR_MAGIC: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:46:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_IO_ERROR: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:49:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_UNEXPECTED_EOF: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:52:3: error: use of undeclared identifier '_msg'
                _msg += "BZ_OUTBUFF_FULL: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:57:3: error: use of undeclared identifier '_msg'
                _msg += "[" + oss.str() + "]: ";
                ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:60:9: error: use of undeclared identifier '_msg'
        _msg += ret;
        ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:62:42: error: member initializer '_msg' does not name a non-static data member or base class
    bzException(const std::string msg) : _msg(msg) {}
                                         ^~~~~~~~~
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:87:5: error: exception specification of overriding function is more lax than base version
    ~bz_stream_wrapper() {
    ^
/Applications/assembly-stats-master/build/external/bxzstr/include/stream_wrapper.hpp:17:13: note: overridden virtual function is here
    virtual ~stream_wrapper() = default;
            ^
In file included from /Applications/assembly-stats-master/filetype.cpp:1:
In file included from /Applications/assembly-stats-master/filetype.h:8:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/bxzstr.hpp:20:
In file included from /Applications/assembly-stats-master/build/external/bxzstr/include/compression_types.hpp:14:
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:95:31: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    int decompress(const int) override {
                              ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:100:45: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    int compress(const int _flags = BZ_RUN) override {
                                            ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:105:29: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    bool stream_end() const override { return this->ret == BZ_STREAM_END; }
                            ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:106:23: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    bool done() const override { return this->stream_end(); }
                      ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:108:36: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    const uint8_t* next_in() const override { return (uint8_t*)bz_stream::next_in; }
                                   ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:109:27: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    long avail_in() const override { return bz_stream::avail_in; }
                          ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:110:31: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    uint8_t* next_out() const override { return (uint8_t*)bz_stream::next_out; }
                              ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:111:28: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    long avail_out() const override { return bz_stream::avail_out; }
                           ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:113:47: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    void set_next_in(const unsigned char* in) override { bz_stream::next_in = (char*)in; }
                                              ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:114:38: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    void set_avail_in(const long in) override { bz_stream::avail_in = in; }
                                     ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:115:42: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    void set_next_out(const uint8_t* in) override { bz_stream::next_out = (char*)in; }
                                         ^
/Applications/assembly-stats-master/build/external/bxzstr/include/bz_stream_wrapper.hpp:116:39: warning: 'override' keyword is a C++11 extension [-Wc++11-extensions]
    void set_avail_out(const long in) override { bz_stream::avail_out = in; }
                                      ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
16 warnings and 20 errors generated.
make[2]: *** [CMakeFiles/filetype.dir/filetype.cpp.o] Error 1
make[1]: *** [CMakeFiles/filetype.dir/all] Error 2
make: *** [all] Error 2

I have consulted multiple web resources but I could not find a solution.
How can I fix this?

EasyBuild installation recipe and a patch file

With this message, I would like to share an easyconfig file that I have just composed to install this tool on our Tier-2 machines. You may find the files via this PR: easybuilders/easybuild-easyconfigs#20281.

  • I have succeeded to install the tool and create the assembly-stats/1.0.1-GCCcore-11.3.0 on our Intel Skylake, Cascadelake and Icelake nodes, running on Rocky 8.9 OS.
  • The build was impossible without a small patch file (see the PR link above)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.