Giter Club home page Giter Club logo

racon's Introduction

Racon

Latest GitHub release Build status for gcc/clang Published in Genome Research

Consensus module for raw de novo DNA assembly of long uncorrected reads.

Description

Racon is intended as a standalone consensus module to correct raw contigs generated by rapid assembly methods which do not include a consensus step. The goal of Racon is to generate genomic consensus which is of similar or better quality compared to the output generated by assembly methods which employ both error correction and consensus steps, while providing a speedup of several times compared to those methods. It supports data produced by both Pacific Biosciences and Oxford Nanopore Technologies.

Racon can be used as a polishing tool after the assembly with either short accurate data or data produced by third generation of sequencing. The type of data inputted is automatically detected. Although, Racon expects single-end short reads, while paired-end reads should be renamed with unique names up to the first whitespace and joined into a single file before mapping (which can be done with misc/racon_preprocess.py).

Racon takes as input only three files: contigs in FASTA/FASTQ format, reads in FASTA/FASTQ format and overlaps/alignments between the reads and the contigs in MHAP/PAF/SAM format. Output is a set of polished contigs in FASTA format printed to stdout. All input files can be compressed with gzip (which will have impact on parsing time).

Racon can also be used as a read error-correction tool. In this scenario, the MHAP/PAF/SAM file needs to contain pairwise overlaps between reads including dual overlaps.

A wrapper script is also available to enable easier usage to the end-user for large datasets. It has the same interface as racon but adds two additional features from the outside. Sequences can be subsampled to decrease the total execution time (accuracy might be lower) while target sequences can be split into smaller chunks and run sequentially to decrease memory consumption. Both features can be run at the same time as well.

Dependencies

  1. gcc 4.8+ or clang 3.4+
  2. cmake 3.2+
  3. zlib

CUDA Support

  1. gcc 5.0+
  2. cmake 3.10+
  3. CUDA 9.0+

Installation

To install Racon run the following commands:

git clone https://github.com/lbcb-sci/racon && cd racon && mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release .. && make

After successful installation, an executable named racon will appear in build/bin (alongside unit tests racon_test).

Optionally, you can run sudo make install to install racon executable to your machine.

To build the wrapper script add -Dracon_build_wrapper=ON while running cmake. After installation, an executable named racon_wrapper (python script) will be created in build/bin.

CUDA Support

Racon makes use of NVIDIA's GenomeWorks SDK for CUDA accelerated polishing and alignment.

To build racon with CUDA support, add -Dracon_enable_cuda=ON while running cmake. If CUDA support is unavailable, the cmake step will error out. Note that the CUDA support flag does not produce a new binary target. Instead it augments the existing racon binary itself.

cd build
cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON ..
make

Note: Short read polishing with CUDA is still in development!

Packaging

To generate a Debian package for racon, run the following command from the build folder -

make package

Usage

Usage of racon is as following:

racon [options ...] <sequences> <overlaps> <target sequences>

    # default output is stdout
    <sequences>
        input file in FASTA/FASTQ format (can be compressed with gzip)
        containing sequences used for correction
    <overlaps>
        input file in MHAP/PAF/SAM format (can be compressed with gzip)
        containing overlaps between sequences and target sequences
    <target sequences>
        input file in FASTA/FASTQ format (can be compressed with gzip)
        containing sequences which will be corrected

options:
    -u, --include-unpolished
        output unpolished target sequences
    -f, --fragment-correction
        perform fragment correction instead of contig polishing (overlaps
        file should contain dual/self overlaps!)
    -w, --window-length <int>
        default: 500
        size of window on which POA is performed
    -q, --quality-threshold <float>
        default: 10.0
        threshold for average base quality of windows used in POA
    -e, --error-threshold <float>
        default: 0.3
        maximum allowed error rate used for filtering overlaps
    --no-trimming
        disables consensus trimming at window ends
    -m, --match <int>
        default: 3
        score for matching bases
    -x, --mismatch <int>
        default: -5
        score for mismatching bases
    -g, --gap <int>
        default: -4
        gap penalty (must be negative)
    -t, --threads <int>
        default: 1
        number of threads
    --version
        prints the version number
    -h, --help
        prints the usage

only available when built with CUDA:
    -c, --cudapoa-batches <int>
        default: 0
        number of batches for CUDA accelerated polishing per GPU
    -b, --cuda-banded-alignment
        use banding approximation for polishing on GPU. Only applicable when -c is used.
    --cudaaligner-batches <int>
        default: 0
        number of batches for CUDA accelerated alignment per GPU
    --cudaaligner-band-width <int>
        default: 0
        Band width for cuda alignment. Must be >= 0. Non-zero allows user defined
        band width, whereas 0 implies auto band width determination.

racon_test is run without any parameters.

Usage of racon_wrapper equals the one of racon with two additional parameters:

...
options:
    --split <int>
        split target sequences into chunks of desired size in bytes
    --subsample <int> <int>
        subsample sequences to desired coverage (2nd argument) given the
        reference length (1st argument)
    ...

Contact information

For additional information, help and bug reports please send an email to one of the following: [email protected], [email protected], [email protected], [email protected]

Acknowledgment

This work has been supported in part by Croatian Science Foundation under the project UIP-11-2013-7353. IS is supported in part by the Croatian Academy of Sciences and Arts under the project "Methods for alignment and assembly of DNA sequences using nanopore sequencing data". NN is supported by funding from A*STAR, Singapore.

racon's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

racon's Issues

Log Racon polish edits

Hi, I am following the polishing method of Mc Cartney et al., 2022, where Racon's liftover and master branches were used. However, I have no idea of how to use this two and would like to ask.

The detailed method is as follows:

The filtered alignments produced are then used as input to Racon, here the Racon liftover branch is utilised. This is an extension of the master branch of Racon with two custom features:

  • BED selection of regions for polishing
  • logging the changes introduced to the draft sequences to produce the polished output (in VCF, PAF or optionally SAM format)

Racon is run with default options except for two new logging options -L out_prefix and -S, which store the liftover information between the input and output sequences.

Floating Point Exception on GPU

Hi,

Thanks for great software. Running latest version of racon (commit b591b12c22539948782704446989893bde826a29) and hitting a floating point exception on GPU, but CPU works. Thanks for your help!

I'm attaching what I hope is a reproducible example. racon_debug.zip

root@a5698f05c7c3:/data/racon_trouble# racon -m 8 -x -6 -g -8 -w 500 --include-unpolished -t 4 --cudapoa-batches 1 --cudaaligner-batches 4 --cuda-banded-alignment filtered.fastq output.paf polished-input.fa
Using 1 GPU(s) to perform polishing
Initialize device 0
[CUDAPolisher] Constructed.
[racon::Polisher::initialize] loaded target sequences 0.000033 s
[racon::Polisher::initialize] loaded sequences 0.006921 s
[racon::Polisher::initialize] loaded overlaps 0.001669 s
GPU 0: Aligning with band width 68
[racon::CUDAPolisher::initialize] allocated memory on GPUs for alignment 0.989071 s
Alignment skipped by GPU: 415 / 921
[racon::Polisher::initialize] aligning overlaps [====================] 0.035475 s
[racon::Polisher::initialize] transformed data into windows 0.001138 s
[racon::CUDAPolisher::polish] allocated memory on GPUs for polishing 1.352416 s
Floating point exception (core dumped)

SIGSEGV in test suite when using bioparser version 3.0.12, spoa version 4.0.6

Hi,
in Debian bioparser and spoa were upgraded to the latest versions. To accomplish this a patch was applied. When building racon with this patch and running the test suite in gdb I get:

Reading symbols from ../obj-x86_64-linux-gnu/bin/racon_test...
(gdb) run
Starting program: /build/racon-1.4.13/obj-x86_64-linux-gnu/bin/racon_test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running main() from /build/googletest-YnT0O3/googletest-1.10.0.20201025/googletest/src/gtest_main.cc
[==========] Running 15 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 5 tests from RaconInitializeTest
[ RUN      ] RaconInitializeTest.PolisherTypeError
[Detaching after fork from child process 4055554]
[       OK ] RaconInitializeTest.PolisherTypeError (4 ms)
[ RUN      ] RaconInitializeTest.WindowLengthError
[Detaching after fork from child process 4055557]
[       OK ] RaconInitializeTest.WindowLengthError (3 ms)
[ RUN      ] RaconInitializeTest.SequencesPathExtensionError
[Detaching after fork from child process 4055559]
[       OK ] RaconInitializeTest.SequencesPathExtensionError (3 ms)
[ RUN      ] RaconInitializeTest.OverlapsPathExtensionError
[Detaching after fork from child process 4055560]
[       OK ] RaconInitializeTest.OverlapsPathExtensionError (6 ms)
[ RUN      ] RaconInitializeTest.TargetPathExtensionError
[Detaching after fork from child process 4055561]
[       OK ] RaconInitializeTest.TargetPathExtensionError (6 ms)
[----------] 5 tests from RaconInitializeTest (22 ms total)

[----------] 10 tests from RaconPolishingTest
[ RUN      ] RaconPolishingTest.ConsensusWithQualities
[New Thread 0x7ffff71f7700 (LWP 4055562)]
[New Thread 0x7ffff69f6700 (LWP 4055563)]
[New Thread 0x7ffff61f5700 (LWP 4055564)]
[New Thread 0x7ffff59f4700 (LWP 4055565)]
[racon::Polisher::initialize] loaded target sequences 0.000784 s
[racon::Polisher::initialize] loaded sequences 0.040465 s
[racon::Polisher::initialize] loaded overlaps 0.000357 s
[racon::Polisher::initialize] aligning overlaps [====================] 0.570651 s
[racon::Polisher::initialize] transformed data into windows 0.002290 s
[racon::Polisher::polish] generating consensus [====================] 1.046491 s
./test/racon_test.cpp:104: Failure
Expected equality of these values:
  polished_sequences.size()
    Which is: 1
  2

Thread 1 "racon_test" received signal SIGSEGV, Segmentation fault.
0x00005555555890b1 in calculateEditDistance (query="", target=<error reading variable: Cannot access memory at address 0x28>) at /usr/include/c++/10/bits/basic_string.h:186
186           _M_data() const

Do you have any idea how to fix this SIGSEGV?

Kind regards, Andreas.

install error

when I run the "cmake -DCMAKE_BUILD_TYPE=Release .." commend,it got an error like that:
CMake Error at CMakeLists.txt:46 (add_subdirectory):
The source directory

/ds3512/home/panyp/denovo/racon-1.4.3/vendor/bioparser

does not contain a CMakeLists.txt file.

CMake Error at CMakeLists.txt:49 (add_subdirectory):
The source directory

/ds3512/home/panyp/denovo/racon-1.4.3/vendor/spoa

does not contain a CMakeLists.txt file.

CMake Error at CMakeLists.txt:52 (add_subdirectory):
The source directory

/ds3512/home/panyp/denovo/racon-1.4.3/vendor/thread_pool

does not contain a CMakeLists.txt file.

CMake Error at CMakeLists.txt:55 (add_subdirectory):
The source directory

/ds3512/home/panyp/denovo/racon-1.4.3/vendor/edlib

does not contain a CMakeLists.txt file.
Can you help me fix this problem . Thank you.

report of the count of corrections

Hi
Is there a way to report the number of corrections done during a pass. Pilon does it and it is nice to evaluate the effect of successive rounds and decide to stop early enough. I probably missed it here.
thanks

Racon does not build against latest edlib

Hi,
I upgraded the Debian package of edlib to the latest version 1.2.7. Unfortunately racon does not build against this as it is reported in a bug report:

/usr/bin/ld: CMakeFiles/racon_test.dir/test/racon_test.cpp.o: in function `calculateEditDistance(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >     const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)':
./obj-x86_64-linux-gnu/./test/racon_test.cpp:18: undefined reference to `edlibDefaultAlignConfig'
/usr/bin/ld: ./obj-x86_64-linux-gnu/./test/racon_test.cpp:18: undefined reference to `edlibAlign'
/usr/bin/ld: ./obj-x86_64-linux-gnu/./test/racon_test.cpp:22: undefined reference to `edlibFreeAlignResult'
/usr/bin/ld: CMakeFiles/racon_test.dir/src/overlap.cpp.o: in function `racon::Overlap::align_overlaps(char const*, unsigned int, char const*, unsigned int)':
./obj-x86_64-linux-gnu/./src/overlap.cpp:208: undefined reference to `edlibNewAlignConfig'
/usr/bin/ld: ./obj-x86_64-linux-gnu/./src/overlap.cpp:208: undefined reference to `edlibAlign'
/usr/bin/ld: ./obj-x86_64-linux-gnu/./src/overlap.cpp:213: undefined reference to `edlibAlignmentToCigar'
/usr/bin/ld: ./obj-x86_64-linux-gnu/./src/overlap.cpp:223: undefined reference to `edlibFreeAlignResult'
collect2: error: ld returned 1 exit status

I wonder whether you plan to adapt racon to the latest edlib (and in case it is just done in some not yet tagged commit whether you intend to tag this as release).
Kind regards, Andreas.

Does not build with thread-pool 4

Hi,
I checked issue 195 of the old repository of racon and thus was optimistic to try it in connection with thread-pool 4.0.0. Unfortunately I get

/build/racon-1.4.21/src/polisher.cpp: In lambda function:
/build/racon-1.4.21/src/polisher.cpp:494:41: error: 'using element_type = class thread_pool::ThreadPool' {aka 'class thread_pool::ThreadPool'} has no member named 'thread_ids'; did  you mean 'thread_map'?
  494 |                 auto it = thread_pool_->thread_ids().find(std::this_thread::get_id());  // NOLINT
      |                                         ^~~~~~~~~~
      |                                         thread_map
/build/racon-1.4.21/src/polisher.cpp: In lambda function:
/build/racon-1.4.21/src/polisher.cpp:494:41: error: 'using element_type = class thread_pool::ThreadPool' {aka 'class thread_pool::ThreadPool'} has no member named 'thread_ids'; did  you mean 'thread_map'?
  494 |                 auto it = thread_pool_->thread_ids().find(std::this_thread::get_id());  // NOLINT
      |                                         ^~~~~~~~~~
      |                                         thread_map
[ 64%] Building CXX object CMakeFiles/racon.dir/src/window.cpp.o
/usr/bin/c++ -DRACON_VERSION=\"v1.4.21\" -I/build/racon-1.4.21/src -I/build/racon-1.4.21/obj-x86_64-linux-gnu/config -g -O2 -ffile-prefix-map=/build/racon-1.4.21=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -pedantic -pthread -std=c++11 -MD -MT CMakeFiles/racon.dir/src/window.cpp.o -MF CMakeFiles/racon.   dir/src/window.cpp.o.d -o CMakeFiles/racon.dir/src/window.cpp.o -c /build/racon-1.4.21/src/window.cpp
make[3]: *** [CMakeFiles/racon_test.dir/build.make:107: CMakeFiles/racon_test.dir/src/polisher.cpp.o] Error 1

Any idea what might went wrong here?
Kind regards, Andreas.

Incompatible with latest spoa release

Hi,
when trying to build the latest release of racon (1.4.6) against the latest release of the spoa library (3.0.1) I get:

/build/racon-1.4.6/src/window.cpp:90:43: error: 'using element_type = class spoa::AlignmentEngine' {aka 'class spoa::AlignmentEngine'} has no member named                           'align_sequence_with_graph'
   90 |             alignment = alignment_engine->align_sequence_with_graph(
      |                                           ^~~~~~~~~~~~~~~~~~~~~~~~~
/build/racon-1.4.6/src/window.cpp:96:43: error: 'using element_type = class spoa::AlignmentEngine' {aka 'class spoa::AlignmentEngine'} has no member named                           'align_sequence_with_graph' 
   96 |             alignment = alignment_engine->align_sequence_with_graph(
      |                                           ^~~~~~~~~~~~~~~~~~~~~~~~~
make[3]: *** [CMakeFiles/racon.dir/build.make:118: CMakeFiles/racon.dir/src/window.cpp.o] Error 1

I guess I might need some code that is not (yet) tagged as release. Any hint how to fix this?
Kind regards, Andreas.

recommendations for read correction

Hi,
I am considering whether to use Racon for error correct raw ONT reads (>20 kb, QV>=8, about 70 Gb or 15x coverage of my repeat-rich genome) and I wonder if you have any settings to recommend, both in the minimap2 and in the Racon steps. I replaced the headers with a short string of unique numbers (up to 9 characters).
Examples:
minimap2 -t 90 -x ava-ont -o alignment.paf reads.fa reads.fa
should I then use any of the -c -I -N parameters?
racon -f -t 90 reads.fa alignment.paf reads.fa
should I also consider -q and -e for example?
my concerns are the space required for the error correction (I have 15-17 TB on that machine), memory (1 TB max), and parallelization (I have 96 CPUs but does Racon scale linearly up to there or should I run jobs in parallel?)
Thanks!

Why are there two `aligning overlaps` (per split/chunk)?

Hi Robert,

while I'm running racon on my draft asm using GPUs, I observed that there are two aligning overlaps steps, the first using GPU, relatively quick, and the second using CPU, taking relatively longer time.

Am I setting parameters in a wrong way, or is this expected?

My biggest contig is about 90M,
NG50 ~ 11M,
LG50 ~70,
out of 3000 ~ 4000 (un-scaffolded) contigs
on a primate genome.

# 10000000000 == 10,000,000,000, aka 10GB
./racon_wrapper -u -t 32 -c 4 --cudaaligner-batches 50 --split 10000000000 ...
[RaconWrapper::run] preparing data with rampler
[RaconWrapper::run] total number of splits: 2
[RaconWrapper::run] processing data with racon
Using 2 GPU(s) to perform polishing
Initialize device 0
Initialize device 1
[CUDAPolisher] Constructed.
[racon::Polisher::initialize] loaded target sequences 10.777237 s
[racon::Polisher::initialize] loaded sequences 2096.867626 s
[racon::Polisher::initialize] loaded overlaps 41.415534 s
[racon::CUDAPolisher::initialize] allocated memory on GPUs for alignment 0.603412 s
[racon::CUDAPolisher::initialize] aligning overlaps [====================] 850.375795 s
[racon::Polisher::initialize] aligning overlaps [====================] 3125.817517 s
[racon::Polisher::initialize] transformed data into windows 57.191808 s
[racon::CUDAPolisher::polish] allocated memory on GPUs for polishing 62.591201 s
[racon::CUDAPolisher::polish] generating consensus [====================] 2028.678508 s
[racon::CUDAPolisher::polish] polished windows on GPU 2246.814400 s
[racon::CUDAPolisher::polish] polished remaining windows on CPU 10.493641 s
[racon::CUDAPolisher::polish] generated consensus 7.312445 s
[racon::Polisher::] total = 8684.385253 s

Thanks,
Steve

install error - logger cannot be defaulted

Hi Robert,
I have tried to install racon (gcc v.8.3.0 and cmake v.3.15.3), but make failed with the below error message. Any suggestion on how to overcome this?
Thanks,
Lel

In file included from /nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:9:
/nfs_netapp2/leory2/src/racon/vendor/logger/include/logger/logger.hpp:24: error: ‘logger::Logger::Logger(logger::Logger&&)’ cannot be defaulted
/nfs_netapp2/leory2/src/racon/vendor/logger/include/logger/logger.hpp:25: error: ‘logger::Logger& logger::Logger::operator=(logger::Logger&&)’ cannot be defaulted
/nfs_netapp2/leory2/src/racon/vendor/logger/include/logger/logger.hpp:53: error: ‘steady_clock’ is not a member of ‘std::chrono’
/nfs_netapp2/leory2/src/racon/vendor/logger/include/logger/logger.hpp:53: error: ‘steady_clock’ is not a member of ‘std::chrono’
/nfs_netapp2/leory2/src/racon/vendor/logger/include/logger/logger.hpp:53: error: template argument 1 is invalid
/nfs_netapp2/leory2/src/racon/vendor/logger/include/logger/logger.hpp:53: error: template argument 2 is invalid
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp: In member function ‘void logger::Logger::log()’:
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:21: error: ‘std::chrono::steady_clock’ has not been declared
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:21: error: unable to deduce ‘auto’ from ‘’
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:22: error: ‘steady_clock’ is not a member of ‘std::chrono’
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:22: error: ‘steady_clock’ is not a member of ‘std::chrono’
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:22: error: template argument 1 is invalid
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:22: error: template argument 2 is invalid
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp: In member function ‘void logger::Logger::log(const std::string&) const’:
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:30: error: ‘std::chrono::steady_clock’ has not been declared
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp: In member function ‘void logger::Logger::bar(const std::string&)’:
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:39: error: ‘std::chrono::steady_clock’ has not been declared
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp: In member function ‘void logger::Logger::total(const std::string&) const’:
/nfs_netapp2/leory2/src/racon/vendor/logger/src/logger.cpp:52: error: ‘std::chrono::steady_clock’ has not been declared
make[2]: *** [vendor/logger/CMakeFiles/logger.dir/src/logger.cpp.o] Error 1
make[1]: *** [vendor/logger/CMakeFiles/logger.dir/all] Error 2
make: *** [all] Error 2

Error compiling racon

Hi!
I am working around two days trying to compile racon with GPU support enabled.
GCC = 9.3.0
cmake = 3.18.20200908-g1d74a64
make = 4.3
nvcc = 11.0
cuDNN = 8.0.2
nvidia driver = 450.51.06

CMakeCache.txt was manually edited with including CMAKE_CXX_FLAGS:STRING=-DFMT_USE_USER_DEFINED_LITERALS=0 string
A compilation aborted with error:
[ 81%] Linking CXX static library ../../lib/libcudaaligner.a Reaping winning child 0x55c2e3d8eaf0 PID 14870 Live child 0x55c2e3d8eaf0 (lib/libcudaaligner.a) PID 14872 Reaping winning child 0x55c2e3d8eaf0 PID 14872 Live child 0x55c2e3d8eaf0 (lib/libcudaaligner.a) PID 14874 Reaping winning child 0x55c2e3d8eaf0 PID 14874 Removing child 0x55c2e3d8eaf0 PID 14874 from chain. Considering target file 'GenomeWorks/cudaaligner/CMakeFiles/cudaaligner.dir/build'. File 'GenomeWorks/cudaaligner/CMakeFiles/cudaaligner.dir/build' does not exist. Considering target file 'lib/libcudaaligner.a'. File 'lib/libcudaaligner.a' was considered already. Finished prerequisites of target file 'GenomeWorks/cudaaligner/CMakeFiles/cudaaligner.dir/build'. Must remake target 'GenomeWorks/cudaaligner/CMakeFiles/cudaaligner.dir/build'. Successfully remade target file 'GenomeWorks/cudaaligner/CMakeFiles/cudaaligner.dir/build'. Reaping winning child 0x558a0914daf0 PID 14802 Live child 0x558a0914daf0 (GenomeWorks/cudaaligner/CMakeFiles/cudaaligner.dir/all) PID 14879 [ 81%] Built target cudaaligner Reaping winning child 0x558a0914daf0 PID 14879 Removing child 0x558a0914daf0 PID 14879 from chain. Reaping losing child 0x560a19aabcb0 PID 14245 make: *** [Makefile:171: all] Error 2 Removing child 0x560a19aabcb0 PID 14245 from chain.

I maintain all prerequisites be OK and I have no idea how to configure cmake and make for compiling.
Thak you.

racon_wrapper: Overflow and corrupting overlaps?

Hi Robert,

I am having some interesting issues with both split and subsample in the wrapper. I am using Racon v1.4.21.

1st: When I use subsample, it seems that my genome size could be causing a 32bit overflow in the subsampler.

2nd: Using split results in "[racon::Overlap::find_breaking_points] error: overlap is not transmuted!" error.

Below is an example with both parameters set:

racon_wrapper -m 8 -x -6 -g -8 -w 500 -t 2 –subsample 4400000000 70 –split 500000000 –cudapoa-batches 50 –cudaaligner-batches 10 raw_readID_integer_whitespaceRemoved.fastq default_intRead_int_reads2asm_SecNo.paf default.pol2.fasta
[RaconWrapper::run] preparing data with rampler
[RaconWrapper::run] total number of splits: 8
[RaconWrapper::run] processing data with racon
[racon::Polisher::initialize] loaded target sequences 1.991097 s
[racon::Polisher::initialize] loaded sequences 12.319651 s
[racon::Polisher::initialize] loaded overlaps 71.611163 s
[racon::Overlap::find_breaking_points] error: overlap is not transmuted!

Fastq is 600 GB
paf ~100 GB
Assembly ~ 3.7 Gbp

We could get Racon running without the wrapper, with the same datasets, same compilation and on the same hardware. (Currently still running... )

Any ideas to what is happening in the wrapper?

Best regards,
Einar

Issue with using Racon in pipeline-pinfish-analysis

I am attempting to use Racon within the pipeline-pinfish-analysis Snakefile but the workflow fails at the point of polishing the transcripts. The sdtout file suggests that the issue is with Racon.

The full error message is:

polish_clusters: 16:17:30 Polishing cluster be37aa9e-8202-4afd-8c18-ebde5ef7c6fa of size 10
polish_clusters: 16:17:30 Failed running command: racon -t 20 -q -1 /tmp/pinfish_be37aa9e-8202-4afd-8c18-ebde5ef7c6fa_713003653/reads.fq /tmp/pinfish_be37aa9e-8202-4afd-8c18-ebde5ef7c6fa_713003653/alignments.s$
[Fri Dec 6 16:17:31 2019]
Error in rule polish_clusters:
jobid: 7
output: results/polished_transcripts.fas
shell:

/rds-d4/project/cj107/rds-cj107-jiggins-rds/projects/eratoCortexMapping/Nanopore/20191024_HW_Larva/pipeline-pinfish-analysis-fast-v4/pinfish/polish_clusters/polish_clusters -t 20 -a results/cluster_membersh$

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job polish_clusters since they might be corrupted:
results/polished_transcripts.fas

Other people in my group who have previously used racon within this pipeline are also having the same problem with racon, suggesting that this is a bug with racon rather than my installation. I am using racon v1.4.10

Has anything changed with racon to account for why we are now having issues?

Thank you in advance for your help.

Empty consensus file error

The issue is fixed by @rvaser's suggestion to replace qualities outputted by Chiron with "!"s.

I am using the standard minimap2-miniasm-racon pipeline but if I use basecalling results from Chiron, get empty consensus files outputted by racon.

I checked and confirmed that minimap/miniasm succesfully generates contigs by overlapping the reads and finds the mapping between the contigs and raw reads. Below is my pipeline:

minimap2/minimap2 -x ava-ont -k15 -w5  ${out_file}merge_${i}_par.fastq ${out_file}merge_${i}_par.fastq > ${out_file}reads_${i}.paf
miniasm/miniasm  -e2 -n1 -f   ${out_file}merge_${i}_par.fastq ${out_file}reads_${i}.paf > ${out_file}raw_contigs_${i}.gfa
awk '$1 ~/S/ {print ">"$2"\n"$3}' ${out_file}raw_contigs_${i}.gfa > ${out_file}raw_contigs_${i}.fasta
echo "Running minimap with raw_contigs and merge_1_par.fastq"
minimap2/minimap2   ${out_file}raw_contigs_${i}.fasta ${out_file}merge_${i}_par.fastq > ${out_file}mapping_${i}.paf
echo "Racon mapping"
${RACON}   ${out_file}merge_${i}_par.fastq ${out_file}mapping_${i}.paf ${out_file}raw_contigs_${i}.fasta >  ${out_file}consensus_${i}_0.fasta

I verified that the pipeline works for other basecaller outpus, i.e Guppy and URnano so I think this is due to the output generated by Chiron. Is there a way to tackle this issue?

error: unable to find subsampled sequences!

Hi,

I've found a small bug on racon_wrapper.py.

When I used the racon_wrapper as below:
racon_wrapper --subsample 3000000000 20 ${NAME}.fq.gz mapped.paf racon_0.fa > racon_1.fa

it caused

[RaconWrapper::run] preparing data with rampler
[RaconWrapper::run] error: unable to find subsampled sequences!

This was because of my fastq file's name, which was "fq.gz" but not "fastq.gz" and
on my racon's work_directory, "${NAME}_20x.fq" was generated, but not "${NAME}_20x.fastq"

Specifically, this bug was mainly caused by lines 73-77 in racon_wrapper.py.

Best,

Zicong.

Illegal instruction (core dumped)

Hi,
I installed racon 1.4.10 using bioconda. Although I intend on using it on a cluster, I first tested it on the machine on which I ran the conda install command. Even though I ran it on the same machine I used for installing it, I still got the "Illegal instruction (core dumped)" error.
The compiler on the machine is: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
Will I have to compile racon from source on each of the cluster machines once I get it to work on the current one?
I appreciate your help in getting racon to work on my system.
Thanks,
Ilya.

Trouble installing

Hi,

I'm installing racon, and when I get to the make command I get these errors at the end:

ld: library not found for -ledlib_static
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [bin/racon] Error 1
make[1]: *** [CMakeFiles/racon.dir/all] Error 2
make: *** [all] Error 2

I'm not sure how to solve this, any ideas?

Cheers

What will happen if draft contains stretches of Ns?

What will happen if I polish a draft genome which contains stretches of Ns? As for example generated by Flye when scaffolding contigs. It is adding 100 Ns in between if I remember correctly.

I would add parameter "-u" but I assume this has influence on my question.

Polishing with racon using multiple input files?

Hi,
I have got a reference genome which I want to polish with more than just one sequence file.
Basically I've got a reference genome which was generated using PacBio and then I've got 50 whole genome sequences which were generated using Illumina.
Now I want to polish the PacBio reference genome using the "Illumina data". I managed to polish the reference genome once, with just one Illumina data set. But when I go ahead and try to polish the resulting file with the next Illumina data set I get the following error:
[racon::Window::add_layer] error: layer begin and end positions are invalid!

Is there a way to do what I want or is it simply not possible using racon?

I also thought about combining all of the Illumina sequences into one file, but that doesn't seem sensible, regarding I am working on snail genomes each about 1 Gb big....

Thanks in advance!
Laura

Correcting fused regions

Hello,

I have a ONT genome assembly with Ra, where I have good reasons to suspect some multi-copy regions have been fused at the assembly step. Is Racon able to "correct" that, i.e. expand those regions?
If yes, should I used the "correct fragments instead of polishing" parameter? It is not really clear to me what this does.

Thanks a lot

compilation with CUDA fails

Hi there,

I've been trying to compile racon with CUDA support and by now tried many version/branches, including the racon-gpu from Nvidia/Clarabricks.

While I could fix some cmake problem and missing dependencies, in the end I fail to successfully compile it during the make step. The non CUDA version is compiling successfully. I am running Ubuntu 20.04 with NVIDIA RTX 3090, CUDA 11.3, Python 3.8, have GenomeWork installed separately already.

This is the commands/erros I get (any ideas how to fix it?):

sudo wget https://github.com/lbcb-sci/racon/releases/download/1.4.21/racon-v1.4.21.tar.gz
sudo tar -xvf racon-v1.4.21.tar.gz
cd racon-v1.4.21
sudo mkdir build
cd build
sudo cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON ..
sudo make -j 20

[ 87%] Building CXX object GenomeWorks/cudapoa/CMakeFiles/cudapoa.dir/src/cudapoa.cpp.o
[ 87%] Building CXX object GenomeWorks/cudapoa/CMakeFiles/cudapoa.dir/version.cpp.o
In file included from /opt/ont/racon/racon-v1.4.21/vendor/GenomeWorks/3rdparty/spdlog/include/spdlog/fmt/fmt.h:21,
from /opt/ont/racon/racon-v1.4.21/vendor/GenomeWorks/3rdparty/spdlog/include/spdlog/common.h:28,
from /opt/ont/racon/racon-v1.4.21/vendor/GenomeWorks/3rdparty/spdlog/include/spdlog/spdlog.h:12,
from /opt/ont/racon/racon-v1.4.21/vendor/GenomeWorks/common/base/include/claraparabricks/genomeworks/logging/logging.hpp:99,
from /opt/ont/racon/racon-v1.4.21/vendor/GenomeWorks/cudapoa/src/cudapoa.cpp:18:
/opt/ont/racon/racon-v1.4.21/vendor/GenomeWorks/3rdparty/spdlog/include/spdlog/fmt/bundled/format.h:3475:55: error: ISO C++ did not adopt string literal operator templates taking an argument pack of characters [-Werror=pedantic]
3475 | FMT_CONSTEXPR internal::udl_formatter<Char, CHARS...> operator""_format() {
| ^~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [GenomeWorks/cudapoa/CMakeFiles/cudapoa.dir/build.make:77: GenomeWorks/cudapoa/CMakeFiles/cudapoa.dir/src/cudapoa.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:4682: GenomeWorks/cudapoa/CMakeFiles/cudapoa.dir/all] Error 2
make: *** [Makefile:152: all] Error 2

Segmentation fault (core dumped)

Here's the command line i use:
racon -m 8 -x 6 -g -8 -w 500 -t 4 ../all_reads.fastq Sample_ref.sam ~/REFERENCES/reference.fasta > RACON/racon_ref_Sample.fasta

This is the error:

[racon::Polisher::initialize] loaded target sequences 10.810765 s
[racon::Polisher::initialize] loaded sequences 2.468961 s
[racon::Polisher::initialize] loaded overlaps 1.142807 s
[racon::Polisher::initialize] aligning overlaps [====================] 1.910425 s
Segmentation fault (core dumped)

I have 28 G free memory. and this my sam file size:
618033983 Jan 10 12:48 Sample_ref.sam

What is wrong and what should I do?

install error

In file included from /mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:14,
from /mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:11:
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/include/spoa/alignment_engine.hpp:30: error: expected nested-name-specifier before 'Alignment'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/include/spoa/alignment_engine.hpp:30: error: 'Alignment' has not been declared
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/include/spoa/alignment_engine.hpp:30: error: expected ';' before '=' token
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/include/spoa/alignment_engine.hpp:30: error: expected unqualified-id before '=' token
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/include/spoa/alignment_engine.hpp:50: error: 'Alignment' does not name a type
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/include/spoa/alignment_engine.hpp:53: error: 'Alignment' does not name a type
In file included from /mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:11:
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:30: error: expected ';' before 'override'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:32: error: 'Alignment' does not name a type
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:47: error: 'Alignment' does not name a type
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:50: error: 'Alignment' does not name a type
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:53: error: 'Alignment' does not name a type
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/sisd_alignment_engine.hpp:60: error: expected ';' before 'noexcept'
In file included from /mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:12:
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/simd_alignment_engine.hpp:30: error: expected ';' before 'override'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/simd_alignment_engine.hpp:32: error: 'Alignment' does not name a type
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/simd_alignment_engine.hpp:48: error: expected constructor, destructor, or type conversion before 'linear'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/simd_alignment_engine.hpp:52: error: expected constructor, destructor, or type conversion before 'affine'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/simd_alignment_engine.hpp:56: error: expected constructor, destructor, or type conversion before 'convex'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/simd_alignment_engine.hpp:65: error: expected initializer before 'noexcept'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp: In function 'std::unique_ptr<spoa::AlignmentEngine, std::default_deletespoa::AlignmentEngine > spoa::createAlignmentEngine(spoa::AlignmentType, int8_t, int8_t, int8_t, int8_t, int8_t, int8_t)':
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:37: error: 'invalid_argument' is not a member of 'std'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:41: error: 'invalid_argument' is not a member of 'std'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:45: error: 'invalid_argument' is not a member of 'std'
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:63: error: 'nullptr' was not declared in this scope
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp: At global scope:
/mnt/ilustre/users/sanger-dev/home/gaohao/bacgenome_v2/racon/vendor/spoa/src/alignment_engine.cpp:76: error: 'Alignment' does not name a type
make[2]: *** [vendor/spoa/CMakeFiles/spoa.dir/src/alignment_engine.cpp.o] Error 1
make[1]: *** [vendor/spoa/CMakeFiles/spoa.dir/all] Error 2
make: *** [all] Error 2

can you help me ?

Segmentation fault (core dumped)

Hi @rvaser ,

I used racon (v1.4.13) to polish assembly,there are 20+ samples, only 2 of them have the following different problems:
(1)
racon -m 8 -x -6 -g -8 -w 500 -t 8 sample1.porechopQC.filtered_reads.fq.gz sample1.sam sample1.flye.assembly.fasta > sample1.racon.fasta
[racon::Polisher::initialize] loaded target sequences 0.225069 s
[racon::Polisher::initialize] loaded sequences 70.247542 s
[racon::Polisher::initialize] loaded overlaps 18.639637 s
[racon::Polisher::initialize] aligning overlaps [====================] 11.493455 s
Segmentation fault (core dumped)
(2)
racon -m 8 -x -6 -g -8 -w 500 -t 8 sample2.porechopQC.filtered_reads.fq.gz sample2.sam sample2.flye.assembly.fasta > sample2.racon.fasta
[racon::Polisher::initialize] loaded target sequences 0.068931 s
[racon::Polisher::initialize] loaded sequences 64.135422 s
[racon::Polisher::initialize] loaded overlaps 16.719191 s
[racon::Overlap::find_breaking_points] error: overlap is not transmuted!

Could you give some advice, Thanks in advance!
Hailong

Decrease in accuracy from racon-v1.4.5 to racon-v1.4.11

Hello,

I am doing long-read + short-read 'hybrid' assemblies where I do a flye assembly followed by two rounds of racon long-read polishing, medaka, and then two rounds of short-read polishing, a somewhat adapted protocol for what Ryan Wick uses in his recent long-read polishing paper he released last year.

What I have found is there seems to be a large difference in aberrant insertions going from version 1.4.5 to 1.4.11. We are using snippy to check for SNPs and INDELs and I have found that has an increase in obvious insertion errors which can be seen when looking at the long or short pileup with tablet/IGV. I re-built racon from the source code release and got the same results. Additionally, I found with an older version of racon, v1.3.2, that the INDEL/SNPs decreased and matched my results with v1.4.5. Here is an example of results following two short-read polishes of a Kpn genome:

racon-v1.4.11
Screen Shot 2020-03-10 at 5 32 04 PM

racon-v1.4.5
Screen Shot 2020-03-10 at 5 32 49 PM

the racon-v1.3.2 results mirrored the v1.4.5 results.

I haven't systematically checked this across multiple isolates, but I did notice this with another particular genome today which is what motivated me to look at the differences in results across these different versions of racon. For the time being we are just going to drop down to v1.4.5 in our pipeline, however, thought this may be of interest to y'all.

Best,

Will

Turning on `racon_enable_cuda` tricks `make install` to install to the wrong path

Hi,
I want to report an issue that should be trivial to fix.

The background is that I am trying to build a docker image of Racon with CUDA.
So I followed the instructions on Readme.

I cloned the repo to /tmp/, then the following will run fine

cmake \
        -DCMAKE_BUILD_TYPE=Release \
        -Dracon_build_tests=ON \
        -Dracon_build_wrapper=ON \
        .. && \
make && make install && \
racon -h

But if I turn on -Dracon_enable_cuda=ON, racon is not installed to the expected path via make install, i.e.

cmake \
        -DCMAKE_BUILD_TYPE=Release \
        -Dracon_enable_cuda=ON \
        -Dracon_build_tests=ON \
        -Dracon_build_wrapper=ON \
        .. && \
make && make install && \
racon -h

instead make install installs the binaries to

[100%] Linking CXX executable bin/racon
[100%] Built target racon
Install the project...
-- Install configuration: "Release"
-- Installing: /tmp/racon/build/install/bin/racon

as opposed to the expected

[100%] Linking CXX executable bin/rampler
[100%] Built target rampler
[  4%] Built target edlib_static
[ 38%] Built target zlibstatic
[ 48%] Built target spoa
[ 53%] Built target thread_pool
[ 68%] Built target racon
[ 72%] Built target gtest
[ 76%] Built target gtest_main
[ 91%] Built target racon_test
[100%] Built target rampler
Install the project...
-- Install configuration: "Release"
-- Installing: /usr/local/bin/racon
-- Installing: /usr/local/bin/rampler

I hope this is trivial to fix.

Thanks.
Steve

Seems GPU was not used?

So I've built racon (a docker image) with GPU support, and tested the docker with the provided racon_test successfully.
For example, in running racon_test, one GPU-specific test outputs

[ RUN      ] RaconPolishingTest.FragmentCorrectionWithQualitiesFullMhapCUDA
Using 1 GPU(s) to perform polishing
Initialize device 0
[CUDAPolisher] Constructed.
[racon::Polisher::initialize] loaded target sequences 0.041138 s
[racon::Polisher::initialize] loaded sequences 0.039576 s
[racon::Polisher::initialize] loaded overlaps 0.009344 s
[racon::Polisher::initialize] aligning overlaps [====================] 4.996705 s
[racon::Polisher::initialize] transformed data into windows 0.053175 s
[racon::CUDAPolisher::polish] allocated memory on GPUs for polishing 1.388905 s
[racon::CUDAPolisher::polish] polished windows on GPU 1.949515 s======> ] 1.273181 s
[racon::CUDAPolisher::polish] polished remaining windows on CPU 0.007528 s
[racon::CUDAPolisher::polish] generated consensus 0.003166 s
[racon::Polisher::] total = 8.896212 s
[       OK ] RaconPolishingTest.FragmentCorrectionWithQualitiesFullMhapCUDA (8898 ms)

However, it seems that when running with actual data, this is what I get, and GPU doesn't seem to be used?

[racon::Polisher::initialize] loaded target sequences 0.198220 s
[racon::Polisher::initialize] loaded sequences 247.850106 s
[racon::Polisher::initialize] loaded overlaps 5.450893 s
[racon::Polisher::initialize] aligning overlaps [====================] 823.672149 s
[racon::Polisher::initialize] transformed data into windows 14.626365 s
[racon::Polisher::polish] generating consensus [====================] 5420.680712 s
[racon::Polisher::] total = 6514.470406 s

Thanks!

GPU option

Hi

It would be great to have a "--GPU option" and have racon fail to run rather than go on using CPUs if it cannot detect the GPU(s)

Best regards
Rasmus

Can racon only fill gaps?

Hi, I have a question about whether I can use racon for gap closing.

I have a scaffold that are scaffolded by polished contigs along the reference. The scaffold still has around 3000 Ns in it. I would like to align my scaffold against the Nanopore reads again, to see whether some contigs can be connected by reads. Can I use racon for another run of polishing but only for the gap region?

Also, I have question about racon polishing, when there are Ns in the contigs. Will Ns be simply trimmed off Ns because of low coverage? Or will Ns be replaced by the consensus of aligned reads in that region?

Thanks!

N50 decrease issue

Hi all , I found an interesting result for me, after running Racon on some data. It seems tha after running Racon 3 rounds and more , the N50 decreased dramatically, some times below of the half.
Did you have any suggestions of what it happens ?

Thank you

Recommended parameters for CLR vs ONT

Hi,

I'm wondering if you have any recommendations on parameters for

  • CLR-based assemblies vs
  • ONT-based (say R9.4.1) assemblies.

I'm asking because I've observed great improvements from Racon on ONT drafts, but when polished with the same default parameters on CLR drafts, the results wasn't much better.

Of course it could be that the parameters used for the overlap generation weren't optimal, but I'd like to see if you've already have some recommendations.

Thanks!

Steve

error: unequal lengths in sequence and overlap file for sequence

Hello,

I'm receiving the following error message when trying to use racon with paired-end reads and minimap2-generated sam files with a mixed Illumina/nanopore metagenome:

error: unequal lengths in sequence and overlap file for sequence K00271:557:HCJ2HBBXY:1:1101:6167:998

I've made sure my reads do not have the same identifier using the advice in this thread: isovic/racon#68. I've also tried running racon with just the forward reads of the paired reads, and still receive the same error. Can anyone offer any advice on this?

Thanks!

Metagenome polishing

Hello, I'm interested in polishing a canu assembly of a metagenome, is it possible to input multiple read files and overlap files? e.g. multiple fastq and .sam files?
Thanks!

Feature request: is it possible to divide the work into stages with appropriate CLI

This is related to #24 .

My hands are tied in the following sense, when polishing assembly of large genomes with deep coverage data:

  1. I want to make use of the GPU acceleration
  2. Using GPU limits my memory allocation for my VM (cloud vendor restriction)
  3. racon tends to load all sequences into memory for preprocessing, potentially demanding a lot of memory (depending on genome size and coverage)

Hence I am wondering if it is possible for racon to expose CLI parameters that permits jobs to be run in stages.
This way, uses can then configure VM of different specifications for different stages and resume work.

I know this might be a big request, but it would make our lives easier.

Thanks!

Steve

docker support

Hi,
Can you provide the docker image for Racon?

Best
Neng Huang

Problem while installing racon on arm platform

I try to intasll racon on arm platform but here are some errors :
make VERBOSE=1
/usr/local/src/cmake-3.6.2/bin/cmake -H/project/software/racon/racon-v1.4.13 -B/project/software/racon/racon-v1.4.13/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/local/src/cmake-3.6.2/bin/cmake -E cmake_progress_start /project/software/racon/racon-v1.4.13/build/CMakeFiles /project/software/racon/racon-v1.4.13/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory /project/software/racon/racon-v1.4.13/build' make -f vendor/edlib/CMakeFiles/edlib_static.dir/build.make vendor/edlib/CMakeFiles/edlib_static.dir/depend make[2]: Entering directory /project/software/racon/racon-v1.4.13/build'
cd /project/software/racon/racon-v1.4.13/build && /usr/local/src/cmake-3.6.2/bin/cmake -E cmake_depends "Unix Makefiles" /project/software/racon/racon-v1.4.13 /project/software/racon/racon-v1.4.13/vendor/edlib /project/software/racon/racon-v1.4.13/build /project/software/racon/racon-v1.4.13/build/vendor/edlib /project/software/racon/racon-v1.4.13/build/vendor/edlib/CMakeFiles/edlib_static.dir/DependInfo.cmake --color=
make[2]: Leaving directory /project/software/racon/racon-v1.4.13/build' make -f vendor/edlib/CMakeFiles/edlib_static.dir/build.make vendor/edlib/CMakeFiles/edlib_static.dir/build make[2]: Entering directory /project/software/racon/racon-v1.4.13/build'
make[2]: Nothing to be done for vendor/edlib/CMakeFiles/edlib_static.dir/build'. make[2]: Leaving directory /project/software/racon/racon-v1.4.13/build'
[ 6%] Built target edlib_static
make -f vendor/bioparser/vendor/zlib/CMakeFiles/zlibstatic.dir/build.make vendor/bioparser/vendor/zlib/CMakeFiles/zlibstatic.dir/depend
make[2]: Entering directory /project/software/racon/racon-v1.4.13/build' cd /project/software/racon/racon-v1.4.13/build && /usr/local/src/cmake-3.6.2/bin/cmake -E cmake_depends "Unix Makefiles" /project/software/racon/racon-v1.4.13 /project/software/racon/racon-v1.4.13/vendor/bioparser/vendor/zlib /project/software/racon/racon-v1.4.13/build /project/software/racon/racon-v1.4.13/build/vendor/bioparser/vendor/zlib /project/software/racon/racon-v1.4.13/build/vendor/bioparser/vendor/zlib/CMakeFiles/zlibstatic.dir/DependInfo.cmake --color= make[2]: Leaving directory /project/software/racon/racon-v1.4.13/build'
make -f vendor/bioparser/vendor/zlib/CMakeFiles/zlibstatic.dir/build.make vendor/bioparser/vendor/zlib/CMakeFiles/zlibstatic.dir/build
make[2]: Entering directory /project/software/racon/racon-v1.4.13/build' make[2]: Nothing to be done for vendor/bioparser/vendor/zlib/CMakeFiles/zlibstatic.dir/build'.
make[2]: Leaving directory /project/software/racon/racon-v1.4.13/build' [ 56%] Built target zlibstatic make -f vendor/spoa/CMakeFiles/spoa.dir/build.make vendor/spoa/CMakeFiles/spoa.dir/depend make[2]: Entering directory /project/software/racon/racon-v1.4.13/build'
cd /project/software/racon/racon-v1.4.13/build && /usr/local/src/cmake-3.6.2/bin/cmake -E cmake_depends "Unix Makefiles" /project/software/racon/racon-v1.4.13 /project/software/racon/racon-v1.4.13/vendor/spoa /project/software/racon/racon-v1.4.13/build /project/software/racon/racon-v1.4.13/build/vendor/spoa /project/software/racon/racon-v1.4.13/build/vendor/spoa/CMakeFiles/spoa.dir/DependInfo.cmake --color=
make[2]: Leaving directory /project/software/racon/racon-v1.4.13/build' make -f vendor/spoa/CMakeFiles/spoa.dir/build.make vendor/spoa/CMakeFiles/spoa.dir/build make[2]: Entering directory /project/software/racon/racon-v1.4.13/build'
[ 59%] Building CXX object vendor/spoa/CMakeFiles/spoa.dir/src/alignment_engine.cpp.o
cd /project/software/racon/racon-v1.4.13/build/vendor/spoa && /usr/bin/c++ -I/project/software/racon/racon-v1.4.13/src -I/project/software/racon/racon-v1.4.13/vendor/spoa/include -Wall -Wextra -pedantic -Wall -Wextra -pedantic -march=native -O3 -DNDEBUG -std=c++11 -o CMakeFiles/spoa.dir/src/alignment_engine.cpp.o -c /project/software/racon/racon-v1.4.13/vendor/spoa/src/alignment_engine.cpp
/project/software/racon/racon-v1.4.13/vendor/spoa/src/alignment_engine.cpp:1:0: error: unknown value ‘native’ for -march

my gcc version and cmake info is
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
Copyright (C) 2015 Free Software Foundation, Inc.
cmake version 3.6.2

Best Regrads

Racon fails to build on 1.5.0

Hi Robert,

I saw you just released a new racon and I tried to build but am now running into issues. I'm pasting a reproducible Dockerfile below. This Dockerfile worked on commit b591b12 and substituting ubuntu 20.04 for ubuntu 18.04 below. Any ideas? Thanks!!

Sam

Dockerfile:

FROM nvidia/cuda:11.4.2-devel-ubuntu20.04 AS builder

# Solve cmake asking for timezone: https://dev.to/setevoy/docker-configure-tzdata-and-timezone-during-build-20bk
ENV TZ=America/Vancouver
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

ARG RACON_GIT_HASH=a2cfcac281d312a73912a97d6d960404f516c389
# Checkout git version: https://stackoverflow.com/q/3555107
RUN apt-get update && \
    apt-get install -y git cmake zlib1g-dev && \
    git clone --recursive https://github.com/lbcb-sci/racon.git racon && \
    cd racon && \
    git reset --hard ${RACON_GIT_HASH} && \
    git submodule update && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON .. && \
    make && \
    apt-get remove -y git cmake && \
    rm -rf /var/lib/apt/lists/*

Log:

Scanning dependencies of target gwbase
[  1%] Building CXX object _deps/genomeworks-build/common/base/CMakeFiles/gwbase.dir/src/cudautils.cpp.o
In file included from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/fmt/fmt.h:21,
                 from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/common.h:28,
                 from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/spdlog.h:12,
                 from /racon/build/_deps/genomeworks-src/common/base/include/claraparabricks/genomeworks/logging/logging.hpp:99,
                 from /racon/build/_deps/genomeworks-src/common/base/include/claraparabricks/genomeworks/utils/cudautils.hpp:22,
                 from /racon/build/_deps/genomeworks-src/common/base/src/cudautils.cpp:17:
/racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/fmt/bundled/format.h:3475:55: warning: ISO C++ did not adopt string literal operator templates taking an argument pack of characters [-Wpedantic]
 3475 | FMT_CONSTEXPR internal::udl_formatter<Char, CHARS...> operator""_format() {
      |                                                       ^~~~~~~~
[  3%] Building CXX object _deps/genomeworks-build/common/base/CMakeFiles/gwbase.dir/src/logging.cpp.o
In file included from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/fmt/fmt.h:21,
                 from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/common.h:28,
                 from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/spdlog.h:12,
                 from /racon/build/_deps/genomeworks-src/common/base/include/claraparabricks/genomeworks/logging/logging.hpp:99,
                 from /racon/build/_deps/genomeworks-src/common/base/src/logging.cpp:17:
/racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/fmt/bundled/format.h:3475:55: warning: ISO C++ did not adopt string literal operator templates taking an argument pack of characters [-Wpedantic]
 3475 | FMT_CONSTEXPR internal::udl_formatter<Char, CHARS...> operator""_format() {
      |                                                       ^~~~~~~~
[  5%] Building CXX object _deps/genomeworks-build/common/base/CMakeFiles/gwbase.dir/src/graph.cpp.o
[  7%] Linking CXX static library ../../../../lib/libgwbase.a
[  7%] Built target gwbase
[  9%] Building NVCC (Device) object _deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/src/cudaaligner_generated_hirschberg_myers_gpu.cu.o
[ 11%] Building NVCC (Device) object _deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/src/cudaaligner_generated_ukkonen_gpu.cu.o
[ 13%] Building NVCC (Device) object _deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/src/cudaaligner_generated_myers_gpu.cu.o
Scanning dependencies of target cudaaligner
[ 15%] Building CXX object _deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/src/cudaaligner.cpp.o
In file included from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/fmt/fmt.h:21,
                 from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/common.h:28,
                 from /racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/spdlog.h:12,
                 from /racon/build/_deps/genomeworks-src/common/base/include/claraparabricks/genomeworks/logging/logging.hpp:99,
                 from /racon/build/_deps/genomeworks-src/cudaaligner/src/cudaaligner.cpp:18:
/racon/build/_deps/genomeworks-src/3rdparty/spdlog/include/spdlog/fmt/bundled/format.h:3475:55: error: ISO C++ did not adopt string literal operator templates taking an argument pack of characters [-Werror=pedantic]
 3475 | FMT_CONSTEXPR internal::udl_formatter<Char, CHARS...> operator""_format() {
      |                                                       ^~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [_deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/build.make:84: _deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/src/cudaaligner.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1976: _deps/genomeworks-build/cudaaligner/CMakeFiles/cudaaligner.dir/all] Error 2
make: *** [Makefile:152: all] Error 2
The command '/bin/sh -c apt-get update &&     apt-get install -y git cmake zlib1g-dev &&     git clone --recursive https://github.com/lbcb-sci/racon.git racon &&     cd racon &&     git reset --hard ${RACON_GIT_HASH} &&     git submodule update &&     mkdir build &&     cd build &&     cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON .. &&     make &&     apt-get remove -y git cmake &&     rm -rf /var/lib/apt/lists/*' returned a non-zero code: 2

racon_wrapper does not take arguments for GPU mode

I run into a dilemma that the VM I rent enforces an upper limit of CPU and memory when I select GPU.

But the data set I'm using needs more memory than the upper limit.

So I tried out the racon_wrapper to split the reads. Unfortunately, I got the following errors:

racon_wrapper: error: unrecognized arguments: -c --cudaaligner-batches

warning: contig XXX might be chimeric in window XX!

Hello, I was new in using Racon for polishing. In my log file, I found some warning like this:
[racon::Window::generate_consensus] warning: contig 670 might be chimeric in window 48!

I wonder why it generated and what it means for? Thank you !

segmentaion fault

Hi,

I run into segmentation fault when running racon with following parameters

 tools/racon/build/bin/racon -t 20 -m 8 -x -6 -g -8 -w 500 -u $FASTQ reads_2_assembly.sam.gz assembly.fasta > racon_polish.fa 
[racon::Polisher::initialize] loaded target sequences 2.781627 s
[racon::Polisher::initialize] loaded sequences 426.328405 s
Segmentation fault

There should be plenty of memory on the machine.

                total        used        free      shared  buff/cache   available 
Mem:           3.0T        118G        2.8T        4.1G        9.6G        2.8T  

Using gzipped sam as input format and racon version v1.4.7
Any help would be appreciated.

Thanks

Unequal lengths in sequence

I am trying to polish ONT contigs with Illumina reads. But getting an error while using the Illumina pair-end reads.

bwa mem -t 40 -x ont2d ./ont_assembly.fasta /home/Ill_paired.fastq >ont_mapping.sam

racon -m 8 -x -6 -g -8 -w 500 -t 14 /home/Ill_paired.fastq ./ont_mapping.sam ./ont_assembly.fasta > ./ont_polish_racon.fasta

[racon::Overlap::transmute] error: unequal lengths in sequence and overlap file for sequence ST-E00181:968:HGM7HCCX2:1:1101:20496:2610!

I am looking forward to hearing from you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.