Giter Club home page Giter Club logo

meryl's People

Contributors

arangrhie avatar brianwalenz avatar mphschmitt avatar skoren avatar snurk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meryl's Issues

meryl count assertion failed with large K

Tried to run with relatively large k (101), got this assertion

meryl count k=101 compress hifi.fasta.gz output 101.hpc.meryl
<...>
Start counting with THREADED method.                                                                                                                                                            
meryl: meryl/merylCountArray.C:512: uint64 merylCountArray::add(kmdata): Assertion `wordEnd <= 192' failed.                                                                                     

k=71 seems to work

would be nice to have max allowed k somewhere in usage/docs.

use meryl

Dear author

meryl if there are parameters, can retrieve the reads where the kmer is located

meryl terminates with 'std::bad_alloc' and failed assertion

Hi! I am encountering an error when running meryl, I used 20 CPU threads, 150GB of memory. It seems to be failing on the assertion _blockPosition <= merylutil::ftell(F). Also, before this assertion failure, it throws a 'std::bad_alloc' exception. This usually happens when there isn't enough memory to allocate, but I've specified 150GB of memory in my job submission script, which I believe should be sufficient. Here is the relevant information:

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
meryl: utility/src/kmers-v1/kmers-files.H:137: void merylutil::kmers::v1::merylFileIndex::set(merylutil::kmers::v1::kmpref, FILE*, uint64): Assertion _blockPosition <= merylutil::ftell(F)' failed. meryl: utility/src/kmers-v1/kmers-files.H:137: void merylutil::kmers::v1::merylFileIndex::set(merylutil::kmers::v1::kmpref, FILE*, uint64): Assertion _blockPosition <= merylutil::ftell(F)' failed.

Failed with 'Aborted'; backtrace (libbacktrace):
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()

hap_blot.sh

When I run merqury, the second step: hap_blob.sh ,the problem arises
This is the code I run:
/home/duhuipeng/DHP/merqury-1.1/merqury.sh /home/duhuipeng/DHP/DRC/DRC.SON.k21.meryl/ ../../DRC.father.k21.meryl/ ../../DRC.mother.k21.meryl/ ../father_purged.fasta ../purged_mo.fasta test-2
the error as follow in hap_blob
image
image
image
image
image

at first I thought core dump was out of memory, but I found that I couldn't run when I used the size of 3T memory. How can I solve this problem?
Because I want to draw a picture of its phase separation,But this step is wrong is not drawn, I hope to get your advice.

Looking forward to your reply!
Best

k-mer size support >32?

Hi Brian,

Thanks for the rewrite! Is there a way to run meryl with a k-mer size larger than 32? Because running with the following example meryl count and meryl print

>test
ATGAAAATCAAAACTCGCTTCGCGCCAAGCCCAACAGGCTATCTGCA

with k=32 results in :

AAAACTCGCTTCGCGCCAAGCCCAACAGGCTA	1
AAAATCAAAACTCGCTTCGCGCCAAGCCCAAC	1
AAACTCGCTTCGCGCCAAGCCCAACAGGCTAT	1
AAATCAAAACTCGCTTCGCGCCAAGCCCAACA	1
AACTCGCTTCGCGCCAAGCCCAACAGGCTATC	1
AATCAAAACTCGCTTCGCGCCAAGCCCAACAG	1
ACTCGCTTCGCGCCAAGCCCAACAGGCTATCT	1
ATCAAAACTCGCTTCGCGCCAAGCCCAACAGG	1
ATGAAAATCAAAACTCGCTTCGCGCCAAGCCC	1
AGCCTGTTGGGCTTGGCGCGAAGCGAGTTTTG	1
CAGATAGCCTGTTGGGCTTGGCGCGAAGCGAG	1
CGCTTCGCGCCAAGCCCAACAGGCTATCTGCA	1
TCAAAACTCGCTTCGCGCCAAGCCCAACAGGC	1
TCGCTTCGCGCCAAGCCCAACAGGCTATCTGC	1
TTGGGCTTGGCGCGAAGCGAGTTTTGATTTTC	1
TGAAAATCAAAACTCGCTTCGCGCCAAGCCCA	1

but with k=33 results in:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA	8
CAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC	7

which does not seem right.

The full commands are here

meryl count  k=32 test.fna output test.k32.out.merylCount &> test.k32.out.merylCount.std
meryl count  k=33 test.fna output test.k33.out.merylCount &> test.k33.out.merylCount.std

meryl -VV print test.k32.out.merylCount > test.k32.out.histogram 2> test.k32.out.histogram.std
meryl -VV print test.k33.out.merylCount > test.k33.out.histogram 2> test.k33.out.histogram.std

and attached a tarball with the stderr etc. kmertest.tar.gz

Thanks!

Eric

Compile error

Hi,

I have the same error in #37. I tried to fix this by compiling from source. However, I have the below error when compiling.

The gcc version is 8.3.0, git version is 2.34.1.

Fetching submodule 'utility'
 - Submodule 'src/utility' (https://github.com/marbl/meryl-utility) registered for path 'utility'
 - Cloning into '/home/hcaoad/Software/meryl/src/utility'...
 - Submodule path 'utility': checked out '159a2d48eca5f208ed4473cc0139a5242f6ebbe3'

Building snapshot v1.4-development +61 changes (r1001 a2e291954d452f3e1b2772cf35a902181b32b4b4) (sync'd with github)
  with utility v1.0-244-g159a2d4  159a2d48eca5f208ed4473cc0139a5242f6ebbe3
For 'Linux' '3.10.0-1062.el7.x86_64)' as 'amd64' into '/home/hcaoad/Software/meryl/build/{bin,obj}'.
Using GNU '/opt/ohpc/pub/compiler/gcc/8.3.0/bin/g++' version '8.3.0'.

g++ -o /home/hcaoad/Software/meryl/build/obj/lib/libmeryl.a/utility/src/align/align-ksw2-driver.o -c -MD -g3 -O4 -funroll-loops -fexpensive-optimizations -finline-functions -fomit-frame-pointer -DLIBBACKTRACE -mxsave -Wall -Wextra -Wformat -Wno-char-subscripts -Wno-sign-compare -Wno-unused-function -Wno-unused-parameter -Wno-unused-variable -Wno-deprecated-declarations -Wno-format-truncation -std=c++2a -pthread -fopenmp -fPIC -iquote/home/hcaoad/Software/meryl/src -iquotemeryl -iquoteutility/src utility/src/align/align-ksw2-driver.C
In file included from utility/src/system.H:26,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/system/time-v1.H:29:22: error: namespace name required before ‘inline’
 namespace merylutil::inline system::inline v1 {
                      ^~~~~~
utility/src/system/time-v1.H:29:22: error: expected ‘{’ before ‘inline’
utility/src/system/time-v1.H:29:29: error: ‘system’ does not name a type
 namespace merylutil::inline system::inline v1 {
                             ^~~~~~
In file included from utility/src/strings.H:23,
                 from utility/src/system/cpuIdent-v1.H:24,
                 from utility/src/system.H:28,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/datastructures/strings-v1.H:28:22: error: namespace name required before ‘inline’
 namespace merylutil::inline strings::inline v1 {
                      ^~~~~~
utility/src/datastructures/strings-v1.H:28:22: error: expected ‘{’ before ‘inline’
utility/src/datastructures/strings-v1.H:28:29: error: ‘strings’ does not name a type; did you mean ‘sprintf’?
 namespace merylutil::inline strings::inline v1 {
                             ^~~~~~~
                             sprintf
In file included from utility/src/strings.H:25,
                 from utility/src/system/cpuIdent-v1.H:24,
                 from utility/src/system.H:28,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/datastructures/keyAndValue-v1.H:25:22: error: namespace name required before ‘inline’
 namespace merylutil::inline strings::inline v1 {
                      ^~~~~~
utility/src/datastructures/keyAndValue-v1.H:25:22: error: expected ‘{’ before ‘inline’
utility/src/datastructures/keyAndValue-v1.H:25:29: error: ‘strings’ does not name a type; did you mean ‘sprintf’?
 namespace merylutil::inline strings::inline v1 {
                             ^~~~~~~
                             sprintf
In file included from utility/src/strings.H:26,
                 from utility/src/system/cpuIdent-v1.H:24,
                 from utility/src/system.H:28,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/datastructures/splitToWords-v1.H:28:22: error: namespace name required before ‘inline’
 namespace merylutil::inline strings::inline v1 {
                      ^~~~~~
utility/src/datastructures/splitToWords-v1.H:28:22: error: expected ‘{’ before ‘inline’
utility/src/datastructures/splitToWords-v1.H:28:29: error: ‘strings’ does not name a type; did you mean ‘sprintf’?
 namespace merylutil::inline strings::inline v1 {
                             ^~~~~~~
                             sprintf
In file included from utility/src/strings.H:27,
                 from utility/src/system/cpuIdent-v1.H:24,
                 from utility/src/system.H:28,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/datastructures/stringList-v1.H:35:22: error: namespace name required before ‘inline’
 namespace merylutil::inline strings::inline v1 {
                      ^~~~~~
utility/src/datastructures/stringList-v1.H:35:22: error: expected ‘{’ before ‘inline’
utility/src/datastructures/stringList-v1.H:35:29: error: ‘strings’ does not name a type; did you mean ‘sprintf’?
 namespace merylutil::inline strings::inline v1 {
                             ^~~~~~~
                             sprintf
In file included from utility/src/system.H:28,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/system/cpuIdent-v1.H:26:22: error: namespace name required before ‘inline’
 namespace merylutil::inline system::inline v1 {
                      ^~~~~~
utility/src/system/cpuIdent-v1.H:26:22: error: expected ‘{’ before ‘inline’
utility/src/system/cpuIdent-v1.H:26:29: error: ‘system’ does not name a type
 namespace merylutil::inline system::inline v1 {
                             ^~~~~~
In file included from utility/src/files.H:28,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/accessing-v1.H:34:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/accessing-v1.H:34:22: error: expected ‘{’ before ‘inline’
utility/src/files/accessing-v1.H:34:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:29,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/reading-v1.H:45:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::v0 {
                      ^~~~~~
utility/src/files/reading-v1.H:45:22: error: expected ‘{’ before ‘inline’
utility/src/files/reading-v1.H:45:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::v0 {
                             ^~~~~
                             fileno
utility/src/files/reading-v1.H:50:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/reading-v1.H:50:22: error: expected ‘{’ before ‘inline’
utility/src/files/reading-v1.H:50:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
utility/src/files/reading-v1.H:78:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/reading-v1.H:78:22: error: expected ‘{’ before ‘inline’
utility/src/files/reading-v1.H:78:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
utility/src/files/reading-v1.H:116:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/reading-v1.H:116:22: error: expected ‘{’ before ‘inline’
utility/src/files/reading-v1.H:116:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:30,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/writing-v1.H:48:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/writing-v1.H:48:22: error: expected ‘{’ before ‘inline’
utility/src/files/writing-v1.H:48:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
utility/src/files/writing-v1.H:85:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/writing-v1.H:85:22: error: expected ‘{’ before ‘inline’
utility/src/files/writing-v1.H:85:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:32,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/buffered-v1-reading.H:29:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/buffered-v1-reading.H:29:22: error: expected ‘{’ before ‘inline’
utility/src/files/buffered-v1-reading.H:29:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:33,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/buffered-v1-writing.H:25:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/buffered-v1-writing.H:25:22: error: expected ‘{’ before ‘inline’
utility/src/files/buffered-v1-writing.H:25:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:35,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/compressed-v1.H:25:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/compressed-v1.H:25:22: error: expected ‘{’ before ‘inline’
utility/src/files/compressed-v1.H:25:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:36,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/compressed-v1-reading.H:26:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/compressed-v1-reading.H:26:22: error: expected ‘{’ before ‘inline’
utility/src/files/compressed-v1-reading.H:26:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:37,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/compressed-v1-writing.H:26:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/compressed-v1-writing.H:26:22: error: expected ‘{’ before ‘inline’
utility/src/files/compressed-v1-writing.H:26:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:39,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/memoryMapped-v1.H:48:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/memoryMapped-v1.H:48:22: error: expected ‘{’ before ‘inline’
utility/src/files/memoryMapped-v1.H:48:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/files.H:41,
                 from utility/src/system/logging-v1.H:24,
                 from utility/src/system.H:30,
                 from utility/src/align/align-ksw2-driver.H:20,
                 from utility/src/align/align-ksw2-driver.C:20:
utility/src/files/fasta-fastq-v1.H:23:22: error: namespace name required before ‘inline’
 namespace merylutil::inline files::inline v1 {
                      ^~~~~~
utility/src/files/fasta-fastq-v1.H:23:22: error: expected ‘{’ before ‘inline’
utility/src/files/fasta-fastq-v1.H:23:29: error: ‘files’ does not name a type; did you mean ‘fileno’?
 namespace merylutil::inline files::inline v1 {
                             ^~~~~
                             fileno
In file included from utility/src/align/align-ksw2-driver.C:20:
utility/src/align/align-ksw2-driver.H:25:22: error: namespace name required before ‘inline’
 namespace merylutil::inline align::inline ksw2::inline v1 {
                      ^~~~~~
utility/src/align/align-ksw2-driver.H:25:22: error: expected ‘{’ before ‘inline’
utility/src/align/align-ksw2-driver.H:25:29: error: ‘align’ does not name a type; did you mean ‘asin’?
 namespace merylutil::inline align::inline ksw2::inline v1 {
                             ^~~~~
                             asin
In file included from utility/src/align/align-ksw2-driver.C:21:
utility/src/align/align-ksw2.H:16:22: error: namespace name required before ‘inline’
 namespace merylutil::inline align::inline ksw2::inline v1 {
                      ^~~~~~
utility/src/align/align-ksw2.H:16:22: error: expected ‘{’ before ‘inline’
utility/src/align/align-ksw2.H:16:29: error: ‘align’ does not name a type; did you mean ‘asin’?
 namespace merylutil::inline align::inline ksw2::inline v1 {
                             ^~~~~
                             asin
In file included from utility/src/arrays.H:23,
                 from utility/src/align/align-ksw2-driver.C:23:
utility/src/datastructures/arrays-v1.H:25:22: error: namespace name required before ‘inline’
 namespace merylutil::inline arrays::inline v1 {
                      ^~~~~~
utility/src/datastructures/arrays-v1.H:25:22: error: expected ‘{’ before ‘inline’
utility/src/datastructures/arrays-v1.H:25:29: error: ‘arrays’ does not name a type
 namespace merylutil::inline arrays::inline v1 {
                             ^~~~~~
In file included from utility/src/sequence.H:23,
                 from utility/src/align/align-ksw2-driver.C:25:
utility/src/sequence/sequence-v1.H:26:22: error: namespace name required before ‘inline’
 namespace merylutil::inline sequence::inline v1 {
                      ^~~~~~
utility/src/sequence/sequence-v1.H:26:22: error: expected ‘{’ before ‘inline’
utility/src/sequence/sequence-v1.H:26:29: error: ‘sequence’ does not name a type; did you mean ‘sigqueue’?
 namespace merylutil::inline sequence::inline v1 {
                             ^~~~~~~~
                             sigqueue
In file included from utility/src/sequence.H:25,
                 from utility/src/align/align-ksw2-driver.C:25:
utility/src/sequence/dnaSeq-v1.H:47:22: error: namespace name required before ‘inline’
 namespace merylutil::inline sequence::inline v1 {
                      ^~~~~~
utility/src/sequence/dnaSeq-v1.H:47:22: error: expected ‘{’ before ‘inline’
utility/src/sequence/dnaSeq-v1.H:47:29: error: ‘sequence’ does not name a type; did you mean ‘sigqueue’?
 namespace merylutil::inline sequence::inline v1 {
                             ^~~~~~~~
                             sigqueue
In file included from utility/src/sequence.H:26,
                 from utility/src/align/align-ksw2-driver.C:25:
utility/src/sequence/dnaSeqFile-v1.H:70:22: error: namespace name required before ‘inline’
 namespace merylutil::inline sequence::inline v1 {
                      ^~~~~~
utility/src/sequence/dnaSeqFile-v1.H:70:22: error: expected ‘{’ before ‘inline’
utility/src/sequence/dnaSeqFile-v1.H:70:29: error: ‘sequence’ does not name a type; did you mean ‘sigqueue’?
 namespace merylutil::inline sequence::inline v1 {
                             ^~~~~~~~
                             sigqueue
utility/src/align/align-ksw2-driver.C:27:22: error: namespace name required before ‘inline’
 namespace merylutil::inline align::inline ksw2::inline v1 {
                      ^~~~~~
utility/src/align/align-ksw2-driver.C:27:22: error: expected ‘{’ before ‘inline’
utility/src/align/align-ksw2-driver.C:27:29: error: ‘align’ does not name a type; did you mean ‘asin’?
 namespace merylutil::inline align::inline ksw2::inline v1 {
                             ^~~~~
                             asin
make: *** [/home/hcaoad/Software/meryl/build/obj/lib/libmeryl.a/utility/src/align/align-ksw2-driver.o] Error 1

'output' at end of command fails

@arangrhie reports:

% meryl output read-$hap.1cp.meryl [ less-than $HI [ greater-than $LO read-$hap.meryl ] ]

works, but

meryl [ less-than $HI [ greater-than $LO read-$hap.meryl ] ] output read-$hap.1cp.meryl

does not. Seems like the placement of output is important.

Logging of reads kept and total reads wrong in lookup -exclude

Just noticed when looking at some results, the order of values is swapped here for the total reads analysed and the number of reads kept.

fprintf(stderr, "\nIncluding %lu reads (or read pairs) out of %lu.\n", g->nReadsTotal, g->nReadsFound);

Should of course be

fprintf(stderr, "\nIncluding %lu reads (or read pairs) out of %lu.\n", g->nReadsFound, g->nReadsTotal);

How to cite meryl?

Sorry, hardly a techincal question, but I didn't know where else to ask it.

If I have used meryl (and I have), how should I cite it in a manuscript?

Is this output correct?

Hey, thanks for your work.
I'm kinda lost. I'm using meryl to k-mers from my reads on a cluster and I'm not sure if it's correct, because it gave an log entry that only occurs when there is an error, but the log ends this way:
image
It does seems that it work, but the output is this:
image
And that doesn't seems right, it should be all in binary? Or should it give me another file that I can read?. But I'm not sure and would like some help.
This is the script used:
image
Soooo, it's that? What am I seeing?

Thanks for your attention.

Feature request

The tool is great, it would be even better if it was possible to have another function:

function X: return kmers that occur in ANY input, set the count to the count in the first input.

or even better:

function X: return kmers that occur in ANY input, count them in any input.

compile problem

Dear Author:
it cannot be compiled with an error
meryl2/merylOpTemplate.C:159:44: error: cast from ‘const char*’ to ‘char’ loses precision [-fpermissive]
159 | for (char *suf = strchr(prName, '#'); ((suf) && (suf == '#')); suf++)
I'm donnot ubderstand c++ ,how to change the ‘const char
’ strchr(prName, '#') into ‘char’?
thank you~

Q: Use this repo or Canu 1.8 for read-partitioning?

Hi,

I have a (hopefully) simple question: where should I get meryl for partitioning reads for trio assembly? This repo or the Canu repo?

Background (if that matters): we are interested in not only Canu, but in other recently released long read assemblers as well.

Thank you!

Support for printing top x % of most frequent k-mers

There is already an optiongreater-than N for printing kmers that occur more than N times in the input. It would also be nice to have a similar option most-frequent x where x ranges from (0,1]. This option would return top x fraction of the most repetitive kmers. Similarly least-frequent x may also be useful.

bitArray-v1.H:86 problem when running the pipeline

Hi everyone,

I have some issues running the pipeline in a 600mb genome and I would like to know if someone could help me. Here is my log with the error that I ran into.

Opening outputs:
'-'
setBit()-- ERROR: position=305 > maximum available=0
meryl-lookup: utility/src/bits/bitArray-v1.H:86: void merylutil::bits::v1::bitArray::setBit(uint64, bool): Assertion `position < _maxBitAvail' failed.

Failed with 'Aborted'; backtrace (libbacktrace):
utility/src/system/system-stackTrace-v1.C::82 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
utility/src/bits/bitArray-v1.H::86 in _ZN9merylutil4bits2v18bitArray6setBitEmb()
meryl-lookup/dump.C::128 in processSequence()
utility/src/system/sweatShop-v1.C::308 in _ZN9sweatShop6workerEP15sweatShopWorker()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
/mnt/bin/merqury/merqury_v1.3//eval/spectra-cn.sh: line 71: 37401 Aborted (core dumped) meryl-lookup -bed -sequence $asm_fa -mers ${asm}.0.meryl > ${asm}_only.bed

Any feedback would be very appreciated.

Cheers,
Mike.

Segfault with k >= 38

Hello,

I am unable to run meryl with kmer sizes greater than or equal to 38. Whenever I do I get an error that looks like:

meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
....
Failed with 'Segmentation fault'; backtrace (libbacktrace):
utility/system-stackTrace.C::89 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
(null)::0 in (null)()
Segmentation fault (core dumped)

Below I have included how I install meryl along with my example commands and fasta input.

Any help would be greatly appreciated!

Thanks!
Mitchell

Install script:

rm -rf meryl/
module load gcc/8.1.0
git clone https://github.com/marbl/meryl.git
cd meryl/src
make -j 24
cd ../../

My test fasta sequence:

>1
AAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGGGGGGGGGGGGGGGGGGGGGGGGGGG

My test run with k =37.

meryl/Linux-amd64/bin/meryl count k=37 threads=32 output test test.fasta
Enabling 32 threads.

Counting 127 (estimated) canonical 37-mers from 1 input file:
    sequence-file: test.fasta


SIMPLE MODE
-----------

  Disabled for mers larger than 20.


COMPLEX MODE
------------

prefix     # of   struct   kmers/    segs/      min     data    total
  bits   prefix   memory   prefix   prefix   memory   memory   memory
------  -------  -------  -------  -------  -------  -------  -------
     1     2  P   240  B    64  M     1  S   128 kB   128 kB   128 kB
     2     4  P   480  B    32  M     1  S   256 kB   256 kB   256 kB
     3     8  P   960  B    16  M     1  S   512 kB   512 kB   512 kB
     4    16  P  1920  B     8  M     1  S  1024 kB  1024 kB  1025 kB
     5    32  P  3840  B     4  M     1  S  2048 kB  2048 kB  2051 kB
     6    64  P  7680  B     2  M     1  S  4096 kB  4096 kB  4103 kB
     7   128  P    15 kB     1  M     1  S  8192 kB  8192 kB  8207 kB
     8   256  P    30 kB     1  M     1  S    16 MB    16 MB    16 MB
     9   512  P    60 kB     1  M     1  S    32 MB    32 MB    32 MB
    10  1024  P   120 kB     1  M     1  S    64 MB    64 MB    64 MB  Best Value!
    11  2048  P   240 kB     1  M     1  S   128 MB   128 MB   128 MB
    12  4096  P   480 kB     1  M     1  S   256 MB   256 MB   256 MB
    13  8192  P   960 kB     1  M     1  S   512 MB   512 MB   512 MB
    14    16 kP  1920 kB     1  M     1  S  1024 MB  1024 MB  1025 MB
    15    32 kP  3840 kB     1  M     1  S  2048 MB  2048 MB  2051 MB


FINAL CONFIGURATION
-------------------

Configured complex mode for 0.063 GB memory per batch, and up to 1 batch.

kmerCountFileWriter()-- Creating 'test' for 37-mers, with prefixSize 10 suffixSize 64 numFiles 64
Loading kmers from 'test.fasta' into buckets.
Used 0.277 GB out of 2015.055 GB to store           87 kmers.

Writing results to 'test', using 32 threads.
finishIteration()--

Finished counting.
Bye.

My test run with k =38.

meryl/Linux-amd64/bin/meryl count k=38 threads=32 output test test.fasta
Enabling 32 threads.

Counting 127 (estimated) canonical 38-mers from 1 input file:
    sequence-file: test.fasta


SIMPLE MODE
-----------

  Disabled for mers larger than 20.


COMPLEX MODE
------------

prefix     # of   struct   kmers/    segs/      min     data    total
  bits   prefix   memory   prefix   prefix   memory   memory   memory
------  -------  -------  -------  -------  -------  -------  -------
     1     2  P   240  B    64  M     1  S   128 kB   128 kB   128 kB
     2     4  P   480  B    32  M     1  S   256 kB   256 kB   256 kB
     3     8  P   960  B    16  M     1  S   512 kB   512 kB   512 kB
     4    16  P  1920  B     8  M     1  S  1024 kB  1024 kB  1025 kB
     5    32  P  3840  B     4  M     1  S  2048 kB  2048 kB  2051 kB
     6    64  P  7680  B     2  M     1  S  4096 kB  4096 kB  4103 kB
     7   128  P    15 kB     1  M     1  S  8192 kB  8192 kB  8207 kB
     8   256  P    30 kB     1  M     1  S    16 MB    16 MB    16 MB
     9   512  P    60 kB     1  M     1  S    32 MB    32 MB    32 MB
    10  1024  P   120 kB     1  M     1  S    64 MB    64 MB    64 MB  Best Value!
    11  2048  P   240 kB     1  M     1  S   128 MB   128 MB   128 MB
    12  4096  P   480 kB     1  M     1  S   256 MB   256 MB   256 MB
    13  8192  P   960 kB     1  M     1  S   512 MB   512 MB   512 MB
    14    16 kP  1920 kB     1  M     1  S  1024 MB  1024 MB  1025 MB
    15    32 kP  3840 kB     1  M     1  S  2048 MB  2048 MB  2051 MB


FINAL CONFIGURATION
-------------------

Configured complex mode for 0.063 GB memory per batch, and up to 1 batch.

kmerCountFileWriter()-- Creating 'test' for 38-mers, with prefixSize 10 suffixSize 66 numFiles 64
Loading kmers from 'test.fasta' into buckets.
Used 0.277 GB out of 2015.055 GB to store           86 kmers.

Writing results to 'test', using 32 threads.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.

Failed with '
Failed with '
Failed with 'Aborted'; backtrace (libbacktrace):

Failed with 'AbortedAborted'; backtrace (libbacktrace):
'; backtrace (libbacktrace):
Aborted'; backtrace (libbacktrace):
utility/system-stackTrace.C::89 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()

Failed with 'Segmentation fault'; backtrace (libbacktrace):
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.
meryl: utility/kmers.H:526: void kmerCountFileIndex::set(uint64, FILE*, uint64): Assertion `_blockPosition <= AS_UTL_ftell(F)' failed.

Failed with 'Aborted'; backtrace (libbacktrace):

Failed with 'Aborted'; backtrace (libbacktrace):
meryl: utility/bits.C:711: uint32 stuffedBits::setBinary(uint32, uint64): Assertion `width < 65' failed.

Failed with 'Aborted'; backtrace (libbacktrace):
utility/system-stackTrace.C::89 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()

Failed with 'Segmentation fault'; backtrace (libbacktrace):
utility/system-stackTrace.C::89 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()

Failed with 'Segmentation fault'; backtrace (libbacktrace):
utility/system-stackTrace.C::89 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
(null)::0 in (null)()
Segmentation fault (core dumped)

simple-dump or analogous tool

Hi there -

Thanks for putting together this easy to use tool! I am working to put together the hap_kmer blob plot with my own unzip and canu assemblies, but encountering difficulty.

Namely I am having trouble generating the input for the script hap_kmer_plot.R. The input seems to come from scripts/meryl_count/meryl2_hapmers.sh, however I can't seem to find simple-dump in this repo of meryl.

Nevertheless I created .mcdat input following the triobinningScripts repo directions (using the meryl version there) and simple-dump does not execute as expected.

e.g. when given:
simple-dump -s hap1.k21.filt.nohap2k21.filt -e hap1.k21.filt.only -m 21
it prints the usage:
usage: simple-dump -m mersize -mers mers [-exist existDB] -seq fasta > output

The same issue happens with meryl included in my canu 1.7 install.

Does the current meryl here offer a similar methodology for inputting a k-mer database and an assembly fasta and outputting the counts per scaffold/contig? Otherwise do you have any suggestions for which older branch I can leverage for this purpose?

I realize this is likely all under development which I understand! Thanks for any suggestions.

Erik

Conda version issues

I've noticed that conda wants to default to installing an ancient meryl version labeled "2013" unless a user specifies a version. This seems undesirable and likely to cause confusion for users. They will likely experience "Unknown option" errors when attempting to follow modern instructions/usage because the 2013 version has a significantly reduced feature set and a very different interface. The fix is simply installing the actual most recent version with something like the following...

conda install -c bioconda meryl=1.4.1

...I'm not 100% sure if the managers of this git repo are also responsible for the conda meryl package. Your readme doesn't list conda as an optional installation method. If you are, it would be good if we could remove or rename that 2013 version so that 1.4.1 appears to be the latest version and will be installed by default. Alternatively, if there are fears of that breaking project somewhere by removing/renaming the 2013 version, then I'd suggest that meryl should adopt an alternative versioning system that simply prefixes the year onto the version. This would mean creating a new package named 2023.1.4.1 which conda should default to over the 2013 version.

If the managers of this git repo have nothing to do with the conda package or simply don't want to or aren't able to make any changes to it, I'd recomend simply leaving this issue open to help people who run into this simple issue.

the parental specific kmers were intersected in progeny genome

Hi, I took use of meryl to identify individual-specific kmers with difference subcommand as below:
meryl difference paternal.meryl/ maternal.meryl/ output paternal-specific.meryl
meryl difference maternal.meryl/ paternal.meryl/ output maternal-specific.meryl
following the differnece subcommand description, I put the paternal kmers first and maternal kmers second to identify paternal-specific kmers, then maternal kmers first and paternal kmers second to identify maternal-specific kmers.

Then meryl-lookup was used to identify paternal-specific or maternal-specific kmers in progeny as below:
meryl-lookup -sequence progeny.genome.fa -mers maternal-specific.meryl -bed-runs > progeny-maternal.bed
meryl-lookup -sequence progeny.genome.fa -mers paternal-specific.meryl -bed-runs > progeny-paternal.bed
I compared the two bed files with bedtools intersect -a progeny-maternal.bed -b progeny-paternal.bed -wa -wb | wc -l, there are over 50,000 overlapped intervals.

My confusion is that the paternal-specific or maternal-specific kmers are individual-specific, why they are overlapped in progeny genome?

Much thanks if you can give me suggestions.

How to use meryl

Hi,

I have a fasta file which I wish to turn into kmers using meryl software. I have meryl version 1.0 installed in hpc.
I followed this guide http://kmer.sourceforge.net/wiki/index.php/Getting_Started_with_Meryl

following is the command I used.

module load meryl
meryl -P -m 21 -s file_name.fasta

I get the following message:
Don't know what to do with '-m'.
Don't know what to do with '21'.
Don't know what to do with '-s'.
Don't know what to do with 'file_name.fasta'

I appreciate if you can guide me on how to run meryl.

Thank you

segmentation fault for k<6

Hi, am trying to create a meryl database with a kmer value of 4 or 5, however I keep getting the below error. It works for k greater than or equal to 6 though.

meryl count k=5 output s_chlorontus_5mer /home/jon/Working_Files/sea_cuke_species_data/stichopus_chloronotus/SRR8499559_1.fastq

Found 1 command tree.

Counting 38 (estimated) billion canonical 5-mers from 1 input file:
    sequence-file: /home/jon/Working_Files/sea_cuke_species_data/stichopus_chloronotus/SRR8499559_1.fastq


SIMPLE MODE
-----------

  5-mers
    -> 1024 entries for counts up to 65535.
    -> 16 kbits memory used

  41845276868 input bases
    -> expected max count of 167381107, needing 13 extra bits.
    -> 13 kbits memory used

  3712  B memory needed

Failed with 'Floating point exception'; backtrace (libbacktrace):

Failed with 'Segmentation fault'; backtrace (libbacktrace):
Segmentation fault (core dumped)

Using meryl to prepare for Merqury

Hi,

I'd like to evaluate my assembly. I have several Illumina fastq files. For each library I have one fastq file for Forward and one for Reverse reads.
What is the best way to use Meryl in the goal of using that output for Merqury:
Is something like this okay:
meryl count k=21 lib1_1.fastq lib1_2.fastq lib2_1.fastq lib2_2.fastq ... output Illumina.meryl

Thanks

Seg fault in union-sum

Hi, I'm having trouble setting up my meryl dbs for a set of 10x data. I ran _submit_build_10x.sh from merqury and am getting a segmentation fault in the union-sum step. I am only using k=5, but am getting crazy number coming up for suffixsize. Log file is below:

asm_bApuApu_10x.union_sum.27859520.log

Here is an example of one of the 8 count log files:

asm_bApuApu_10x.count.27859519_1.log

I am hoping that a k of 5 is not an issue, but am hoping you'll be able to point me to some parameters in the _submit_build.sh file that should be changed.

Meryl release: v1.0

Distinct parameter not documented

The distinct parameter, used for generating the high frequency k-mers datasets required by winnowmap, is not documented at all I think.

  • Mapping ONT or PacBio-hifi WGS reads
meryl count k=15 output merylDB ref.fa
meryl print greater-than distinct=0.9998 merylDB > repetitive_k15.txt
winnowmap -W repetitive_k15.txt -ax map-ont ref.fa ont.fq.gz > output.sam 

Use sequence to query meryl db

I want to generate a db of all kmers and their counts for a reference genome using meryl count, then for thousands of small (~1-5 kbp) sequences I want to extract all kmers and find their counts in the genome kmer db.

Is there a way to provide a short sequence as an argument to meryl to query its kmers against an existing db?

It seems like it would not be efficient to run meryl count on all of the short seqs and have to clean up the .meryl files between each query.

eval/qv.sh meryl command not working

Hi,

I am using meryl pipelined with merqury via the script eval/qv.sh which repeatedly put .meryl files as input files but I am constantly getting this error:
Can't interpret '*.0.meryl': not a meryl command, option, or recognized input file.

Is there an error in the script provided or is it an issue with meryl?
even when I try using meryl difference with to .meryl files I get the same error: meryl-1.3/bin/meryl difference 1.meryl 2.meryl output all.meryl I get the error:
Can't interpret '2.meryl': not a meryl command, option, or recognized input file.

Best,

How do you plot?

Hello, I did this

meryl count k=17 m64072_200915_142348.Q20.fastq.gz output ref.meryl

there must be a script to plot it right? Could you tell me which one? Thanks a lot

EDIT: to be clear I would like to get the same result as with jellyfish when we passed it to genomescope. Is it possible?

Thanks

meryl-lookup -include

image
This is the parameter index provided by the author, but why is it not included in the meryl-lookup?
image

Hope for some guidance

version question about meryl

Hi,

Thank you for this awesome software!

During the kmer calling process, I noticed that different versions of Meryl take significantly different amounts of time. Version 1.3.1 is almost three times faster than the version I installed in 2020 using conda, which is version 1.7 showed by --version. However, when using the same dataset, I found that the size of the database differs, even though it generates the same kmer count (verified by using to_hist_for_plotting.sh, which shows the total count is the same).

I'm curious, are there any significant differences between the versions, or are they generally compatible?

I also noticed that the Meryl databases created by different versions seem to have the same structure. Can I use these db, even if they were built with different versions of Meryl?

Homopolymer compression is not applied if the first read file is empty

Running count compress with multiple read files and an empty file as the first file does not apply homopolymer compression. The following command creates an index without homopolymer compression:

meryl count compress k=21 threads=4 memory=32g empty.fa reads.fa output kmers_withempty

But putting the empty file as the not first file will correctly create a homopolymer compressed index:

meryl count compress k=21 threads=4 memory=32g reads.fa empty.fa output kmers_withempty2

meryl print shows the first file is not homopolymer compressed but the second is:

$ meryl print kmers_withempty/ | head

Found 1 command tree.

PROCESSING TREE #1 using 1 thread.
  opLessThan
    kmers_withempty/
    print to (stdout)
AAAAAAAAAAAAAAAAATAAG   1
AAAAAAAAAAAAAAAACTACA   1
AAAAAAAAAAAAAAAATAAGG   1
AAAAAAAAAAAAAAACAATAC   1
AAAAAAAAAAAAAAACTACAG   1
AAAAAAAAAAAAAAATAAGGA   1
AAAAAAAAAAAAAACAATACT   1
AAAAAAAAAAAAAACTACAGA   1
AAAAAAAAAAAAAATAAGGAG   1
AAAAAAAAAAAAAAGTACTTT   1

$ meryl print kmers_withempty2 | head

Found 1 command tree.

PROCESSING TREE #1 using 1 thread.
  opLessThan
    kmers_withempty2/
    print to (stdout)
ACACACACACACACACTACTA   1
ACACACACACACACTACTACT   1
ACACACACACACATCATATAC   1
ACACACACACACTACAGACAT   1
ACACACACACACTACAGATCA   1
ACACACACACACTACTACTAC   2
ACACACACACATCATATACAG   1
ACACACACACTACAGACATCA   1
ACACACACACTACAGATCATC   1
ACACACACACTACTACTACTA   4

$ meryl --version
meryl snapshot v1.4-development +29 changes (r969 97d5923dd69ebc3efed67fc466c21ed8c5e6670b)

database file parameters used in count output are too large

The prefixSize used for writing count output is too large when inputs are large too.

https://github.com/marbl/meryl/blob/master/src/meryl/merylOp-countThreads.C#L404

Sets the output prefix based on the 'optimal' prefix used for counting. It works fine for moderate kmer sizes (e.g., 22) but when larger (e.g., 28) database chunks are too big for merging.

Example:

prefix     # of   struct   kmers/    segs/      min     data    total
  bits   prefix   memory   prefix   prefix   memory   memory   memory
------  -------  -------  -------  -------  -------  -------  -------
    14    16 kP    66 MB    98 kM   130  S    64 MB  8320 MB  8386 MB
    15    32 kP   117 MB    49 kM    64  S   128 MB  8192 MB  8309 MB
    16    64 kP   217 MB    24 kM    31  S   256 MB  7936 MB  8153 MB  Best Value!
    17   128 kP   420 MB    12 kM    16  S   512 MB  8192 MB  8612 MB
    18   256 kP   824 MB  6314  M     8  S  1024 MB  8192 MB  9016 MB
> meryl dumpIndex 001.meryl
Opened '001.meryl'.
  magic          0x646e496c7972656d33302e765f5f7865 'merylIndex__v.03'
  prefixSize     16
  suffixSize     40
  numFilesBits   6 (64 files)
  numBlocksBits  10 (1024 blocks)

But after merging, the prefix is more reasonable (though this is, iirc, a fixed hardcoded size). Merging seems to want to use around 1 GB per input database, not sure why.

> meryl dumpIndex 00x.meryl/
Opened '00x.meryl/'.
  magic          0x646e496c7972656d33302e765f5f7865 'merylIndex__v.03'
  prefixSize     12
  suffixSize     44
  numFilesBits   6 (64 files)
  numBlocksBits  6 (64 blocks)

kmer-mask support?

Hi,

A good job of rewriting of meryl. We found some issues of it in our tests too.

Is it OK to add kmer-mask support too? firstly it is OK to just add the support of the make of kmer-mask source.

Best Regards

ERROR: operation 'opNothing' cannot use sequence files as inputs.

Hi,
I am trying to estimate read accuracy for an assembly using this pipeline https://genome.cshlp.org/content/suppl/2020/09/02/gr.263566.120.DC1/Supplemental_Material_.pdf

I have the meryl database, and try to run the final step and I have this error:
meryl -lookup -memory 40 -threads 16 -existence -sequence ../reads.fastq.gz -output k10.kmers -mers k10_filtered.meryl
ERROR: operation 'opNothing' cannot use sequence files as inputs.

Any idea, what happen?
thanks
Gonzalo

add conversion to 'usual' acgt ordering to get canonical mer correct

In 'print' output, it would be convenient to have the correct canonical kmer reported. Translate from ACTG to ACGT, recompute the canonical kmer, and output that. The output order will still be deterministic, but no longer alphabetical.

Option 'print-acgt'? Generalize it to 'print-****' to get any ordering?

If properly sorted kmers are required, data could be loaded into the standard lookup structure and output from there. Each bucket would need to be sorted first.

[Compile] ‘gettimeofday’ was not declared in this scope

While compiling on ubuntu 22.04.2 using GCC 12.2.0 I get a bunch of the following errors:

utility/src/system/time-v1.H:88:5: error: ‘gettimeofday’ was not declared in this scope; did you mean ‘SYS_gettimeofday’?
88 |     gettimeofday(&tp, nullptr);

As per this post on stackoverflow, adding...

#define _BSD_SOURCE

#include <sys/time.h>

...to utility/src/system/time-v1.H seems to resolve this issue.

feature request: filter reads with kmers from bam file?

Would it be possible to add a feature to run meryl-lookup exclude/include on a BAM instead of a fasta and output BAM? This would be very useful for filtering reads from PacBio or ONT data in their original BAM format without going through FASTA intermediary. Or at least just output a list of reads from the fasta instead of generating the filtered fasta?

Thanks,
KF

API docs

Are there any draft docs available for the Meryl C++ API?

Or, very optimistically, has anyone written any python bindings for doing kmer set operations with Meryl?

How to install meryl2?

@treangen

Dear author, thank you very much for writing this software, this software is very convenient. Therefore, I would like to experience meryl2 in advance. How to install it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.