Giter Club home page Giter Club logo

egsa's People

Contributors

felipelouza avatar jiningsong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

egsa's Issues

memory error

Hi @felipelouza ,

I would like to play with this package. So, I try:

git clone https://github.com/felipelouza/egsa.git
cd egsa
make
./egsa dataset/input-100.txt 2

And, I get this error:

SIGMA = 255
DIR = dataset/
INPUT = input-100.txt
K = 2
MEMLIMIT = 2048.00 MB
CHECK = 0
COMPUTE_BWT = 0
WORKSPACE = 13.n bytes
malloc_count ### free(0x7fcd04fffff0) has no sentinel !!! memory corruption?
egsa(5619,0x7fffa48d5380) malloc: *** error for object 0x7fcd04fffff0: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

What I am doing wrong?

I am using Mac OS X version 10.13.6

Best,

segfault when writing the output file

With this input file testatestb\0blablabla

i get a segfault (tried with K = 0, 1, 2):

DIR = ./
INPUT = lalouli.txt
K = 0
MEMLIMIT = 2048.00 MB
CHECK = 0
COMPUTE_BWT = 0
WORKSPACE = 13.n bytes

### PREPROCESSING ###
K = 0
PARTITIONS = 1
TOTAL = 20 bytes	0.00 MB
CLOCK = 0.000658 TIME = 0.000000
0.000658	0.000000

### PHASE 1 ###
CLOCK = 0.000184 TIME = 0.000000
0.000184	0.000000

### PHASE 2 ###

INDUCING:
alfa	TOTAL	INDUCED	%:
ALL)	19	0	0.00
Segmentation fault (core dumped)

and the output file is created but empty.

How do we read max-length LCP text?

Hi!

This is an excellent library, thanks for posting!

I am wondering, is there a way to get the actual max-length LCP text? When I ran the code I can see the maximum length, but not the actual LCP. Is there a way to get that done?

I am afraid I do not have much experience with advanced C/C++, so any help is greatly appreciated! Thanks!

Output interpretation

Hi @felipelouza ,

Finally, I have run the library on a Linux machine :)

I am not sure if I interpret in the right way the normal output of this library, because I get a bigger LCS size with k=50 than with k=5. What is the meaning of the "size" in the output?

k=5

ubuntu@ip-172-31-32-99:~/egsa/egsa$ ./egsa  dataset/input-100.txt 5
SIGMA = 255
DIR = dataset/
INPUT = input-100.txt
K = 5
MEMLIMIT = 2048.00 MB
CHECK = 0
COMPUTE_BWT = 0
WORKSPACE = 13.n bytes
### PREPROCESSING ###
K = 5
PARTITIONS = 1
TOTAL = 286 bytes       0.00 MB
CLOCK = 0.000272 TIME = 0.000000
0.000272        0.000000
### PHASE 1 ###
CLOCK = 0.000125 TIME = 0.000000
0.000125        0.000000
### PHASE 2 ###
INDUCING:
alfa    TOTAL   INDUCED %:
ALL)    285     98      34.39
CLOCK = 0.004332 TIME = 0.000000
0.004332        0.000000
### TOTAL ###
CLOCK = 0.004495 TIME = 0.000000
0.004495        0.000000
milisecond per byte = 0.000000000
0.000000000
size = 285
malloc_count ### exiting, total: 1,158,870,124, peak: 1,158,641,041, current: 1,033

k=50

ubuntu@ip-172-31-32-99:~/egsa/egsa$ ./egsa  dataset/input-100.txt 50
SIGMA = 255
DIR = dataset/
INPUT = input-100.txt
K = 50
MEMLIMIT = 2048.00 MB
CHECK = 0
COMPUTE_BWT = 0
WORKSPACE = 13.n bytes
### PREPROCESSING ###
K = 50
PARTITIONS = 1
TOTAL = 2848 bytes      0.00 MB
CLOCK = 0.000360 TIME = 0.000000
0.000360        0.000000
### PHASE 1 ###
CLOCK = 0.000612 TIME = 0.000000
0.000612        0.000000
### PHASE 2 ###
INDUCING:
alfa    TOTAL   INDUCED %:
ALL)    2847    1403    49.28
CLOCK = 0.005419 TIME = 0.000000
0.005419        0.000000
### TOTAL ###
CLOCK = 0.006064 TIME = 0.000000
0.006064        0.000000
milisecond per byte = 0.000000000
0.000000000
size = 2847
malloc_count ### exiting, total: 1,159,007,790, peak: 1,158,692,569, current: 1,033

My problem is about to find the k-LCS in n (n>=k and 2<=k<=n) strings. So, when k=5 the LCS value should be >= than when k=50.

add non binary output format

I'd like to play with this library, but I'm not familiar with C, an example of how to export the results to a text format would be very helpful, so it can be easily interfaced with anything.

How to read the output binary file?

Hi, what is the format of the output binary file, how to read them in C++/C?
Also, I am curious how you deal with the read index for generalized suffix array. When there are many reads, the read index may cost a lot of memory.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.