ACOMA is a DNA storage codec that provides high density and can adapt to different requirements for DNA synthesis and sequencing.
ACOMA depends on https://github.com/klauspost/reedsolomon
Please install it using
go get -u github.com/klauspost/reedsolomon
Lookup tables speed up significantly ACOMA. You can generate them using the tblgen tool (see below), or download them from github (1.7 GB file):
https://github.com/lanl/acoma/releases/download/0.9/tables.zip
Unpack the zip into the tbl directory where the tools and the unit tests are expectin the lookup tables.
To get ACOMA clone this repository and build the packages and commands that you are interested in.
The specification of the codec is located in the slides located in the doc directory. More documentation on the implementation is located in the source code.
Contains the basic abstraction of an oligo that is used by the rest of the packages.
An implementation of the basic oligo interface that stores an oligo in a 64-bit integer, and therefore can handle short oligos (up to 32 nts).
An implementation of the basic oligo interface that can store an arbitrary long oligo. It uses one byte per nt.
Abstract interface for oligo viability criteria. It is used by the Level 0 codec (l0) to check if an oligo can be synthesized/sequenced. The package implements a single criteria: H4G2 that prevents oligos with homopolymers longer than 4 nts (for A, T, and C) or 2 nts for G.
Level 0 of the ACOMA codec (bit packing). Theoretically it can pack any value up to 64 bits. In practice it is prohibitively slow to pack large values and requires lookup tables even for 17 bit values to achieve reasonable performance.
Level 1 of the ACOMA codec. Packs an address and array of bytes into a single oligo.
Level 2 of the ACOMA codec. Packs an arbitrary array of bytes into a collection of oligos. Provides erasure code oligos for recove of the data in case of errors.
The tools in the repository use the packages to provide some convenient commands.
Generates encoding and decoding lookup tables for speeding-up the Level 0 encoding and decoding.
For example, generating an encoding lookup table for 17 nts oligos that has 2^13 entries can be done by:
./tblgen -e encnt17b13.tbl -l 17 -b 13
Generating a decoding lookup table for 17 nts oligos that has 2^14 entries can be done by:
./tblgen -d decnt17b7.tbl -l 17 -b 7
Although the code is parallelized and uses all available cores, it can take few hours to generate the table.
Encodes the specified file and outputs a list of oligos that represent it.
Decodes the specified list of oligos into a file. If not all data can be recovered, the output file might have holes.
The packages have some limited unit tests that can be run by the standard:
go test
The unit tests will slowly be extended to cover all use cases.
There are multiple TODO and FIXME comments in the source code that describe things that are missing, or implementation restrictions that should be fixed eventually.