Giter Club home page Giter Club logo

mca's Introduction

mca

This repository contains a compression algorithm written by me (Michael Grigoryan). The algorithm is only able to compress and decompress text files and is not guaranteed to work with other file types.

The algorithm works best with repetitive texts.

Explanation with an example

Suppose you have this repetitive text:

Nory was a Catholic because her mother was a Catholic,
and Nory’s mother was a Catholic because her father
was a Catholic, and her father was a Catholic because
his mother was a Catholic, or had been.

Taken from https://thejohnfox.com/2021/08/17-fantastic-repetition-examples-in-literature/

As you can see the text is very repetitive. The compression algorithm will loop through all the lines and the words in them.

If a word is not present in the shared index and is used multiple times throughout the text body, the algorithm will append that word to the shared index. If the word is only used once, then it is added directly to the compressed file, without being added to the shared index.

After the compression the file will produce a compressed.mca file which will have the following content:

["Nory","was","a","Catholic","because","her","mother","Catholic,","and","father","or"]
0 1 2 3 4 5 6 1 2 7
8 Nory’s 6 1 2 3 4 5 9
1 2 7 8 5 9 1 2 3 4
his 6 1 2 7 10 had been.

The first line of the output contains the shared index. If you attempt to decompress a file without this "header" the program will throw a corruption error.

Usage

You can start the program by running:

cargo run --release

after which you should get a prompt, asking you to choose an action:

select an option: (c)ompress/(d)ecompress:

Both, compression and decompression are supported. Compressing a file will create a file named compressed.mca and decompression will output a file named decompressed.txt.

mca's People

Contributors

michaelgrigoryan25 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.