Giter Club home page Giter Club logo

parallel-zip's Introduction

๐Ÿ›ค๏ธ Parallel Zip (pzip)

About

Parallel Zip (pzip) is a multi-threaded program that compresses a list of input files specified in the command line arguments using Run Length Encoding (RLE). It implements locks and semaphores to ensure multiple threads can safely access a shared unbounded buffer. Additional semaphores are also used to order the output (print in the same order as the input list).

More information can be found here.

Team members and contribution

Design Considerations

Paralleling the compression

We used multiple threads to compress the file. This allows us to run the compression algorithm in parallel. In addition, we saved this compressed data in memory to decrease the amount of time spent in the ordering semaphores' critical section.

Determine the number of threads to create

Using get_nprocs(), we can determine the number of processors available on the system. This number is then used as the max thread limit (unless the system does not have multiple cores โ€” which it would then default to 5). The program will not create more threads than needed (except for the 5 default threads).

Efficiency of each thread

By memory mapping input files, using a thread pool, and storing compressed data in memory until their turn to print, we can efficiently perform each piece of work in parallel.

Access the input files efficiently

Memory mapping was the way we efficiently accessed the input files. This allows us to have easier/quicker access to the files. In addition, the memory mapping occurs in the worker threads. This allows input files to be read/processed concurrently!

Coordinating multiple threads

We used a lock to protect shared data (the job queue). A semaphore to prevent job worker threads from running when the queue is empty. And multiple semaphores to order the printing output.

Terminating threads in the thread pool

We created a kill boolean in the job struct (this struct is added to the job queue). Whenever a worker thread receives a new job, it will check the kill boolean. If kill is true, we killed the thread and exit appropriately.

Strengths and Weaknesses

Strengths:

  • Parallelizes the compression algorithm
  • Saves compressed data to memory before printing
    • Prevents computation and printing bottleneck
  • Faster than wzip
  • Handles potential system call errors

Weaknesses:

  • Only one thread per file
  • Uses Run Length Encoding (RLE)

parallel-zip's People

Contributors

castelitoo avatar garyhtou avatar hankrud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.