Giter Club home page Giter Club logo

mass-image-processing's Introduction

mass-image-processing

Simple hassle-free image processing using OpenCV for large amount of images concurrently in C++.

This program can apply a list of filters on a group of images concurrently. For example, you can change red color to green, then yellow color to blue and then apply a gaussian blur in all images inside a directory. These operations will be in the most possible efficient way. Both image groups and every single filters on individual images are processed in parallel and concurrently.

Created using modern C++ 20 features to utilize concurrent and parallel programming and managing threads using asynchronous execution, futures and TBB lib.

No manual thread handling, no locks, no mutex, no semaphore.

Advantages

  • Parallel loading of images from directory
  • Parallel manipulation of images in OpenCV
  • Parallel manipulation of every single filter on individual images
  • Preserve order of filters when applying concurrently
  • Parallel saving of images to output directory
  • Ability to buffer IO
  • Asynchronous
  • Using futures
  • Using TBB
  • Concurrent and parallel
  • Sequential version for comparison
  • No lock, mutex or semaphore
  • No race conditions or deadlocks
  • Simple and understandable code
  • Documented code

For comparison, you can see a sequential manipulation of images takes too long:

mass-image-processing-sequential.mp4

But when using parallel execution everything is blazing fast:

mass-image-processing-parallel.mp4

Quickstart

This project is created using OpenCV and TBB libs, so you need to have those in your $PATH in order to compile the project.

After cloning repo simply run:

cmake .
make

Then you can find the mass_image_processing in your current directory.

Usage

A minimal usage would look like this:

mass_image_processing ./imgs-in ./imgs-out

Where imgs-in is a directory containing as many as jpeg images as you want and imgs-out is an empty directory to put the manipulated images.

There is also two more arguments that you can pass to the binary. First one is concurrency type and second one is IO type.

Concurrency type:

  • -p (default) for parallel processing
  • -s for sequential processing

IO type:

  • -nb (default) to not use buffered IO
  • -b to use buffered IO

Parameters order matters:

Usage: mass_image_processing <input_dir> <output_dir> [-p,-s] [-nb,-b]
    -p: parallel processing (default)
    -s: sequential processing
    -nb: non-buffered I/O (default)
    -b: buffered I/O

Run parallel processing with non-buffered IO (default):

mass_image_processing ./imgs-in ./imgs-out

or

mass_image_processing ./imgs-in ./imgs-out -p -nb

Run parallel processing with buffered IO:

mass_image_processing ./imgs-in ./imgs-out -p -b

Run sequential processing with non-buffered IO:

mass_image_processing ./imgs-in ./imgs-out -s -nb

Run sequential processing with buffered IO:

mass_image_processing ./imgs-in ./imgs-out -s -b

Comparison

Depending on how you want to process images it will have a major effect on how your system resources will be utilized and how much time it will take to run the job.

We will compare all 4 possibilities below with different number of images.

All images are different from each other and are high-quality 4k images which every one of them has a size around 10MB.

These benchmarks are generated in an Arch Linux pc with 20 cores and 64GB of RAM.

10 Images

Running with 10 images will be like this:

Parallel non-buffered Parallel buffered Sequential non-buffered Sequential buffered
1.38 secs 1.46 secs 10.38 secs 10.76 secs

100 Images

Running with 100 images will be like this:

Parallel non-buffered Parallel buffered Sequential non-buffered Sequential buffered
20.39 secs 20.28 secs 170.40 secs 170.22 secs

1000 Images

Running with 1000 images will be like this:

Parallel non-buffered Parallel buffered Sequential non-buffered Sequential buffered
131.16 secs 129.61 secs 16.99 mins 16.93 mins

Conclusion

Runtime Comparison

By comparing runtime results we can see the parallel version is about 8 times faster than the sequential version.

Resource Usage

Sequential version used a fixed amount of CPU and RAM because every image was roughly the same size and there was no parallel execution. Even one percent of CPU and RAM was not acquired by sequential version, so many of the system resources was not utilized and remained useless. But on the other hand the parallel version used as much CPU and RAM as it could acquire without degrading the performance of other parts of the system. This is because C++ asynchronous programming will manage threads in the most efficient way possible.

Buffered IO

Enabling or disabling buffered IO seems to not much affect the performance, but when number of images increases it improves the performance. So we can say for low number of images (e.g. less than 1000) it's better to use non-buffered version but for higher amounts of images it's better to use buffered IO.

mass-image-processing's People

Contributors

j-tag avatar

Stargazers

Ali Azmoodeh Valdi avatar milad baharlo avatar Naser Rezayi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.