Giter Club home page Giter Club logo

auto-compressor's Introduction

AutoCompressor

@author Бобров Олег Борисович, группа М4150

This repository contains source code for neural compressor. Model was build using autoencoder architecture. For encoder part pretrained ResNet18 was used. Decoder is mirrored representation of encoder with upsampling layers.

Latent vector obtained after encoding part is quantized (hard mode with B = 2 and sof mode with B = 10) and then encoded using adaptive arithmetic encoding. Arithmetic encoder takes quantized vector with values in range [0; 2^B] as the input and outputs binary sequence. Encoding is performed using arithmetic-compressor python package. SimpleAdaptiveModel was used for probabilities update. This model gradually forgets old statistics with exponential moving average. Embedding size is 1 x 32768 which is only 0.125 of the initial amount of pixels (512 x 512). Smaller embeddings led to worse results (visually). Bigger ones might strongly increase final encoding size. Qunatization during inference is performed with this formula:

equation

The problem is that this quantization is not differentiable, so we cannot use it during training. Thus, during training phase we add some noise which commensurate in amplitude with or quantization:

equation

Below you can find charts of PSNR/BPP comparison between JPG (low and high quality), AE(B=2) and AE(B=10) algorithms on three test images.

Images comparison PSNR

As we can see, with almost identical BPP (for low quality and high quality images) JPG slightly outperforms encoding with the neural network. Better quality might be obtained using more complex model or with more training epochs.

Below you can see usage example for coding and decoding. 3 test images were encoded with these commands. You can run them all using test_all.sh

Training

Autoencoders were trained using 130k Images (512x512) - Universal Image Embeddings dataset. MSE loss was used in both cases. You can find training notebook here.

Weights

You can download trained weights from Google Drive

Encoding

Soft (B = 10)

python3 encode.py --input_path=resources/test_images/peppers.png --output_path=resources/encoded/B=10/peppers.encoded --encoder_path=models/model_B=10/encoder.model --quantize_mode=soft

python3 encode.py --input_path=resources/test_images/lena.png --output_path=resources/encoded/B=10/lena.encoded --encoder_path=models/model_B=10/encoder.model --quantize_mode=soft

python3 encode.py --input_path=resources/test_images/baboon.png --output_path=resources/encoded/B=10/baboon.encoded --encoder_path=models/model_B=10/encoder.model --quantize_mode=soft

Hard (B = 2)

python3 encode.py --input_path=resources/test_images/peppers.png --output_path=resources/encoded/B=2/peppers.encoded --encoder_path=models/model_B=2/encoder.model --quantize_mode=hard

python3 encode.py --input_path=resources/test_images/lena.png --output_path=resources/encoded/B=2/lena.encoded --encoder_path=models/model_B=2/encoder.model --quantize_mode=hard

python3 encode.py --input_path=resources/test_images/baboon.png --output_path=resources/encoded/B=2/baboon.encoded --encoder_path=models/model_B=2/encoder.model --quantize_mode=hard

Decoding

Soft (B = 10)

python3 decode.py --output_path=resources/results/B=8/peppers_reconstructed.png --input_path=resources/encoded/B=10/peppers.encoded --decoder_path=models/model_B=10/decoder.model --quantize_mode=soft

python3 decode.py --output_path=resources/results/B=8/lena_reconstructed.png --input_path=resources/encoded/B=10/lena.encoded --decoder_path=models/model_B=10/decoder.model --quantize_mode=soft

python3 decode.py --output_path=resources/results/B=8/baboon_reconstructed.png --input_path=resources/encoded/B=10/baboon.encoded --decoder_path=models/model_B=10/decoder.model --quantize_mode=soft

Hard (B = 2)

python3 decode.py --output_path=resources/results/B=2/peppers_reconstructed.png --input_path=resources/encoded/B=2/peppers.encoded --decoder_path=models/model_B=2/decoder.model --quantize_mode=hard

python3 decode.py --output_path=resources/results/B=2/lena_reconstructed.png --input_path=resources/encoded/B=2/lena.encoded --decoder_path=models/model_B=2/decoder.model --quantize_mode=hard

python3 decode.py --output_path=resources/results/B=2/baboon_reconstructed.png --input_path=resources/encoded/B=2/baboon.encoded --decoder_path=models/model_B=2/decoder.model --quantize_mode=hard

auto-compressor's People

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.