Giter Club home page Giter Club logo

condensa's Introduction

A Programming System for Neural Network Compression

Note: the original version of Condensa (contained in this branch) is no longer actively maintained. Please check out the lite branch for the most up-to-date version.

Condensa is a framework for programmable model compression in Python. It comes with a set of built-in compression operators which may be used to compose complex compression schemes targeting specific combinations of DNN architecture, hardware platform, and optimization objective. To recover any accuracy lost during compression, Condensa uses a constrained optimization formulation of model compression and employs an Augmented Lagrangian-based algorithm as the optimizer.

Status: Condensa is under active development, and bug reports, pull requests, and other feedback are all highly appreciated. See the contributions section below for more details on how to contribute.

Supported Operators and Schemes

Condensa provides the following set of pre-built compression schemes:

The schemes above are built using one or more compression operators, which may be combined in various ways to define your own custom schemes.

Please refer to the documentation for a detailed description of available operators and schemes.

Prerequisites

Condensa requires:

  • A working Linux installation (we use Ubuntu 18.04)
  • NVIDIA drivers and CUDA 10+ for GPU support
  • Python 3.5 or newer
  • PyTorch 1.0 or newer

Installation

The most straightforward way of installing Condensa is via pip:

pip install condensa

Installation from Source

Retrieve the latest source code from the Condensa repository:

git clone https://github.com/NVlabs/condensa.git

Navigate to the source code directory and run the following:

pip install -e .

Test out the Installation

To check the installation, run the unit test suite:

bash run_all_tests.sh -v

Getting Started

The AlexNet Notebook contains a simple step-by-step walkthrough of compressing a pre-trained model using Condensa. Check out the examples folder for additional, more complex examples of using Condensa (note: some examples require the torchvision package to be installed).

Documentation

Documentation is available here. Please also check out the Condensa paper for a detailed description of Condensa's motivation, features, and performance results.

Contributing

We appreciate all contributions, including bug fixes, new features and documentation, and additional tutorials. You can initiate contributions via Github pull requests. When making code contributions, please follow the PEP 8 Python coding standard and provide unit tests for the new features. Finally, make sure to sign off your commits using the -s flag or adding Signed-off-By: Name<Email> in the commit message.

Citing Condensa

If you use Condensa for research, please consider citing the following paper:

@article{condensa2020,
  title={A Programmable Approach to Neural Network Compression}, 
  author={V. {Joseph} and G. L. {Gopalakrishnan} and S. {Muralidharan} and M. {Garland} and A. {Garg}},
  journal={IEEE Micro}, 
  year={2020},
  volume={40},
  number={5},
  pages={17-25},
  doi={10.1109/MM.2020.3012391}
}

Disclaimer

Condensa is a research prototype and not an official NVIDIA product. Many features are still experimental and yet to be properly documented.

condensa's People

Contributors

nitro-tuner avatar srvm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

condensa's Issues

How to Measure Performance?

I tried to execute compression from the Alexnet notebook example.
It gave me the output of AlexNet_MEM.pth and AlexNet_FLOP.pth

Then I tried to load the model and compare the memory with 'nvidia-smi' command.
Unfortunately, I don't see any memory improvement for both.

I also tried to compare the throughput by comparing time to do inference for each model.
Again, I'm unable to see improvement. All of them took almost the same time to do inference.

Could you advice how did you compare the performance?

Filter and neuron pruning don't work in scheme composer (lite branch)

Filter and neuron pruning do not have add_mask_to_module() support so there is a NotImplementedError raised here or here. I was thinking I could catch and handle this exception since the masks aren't a necessity, but raising the exception breaks out of the for loop iterating over the layers in the scheme composer and leaves the remaining layers dense.

I am pretty sure saving the masks in the layer module isn't a necessity, so raising an exception seems like overkill because it stops the model from being pruned. You could also handle the exception inside the scheme composer's for loop so that it doesn't break out and is able to prune all the layers.

I can submit a PR myself if you want to describe which route you want to take for fixing this. Obviously supporting mask saving for filter and neuron pruning would also work, but I'm not sure what work needs to be done for that.

Maybe a minor error in the code snippet for "Setting up the Optimizer"?

Hello,
In the "Setting up the Optimizer" part from the LeNet5 tutorial, there is a text description as follows.

In our case, we run the L-C algorithm for 40 iterations using the hyper-parameter values shown above. LC hyper-parameter values for a number of common convolutional neural networks are also included in the /workspace/condensa/examples folder in the container.

However, the steps argument for the function condensa.opt.LC is set to 2 in the given code snippet. Shouldn't the 'steps' argument be set to 40 instead?

Network Thinning Support

thx for your work, I check the library and not find the thin method which mentioned in the article of "A PROGRAMMABLE APPROACH TO MODEL COMPRESSION"

Replace view with reshape?

Hi, we were running condensa on the CIFAR-10 and Alexnet example. Our versions are:

Python version: 3.8.5
Pytorch version: 1.12.1+cu102

we get the following error at condensa/util.py line 112

correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one \
dimension spans across two contiguous subspaces). Use .reshape(...) instead.

We tried just replacing view with reshape and things seem to work okay. But I don't know Pytorch enough to know if this is the correct thing to do.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.