ivannz / cplxmodule Goto Github PK

Complex-valued neural networks for pytorch and Variational Dropout for real and complex layers.

License: MIT License

Python 100.00%

pytorch complex-networks variational-bayes dropout sparsification

cplxmodule's Introduction

CplxModule

A lightweight extension for torch.nn that adds layers and activations, which respect algebraic operations over the field of complex numbers, and implements real- and complex-valued Variational Dropout methods for weight sparsification. The complex-valued building blocks and Variational Dropout layers of both kinds can be seamlessly integrated into pytorch-based training pipelines. The package provides the toolset necessary to train, sparsify and fine-tune both real- and complex-valued models.

Documentation

For a high-level description of the implementation, functionality and useful code patterns, please refer to the following READMEs

cplxmodule.nn the implemented complex-valued layers and their basic use
cplxmodule.nn.relevance the plug-and-play layers for Variational Dropout and how to use them ([3], [4], [5]).
cplxmodule.nn.masked supported masked layers for fine-tuning pruned networks and how to migrate parameters between classic torch.nn layers

Implementation

The core implementation of the complex-valued arithmetic and layers is based on careful tracking of transformations of real and imaginary parts of complex-valued tensors, and leverages differentiable computations of the real-valued pytorch backend.

The batch normalization and weight initialization layers are based on the ICLR 2018 paper by Chiheb Trabelsi et al. (2018) on Deep Complex Networks [1] and borrow ideas from their implementation (nn.init, nn.modules.batchnorm). The complex-valued magnitude-based Max pooling is based on the idea by Zhang et al. (2017) [6].

The implementations of the real-valued Variational Dropout and Automatic Relevance Determination are based on the profound works by Diederik Kingma et al. (2015) [2], Dmitry Molchanov et al. (2017) [3], and Valery Kharitonov et al. (2018) [4].

Complex-valued Bayesian sparsification layers are based on the research by Nazarov and Burnaev (2020) [5].

Installation

The essential dependencies of cplxmodule are numpy, torch and scipy, which can be installed via

# essential dependencies
# conda update -n base -c defaults conda
conda create -n cplxmodule "python>=3.7" pip numpy scipy "pytorch::pytorch" \
  && conda activate cplxmodule

Extra dependencies, that are used in tests and needed for development, can be added on top of the essentials. Check ONNX Runtime to see of your system is compatible.

conda activate cplxmodule

# extra deps for development
conda install -n cplxmodule matplotlib scikit-learn tqdm pytest "pytorch::torchvision" \
  && pip install black pre-commit

# ONNX (for compatible systems)
conda install -n cplxmodule onnx && pip install onnxruntime

The package itself can be installed this package with pip:

conda activate cplxmodule

pip install cplxmodule

or from the git repo to get the latest version:

conda activate cplxmodule

pip install --upgrade git+https://github.com/ivannz/cplxmodule.git

or locally from the root of the locally cloned repo, if you prefer an editable developer install:

conda activate cplxmodule

# enable basic checks (codestyle, stray whitespace, eof newline)
pre-commit install

# editable install
pip install -e .

# run tests to verify installation (batchnorm test )
# XXX `test_batchnorm.py` depends on the precision of the outcome of SGD, hence
#  may occasionally fail
# XXX A user warning concerning non-writable numpy array is expected
pytest

Additionally, you may want to study the following examples and test Variational Dropout:

conda activate cplxmodule

# test real- and complex-valued Bayesian sparsification layers
python tests/test_relevance.py

# showcase the train-sparisify-fine-tune staged pipeline on a basic
#  real-valued CNN on MNIST
python tests/test_mnist.py

Citation

The proper citation for the real-valued Bayesian Sparsification layers from cplxmodule.nn.relevance.real is either [3] (VD) or [4] (ARD). If you find the complex-valued Bayesian Sparsification layers from cplxmodule.nn.relevance.complex useful in your research, please consider citing the following paper [5]:

@inproceedings{nazarov_bayesian_2020,
    title = {Bayesian {Sparsification} of {Deep} {C}-valued {Networks}},
    volume = {119},
    url = {http://proceedings.mlr.press/v119/nazarov20a.html},
    language = {en},
    urldate = {2021-08-02},
    booktitle = {International {Conference} on {Machine} {Learning}},
    publisher = {PMLR},
    author = {Nazarov, Ivan and Burnaev, Evgeny},
    month = nov,
    year = {2020},
    note = {ISSN: 2640-3498},
    pages = {7230--7242}
}

References

[1] Trabelsi, C., Bilaniuk, O., Zhang, Y., Serdyuk, D., Subramanian, S., Santos, J. F., Mehri, S., Rostamzadeh, N, Bengio, Y. & Pal, C. J. (2018). Deep complex networks. In International Conference on Learning Representations, 2018.

[2] Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational dropout and the local reparameterization trick. In Advances in neural information processing systems (pp. 2575-2583).

[3] Molchanov, D., Ashukha, A., & Vetrov, D. (2017, August). Variational dropout sparsifies deep neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp. 2498-2507). JMLR.org

[4] Kharitonov, V., Molchanov, D., & Vetrov, D. (2018). Variational Dropout via Empirical Bayes. arXiv preprint arXiv:1811.00596.

[5] Nazarov, I., & Burnaev, E. (2020, November). Bayesian Sparsification of Deep C-valued Networks. In International Conference on Machine Learning (pp. 7230-7242). PMLR.

[6] Zhang, Z., Wang, H., Xu, F., & Jin, Y. Q. (2017). Complex-valued convolutional neural network and its application in polarimetric SAR image classification. IEEE Transactions on Geoscience and Remote Sensing, 55(12), 7177-7188.

cplxmodule's People

Contributors

Stargazers

Watchers

cplxmodule's Issues

cplx_trabelsi_independent_ not working for CplxLinear layer

I was trying to use cplxmodule.nn.init.cplx_trabelsi_independent_ for the cplxmodule.nn.CplxLinear layer as given in the minimalistic snipped below.

from cplxmodule.nn import CplxLinear
from cplxmodule.nn.init import cplx_trabelsi_independent_ 

fc1 = CplxLinear(1250,500)
cplx_trabelsi_independent_(fc1.weight)

However, it gives me the below provided error. Am I using it incorrectly? Help is appreciated.
File "<path>/python3.7/site-packages/cplxmodule/nn/init.py", line 114, in cplx_trabelsi_independent_ M = M.reshape(cplx.shape) ValueError: cannot reshape array of size 250000 into shape (500,1250)

Complex max pooling

SpectralNorm or WeightNorm

Hi Ivan,

Thanks for this repo. This has been very helpful to me and I have been using this extensively for my research. However, I was wondering if you have any plans on including nn.spectral_norm or nn.weight_norm for the complex modules where these norms run on real and imaginary counterparts individually. This might be a good thing to have for designing discriminators in GANs. Please feel free to share your thoughts on this.

Happy to collaborate

Best,
Vinay Kothapally

weight initialization issue

In cplxmodule.nn.modules conv.py, line 43, reset_parameters function, the initialization function is "init.cplx_kaiming_uniform_", rather than "cplx_trabelsi_independent_", while "init.cplx_uniform_independent_" is used for the "bias".
Furthermore, in cplxmodule, cplx.py, line 628, convnd function, the bias is used to be added to the output, rather than the weight.
Is that right? And how to initialize the conv1d weight with "init.cplx_uniform_independent_" as the paper?

Any advice for using with Captum?

I'm trying to implement a complex model, then visualize it with Captum's integrated gradients library (https://captum.ai/docs/extension/integrated_gradients). However, Captum's implementation expects everything to be a torch tensor, not a Cplx object. Any thoughts for a workaround?

Complex Backprop and Learning speed

Question: In the real domain, you only require differentiability for back-propagation to work. In the complex domain, you need holomorphism. Now pytorch doesn't check this because it doesn't natively support complex numbers. Do you think there could be a learning/training problem with back-propagation if some of the functions don't support Cauchy-Riemann equations?

Question: In complex analysis, a function f(z) has two derivatives: df/dz and df/dz*. If the forward passes are implemented correctly, as you have done, is back-propagation well defined? Specifically, do you get both derivatives being back-propagated?

ModReLU : bug?

Currently the implementation is:

def modrelu(input, threshold=0.5):
    r"""Compute the modulus relu of the complex tensor in re-im pair."""
    # scale = (1 - \trfac{b}{|z|})_+
    modulus = torch.clamp(abs(input), min=1e-5)
    return input * torch.relu(1. - threshold / modulus)

Shouldn't it be:

def modrelu(input, threshold=0.5):
    r"""Compute the modulus relu of the complex tensor in re-im pair."""
    # scale = (1 - \trfac{b}{|z|})_+
    modulus = torch.clamp(abs(input), min=1e-5)
    return input * torch.relu(1. + threshold / modulus)

i.e. torch.relu(1. - threshold / modulus) -> torch.relu(1. + threshold / modulus)

The paper states:

g(z) = ReLU (|z|+ b) exp {iφ(z)}

not

g(z) = ReLU (|z|- b) exp {iφ(z)}

cplxmodule.nn.CplxBatchNorm1d is not ONNX exportable

Steps to reproduce

def _cplxFrom2d_func(x):
    return cplxmodule.Cplx(x[..., 0], x[..., 1])

def _cplxTo2d_func(x):
    return torch.stack([x.real, x.imag], dim=-1)

class _cplxFrom2d(nn.Module):
    def __init__(self):
        super(_cplxFrom2d, self).__init__()

    def forward(self, x):
        return _cplxFrom2d_func(x)

class _cplxTo2d(nn.Module):
    def __init__(self):
        super(_cplxTo2d, self).__init__()

    def forward(self, x):
        return _cplxTo2d_func(x)

model = nn.Sequential(_cplxFrom2d(),
                                    cplxmodule.nn.CplxBatchNorm1d(1),
                                    _cplxTo2d()).eval()

input   = torch.randn(1,1,1024,2)
torch.onnx.export(model,
                                (input,),
                                "file.onnx",
                                opset_version=12,
                                input_names=['in'],
                                output_names=['out'],
                                dynamic_axes={'in': [0,2], 'out': [0]})

I get stack trace:

  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/__init__.py", line 208, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 92, in export
    use_external_data_format=use_external_data_format)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 530, in _export
    fixed_batch_size=fixed_batch_size)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 409, in _model_to_graph
    _export_onnx_opset_version)
IndexError: index out of range in self

Feature request : compatibility with einops

I would like the following to work:

from einops import rearrange
from cplxmodule.nn import CplxToCplx
from cplxmodule.nn.modules.casting import ConcatenatedRealToCplx
cplxrearrange = CplxToCplx[rearrange]

x = torch.randn(4, 1024, 2)
y = ConcatenatedRealToCplx()(x)
z = cplxrearrange()(y, 'b t c -> b c t')

I thought it would be possible composing CplxToCplx and rearrange. But i get some long complicated errors.
Am I mis-using something or is CplxToCplx not supposed to work for any type of function.

deepcopy doesn't work with cplxmodule modules

from copy import deepcopy
import cplxmodule as cplx

n1 = cplx.nn.CplxConv2d(1,1,1)
n2 = deepcopy(n1)

This fails with

n2 = deepcopy(n1)
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 306, in _reconstruct
    value = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 274, in _reconstruct
    y = func(*args)
  File "/usr/lib/python3.6/copyreg.py", line 88, in __newobj__
    return cls.__new__(cls, *args)
TypeError: __new__() missing 1 required positional argument: 'real'

Process finished with exit code 1

Torchscript support?

Hi Ivan,

Thanks so much for making this package, it's really well-developed. As things stand however, models created using it can't be converted to Torchscript. It may or may not be a problem to solve: please find attached an example of an error I got. It's noticeable that Torchscript is reasonably happy, there are just some functions that are defined with kwargs etc. Is there any chance you'd consider supporting this? It would be amazing to be able to use this in a compiled environment. I hope this isn't a bother.

Regards,

Stephen
cplxtest.txt - rename to cplxtest.ipynb

Pytorch Complex Autograd used in cplxmodule?

I recently read about Pytorch autograd supporting complex differentiation using Wirtinger (CR) Calculus here for it;s torch.complex datatype. However, the same document also talks about a different split way of computing gradients. I have been using cplxmodule, however, I'm not sure about calculation of gradients in the framework. Does the cplxmodule use the Pytorch complex autograd differentiation for calculation of gradients for a Cplx Tensors?

In summary, the complex autograd gradient is given by:

where:

Hence, the entire eqn. reduces to the following eqn., since the backward() = 1 for scalar output.

Workaround for nn.DataParallel bug

It seems cplxmoduke don't work with nn.DataParallel.
Attached minimal example gives the following error

RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 0 does not equal 1 (while checking arguments for cudnn_convolution)

cplxmodule-bug.py.txt

A Survey of Complex-Valued Neural Networks

CplxConv1d can be exported to ONNX but cannot be inferred by ONNXRUNTIME

The following example code shows that cplxmodule.nn.CplxConv1d can be exported but cannot be run.

    class Net(nn.Module):
        def __init__(self):
            super(Net, self).__init__()
            self.net = cplxmodule.nn.CplxConv1d(1,1,3,padding=1)

        def forward(self, x):
            x = cplxmodule.Cplx(x[..., 0], x[..., 1])
            x = self.net(x)
            x = torch.stack([x.real, x.imag], dim=-1)
            return x

    model = Net().eval()

    input = torch.randn(1, 1, 1024, 2)
    out = model(input)

    torch.onnx.export(model,
                      (input,),
                      "file.onnx",
                      opset_version=12,
                      input_names=['in'],
                      output_names=['out'])

    print("exported")

    import onnxruntime
    ort_session = onnxruntime.InferenceSession("file.onnx")
    ort_inputs = {ort_session.get_inputs()[0].name: input.numpy()}
    ort_outs = ort_session.run(None, ort_inputs)
    assert len(ort_outs) == 1, "bad number of outputs"
    np.testing.assert_allclose(out.detach().cpu(), ort_outs[0], rtol=1e-04, atol=1e-05)
    print("done")

Nit picks / Bugs

Module should have a cplxmodule.__version__ defined.
In your setup.py set the version=version and use this up top somewhere. This might look dirty but believe me it's the best way. We use this for gnuradio/sigmf and other projects:

import os

with open(os.path.join('cplxmodule', '__init__.py')) as derp:
    version = re.search(r'__version__\s*=\s*[\'"]([^\'"]*)[\'"]', derp.read()).group(1)

I'm fairly certain that this line should be changed to allow passing a tuple of padding parameters like (5,7,) or whatever to padding:

self.stride[0], self.padding[0], self.dilation[0], # from this
self.stride, self.padding, self.dilation, # to this

~~Due to the way that you are using relative imports (correctly) this project is compatible with Python 3.7+ and NOT prior versions. You can add an indicator for this in the setup.~~ This was actually a bug related to your pip package reporting the same version as the version installed from git, but they were different and had different submodules. More reason to properly version.

I probably have a PR for you for a different feature, but I have to get it approved for release first.

Implement Transposed Convolution

Should be similar to convnd_naive