ispamm / htorch Goto Github PK

Repository dedicated to Quaternion Neural Networks

License: MIT License

Python 86.71% Jupyter Notebook 13.29%

htorch's Introduction

Quaternion PyTorch

⚠️ this is still heavily experimental. Use at your own discretion!

This repository contains code to extend PyTorch for applications defined in the quaternion domain (H). We provide quaternion-valued tensors, layers, and examples. Code is designed to be as inter-operable as possible with basic real-valued PyTorch tensors and operations.

This code draws in large part from Titouan Parcollet's code, which inspired this library.

Installation

After cloning the repository, install the requirements as:

pip install -r requirements.txt

If you want to run the unit tests, you will also need the pyquaternion library. Then, install the library by running:

pip install -e .

Using the library

The basic unit of the library is the QuaternionTensor, an extension of the PyTorch's tensor class to handle quaternion-valued elements. You can initialize a quaternion tensor by specifying the four components, or by providing a (..., 4)-dimensional tensor:

# A vector with two quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4, requires_grad=True))

We provide a number of operations from quaternion algebra and inter-operability with PyTorch:

x = x * torch.rand(2) # Multiply with real-valued scalars
x.norm().sum().backward() # Take the absolute value, sum, and take the gradient

We also provide layers and utilities to work with PyTorch modules, e.g.:

model = torch.nn.Sequential(
    layers.QLinear(10, 20),
    torch.nn.ReLU(),
    layers.QLinear(20, 10),
    layers.QuaternionToReal(10), # Take the absolute value in output
)

See the Basic notebook for an introduction to the basic concepts in the library, and the Training notebook for an example of training a quaternion-valued CNN.

Code organization

The QuaternionTensor class is defined in htorch/quaternion.py.
Layers for building quaternion-valued NN are found in htorch/layers.py.
A few utilities to load real-valued datasets or convert existing real-valued models can be found in htorch/utils.py.

Most operations are documented in the example notebooks.

Testing

To manually run the unit tests:

python -m unittest discover -s ./tests -p *_test.py

If you have coverage installed:

coverage run -m unittest discover -s ./tests -p *_test.py

To generate again the coverage badge (not automated yet), install coverage-badge, then run:

coverage-badge -o coverage.svg -f

References

[1] Chase Gaudet, Anthony Maida (2018). Deep Quaternion Networks. 2018 International Joint Conference on Neural Networks (IJCNN).

[2] Chiheb Trabelsi et al. (2017). Deep Complex Networks.

[3] Titouan Parcollet, Mohamed Morchid, G. Linarès (2019). A survey of quaternion neural networks. Artificial Intelligence Review.

htorch's People

Contributors

Stargazers

Watchers

Forkers

nicolizamacorrea giorgiozannini akudan luigisigillo vhbaoduy tru17v

htorch's Issues

No GPU utilization with QBatchNorm2d

Hi,

When using the QBatchNorm2d layer, there is no GPU utilization, and consequently, model training is taking absurdly long times. Without this layer, everything works as expected. Kindly check this and let me know how this can be fixed. You can recreate this issue by testing any convolutional model that uses QBatchNorm2d.

License file

Hi,

Could you add a license to this repo? I would like to cite your work.

QBatchNorm2d RuntimeError 'shape invalid'

Hi,

It's me again. I'm running similar code to that I used in issue #8 but this time I'm using a ResNet model with batch normalization on the CIFAR-10 dataset (I used your custom collate_fn to add an addition channel to the images). The code for the model is given below:

class Quat_Block(nn.Module):
    """
    A quaternion ResNet block.
    """
    def __init__(self, in_channels: int, out_channels: int, downsample=False):
        super().__init__()

        stride = 2 if downsample else 1

        self.conv1 = layers.QConv2d(in_channels, out_channels, kernel_size=3,
                                    stride=stride, padding=1, bias=False)
        self.bn1 = layers.QBatchNorm2d(out_channels)

        self.conv2 = layers.QConv2d(out_channels, out_channels, kernel_size=3,
                                    stride=1, padding=1, bias=False)
        self.bn2 = layers.QBatchNorm2d(out_channels)

        # Shortcut connection
        if downsample or in_channels != out_channels:
            self.shortcut = nn.Sequential(
                layers.QConv2d(in_channels, out_channels, kernel_size=1,
                               stride=2, bias=False),
                layers.QBatchNorm2d(out_channels)
            )
        else:
            self.shortcut = nn.Sequential()

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += self.shortcut(x)
        return F.relu(out)


class Model(nn.Module):

    def __init__(self):
        super().__init__()
        num_segments = 3
        filters_per_segment = [4, 8, 16]
        architecture = [(num_filters, num_segments) for num_filters in
                        filters_per_segment]

        # Initial convolutional layer.
        current_filters = architecture[0][0]
        self.conv = layers.QConv2d(1, current_filters, kernel_size=3, stride=1,
                                   padding=1, bias=False)
        self.bn = layers.QBatchNorm2d(current_filters)

        # ResNet blocks
        blocks = []
        for segment_index, (filters, num_blocks) in enumerate(architecture):
            for block_index in range(num_blocks):
                downsample = segment_index > 0 and block_index == 0
                blocks.append(Quat_Block(current_filters, filters, downsample))
                current_filters = filters

        self.blocks = nn.Sequential(*blocks)

        # Final fc layer.
        self.fc = layers.QLinear(architecture[-1][0], 10)
        self.abs = layers.QuaternionToReal(10)

    def forward(self, x):
        out = F.relu(self.bn(self.conv(x)))
        out = self.blocks(out)
        out = F.avg_pool2d(out, out.size()[3])
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return self.abs(out)

I'm getting the following error:

/home/sahel/Documents/code/quaternion_lth/htorch/htorch/layers.py:577: UserWarning: torch.cholesky is deprecated in favor of torch.linalg.cholesky and will be removed in a future PyTorch release.
L = torch.cholesky(A)
should be replaced with
L = torch.linalg.cholesky(A)
and
U = torch.cholesky(A, upper=True)
should be replaced with
U = torch.linalg.cholesky(A.transpose(-2, -1).conj()).transpose(-2, -1).conj() (Triggered internally at  ../aten/src/ATen/native/BatchLinearAlgebra.cpp:1284.)
  ell = torch.cholesky(cov + self.eye, upper=True)
Traceback (most recent call last):
  File "train.py", line 47, in <module>
    H.train_model(
  File "/home/sahel/Documents/code/quaternion_lth/helper_methods.py", line 192, in train_model
    accuracy = test_model(model, testloader, device)
  File "/home/sahel/Documents/code/quaternion_lth/helper_methods.py", line 249, in test_model
    outputs = model(images)
  File "/home/sahel/Documents/code/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/sahel/Documents/code/quaternion_lth/models/resnet.py", line 172, in forward
    out = F.relu(self.bn(self.conv(x)))
  File "/home/sahel/Documents/code/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/sahel/Documents/code/quaternion_lth/htorch/htorch/layers.py", line 587, in forward
    weight = self.weight.view(4, 4, *shape)
RuntimeError: shape '[4, 4, 1, 4, 1, 1]' is invalid for input of size 16

I can't figure out if this is because of an error in my implementation of the model or because of an error in the QBatchNorm2d function.

Note: In the file layers.py, in line 545 you're missing an argument for the method init.constant_(), which is probably the first error that'll show up if you run the QBatchNorm2d function.

Addition of quaternion tensor and real tensor fails

Adding a quaternion-valued tensor to a real-valued tensor fails:

quaternion.QuaternionTensor(torch.rand(2, 4)) + torch.rand(2)

Currently, it fails with this error message:

Dimension Mismatch error for QConv1d

Hi,
I was looking at your great work on this repo on QNNs. I am facing an issue, and would request your help.

So, for each input data point, I have 4 different channels (input mechanisms), and each channel provides 1024 features (array of size 1024) for the data. Thus for training we have an input_data of size (64 * 4 * 1024) [batch_size=64]. The goal is classification into num_class, using a Quaternion Conv1D. Please find a snapshot of the code below.

kernel = 1
out_chn = 4
out_conv = ((1024 - 1 * (kernel - 1) - 1)) + 1
out_pool = ((out_conv - 1 * (kernel - 1) - 1)) + 1
inp_lin = out_pool*out_chn
#
## Model Definition
model_QNN = torch.nn.Sequential(
        layers.QConv1d(in_channels=1, out_channels=out_chn, kernel_size=kernel, stride=1, bias=True),
        torch.nn.ReLU(),
        torch.nn.MaxPool1d(kernel_size=kernel, stride=1),
        torch.nn.Flatten(),
        layers.QLinear(inp_lin, num_class),
        layers.QuaternionToReal(num_class)
        )
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model_QNN.parameters())
model_QNN.cuda()
criterion.cuda()
#
## Training Loop
for epoch in range(5):
    for i, (inputs, targets) in enumerate(train_dl):
        yhat = model_QNN(inputs.cuda())
        loss = criterion(yhat.cuda(),targets.cuda())
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

However, the above throws the following error during training:

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/QNN/hTorch/q_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/QNN/hTorch/q_env/lib/python3.6/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/QNN/hTorch/q_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/QNN/hTorch/htorch/layers.py", line 92, in forward
    self.padding, self.dilation, self.groups))
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [16, 4, 1, 1], but got 3-dimensional input of size [64, 4, 1024] instead

What am I doing wrong here? Thanks in advance.

Adding a collate_fn to handle real-valued data

It would be helpful to have a custom collate_fn to convert real-valued images to quaternion-valued images. Something like:

cifar = torchvision.datasets.CIFAR()
loader = torch.utils.data.DataLoader(cifar, batch_size=4, shuffle=True, collate_fn=convert_to_quaternion)

In this way, all image datasets in torchvision could be used directly. I think this could go together with the model conversion tools in a new utils.py module.

QBatchNorm2d calculation

Hi, I was looking at your QBatchNorm2d code and attempting to fix some of the issues. I noticed you use the pseudo-covariance directly for normalization, rather than variance which is more typical; I was wondering if you could provide a link to the relevant literature on this method of normalization since I'm unfamiliar with it. Thanks!

Improving printing on screen

Currently, calling print(x) or letting x as the last output in a notebook cell produce different results. We need to make this more uniform and add a str() function:

https://www.geeksforgeeks.org/str-vs-repr-in-python/

RuntimeError: mat1 and mat2 shapes cannot be multiplied (8x4 and 16x80)

Hi,
I am very interested in your work and thanks for what you have done
I encountered a problem when I tried to run your sample code in the file "notebooks/basic.ipynb" line
input:

import torch
from torch import nn
from htorch import quaternion
from htorch.layers import QLinear
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each component separately)
model = nn.Sequential(
    QLinear(4, 20, bias=True),
    nn.ReLU(),
    QLinear(20, 1)
)
x = quaternion.QuaternionTensor(torch.rand(2, 4, 4))
print(model(x))

out put:

  File "D:/code-study/hTorch-main/notebooks/vgew.py", line 13, in <module>
    print(model(x))
  File "C:\anaconda3\envs\pytorch_qua_env\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\anaconda3\envs\pytorch_qua_env\lib\site-packages\torch\nn\modules\container.py", line 141, in forward
    input = module(input)
  File "C:\anaconda3\envs\pytorch_qua_env\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\code-study\hTorch-main\htorch\layers.py", line 268, in forward
    return Q(F.linear(x, weight.t(), self.bias))
  File "D:\code-study\hTorch-main\htorch\quaternion.py", line 543, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (8x4 and 16x80)```


![image](https://user-images.githubusercontent.com/80167832/170857782-16bbc3d0-8b9e-4093-a89c-a99396ec0c4b.png)

![image](https://user-images.githubusercontent.com/80167832/170857792-b8d922b2-2073-4513-801b-ba2cedabc5b4.png)
I've installed requirements.txt, still not working, 
Can you answer for me, thank you!

Accuracy drops during extended training.

Hi,

I've built the following quaternion CNN using the methods provided.

  class QLeNet_300_100(nn.Module):
      def __init__(self):
          super().__init__()
          self.fc1 = layers.QLinear(196, 75)
          self.fc2 = layers.QLinear(75, 25)
          self.fc3 = layers.QLinear(25, 10)
          self.abs = layers.QuaternionToReal(10)
  
      def forward(self, x):
          x = torch.flatten(x, 1)
          x = F.relu(self.fc1(x))
          x = F.relu(self.fc2(x))
          x = self.abs(self.fc3(x))
          return x

When training the model for an extended duration on the MNIST dataset, the accuracy suddenly drops to nearly 10%, which is what we would expect from an untrained model, and doesn't improve any further. An image of the accuracy values as training progresses is attached.

The same issue also persists when using the methods in Parcollet's original repo. I would appreciate some insight into why this might be happening. If you need additional info, I can provide the code to recreate this issue.

Thanks,
Sahel