ptrblck / pytorch_misc Goto Github PK

View Code? Open in Web Editor NEW

540.0 540.0 64.0 24 KB

Code snippets created for the PyTorch discussion board

Python 100.00%

pytorch_misc's People

Contributors

Stargazers

Watchers

pytorch_misc's Issues

how to realize a batchnorm6d layer

Hi ptrblck,
could you tell me how to realize an anbitrary dimension batchnorm layer, such as : batchnorm4d, batchnorm6d …
thank you!

about adaptive_batchnorm

hello, I've read a lot discussions about pytorch from you, it was really helpful.
here I have a doubt about the implementation of adaptative batchnorm, why you use a * x + b * bn(x) to do it? Could you explain a bit?

The locally connected conv2d runs too slow and costs lots of GPU memory, can you provide a CUDA version?

UNET + YOLOv6

I am trying to add Unet as a preprocessing layer before the YOLOv6 architecture and when I am trying to do this, I am facing the following error. And how do I combine a Unet architecture with YOLO architecture. Any help from you here will be appreciated.
Looking forward to your support as soon as possible. Thank you

ERROR in training steps.
ERROR in training loop or eval/save model.

Training completed in 0.000 hours.
Traceback (most recent call last):
File "tools/train.py", line 112, in
main(args)
File "tools/train.py", line 102, in main
trainer.train()
File "/workspace/YOLOv61/yolov6/core/engine.py", line 75, in train
self.train_in_loop()
File "/workspace/YOLOv61/yolov6/core/engine.py", line 88, in train_in_loop
self.train_in_steps()
File "/workspace/YOLOv61/yolov6/core/engine.py", line 104, in train_in_steps
preds = self.model(images)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/YOLOv61/yolov6/models/yolo.py", line 39, in forward
x = self.detect(x)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/YOLOv61/yolov6/models/effidehead.py", line 60, in forward
x[i] = self.stemsi
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/YOLOv61/yolov6/layers/common.py", line 102, in forward
return self.act(self.bn(self.conv(x)))
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [256, 256, 1, 1], expected input[8, 128, 20, 20] to have 256 channels, but got 128 channels instead

Sizes of tensors must match except in dimension 1.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-27-8eca1ccc60c7> in <module>
      1 inputs = torch.randn(1, 3, 222, 222).to(device)
      2 print(inputs.dtype)
----> 3 outputs = unet(inputs)
      4 print(outputs.shape)
      5 print(outputs.dtype)

~/miniconda3/envs/deep_mol/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    489             result = self._slow_forward(*input, **kwargs)
    490         else:
--> 491             result = self.forward(*input, **kwargs)
    492         for hook in self._forward_hooks.values():
    493             hook_result = hook(self, input, result)

<ipython-input-20-1b02871aaf17> in forward(self, x)
     43         x_up = self.up4(x4, x3)
     44         x_up = self.up3(x_up, x2)
---> 45         x_up = self.up2(x_up, x1)
     46         x_up = self.up1(x_up, x)
     47         x_out = F.log_softmax(self.out(x_up), 1)

~/miniconda3/envs/deep_mol/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    489             result = self._slow_forward(*input, **kwargs)
    490         else:
--> 491             result = self.forward(*input, **kwargs)
    492         for hook in self._forward_hooks.values():
    493             hook_result = hook(self, input, result)

<ipython-input-19-4c1c64cb0ec3> in forward(self, x, x_skip)
     15     def forward(self, x, x_skip):
     16         x = self.conv_trans1(x)
---> 17         x = torch.cat((x, x_skip), dim=1)
     18         x = self.conv_block(x)
     19         return x

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 111 and 112 in dimension 2 at /opt/conda/conda-bld/pytorch_1525909934016/work/aten/src/THC/generic/THCTensorMath.cu:111

Hello, I try your code and I encountered this error above.
I added the ceil_mode=True but it doesn't work.

EOF error when using shared_dict in multiple workers.

When the size of shared data is very big, then read/write this dict will raise an EOF error.
Related discuss in stackoverflow: https://stackoverflow.com/questions/4534687/python-sharing-huge-dictionaries-using-multiprocessing
Is there any better solution?

Whether `shared_array.py` works for ddp

Hi, ptrblck. Thanks for providing the example of sharing an array among different workers in shared_array.py'. However, there's only one Datasetinstance in the job. I'm wondering whether it still applies in DDP scenario when multiple instances ofDataset` exist. I haven't tried it in DDP on my own, but I want to make sure it works in DDP first. Thanks : )

diminsion miss match error while implimenting tripletloss

Im getting bellow error. is this due to pytorch version ?
please clarify me

"main", fname, loader, pkg_name)
File "/home/padmashree/anaconda3/envs/myenv/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/padmashree/project_dir/EANet2/package/optim/eanet_trainer.py", line 135, in
trainer.train_phases()
File "/home/padmashree/project_dir/EANet2/package/optim/eanet_trainer.py", line 126, in train_phases
self.train()
File "package/optim/reid_trainer.py", line 338, in train
self.trainer.train_one_epoch(trial_run_steps=3 if cfg.trial_run else None)
File "package/optim/trainer.py", line 36, in train_one_epoch
self.train_one_step(batch)
File "package/optim/trainer.py", line 24, in train_one_step
pred = self.train_forward(batch)
File "/home/padmashree/project_dir/EANet2/package/optim/eanet_trainer.py", line 102, in train_forward
loss += self.loss_funcs[loss_cfg.name](reid_batch, pred, step=self.trainer.current_step)['loss']
File "package/loss/triplet_loss.py", line 124, in call
res3 = self.calculate(torch.stack(pred['feat_list']), batch['label'], hard_type=hard_type)
File "package/loss/triplet_loss.py", line 107, in calculate
dist_mat = compute_dist(feat, feat, dist_type=cfg.dist_type)
File "package/eval/torch_distance.py", line 49, in compute_dist
dist = euclidean_dist(array1, array2)
File "package/eval/torch_distance.py", line 25, in euclidean_dist
xx = torch.pow(x, 2).sum(1, keepdim=True).expand(m, n)
RuntimeError: expand(torch.cuda.FloatTensor{[9, 1, 256]}, size=[9, 9]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (3)

about batch_norm_manual

Here use the external x, seems not good

batch_norm_manual gradient inconsistent

Hi Ptrblk,

I am playing with PyTorch batchnorm2d and your implementation. I tried to use your implementation in mobilenetv3 and the performance seems similar. However, I found the gradient values are not the same, but I am not sure why. Below is my test code:

"""
Comparison of manual BatchNorm2d layer implementation in Python and
nn.BatchNorm2d
@author: ptrblck
"""

import torch
import torch.nn as nn


def compare_bn(bn1, bn2):
    err = False
    if not torch.allclose(bn1.running_mean, bn2.running_mean):
        print('Diff in running_mean: {} vs {}'.format(
            bn1.running_mean, bn2.running_mean))
        err = True

    if not torch.allclose(bn1.running_var, bn2.running_var):
        print('Diff in running_var: {} vs {}'.format(
            bn1.running_var, bn2.running_var))
        err = True

    if bn1.affine and bn2.affine:
        if not torch.allclose(bn1.weight, bn2.weight):
            print('Diff in weight: {} vs {}'.format(
                bn1.weight, bn2.weight))
            err = True
        # compare weight gradient here
        if not torch.allclose(bn1.weight.grad, bn2.weight.grad):
            print('Diff in weight gradient: {} vs {}'.format(
                bn1.weight.grad, bn2.weight.grad))
            err = True

        if not torch.allclose(bn1.bias, bn2.bias):
            print('Diff in bias: {} vs {}'.format(
                bn1.bias, bn2.bias))
            err = True
        # compare bias gradient here
        if not torch.allclose(bn1.bias.grad, bn2.bias.grad):
            print('Diff in bias gradient: {} vs {}'.format(
                bn1.bias.grad, bn2.bias.grad))
            err = True

    if not err:
        print('All parameters are equal!')


class MyBatchNorm2d(nn.BatchNorm2d):
    def __init__(self, num_features, eps=1e-5, momentum=0.1,
                 affine=True, track_running_stats=True):
        super(MyBatchNorm2d, self).__init__(
            num_features, eps, momentum, affine, track_running_stats)

    def forward(self, input):
        self._check_input_dim(input)

        exponential_average_factor = 0.0

        if self.training and self.track_running_stats:
            if self.num_batches_tracked is not None:
                self.num_batches_tracked += 1
                if self.momentum is None:  # use cumulative moving average
                    exponential_average_factor = 1.0 / float(self.num_batches_tracked)
                else:  # use exponential moving average
                    exponential_average_factor = self.momentum

        # calculate running estimates
        if self.training:
            mean = input.mean([0, 2, 3])
            # use biased var in train
            var = input.var([0, 2, 3], unbiased=False)
            n = input.numel() / input.size(1)
            with torch.no_grad():
                self.running_mean = exponential_average_factor * mean\
                    + (1 - exponential_average_factor) * self.running_mean
                # update running_var with unbiased var
                self.running_var = exponential_average_factor * var * n / (n - 1)\
                    + (1 - exponential_average_factor) * self.running_var
        else:
            mean = self.running_mean
            var = self.running_var

        input = (input - mean[None, :, None, None]) / (torch.sqrt(var[None, :, None, None] + self.eps))
        if self.affine:
            input = input * self.weight[None, :, None, None] + self.bias[None, :, None, None]

        return input


# Init BatchNorm layers
my_bn = MyBatchNorm2d(3, affine=True)
bn = nn.BatchNorm2d(3, affine=True)
# Load weight and bias
my_bn.load_state_dict(bn.state_dict())

# Run train
for _ in range(10):
    scale = torch.randint(1, 10, (1,)).float()
    bias = torch.randint(-10, 10, (1,)).float()
    x = torch.randn(10, 3, 100, 100) * scale + bias
    out1 = my_bn(x)
    out2 = bn(x)
    #  calculate gradient for leaf
    out1.sum().backward()
    out2.sum().backward()
    compare_bn(my_bn, bn)

    torch.allclose(out1, out2)
    print('Max diff: ', (out1 - out2).abs().max())

# Run eval
my_bn.eval()
bn.eval()
for _ in range(10):
    scale = torch.randint(1, 10, (1,)).float()
    bias = torch.randint(-10, 10, (1,)).float()
    x = torch.randn(10, 3, 100, 100) * scale + bias
    out1 = my_bn(x)
    out2 = bn(x)
    #  calculate gradient for leaf
    out1.sum().backward()
    out2.sum().backward()
    compare_bn(my_bn, bn)

    torch.allclose(out1, out2)
    print('Max diff: ', (out1 - out2).abs().max())

Thanks in advance.

target size

Hi, ptrblck, I don't know why your target tensor the final dimension is a tuple. could you explain
y = torch.randint(0, nb_classes, (1, 96, 96))

ptrblck / pytorch_misc Goto Github PK

pytorch_misc's People

Contributors

Stargazers

Watchers

Forkers

pytorch_misc's Issues

how to realize a batchnorm6d layer

about adaptive_batchnorm

The locally connected conv2d runs too slow and costs lots of GPU memory, can you provide a CUDA version?

UNET + YOLOv6

Sizes of tensors must match except in dimension 1.

EOF error when using shared_dict in multiple workers.

Whether `shared_array.py` works for ddp

diminsion miss match error while implimenting tripletloss

about batch_norm_manual

batch_norm_manual gradient inconsistent

target size

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent