Giter Club home page Giter Club logo

pytorch-xnor-net's People

Contributors

cooooorn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pytorch-xnor-net's Issues

the model of VGG accuracy is very low

the accurary of the model classification is very low when i execute the comand"$ python3 main.py --arch VGG16",only 10.0%;i want to ask about the soluton,thank you.(基于CIFAR-10的VGG模型的准确度非常低,在我还没对代码作出修改时,只有10%,想问下z造成问题的原因或是解决方案;MNIST数据集的实验结果与作者一致;谢谢)
2020-11-10_164013

Out of memory when saving binary weights

Hi,
Thank you for providing this amazing code. When I ran my code on simple network, the training code crashes at binop.encode_rows(weight, bin_weight) and I get this error torch.FatalError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58
I am using the same pytorch version to be consistent with your work. Could you please guide me on how to resolve this issue.
Thank you

AttributeError: module 'models' has no attribute 'VGG'

When I run the Cifar10 I met such a problem. (Python 3.6, Pytorch 0.4.0)

Namespace(arch='VGG16', batch_size=128, cuda=True, epochs=300, evaluate=False, log_interval=100, lr=0.1, lr_epochs=100, momentum=0.9, no_cuda=False, pretrained=None, seed=1, test_batch_size=100, weight_decay=1e-05)
Files already downloaded and verified
Traceback (most recent call last):
  File "main.py", line 341, in <module>
    model_ori = models.VGG(name)
AttributeError: module 'models' has no attribute 'VGG'

I wrote init.py to fix it.

from .VGG import VGG
from .Bin_VGG import Bin_VGG_test
from .Bin_VGG import Bin_VGG_train

No module named 'binop'

File "..\util\util.py", line 1, in
import binop
ModuleNotFoundError: No module named 'binop'

XNOR acceleration

Thanks to the implement of XNOR by CUDA and pytorch, it really helps me. I'm now wondering if the implementation can really speed up the training process. After doing some experiment about MNIST, the speed of Bin_LeNet seems slower than LeNet, which seems unreasonable, so can you explain how to accelerate the training process? Thanks a lot.

BinConv2D for group convolution

Dear @cooooorn ,

Thanks for your helpful implementation. I have 2 following concerns about class BinConv2d:

  • This line: self.weight = nn.Parameter(torch.IntTensor(out_channels, 1 + ( in_channels * self.kernel_size[0] * self.kernel_size[1] - 1) // 32)). Why do we divide 32 in testing process? I notice that the number of weights in testing is reduces by 32. Could you clarify that?

  • I want to use the group convolution. How can I modify for BinConv2d?
    Thank you very much.

Thanks,
Hai

Further optimize gemm

Thanks for your great work!

I plan to work on bnn optimization as well for various application (generative model/classifier) on a powerful cpu.
I did preliminary work for a few hours to change the "micro_kernel" to use avx512, and it showed 4x speed up for simple one loop optimization (note -O3 won't do the optimization to vectorize). I wonder if you plan to work on this further ? and boost the performance further.

Query not an issue

Hi,
Firstly thank you very much for providing the code for XNOR net. Just out of curiosity, I was visualizing the weight value of conv2 and fc1 layer of binary version of LeNet, but unfortunately I see that they do not have binary values. Could you kindly guide me on this?
I am visualizing it using model.conv2.weight.data

No module named 'binop'

No matter what i try, I can run the training. I have tried compiling binop, and it compiles fine, but running doenst work:

on Ubuntu LTS 18.04: (Python 3.6, Pytorch 4.0, no GPU)

python3 main.py --arch Bin_LeNet
Traceback (most recent call last):
  File "main.py", line 18, in <module>
    import models as models
  File "/home/aoreskovic/GitHub/Pytorch-XNOR-Net-master/MNIST/models/__init__.py", line 2, in <module>
    from .Bin_LeNet import Bin_LeNet_test
  File "/home/aoreskovic/GitHub/Pytorch-XNOR-Net-master/MNIST/models/Bin_LeNet.py", line 6, in <module>
    from util import BinLinear
  File "../util/__init__.py", line 1, in <module>
    from .util import *
  File "../util/util.py", line 1, in <module>
    import binop
ModuleNotFoundError: No module named 'binop'

on Win10: (Python 3.6, Pytorch 4.0, no GPU)

python main.py --arch Bin_LeNet
Traceback (most recent call last):
  File "main.py", line 18, in <module>
    import models as models
  File "H:\Dropbox\NeuralXNOR\Pytorch-XNOR-Net\MNIST\models\__init__.py", line 2, in <module>
    from .Bin_LeNet import Bin_LeNet_test
  File "H:\Dropbox\NeuralXNOR\Pytorch-XNOR-Net\MNIST\models\Bin_LeNet.py", line 6, in <module>
    from util import BinLinear
  File "..\util\__init__.py", line 1, in <module>
    from .util import *
  File "..\util\util.py", line 1, in <module>
    import binop
  File "..\binop\__init__.py", line 3, in <module>
    from ._binop import lib as _lib, ffi as _ffi
ModuleNotFoundError: No module named 'binop._binop'

It looks like there should be some module named binop.py that acts like a wrapper for _binop, but that isnt generated?

Use the code with AlexNet but fail when saving binary model

Your code is great! However, when I use the code in AlexNet model, an error occurred when saving the binary model after one epoch. The log is here:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1523242347739/work/torch/csrc/generic/serialization.cpp line=38 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
  File "main.py", line 396, in <module>
    train_bin(epoch)
  File "main.py", line 128, in train_bin
    bin_save_state(args, model_train)
  File "../util/util.py", line 36, in bin_save_state
    torch.save(state, 'models/' + args.arch + '.pth')
  File "/home/fjb/miniconda3/envs/pytorch0.3/lib/python3.5/site-packages/torch/serialization.py", line 135, in save
    return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
  File "/home/fjb/miniconda3/envs/pytorch0.3/lib/python3.5/site-packages/torch/serialization.py", line 117, in _with_file_like
    return body(f)
  File "/home/fjb/miniconda3/envs/pytorch0.3/lib/python3.5/site-packages/torch/serialization.py", line 135, in <lambda>
    return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
  File "/home/fjb/miniconda3/envs/pytorch0.3/lib/python3.5/site-packages/torch/serialization.py", line 204, in _save
    serialized_storages[key]._write_file(f)
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /opt/conda/conda-bld/pytorch_1523242347739/work/torch/csrc/generic/serialization.cpp:38

The environment is same with yours, and I succeed in other arch you provide.

The binary AlexNet code is here:

import torch
import torch.nn as nn
import torch.nn.functional as F
import sys
sys.path.append("..")
from util import BinLinear
from util import BinConv2d

class Bin_AlexNet_train(nn.Module):

    def __init__(self):
        super(Bin_AlexNet_train, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=11, stride=4, padding=0),
            nn.BatchNorm2d(96, eps=1e-4, momentum=0.1, affine=True),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            BinConv2d(96, 256, kernel_size=5, stride=1, padding=2, istrain=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            BinConv2d(256, 384, kernel_size=3, stride=1, padding=1, istrain=True),
            BinConv2d(384, 384, kernel_size=3, stride=1, padding=1, istrain=True),
            BinConv2d(384, 256, kernel_size=3, stride=1, padding=1, istrain=True),
            nn.MaxPool2d(kernel_size=3, stride=2)
        )
        self.classifier = nn.Sequential(
            BinLinear(256 * 6 * 6, 4096, istrain=True),
            BinLinear(4096, 4096, istrain=True),
            nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
            nn.Linear(4096, 10)
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(-1, 256 * 6 * 6)
        x = self.classifier(x)
        return x

class Bin_AlexNet_test(nn.Module):

    def __init__(self):
        super(Bin_AlexNet_test, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=11, stride=4, padding=0),
            nn.BatchNorm2d(96, eps=1e-4, momentum=0.1, affine=True),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            BinConv2d(96, 256, kernel_size=5, stride=1, padding=2, istrain=False),
            nn.MaxPool2d(kernel_size=3, stride=2),
            BinConv2d(256, 384, kernel_size=3, stride=1, padding=1, istrain=False),
            BinConv2d(384, 384, kernel_size=3, stride=1, padding=1, istrain=False),
            BinConv2d(384, 256, kernel_size=3, stride=1, padding=1, istrain=False),
            nn.MaxPool2d(kernel_size=3, stride=2)
        )
        self.classifier = nn.Sequential(
            BinLinear(256 * 6 * 6, 4096, istrain=False),
            BinLinear(4096, 4096, istrain=False),
            nn.BatchNorm1d(4096, eps=1e-3, momentum=0.1, affine=True),
            nn.Linear(4096, 10)
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(-1, 256 * 6 * 6)
        x = self.classifier(x)
        return x

Also, the unbinarized AlexNet can run successfully.

Could you please tell me how to solve the problem? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.