Giter Club home page Giter Club logo

Comments (5)

zanshuxun avatar zanshuxun commented on June 14, 2024

1.Cause

This is because pytorch 1.7 changes Tensor iteration behavior. When weight_list is a tensor or a Parameter, list(weight_list) returns list of UnbindBackward tensors, which further lead to inplace operation error.
In previous versions (1.2-1.6), list(weight_list) returns list of SelectBackward tensors and it works.
Related pytorch issue can be seen here: pytorch/pytorch#47899

Location:

In DeepCTR-Torch, this error happens in:

self.add_regularization_weight(self.dnn_linear.weight, l2_reg_dnn)

def add_regularization_weight(self, weight_list, weight_decay, p=2):
self.regularization_weight.append((list(weight_list), weight_decay, p))

2.Solution

We fix this bug by simply remove list(weight_list) operation for <class 'torch.nn.parameter.Parameter'> input in a9402a4.

Take DeepFM as example, add_regularization_weight() are called 4 times:

file input (weight_list) type
basemodel.py self.embedding_dict.parameters() <class 'generator'>
basemodel.py self.linear_model.parameters() <class 'generator'>
deepfm.py filter(lambda x: 'weight' in x[0] and 'bn' not in x[0], self.dnn.named_parameters()) <class 'filter'>
deepfm.py self.dnn_linear.weight <class 'torch.nn.parameter.Parameter'>

list(weight_list) are used to convert generator and filter to a list of tensors. For a Parameter, we can directly compute its norm and there is no need to convert it to a list.

3.Reproduction

If you are still interested, I simplify this problem and reproduce it in the codes below (works in previous versions but fails in 1.7)

import torch

x = torch.tensor([[0.5, 0.2, 0.3, 0.8]], requires_grad=True)
w = torch.tensor([[0.2, 0.5, 0.1, 0.5]], requires_grad=True)
y_true = torch.tensor([1])

# In 1.7, `list(w)` returns list of UnbindBackward tensors, which further leads to inplace operation error.
# In previous versions, `list(w)` returns list of SelectBackward tensors and it works.
weight_list = list(w)
print('weight_list',weight_list) 

for i in range(2):
    l2_norm = torch.norm(weight_list[0], 2)
    y = w.mm(x.T)

    # mse with l2 regularization
    loss = pow(y - y_true, 2) + 0.5 * pow(l2_norm, 2)
    loss.backward()

    # simulate optimizer.step()
    with torch.no_grad():
        w.add_(w.grad, alpha=1e-3)

output:
torch 1.7:

weight_list [tensor([0.2000, 0.5000, 0.1000, 0.5000], grad_fn=<UnbindBackward>)]
...
RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

Process finished with exit code 1

previous versions:

weight_list [tensor([0.2000, 0.5000, 0.1000, 0.5000], grad_fn=<SelectBackward>)]

Process finished with exit code 0

Except the solution above, if it's necessary to so such operation, using indices works:

# weight_list = list(w)
weight_list = [w[i] for i in range(len(w))]

using indices returns SelectBackward tensors instead of UnbindBackward tensors.

from deepctr-torch.

zanshuxun avatar zanshuxun commented on June 14, 2024

Details can be found in pytorch/pytorch#47899 and the release note of pytorch 1.7 here.

Use torch.unbind instead of a for loop

image

from deepctr-torch.

LeeTsinghua avatar LeeTsinghua commented on June 14, 2024

Thanks

from deepctr-torch.

zanshuxun avatar zanshuxun commented on June 14, 2024

Hi, we have solved this issue in v0.2.4, please use pip install -U deepctr-torch to upgrade.

from deepctr-torch.

Arexh avatar Arexh commented on June 14, 2024

Hi, we have solved this issue in v0.2.4, please use pip install -U deepctr-torch to upgrade.

Thanks, I will try it later!

from deepctr-torch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.