Comments (5)
1.Cause
This is because pytorch 1.7 changes Tensor iteration behavior. When weight_list
is a tensor or a Parameter, list(weight_list)
returns list of UnbindBackward tensors, which further lead to inplace operation error.
In previous versions (1.2-1.6), list(weight_list)
returns list of SelectBackward tensors and it works.
Related pytorch issue can be seen here: pytorch/pytorch#47899
Location:
In DeepCTR-Torch, this error happens in:
DeepCTR-Torch/deepctr_torch/models/basemodel.py
Lines 371 to 372 in bc881dc
2.Solution
We fix this bug by simply remove list(weight_list)
operation for <class 'torch.nn.parameter.Parameter'> input in a9402a4.
Take DeepFM as example, add_regularization_weight()
are called 4 times:
file | input (weight_list) | type |
---|---|---|
basemodel.py | self.embedding_dict.parameters() | <class 'generator'> |
basemodel.py | self.linear_model.parameters() | <class 'generator'> |
deepfm.py | filter(lambda x: 'weight' in x[0] and 'bn' not in x[0], self.dnn.named_parameters()) | <class 'filter'> |
deepfm.py | self.dnn_linear.weight | <class 'torch.nn.parameter.Parameter'> |
list(weight_list)
are used to convert generator and filter to a list of tensors. For a Parameter, we can directly compute its norm and there is no need to convert it to a list.
3.Reproduction
If you are still interested, I simplify this problem and reproduce it in the codes below (works in previous versions but fails in 1.7)
import torch
x = torch.tensor([[0.5, 0.2, 0.3, 0.8]], requires_grad=True)
w = torch.tensor([[0.2, 0.5, 0.1, 0.5]], requires_grad=True)
y_true = torch.tensor([1])
# In 1.7, `list(w)` returns list of UnbindBackward tensors, which further leads to inplace operation error.
# In previous versions, `list(w)` returns list of SelectBackward tensors and it works.
weight_list = list(w)
print('weight_list',weight_list)
for i in range(2):
l2_norm = torch.norm(weight_list[0], 2)
y = w.mm(x.T)
# mse with l2 regularization
loss = pow(y - y_true, 2) + 0.5 * pow(l2_norm, 2)
loss.backward()
# simulate optimizer.step()
with torch.no_grad():
w.add_(w.grad, alpha=1e-3)
output:
torch 1.7:
weight_list [tensor([0.2000, 0.5000, 0.1000, 0.5000], grad_fn=<UnbindBackward>)]
...
RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.
Process finished with exit code 1
previous versions:
weight_list [tensor([0.2000, 0.5000, 0.1000, 0.5000], grad_fn=<SelectBackward>)]
Process finished with exit code 0
Except the solution above, if it's necessary to so such operation, using indices works:
# weight_list = list(w)
weight_list = [w[i] for i in range(len(w))]
using indices returns SelectBackward tensors instead of UnbindBackward tensors.
from deepctr-torch.
Details can be found in pytorch/pytorch#47899 and the release note of pytorch 1.7 here.
Use torch.unbind
instead of a for loop
from deepctr-torch.
Thanks
from deepctr-torch.
Hi, we have solved this issue in v0.2.4, please use pip install -U deepctr-torch
to upgrade.
from deepctr-torch.
Hi, we have solved this issue in v0.2.4, please use
pip install -U deepctr-torch
to upgrade.
Thanks, I will try it later!
from deepctr-torch.
Related Issues (20)
- Issues installing deeptorch >0.2.2 for M1 macos HOT 2
- 训练过程中有一定概率因为label全为0而报错 HOT 1
- import deepctr_torch error HOT 3
- How to assign different weights to auxiliary task's loss in the whole loss function - multitask
- When doing model prediction, how to get the next-to-last layer emeddings at the same time?
- can developer add change basemodel class device attribute? HOT 1
- How to use pre training embedding in DIN and other models for pytorch version?
- How to do hyperparameter tuning with DeepCTR? HOT 4
- I had this error" RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor", when i try to add LSTM on the top of bert HOT 1
- 数据文件 criteo_sample.txt HOT 2
- RuntimeError: Output 0 of UnbindBackward0 is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.
- Why manually manage L2 norm multiplication of model parameters ? HOT 1
- How could I get PT file with DCN?
- 为什么这个项目会依赖tensorflow? HOT 2
- How to use models or adjust parameters to reproduce approximate experimental results in a paper? HOT 1
- In the MOE method does expert have to learn and can the frozen model be used as an expert? HOT 1
- Custom Loss Function within xDeepFM API
- MMOE metric的代码有些问题,不适合分类+回归的任务
- module 'tensorflow.python.distribute.input_lib' has no attribute 'DistributedDatasetInterface'
- 不是国人的仓库吗, 怎么从简介到注释都是英文😂
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepctr-torch.