Comments (14)
Hi, my friend, thank you for your work, I try to debug and find a tiny bug:
in file - 0_gen_sampled_data.py:
unique_cate_id = np.concatenate(
(ad['cate_id'].unique(), log['cate'].unique()))
lbe.fit(unique_cate_id)
in file - 2_gen_dsin_input.py:
data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')
here merge method lost some data(cate_id, and brand)
sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]
so here data['brand'].nunique() is small than input data index.
and I log all unique input brand number, and update the fd, then code can run without error.
from dsin.
Hi, my friend, thank you for your work, I try to debug and find a tiny bug:
in file - 0_gen_sampled_data.py:
unique_cate_id = np.concatenate( (ad['cate_id'].unique(), log['cate'].unique())) lbe.fit(unique_cate_id)
in file - 2_gen_dsin_input.py:
data = pd.merge(sample_sub, user, how='left', on='userid', ) data = pd.merge(data, ad, how='left', on='adgroup_id')
here merge method lost some data(cate_id, and brand)
sparse_feature_list = [SingleFeat(feat, data[feat].nunique( ) + 1) for feat in sparse_features + ['cate_id', 'brand']]
so here data['brand'].nunique() is small than input data index.
and I log all unique input brand number, and update the fd, then code can run without error.
Hi, I met the same problem, could you tell us how to fix the bug?
Sorry for the late reply, I am not sure whether it is ok or not.
-
log the dimension in file 0_gen_...:
pd.to_pickle({
'cate_id': SingleFeat('cate_id', len(np.unique(unique_cate_id)) + 1),
'brand': SingleFeat('brand', len(np.unique(unique_brand)) + 1),
},
'../model_input/dsin_fd_cate_brand_' + str(FRAC) + '.pkl') -
update input fd in train_dsin.py:
cate_brand_fd = pd.read_pickle('../model_input/dsin_fd_cate_brand_' +
str(FRAC) + '.pkl')fd['sparse'][13] = cate_brand_fd['cate_id']
fd['sparse'][14] = cate_brand_fd['brand'] -
rerun the script.
from dsin.
please update your code to the latest and run them on the environment written on
https://github.com/shenweichen/DSIN#operating-environment
from dsin.
I run this code on tf-cpu1.4.0, cause my cuda is 10.0 and cannot run on gpu.
Do you know what does this error info mean?
from dsin.
have you run your code on python3.6?
from dsin.
right
from dsin.
check your code is up to date with the the latest commit
from dsin.
It is the latest commit with deepctr==0.4.1
from dsin.
yes i suggest you to clone the whole repo and re-run again
from dsin.
from dsin.
Hi, my friend, thank you for your work, I try to debug and find a tiny bug:
in file - 0_gen_sampled_data.py:
unique_cate_id = np.concatenate( (ad['cate_id'].unique(), log['cate'].unique())) lbe.fit(unique_cate_id)
in file - 2_gen_dsin_input.py:
data = pd.merge(sample_sub, user, how='left', on='userid', ) data = pd.merge(data, ad, how='left', on='adgroup_id')
here merge method lost some data(cate_id, and brand)
sparse_feature_list = [SingleFeat(feat, data[feat].nunique( ) + 1) for feat in sparse_features + ['cate_id', 'brand']]
so here data['brand'].nunique() is small than input data index.
and I log all unique input brand number, and update the fd, then code can run without error.
I have also encountered this problem. Could you please tell me how to modify this bug in detail?
from dsin.
Hi, my friend, thank you for your work, I try to debug and find a tiny bug:
in file - 0_gen_sampled_data.py:
unique_cate_id = np.concatenate( (ad['cate_id'].unique(), log['cate'].unique())) lbe.fit(unique_cate_id)
in file - 2_gen_dsin_input.py:
data = pd.merge(sample_sub, user, how='left', on='userid', ) data = pd.merge(data, ad, how='left', on='adgroup_id')
here merge method lost some data(cate_id, and brand)
sparse_feature_list = [SingleFeat(feat, data[feat].nunique( ) + 1) for feat in sparse_features + ['cate_id', 'brand']]
so here data['brand'].nunique() is small than input data index.
and I log all unique input brand number, and update the fd, then code can run without error.
Hi, I met the same problem, could you tell us how to fix the bug?
from dsin.
Hi, my friend, thank you for your work, I try to debug and find a tiny bug:
in file - 0_gen_sampled_data.py:
unique_cate_id = np.concatenate( (ad['cate_id'].unique(), log['cate'].unique())) lbe.fit(unique_cate_id)
in file - 2_gen_dsin_input.py:
data = pd.merge(sample_sub, user, how='left', on='userid', ) data = pd.merge(data, ad, how='left', on='adgroup_id')
here merge method lost some data(cate_id, and brand)
sparse_feature_list = [SingleFeat(feat, data[feat].nunique( ) + 1) for feat in sparse_features + ['cate_id', 'brand']]
so here data['brand'].nunique() is small than input data index.
and I log all unique input brand number, and update the fd, then code can run without error.
Hi, I met the same problem, could you tell us how to fix the bug?
Sorry for the late reply, I am not sure whether it is ok or not.
- log the dimension in file 0_gen_...:
pd.to_pickle({
'cate_id': SingleFeat('cate_id', len(np.unique(unique_cate_id)) + 1),
'brand': SingleFeat('brand', len(np.unique(unique_brand)) + 1),
},
'../model_input/dsin_fd_cate_brand_' + str(FRAC) + '.pkl')- update input fd in train_dsin.py:
cate_brand_fd = pd.read_pickle('../model_input/dsin_fd_cate_brand_' +
str(FRAC) + '.pkl')
fd['sparse'][13] = cate_brand_fd['cate_id']
fd['sparse'][14] = cate_brand_fd['brand']- rerun the script.
thank you too much, please let me try
from dsin.
sorry for this mistake, we are planning to refactor our code in the future.
I think this error can be fixed by using
sparse_feature_list = [SingleFeat(feat, data[feat].max(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]
instead of
Lines 141 to 142 in 3aed781
from dsin.
Related Issues (20)
- tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[6,0] = 9984 is not in [0, 5619) HOT 5
- 关于结果 HOT 5
- 关于代码中Transformer输入格式的疑惑 HOT 3
- #Parameter is zero for gru and attention layer? HOT 4
- 关于dien,dsin实验结果的输入参数设置
- Transformer类中call()的问题 HOT 3
- 如何使用线上数据预测 HOT 4
- 结果和原文有差距。 HOT 9
- train_din error HOT 1
- 关于session的划分
- MemoryError
- 关于DIEN 的negsample
- SingleFeat不存在 HOT 1
- 报bug:DSIN特征穿越问题?
- The code file "train_dsin.py" fails to run. HOT 2
- in 0_gen_sampled_data.py why behavior_log.csv drop 'btag' columns
- 关于序列构造的问题 HOT 2
- run train_din error HOT 1
- 0_gen_sampled_data.py error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dsin.