Giter Club home page Giter Club logo

tensorflow-deepfm's Introduction

tensorflow-DeepFM

This project includes a Tensorflow implementation of DeepFM [1].

NEWS

Usage

Input Format

This implementation requires the input data in the following format:

  • Xi: [[ind1_1, ind1_2, ...], [ind2_1, ind2_2, ...], ..., [indi_1, indi_2, ..., indi_j, ...], ...]
    • indi_j is the feature index of feature field j of sample i in the dataset
  • Xv: [[val1_1, val1_2, ...], [val2_1, val2_2, ...], ..., [vali_1, vali_2, ..., vali_j, ...], ...]
    • vali_j is the feature value of feature field j of sample i in the dataset
    • vali_j can be either binary (1/0, for binary/categorical features) or float (e.g., 10.24, for numerical features)
  • y: target of each sample in the dataset (1/0 for classification, numeric number for regression)

Please see example/DataReader.py an example how to prepare the data in required format for DeepFM.

Init and train a model

import tensorflow as tf
from sklearn.metrics import roc_auc_score

# params
dfm_params = {
    "use_fm": True,
    "use_deep": True,
    "embedding_size": 8,
    "dropout_fm": [1.0, 1.0],
    "deep_layers": [32, 32],
    "dropout_deep": [0.5, 0.5, 0.5],
    "deep_layers_activation": tf.nn.relu,
    "epoch": 30,
    "batch_size": 1024,
    "learning_rate": 0.001,
    "optimizer_type": "adam",
    "batch_norm": 1,
    "batch_norm_decay": 0.995,
    "l2_reg": 0.01,
    "verbose": True,
    "eval_metric": roc_auc_score,
    "random_seed": 2017
}

# prepare training and validation data in the required format
Xi_train, Xv_train, y_train = prepare(...)
Xi_valid, Xv_valid, y_valid = prepare(...)

# init a DeepFM model
dfm = DeepFM(**dfm_params)

# fit a DeepFM model
dfm.fit(Xi_train, Xv_train, y_train)

# make prediction
dfm.predict(Xi_valid, Xv_valid)

# evaluate a trained model
dfm.evaluate(Xi_valid, Xv_valid, y_valid)

You can use early_stopping in the training as follow

dfm.fit(Xi_train, Xv_train, y_train, Xi_valid, Xv_valid, y_valid, early_stopping=True)

You can refit the model on the whole training and validation set as follow

dfm.fit(Xi_train, Xv_train, y_train, Xi_valid, Xv_valid, y_valid, early_stopping=True, refit=True)

You can use the FM or DNN part only by setting the parameter use_fm or use_dnn to False.

Regression

This implementation also supports regression task. To use DeepFM for regression, you can set loss_type as mse. Accordingly, you should use eval_metric for regression, e.g., mse or mae.

Example

Folder example includes an example usage of DeepFM/FM/DNN models for Porto Seguro's Safe Driver Prediction competition on Kaggle.

Please download the data from the competition website and put them into the example/data folder.

To train DeepFM model for this dataset, run

$ cd example
$ python main.py

Please see example/DataReader.py how to parse the raw dataset into the required format for DeepFM.

Performance

DeepFM

dfm

FM

fm

DNN

dnn

Some tips

  • You should tune the parameters for each model in order to get reasonable performance.
  • You can also try to ensemble these models or ensemble them with other models (e.g., XGBoost or LightGBM).

Reference

[1] DeepFM: A Factorization-Machine based Neural Network for CTR Prediction, Huifeng Guo, Ruiming Tang, Yunming Yey, Zhenguo Li, Xiuqiang He.

Acknowledgments

This project gets inspirations from the following projects:

License

MIT

tensorflow-deepfm's People

Contributors

chenglongchen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflow-deepfm's Issues

TypeError: Expected string passed to parameter 'tensor_names' of op 'SaveV2', got ['Variable', 'Variable/Adam', 'Variable/Adam_1', 'Variable_1', 'Variable_1/Adam', 'Variable_1/Adam_1', 'Variable_2', 'Variable_2/Adam', 'Variab

请问各位 :
TypeError: Expected string passed to parameter 'tensor_names' of op 'SaveV2', got ['Variable', 'Variable/Adam', 'Variable/Adam_1', 'Variable_1', 'Variable_1/Adam', 'Variable_1/Adam_1', 'Variable_2', 'Variable_2/Adam', 'Variab
这个错误是为什么?
完整的错误信息是这样:
Traceback (most recent call last):
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 511, in _apply_op_helper
preferred_dtype=default_dtype)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1175, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\constant_op.py", line 304, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\constant_op.py", line 245, in constant
allow_broadcast=True)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\constant_op.py", line 283, in _constant_impl
allow_broadcast=allow_broadcast))
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 501, in make_tensor_proto
(dtype, nparray.dtype, values))
TypeError: Incompatible types: <dtype: 'string'> vs. object. Value is ['Variable', 'Variable/Adam', 'Variable/Adam_1', 'Variable_1', 'Variable_1/Adam', 'Variable_1/Adam_1', 'Variable_2', 'Variable_2/Adam', 'Variable_2/Adam_1', 'Variable_3', 'Variable_3/Adam', 'Variable_3/Adam_1', 'Variable_4', 'Variable_4/Adam', 'Variable_4/Adam_1', 'Variable_5', 'Variable_5/Adam', 'Variable_5/Adam_1', 'beta1_power', 'beta2_power', 'bn_0/beta', 'bn_0/beta/Adam', 'bn_0/beta/Adam_1', 'bn_0/gamma', 'bn_0/gamma/Adam', 'bn_0/gamma/Adam_1', 'bn_0/moving_mean', 'bn_0/moving_variance', 'bn_1/beta', 'bn_1/beta/Adam', 'bn_1/beta/Adam_1', 'bn_1/gamma', 'bn_1/gamma/Adam', 'bn_1/gamma/Adam_1', 'bn_1/moving_mean', 'bn_1/moving_variance', 'feature_bias', 'feature_bias/Adam', 'feature_bias/Adam_1', 'feature_embeddings', 'feature_embeddings/Adam', 'feature_embeddings/Adam_1']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/pycharm project/2/main.py", line 148, in
y_train_dfm, y_test_dfm = _run_base_model_dfm(dfTrain, dfTest, folds, dfm_params)
File "E:/pycharm project/2/main.py", line 68, in _run_base_model_dfm
dfm = DeepFM(**dfm_params)
File "E:\pycharm project\2\DeepFM.py", line 61, in init
self._init_graph()
File "E:\pycharm project\2\DeepFM.py", line 156, in _init_graph
self.saver = tf.train.Saver()
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\saver.py", line 832, in init
self.build()
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\saver.py", line 844, in build
self._build(self._filename, build_save=True, build_restore=True)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\saver.py", line 881, in _build
build_save=build_save, build_restore=build_restore)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\saver.py", line 510, in _build_internal
save_tensor = self._AddSaveOps(filename_tensor, saveables)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\saver.py", line 210, in _AddSaveOps
save = self.save_op(filename_tensor, saveables)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\saver.py", line 124, in save_op
tensors)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1920, in save_v2
name=name)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 520, in _apply_op_helper
repr(values), type(values).name))
TypeError: Expected string passed to parameter 'tensor_names' of op 'SaveV2', got ['Variable', 'Variable/Adam', 'Variable/Adam_1', 'Variable_1', 'Variable_1/Adam', 'Variable_1/Adam_1', 'Variable_2', 'Variable_2/Adam', 'Variable_2/Adam_1', 'Variable_3', 'Variable_3/Adam', 'Variable_3/Adam_1', 'Variable_4', 'Variable_4/Adam', 'Variable_4/Adam_1', 'Variable_5', 'Variable_5/Adam', 'Variable_5/Adam_1', 'beta1_power', 'beta2_power', 'bn_0/beta', 'bn_0/beta/Adam', 'bn_0/beta/Adam_1', 'bn_0/gamma', 'bn_0/gamma/Adam', 'bn_0/gamma/Adam_1', 'bn_0/moving_mean', 'bn_0/moving_variance', 'bn_1/beta', 'bn_1/beta/Adam', 'bn_1/beta/Adam_1', 'bn_1/gamma', 'bn_1/gamma/Adam', 'bn_1/gamma/Adam_1', 'bn_1/moving_mean', 'bn_1/moving_variance', 'feature_bias', 'feature_bias/Adam', 'feature_bias/Adam_1', 'feature_embeddings', 'feature_embeddings/Adam', 'feature_embeddings/Adam_1'] of type 'list' instead.

Process finished with exit code 1

GPU Utilization is very law

Hey, I am training the example on v100 card. And GPU Utilization is a maximum of 12%. What can be a cause?? Has the code written to utilize GPU fully or modification is required ?? Is it because of the small dataset? Please give suggestions.

关于特征的向量表示我有一个疑问

def _initialize_weights(self):
weights = dict()
# embeddings
# FM 组件
weights["feature_embeddings"] = tf.Variable(
tf.random_normal([self.feature_size, self.embedding_size], 0.0, 0.01),
name="feature_embeddings") # feature_size * K
上面是初始化权重的方法 但是我们如何能直接初始化一个特征表示的矩阵weights["feature_embeddings"],每个特征表示的向量 不是应该通过相应的嵌入层网络获得吗?
还有伴随着训练完成 ,为什么weights["feature_embeddings"]会发生变化??

序列特征如何处理

你好
我有一些序列特征比如用户购买历史,用户搜索历史。
这些特征如何使用deepfm处理比较好?

MAE loss doesn't decrease?

Not sure why but if I use deepFM for regression problem, MAE not improving?
Example:

import numpy as np
from DeepFM import DeepFM
import tensorflow as tf
import pandas as pd
from sklearn.cross_validation import KFold
from sklearn.metrics import mean_absolute_error
from DataReader import FeatureDictionary, DataParser
train=pd.DataFrame(np.random.randn(100, 100).astype('float'))
y=pd.DataFrame(np.random.randn(100,1).astype('float'))
train['loss']=y
test=pd.DataFrame(np.random.randn(100, 100).astype('float'))
cols=[c for c in train.columns if 'loss' not in c]
fd = FeatureDictionary(dfTrain=train, dfTest=test,numeric_cols=cols,ignore_cols='loss')
data_parser = DataParser(feat_dict=fd)
Xi_train, Xv_train,y_train = data_parser.parse(df=train, has_label=True)
Xi_test, Xv_test = data_parser.parse(df=test)
dfm_params = {
"use_fm": True,
"use_deep": True,
"embedding_size": 8,
"dropout_fm": [1.0, 1.0],
"deep_layers": [32, 32],
"dropout_deep": [0.5, 0.5, 0.5],
"deep_layers_activation": tf.nn.relu,
"epoch": 30,
"batch_size": 1024,
"learning_rate": 0.001,
"optimizer_type": "adam",
"batch_norm": 1,
"batch_norm_decay": 0.995,
"l2_reg": 0.01,
"verbose": True,
"metric": mean_absolute_error,
"random_seed": 42,'greater_is_better':False
}
dfm_params["feature_size"] = fd.feat_dim
dfm_params["field_size"] = len(Xi_train[0])
get = lambda x, l: [x[i] for i in l]
del train['loss']
train=train.values
y=y.values
y=y.reshape((-1))
folds = KFold(len(y),8, shuffle=True, random_state=2016)
for i, (train_idx, valid_idx) in enumerate(folds):
Xi_train
, Xv_train_, y_train_ = get(Xi_train, train_idx), get(Xv_train, train_idx), get(y_train, train_idx)
Xi_valid
, Xv_valid
, y_valid
= get(Xi_train, valid_idx), get(Xv_train, valid_idx), get(y_train, valid_idx)
dfm = DeepFM(**dfm_params)
dfm.fit(Xi_train
, Xv_train
, y_train
, Xi_valid_, Xv_valid_, y_valid_)

[1] train-result=0.9263, valid-result=0.9745 [0.0 s]
[2] train-result=0.9263, valid-result=0.9745 [0.0 s]
[3] train-result=0.9263, valid-result=0.9745 [0.0 s]
[4] train-result=0.9263, valid-result=0.9745 [0.0 s]
[5] train-result=0.9263, valid-result=0.9745 [0.0 s]
[6] train-result=0.9263, valid-result=0.9745 [0.0 s]
[7] train-result=0.9263, valid-result=0.9746 [0.0 s]
[8] train-result=0.9263, valid-result=0.9746 [0.0 s]
[9] train-result=0.9263, valid-result=0.9746 [0.0 s]
[10] train-result=0.9263, valid-result=0.9746 [0.0 s]
[11] train-result=0.9264, valid-result=0.9746 [0.0 s]
[12] train-result=0.9264, valid-result=0.9746 [0.0 s]
[13] train-result=0.9264, valid-result=0.9746 [0.0 s]
[14] train-result=0.9264, valid-result=0.9746 [0.0 s]
[15] train-result=0.9264, valid-result=0.9746 [0.0 s]
[16] train-result=0.9264, valid-result=0.9747 [0.0 s]
[17] train-result=0.9264, valid-result=0.9747 [0.0 s]
[18] train-result=0.9264, valid-result=0.9747 [0.0 s]
[19] train-result=0.9264, valid-result=0.9747 [0.0 s]
[20] train-result=0.9264, valid-result=0.9747 [0.0 s]
[21] train-result=0.9264, valid-result=0.9747 [0.0 s]
[22] train-result=0.9264, valid-result=0.9747 [0.0 s]
[23] train-result=0.9264, valid-result=0.9747 [0.0 s]
[24] train-result=0.9264, valid-result=0.9747 [0.0 s]
[25] train-result=0.9264, valid-result=0.9747 [0.0 s]
[26] train-result=0.9264, valid-result=0.9747 [0.0 s]
[27] train-result=0.9264, valid-result=0.9748 [0.0 s]
[28] train-result=0.9264, valid-result=0.9748 [0.0 s]
[29] train-result=0.9264, valid-result=0.9748 [0.0 s]
[30] train-result=0.9264, valid-result=0.9748 [0.0 s]

模型结构

老哥,我对你代码的理解,first_order是F维,second_order是K维,Deep层输出维度。这三个维度concat之后再加权输出?
这个跟论文好像不一样呀……

int/long型的特征要放入NUMERIC_COLS当中吗?

看了下代码,整型类的特征似乎和category和binary类型的特征是当作一种类型处理的?是不是要把整型转换成float,然后再放入NUMERIC_COLS呢?
我试着直接将整型类特征放入NUMERIC_COLS,跑的时候结果不太对。

关于代码89行

self.y_first_order = tf.reduce_sum(tf.multiply(self.y_first_order, feat_value), 2) # None * F

这里的multiply应该是add才对吧?wx+b, feat_value为wx, y_first_order为b,

不知各位有何看法?

Why predict same test set, but given different predict_prob?

After train a DeepFM model, I call predict() many times with same test set, but given predict_prob is slightly different.
such as:

array([ 0.3498897 , 0.30687785, 0.34534037], dtype=float32) #first time call predict()

array([ 0.3498849 , 0.30688545, 0.34534937], dtype=float32) #second time call predict()

No module named yellowfin

(tensorflow) ➜  example git:(master) ✗ python main.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
  File "main.py", line 16, in <module>
    from DeepFM import DeepFM
  File "../DeepFM.py", line 15, in <module>
    from yellowfin import YFOptimizer

concat层的输入

你好,请教一个问题。
我看代码里面concat层是把fm部分的一阶向量、二阶向量和dnn部分的最后一个隐层的向量连接在一块作为一个concat层,然后接一个全连接和sigmoid,得到最后的输出。但是原论文感觉是把fm的一阶部分、二阶部分和dnn部分的输出进行求和,然后在和的基础上使用sigmoid,得到最后的输出。

请问这么实现是有什么性能上的优势吗?
多谢了

线上复现

用这个代码跑出来线上才0.27左右,请问需要做什么调整吗?

Field-aware?

The Figure 4 in the DeepFM paper seems to indicate that the embeddings are "field-aware", but looking at the code it appears that field information is not used. Can you kindly confirm?

在DataParser中的parse函数,为啥 dfv[col] = 1. ?

你好,有一处代码没有看懂,对于所有的非 NUMERIC_COLS 特征,为啥 dfv[col] = 1. ?
for col in dfi.columns:
if col in self.feat_dict.ignore_cols:
dfi.drop(col, axis=1, inplace=True)
dfv.drop(col, axis=1, inplace=True)
continue
if col in self.feat_dict.numeric_cols:
dfi[col] = self.feat_dict.feat_dict[col]
else:
dfi[col] = dfi[col].map(self.feat_dict.feat_dict[col])
dfv[col] = 1.

连续特征的embedding问题

hi,看了代码实现,有个疑问,连续特征乘以k维需要训练的embedding向量(1个数分解为多个数),这样对连续特征也带来了信息损失吧?

抱歉,还是要吐槽一下

在DeepFM.py中,实际feature_bias实际就是一元项wx+b的w,这样第89行代码“self.y_first_order = tf.reduce_sum(tf.multiply(self.y_first_order, feat_value), 2) ” 就能解释通了。而b为concat_bias。

“feature_bias”这个名字容易产生误导。

代码出错

你这个prepross里面的代码很搞笑呀
cols = [c for c in dfTrain.columns if c not in ["id", "label"]]
cols = [c for c in cols if (not c in config.IGNORE_COLS)]
cols取的是不在ignore里面的列名,后来的
cat_features_indices = [i for i,c in enumerate(cols) if c in config.CATEGORICAL_COLS]
又要在cols里面吧ignore里面的取出来,肯定就是空值啊

有没有实现sparse输入的deepfm

sparse_index = tf.placeholder(tf.int64, [None, 2])
sparse_ids = tf.placeholder(tf.int64, [None])
sparse_values = tf.placeholder(tf.float32, [None])
sparse_shape = tf.placeholder(tf.int64, [2])
ids = tf.SparseTensor(sparse_index, sparse_ids, sparse_shape)
values = tf.SparseTensor(sparse_index, sparse_values, sparse_shape)

这样子的sparse输入

训练报错

在训练的时候报了这个错误,很难知道哪里有问题,还请指教:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-187-236f2a6678c7> in <module>()
----> 1 dfm.fit(Xi,Xv,y_out)

/nfs/private/HM-PROJECT/DeepFM.py in fit(self, Xi_train, Xv_train, y_train, Xi_valid, Xv_valid, y_valid, early_stopping, refit)
    282             for i in range(total_batch):
    283                 Xi_batch, Xv_batch, y_batch = self.get_batch(Xi_train, Xv_train, y_train, self.batch_size, i)
--> 284                 self.fit_on_batch(Xi_batch, Xv_batch, y_batch)
    285 
    286             # evaluate training and validation datasets

/nfs/private/HM-PROJECT/DeepFM.py in fit_on_batch(self, Xi, Xv, y)
    254                      self.dropout_keep_deep: self.dropout_deep,
    255                      self.train_phase: True}
--> 256         loss, opt = self.sess.run((self.loss, self.optimizer), feed_dict=feed_dict)
    257         return loss
    258 

/home/luban/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    898     try:
    899       result = self._run(None, fetches, feed_dict, options_ptr,
--> 900                          run_metadata_ptr)
    901       if run_metadata:
    902         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/home/luban/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1133     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1134       results = self._do_run(handle, final_targets, final_fetches,
-> 1135                              feed_dict_tensor, options, run_metadata)
   1136     else:
   1137       results = []

/home/luban/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1314     if handle is None:
   1315       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1316                            run_metadata)
   1317     else:
   1318       return self._do_call(_prun_fn, handle, feeds, fetches)

/home/luban/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1333         except KeyError:
   1334           pass
-> 1335       raise type(e)(node_def, op, message)
   1336 
   1337   def _extend_graph(self):

InvalidArgumentError: slice index 3 of dimension 0 out of bounds.
	 [[Node: strided_slice_5 = StridedSlice[Index=DT_INT32, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_dropout_keep_deep_0_0/_11, strided_slice_5/stack, strided_slice_5/stack_1, strided_slice/stack_1)]]

三种特征的区别是什么,如何区分

嗨,陈工,您好:
首先感谢您对deepFM的代码进行开源,该代码对我的帮助非常大,后续的毕业设计准备采用该代码进行模型训练及验证工作。
我的想法是将该代码用在一个新的kaggle 竞赛项目。
在您提供的代码中:有下述三种类别,请问这三种类别的区别是什么,尤其是IGNORE_COLS 这个,什么样的特征可以作为该项,期待您的回复

types of columns of the dataset dataframe

CATEGORICAL_COLS = [
# 'ps_ind_02_cat', 'ps_ind_04_cat', 'ps_ind_05_cat',
# 'ps_car_01_cat', 'ps_car_02_cat', 'ps_car_03_cat',
# 'ps_car_04_cat', 'ps_car_05_cat', 'ps_car_06_cat',
# 'ps_car_07_cat', 'ps_car_08_cat', 'ps_car_09_cat',
# 'ps_car_10_cat', 'ps_car_11_cat',
]

NUMERIC_COLS = [
# # binary
# "ps_ind_06_bin", "ps_ind_07_bin", "ps_ind_08_bin",
# "ps_ind_09_bin", "ps_ind_10_bin", "ps_ind_11_bin",
# "ps_ind_12_bin", "ps_ind_13_bin", "ps_ind_16_bin",
# "ps_ind_17_bin", "ps_ind_18_bin",
# "ps_calc_15_bin", "ps_calc_16_bin", "ps_calc_17_bin",
# "ps_calc_18_bin", "ps_calc_19_bin", "ps_calc_20_bin",
# numeric
"ps_reg_01", "ps_reg_02", "ps_reg_03",
"ps_car_12", "ps_car_13", "ps_car_14", "ps_car_15",

# feature engineering
"missing_feat", "ps_car_13_x_ps_reg_03",

]

IGNORE_COLS = [
"id", "target",
"ps_calc_01", "ps_calc_02", "ps_calc_03", "ps_calc_04",
"ps_calc_05", "ps_calc_06", "ps_calc_07", "ps_calc_08",
"ps_calc_09", "ps_calc_10", "ps_calc_11", "ps_calc_12",
"ps_calc_13", "ps_calc_14",
"ps_calc_15_bin", "ps_calc_16_bin", "ps_calc_17_bin",
"ps_calc_18_bin", "ps_calc_19_bin", "ps_calc_20_bin"
]

如何处理pre-trained embeddings?

如果我有一份训练好的向量想用于训练,但是不想被模型end-to-end修正。 那么应该如何操作?
感觉需要自己重新修改代码,如下:

tf.get_variable(name = "pre_trained_embedding",
            shape = (ids_num,embedding_dimension),
            initializer = tf.constant_initializer(preTrainedEmbeddingData),
            trainable = False)

然后将这个embedding结果与其它的输入进行concat, 不知道这样思考对不对?

如何export模型

我用如下代码导出DeepFM模型,但是发现freeze_model_dir/variables为空,请问应该如何导出DeepFM的pb模型?

def freeze_model(self): freeze_model_dir = "freeze_model_dir" save_dir = 'checkpoints/' save_path = os.path.join(save_dir, 'best_validation') start_time = time() print(tf.trainable_variables()) print("freeze model...") SIGNATURE_NAME = "serving_default" builder = tf.saved_model.builder.SavedModelBuilder(freeze_model_dir) inputs = {'feat_index': tf.saved_model.utils.build_tensor_info(self.feat_index), 'feat_value': tf.saved_model.utils.build_tensor_info(self.feat_value), 'dropout_keep_fm': tf.saved_model.utils.build_tensor_info(self.dropout_keep_fm), 'dropput_keep_deep': tf.saved_model.utils.build_tensor_info(self.dropout_keep_deep), 'train_phase': tf.saved_model.utils.build_tensor_info(self.train_phase)} outputs = {'y_pred': tf.saved_model.utils.build_tensor_info(self.y_pred)} builder.add_meta_graph_and_variables(self.sess, [tf.saved_model.tag_constants.SERVING], signature_def_map={ tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:tf.saved_model.signature_def_utils.build_signature_def( inputs, outputs, tf.saved_model.signature_constants.PREDICT_METHOD_NAME ) }, main_op=tf.tables_initializer(), strip_default_attrs=True ) builder.save()

The AUC Result

May I know why the AUC score in the Kaggle dataset and my own dataset is so strange that never runs above 0.5 ?
Thanks a lot~

from sklearn.metrics import roc_auc_score

[1] train-result=0.0130, valid-result=0.0551 [33.9 s]
[2] train-result=0.2280, valid-result=0.2518 [35.6 s]
[3] train-result=0.2539, valid-result=0.2593 [33.7 s]
[4] train-result=0.2590, valid-result=0.2629 [34.0 s]
[5] train-result=0.2559, valid-result=0.2561 [36.0 s]
[6] train-result=0.2588, valid-result=0.2594 [34.4 s]
[7] train-result=0.2579, valid-result=0.2587 [33.2 s]
[8] train-result=0.2640, valid-result=0.2626 [33.6 s]
[9] train-result=0.2671, valid-result=0.2638 [34.8 s]
[10] train-result=0.2619, valid-result=0.2601 [35.0 s]
[11] train-result=0.2667, valid-result=0.2609 [36.3 s]
[12] train-result=0.2664, valid-result=0.2627 [37.6 s]
[13] train-result=0.2687, valid-result=0.2650 [34.1 s]
[14] train-result=0.2724, valid-result=0.2692 [33.9 s]
[15] train-result=0.2689, valid-result=0.2663 [34.0 s]
[16] train-result=0.2710, valid-result=0.2671 [33.2 s]
[17] train-result=0.2677, valid-result=0.2655 [33.1 s]
[18] train-result=0.2694, valid-result=0.2674 [34.5 s]

这些参数可以给个解释吗

dfm_params = {
"use_fm": True,
"use_deep": True,
"embedding_size": 8,
"dropout_fm": [1.0, 1.0],
"deep_layers": [32, 32],
"dropout_deep": [0.5, 0.5, 0.5],
"deep_layers_activation": tf.nn.relu,
"epoch": 30,
"batch_size": 1024,
"learning_rate": 0.001,
"optimizer_type": "adam",
"batch_norm": 1,
"batch_norm_decay": 0.995,
"l2_reg": 0.01,
"verbose": True,
"eval_metric": gini_norm,
"random_seed": config.RANDOM_SEED
}
本人小白 求解释 不想太过于深入底层的东西

Input Dimension Error

Hello,

In your paper, it is written that "the lengths of different input field vectors can be different" which means the input data Xi can contain lists of different lengths. I have tried to run the code with the latter, but it gave the following error "ValueError: setting an array element with a sequence.". Can you please suggest what to do?

Thanks in advance

Deep Component最开始的输入应该是最原始的Embedding结果?

self.y_deep = tf.reshape(self.embeddings, shape=[-1, self.field_size * self.embedding_size]) # None * (F*K)

这里是Deep Component中最原始的输入部分。但是此时的Embedding已经不在是最初的嵌入之后的结果了。此处的embedding乘上了输入x,目的应该是方便计算FM的二维组合特征部分。

在论文里,没有看到说Deep Component'的输入是什么,但是从给出的架构图上看,应该就是最原始的Embedding的结果。

希望得到解答,非常感谢~

收敛极度慢

40个左右特征,其中20+个数值型,总共500万训练数据,100万测试数据。
一个epoch大概200秒,跑到126的时候,忍受不了速度,停了,可以看得出其实一直有在收敛,但是速度很慢,有给点建议的么,各种优化算法和学习率、batch size都试过。
feature_size: 533
field_size: 38
#params: 15692
[1] train.csv-result=0.2512, valid-result=0.2515 [202.4 s]
[2] train.csv-result=0.2572, valid-result=0.2561 [190.0 s]
[3] train.csv-result=0.2578, valid-result=0.2568 [200.4 s]
[4] train.csv-result=0.2598, valid-result=0.2583 [188.3 s]
[5] train.csv-result=0.2582, valid-result=0.2569 [191.0 s]
[6] train.csv-result=0.2611, valid-result=0.2598 [201.3 s]
[7] train.csv-result=0.2594, valid-result=0.2578 [190.5 s]
[8] train.csv-result=0.2662, valid-result=0.2648 [182.1 s]
[9] train.csv-result=0.2639, valid-result=0.2623 [192.6 s]
[10] train.csv-result=0.2655, valid-result=0.2639 [190.7 s]
[11] train.csv-result=0.2671, valid-result=0.2654 [201.4 s]
[12] train.csv-result=0.2662, valid-result=0.2644 [190.8 s]
[13] train.csv-result=0.2674, valid-result=0.2653 [191.0 s]
[14] train.csv-result=0.2686, valid-result=0.2665 [200.5 s]
[15] train.csv-result=0.2678, valid-result=0.2657 [189.6 s]
[16] train.csv-result=0.2691, valid-result=0.2664 [200.2 s]
[17] train.csv-result=0.2704, valid-result=0.2676 [183.4 s]
[18] train.csv-result=0.2717, valid-result=0.2692 [173.5 s]
[19] train.csv-result=0.2721, valid-result=0.2688 [183.9 s]
[20] train.csv-result=0.2744, valid-result=0.2709 [173.3 s]
[21] train.csv-result=0.2757, valid-result=0.2718 [182.0 s]
[22] train.csv-result=0.2778, valid-result=0.2734 [173.0 s]
[23] train.csv-result=0.2803, valid-result=0.2753 [173.0 s]
[24] train.csv-result=0.2806, valid-result=0.2747 [183.5 s]
[25] train.csv-result=0.2852, valid-result=0.2792 [173.4 s]
[26] train.csv-result=0.2882, valid-result=0.2813 [184.0 s]
[27] train.csv-result=0.2890, valid-result=0.2818 [174.0 s]
[28] train.csv-result=0.2909, valid-result=0.2834 [174.2 s]
[29] train.csv-result=0.2921, valid-result=0.2841 [184.5 s]
[30] train.csv-result=0.2940, valid-result=0.2860 [174.8 s]
[31] train.csv-result=0.2945, valid-result=0.2862 [186.5 s]
[32] train.csv-result=0.2946, valid-result=0.2859 [191.1 s]
[33] train.csv-result=0.2966, valid-result=0.2879 [190.2 s]
[34] train.csv-result=0.2976, valid-result=0.2888 [200.9 s]
[35] train.csv-result=0.2984, valid-result=0.2892 [190.7 s]
[36] train.csv-result=0.2979, valid-result=0.2885 [201.8 s]
[37] train.csv-result=0.2987, valid-result=0.2896 [191.1 s]
[38] train.csv-result=0.2997, valid-result=0.2899 [191.4 s]
[39] train.csv-result=0.3006, valid-result=0.2910 [201.6 s]
[40] train.csv-result=0.3011, valid-result=0.2914 [191.4 s]
[41] train.csv-result=0.3006, valid-result=0.2902 [201.5 s]
[42] train.csv-result=0.3018, valid-result=0.2919 [190.5 s]
[43] train.csv-result=0.3022, valid-result=0.2912 [189.5 s]
[44] train.csv-result=0.3029, valid-result=0.2926 [200.5 s]
[45] train.csv-result=0.3027, valid-result=0.2923 [190.9 s]
[46] train.csv-result=0.3031, valid-result=0.2922 [200.6 s]
[47] train.csv-result=0.3037, valid-result=0.2930 [190.0 s]
[48] train.csv-result=0.3037, valid-result=0.2929 [190.1 s]
[49] train.csv-result=0.3043, valid-result=0.2929 [200.6 s]
[50] train.csv-result=0.3039, valid-result=0.2932 [190.7 s]
[51] train.csv-result=0.3048, valid-result=0.2939 [200.9 s]
[52] train.csv-result=0.3046, valid-result=0.2926 [190.3 s]
[53] train.csv-result=0.3050, valid-result=0.2936 [189.6 s]
[54] train.csv-result=0.3057, valid-result=0.2940 [200.4 s]
[55] train.csv-result=0.3055, valid-result=0.2944 [190.5 s]
[56] train.csv-result=0.3058, valid-result=0.2943 [201.2 s]
[57] train.csv-result=0.3062, valid-result=0.2948 [190.5 s]
[58] train.csv-result=0.3057, valid-result=0.2948 [191.1 s]
[59] train.csv-result=0.3064, valid-result=0.2949 [201.3 s]
[60] train.csv-result=0.3065, valid-result=0.2944 [190.6 s]
[61] train.csv-result=0.3072, valid-result=0.2955 [201.2 s]
[62] train.csv-result=0.3072, valid-result=0.2954 [190.0 s]
[63] train.csv-result=0.3072, valid-result=0.2955 [189.9 s]
[64] train.csv-result=0.3067, valid-result=0.2944 [200.6 s]
[65] train.csv-result=0.3076, valid-result=0.2958 [190.4 s]
[66] train.csv-result=0.3080, valid-result=0.2964 [200.1 s]
[67] train.csv-result=0.3083, valid-result=0.2960 [190.9 s]
[68] train.csv-result=0.3082, valid-result=0.2963 [190.8 s]
[69] train.csv-result=0.3083, valid-result=0.2957 [201.4 s]
[70] train.csv-result=0.3087, valid-result=0.2964 [190.5 s]
[71] train.csv-result=0.3083, valid-result=0.2966 [201.2 s]
[72] train.csv-result=0.3082, valid-result=0.2963 [190.7 s]
[73] train.csv-result=0.3089, valid-result=0.2966 [191.1 s]
[74] train.csv-result=0.3085, valid-result=0.2957 [201.2 s]
[75] train.csv-result=0.3088, valid-result=0.2971 [190.2 s]
[76] train.csv-result=0.3093, valid-result=0.2974 [190.8 s]
[77] train.csv-result=0.3094, valid-result=0.2973 [201.0 s]
[78] train.csv-result=0.3096, valid-result=0.2971 [190.8 s]
[79] train.csv-result=0.3089, valid-result=0.2969 [201.3 s]
[80] train.csv-result=0.3099, valid-result=0.2972 [190.7 s]
[81] train.csv-result=0.3097, valid-result=0.2969 [190.8 s]
[82] train.csv-result=0.3100, valid-result=0.2974 [200.7 s]
[83] train.csv-result=0.3096, valid-result=0.2970 [190.8 s]
[84] train.csv-result=0.3104, valid-result=0.2979 [200.7 s]
[85] train.csv-result=0.3095, valid-result=0.2974 [190.5 s]
[86] train.csv-result=0.3091, valid-result=0.2967 [190.0 s]
[87] train.csv-result=0.3104, valid-result=0.2982 [201.1 s]
[88] train.csv-result=0.3101, valid-result=0.2967 [190.3 s]
[89] train.csv-result=0.3109, valid-result=0.2981 [201.1 s]
[90] train.csv-result=0.3102, valid-result=0.2977 [190.4 s]
[91] train.csv-result=0.3112, valid-result=0.2982 [190.7 s]
[92] train.csv-result=0.3110, valid-result=0.2980 [200.6 s]
[93] train.csv-result=0.3111, valid-result=0.2980 [191.0 s]
[94] train.csv-result=0.3117, valid-result=0.2985 [201.1 s]
[95] train.csv-result=0.3108, valid-result=0.2977 [191.0 s]
[96] train.csv-result=0.3117, valid-result=0.2989 [190.2 s]
[97] train.csv-result=0.3114, valid-result=0.2982 [201.1 s]
[98] train.csv-result=0.3111, valid-result=0.2985 [190.8 s]
[99] train.csv-result=0.3116, valid-result=0.2980 [201.1 s]
[100] train.csv-result=0.3120, valid-result=0.2984 [191.0 s]
[101] train.csv-result=0.3122, valid-result=0.2984 [191.1 s]
[102] train.csv-result=0.3120, valid-result=0.2985 [201.9 s]
[103] train.csv-result=0.3124, valid-result=0.2992 [191.0 s]
[104] train.csv-result=0.3120, valid-result=0.2989 [200.3 s]
[105] train.csv-result=0.3114, valid-result=0.2980 [191.7 s]
[106] train.csv-result=0.3109, valid-result=0.2974 [189.1 s]
[107] train.csv-result=0.3121, valid-result=0.2986 [195.3 s]
[108] train.csv-result=0.3123, valid-result=0.2991 [184.6 s]
[109] train.csv-result=0.3122, valid-result=0.2993 [194.1 s]
[110] train.csv-result=0.3127, valid-result=0.2995 [184.1 s]
[111] train.csv-result=0.3126, valid-result=0.2986 [184.3 s]
[112] train.csv-result=0.3128, valid-result=0.2995 [192.9 s]
[113] train.csv-result=0.3127, valid-result=0.2986 [183.8 s]
[114] train.csv-result=0.3129, valid-result=0.2990 [194.0 s]
[115] train.csv-result=0.3133, valid-result=0.2994 [184.0 s]
[116] train.csv-result=0.3127, valid-result=0.2989 [183.0 s]
[117] train.csv-result=0.3131, valid-result=0.2997 [193.4 s]
[118] train.csv-result=0.3128, valid-result=0.2991 [185.9 s]
[119] train.csv-result=0.3102, valid-result=0.2969 [194.4 s]
[120] train.csv-result=0.3131, valid-result=0.2997 [184.5 s]
[121] train.csv-result=0.3132, valid-result=0.2988 [177.8 s]
[122] train.csv-result=0.3136, valid-result=0.2995 [191.2 s]
[123] train.csv-result=0.3136, valid-result=0.2995 [180.3 s]
[124] train.csv-result=0.3139, valid-result=0.3001 [191.6 s]
[125] train.csv-result=0.3136, valid-result=0.3000 [182.3 s]
[126] train.csv-result=0.3139, valid-result=0.3001 [179.6 s]

输入是文本特征

你好,我想请教一下deepfm模型的输入可以是评论文本表示的user,item 特征吗?

ImportError: No module named 'yellowfin'

\

/usr/local/lib/python3.5/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "main.py", line 16, in
from DeepFM import DeepFM
File "../DeepFM.py", line 15, in
from yellowfin import YFOptimizer
ImportError: No module named 'yellowfin'
你好,我用1.0版本的tensorflow报错这句话,把tf升级成最新版本还是有这个问题,不知道该怎么解决?

关于代码107行

deepFM.py 第107行为:self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[0]),本人在实验中,删除该行,效果反而变好了很多。查阅资料后,发现dropout一般的对象是deep的中间节点,而该行是针对初始节点。

楼主可以自行实验,以加验证。

祝好~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.