paddlepaddle / paddlerec Goto Github PK

Recommendation Algorithm大规模推荐算法库，包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM，DSIN，SIGN，IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM，TiSAS，AutoFIS等，包含经典推荐系统数据集criteo 、movielens等

Home Page: https://paddlerec.readthedocs.io/

License: Apache License 2.0

Python 84.93% Shell 4.28% C++ 9.85% Go 0.17% Java 0.77%

ple deepfm mmoe word2vec gru4rec tdm esmm widedeep lr

paddlerec's Introduction

(中文文档|简体中文|English)

什么是推荐系统?

推荐系统是在互联网信息爆炸式增长的时代背景下，帮助用户高效获得感兴趣信息的关键；
推荐系统也是帮助产品最大限度吸引用户、留存用户、增加用户粘性、提高用户转化率的银弹。
有无数优秀的产品依靠用户可感知的推荐系统建立了良好的口碑，也有无数的公司依靠直击用户痛点的推荐系统在行业中占领了一席之地。

可以说，谁能掌握和利用好推荐系统，谁就能在信息分发的激烈竞争中抢得先机。但与此同时，有着许多问题困扰着推荐系统的开发者，比如：庞大的数据量，复杂的模型结构，低效的分布式训练环境，波动的在离线一致性，苛刻的上线部署要求，以上种种，不胜枚举。

什么是PaddleRec?

源于飞桨生态的搜索推荐模型 一站式开箱即用工具
适合初学者，开发者，研究者的推荐系统全流程解决方案
包含内容理解、匹配、召回、排序、多任务、重排序等多个任务的完整推荐搜索算法库。支持模型列表

快速使用

在线运行

AI Studio在线运行示例

环境要求

Python 2.7.15 / 3.5 / 3.6 / 3.7, 推荐使用python3.7，示例中的python默认表示python3.7
PaddlePaddle >=2.0
操作系统: Windows/Mac/Linux

Windows下PaddleRec目前仅支持单机训练，分布式训练建议使用Linux环境

安装Paddle

gpu环境pip安装

python -m pip install paddlepaddle-gpu==2.0.0

cpu环境pip安装

python -m pip install paddlepaddle # gcc8

更多版本下载可参考paddle官网下载安装

下载PaddleRec

注意：官方维护github版本地址：
https://github.com/PaddlePaddle/PaddleRec

git clone https://github.com/PaddlePaddle/PaddleRec/
cd PaddleRec

快速运行

我们以排序模型中的dnn模型为例介绍PaddleRec的一键启动。训练数据来源为Criteo数据集，我们从中截取了100条数据：

python -u tools/trainer.py -m models/rank/dnn/config.yaml # 动态图训练 
python -u tools/static_trainer.py -m models/rank/dnn/config.yaml #  静态图训练

帮助文档

项目背景

入门教程

进阶教程

FAQ

常见问题FAQ

致谢

外部开发者贡献列表

支持模型列表

方向	模型	在线环境	分布式CPU	分布式GPU	支持版本	论文
内容理解	TextCnn(文档)	Python CPU/GPU	✓	x	>=2.1.0	[EMNLP 2014]Convolutional neural networks for sentence classication
内容理解	TagSpace(文档)	Python CPU/GPU	✓	x	>=2.1.0	[EMNLP 2014]TagSpace: Semantic Embeddings from Hashtags
匹配	DSSM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[CIKM 2013]Learning Deep Structured Semantic Models for Web Search using Clickthrough Data
匹配	Match-Pyramid(文档)	Python CPU/GPU	✓	x	>=2.1.0	[AAAI 2016]Text Matching as Image Recognition
匹配	MultiView-Simnet(文档)	Python CPU/GPU	✓	x	>=2.1.0	[WWW 2015]A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems
匹配	KIM(文档)	-	x	x	>=2.1.0	[SIGIR 2021]Personalized News Recommendation with Knowledge-aware Interactive Matching
召回	TDM	-	✓	>=1.8.0	1.8.5	[KDD 2018]Learning Tree-based Deep Model for Recommender Systems
召回	FastText	-	x	x	1.8.5	[EACL 2017]Bag of Tricks for Efficient Text Classification
召回	MIND(文档)	Python CPU/GPU	x	x	>=2.1.0	[2019]Multi-Interest Network with Dynamic Routing for Recommendation at Tmall
召回	Word2Vec(文档)	Python CPU/GPU	✓	x	>=2.1.0	[NIPS 2013]Distributed Representations of Words and Phrases and their Compositionality
召回	DeepWalk(文档)	Python CPU/GPU	x	x	>=2.1.0	[SIGKDD 2014]DeepWalk: Online Learning of Social Representations
召回	SSR	-	✓	✓	1.8.5	[SIGIR 2016]Multtti-Rate Deep Learning for Temporal Recommendation
召回	Gru4Rec(文档)	-	✓	✓	1.8.5	[2015]Session-based Recommendations with Recurrent Neural Networks
召回	Youtube_dnn	-	✓	✓	1.8.5	[RecSys 2016]Deep Neural Networks for YouTube Recommendations
召回	NCF(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[WWW 2017]Neural Collaborative Filtering
召回	TiSAS	-	✓	✓	>=2.1.0	[WSDM 2020]Time Interval Aware Self-Attention for Sequential Recommendation
召回	ENSFM	-	✓	✓	>=2.1.0	[IW3C2 2020]Eicient Non-Sampling Factorization Machines for Optimal Context-Aware Recommendation
召回	MHCN	-	✓	✓	>=2.1.0	[WWW 2021]Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation
召回	GNN	-	✓	✓	1.8.5	[AAAI 2019]Session-based Recommendation with Graph Neural Networks
召回	RALM	-	✓	✓	1.8.5	[KDD 2019]Real-time Attention Based Look-alike Model for Recommender System
排序	Logistic Regression(文档)	Python CPU/GPU	✓	x	>=2.1.0	/
排序	Dnn(文档)	Python CPU/GPU	✓	✓	>=2.1.0	/
排序	FM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[IEEE Data Mining 2010]Factorization machines
排序	BERT4REC	-	✓	x	>=2.1.0	[CIKM 2019]BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
排序	FAT_DeepFFM	-	✓	x	>=2.1.0	[2019]FAT-DeepFFM: Field Attentive Deep Field-aware Factorization Machine
排序	FFM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[RECSYS 2016]Field-aware Factorization Machines for CTR Prediction
排序	FNN	-	✓	x	1.8.5	[ECIR 2016]Deep Learning over Multi-field Categorical Data
排序	Deep Crossing	-	✓	x	1.8.5	[ACM 2016]Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features
排序	Pnn	-	✓	x	1.8.5	[ICDM 2016]Product-based Neural Networks for User Response Prediction
排序	DCN(文档)	Python CPU/GPU	✓	x	>=2.1.0	[KDD 2017]Deep & Cross Network for Ad Click Predictions
排序	NFM	-	✓	x	1.8.5	[SIGIR 2017]Neural Factorization Machines for Sparse Predictive Analytics
排序	AFM	-	✓	x	1.8.5	[IJCAI 2017]Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks
排序	DMR(文档)	Python CPU/GPU	x	x	>=2.1.0	[AAAI 2020]Deep Match to Rank Model for Personalized Click-Through Rate Prediction
排序	DeepFM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[IJCAI 2017]DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
排序	xDeepFM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[KDD 2018]xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems
排序	DIN(文档)	Python CPU/GPU	✓	x	>=2.1.0	[KDD 2018]Deep Interest Network for Click-Through Rate Prediction
排序	DIEN(文档)	Python CPU/GPU	✓	x	>=2.1.0	[AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction
排序	GateNet(文档)	Python CPU/GPU	✓	x	>=2.1.0	[SIGIR 2020]GateNet: Gating-Enhanced Deep Network for Click-Through Rate Prediction
排序	DLRM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[CoRR 2019]Deep Learning Recommendation Model for Personalization and Recommendation Systems
排序	NAML(文档)	Python CPU/GPU	✓	x	>=2.1.0	[IJCAI 2019]Neural News Recommendation with Attentive Multi-View Learning
排序	DIFM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[IJCAI 2020]A Dual Input-aware Factorization Machine for CTR Prediction
排序	DeepFEFM(文档)	Python CPU/GPU	✓	x	>=2.1.0	[arXiv 2020]Field-Embedded Factorization Machines for Click-through rate prediction
排序	BST(文档)	Python CPU/GPU	✓	x	>=2.1.0	[DLP_KDD 2019]Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
排序	AutoInt	-	✓	x	>=2.1.0	[CIKM 2019]AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
排序	Wide&Deep(文档)	Python CPU/GPU	✓	x	>=2.1.0	[DLRS 2016]Wide & Deep Learning for Recommender Systems
排序	Fibinet	-	✓	✓	1.8.5	[RecSys19]FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction
排序	FLEN	-	✓	✓	>=2.1.0	[2019]FLEN: Leveraging Field for Scalable CTR Prediction
排序	DeepRec	-	✓	✓	>=2.1.0	[2017]Training Deep AutoEncoders for Collaborative Filtering
排序	AutoFIS	-	✓	✓	>=2.1.0	[KDD 2020]AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction
排序	DCN_V2	-	✓	✓	>=2.1.0	[WWW 2021]DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems
排序	DSIN	-	✓	✓	>=2.1.0	[IJCAI 2019]Deep Session Interest Network for Click-Through Rate Prediction
排序	SIGN(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[AAAI 2021]Detecting Beneficial Feature Interactions for Recommender Systems
排序	IPRec(文档)	-	✓	✓	>=2.1.0	[SIGIR 2021]Package Recommendation with Intra- and Inter-Package Attention Networks
排序	FGCNN	-	✓	✓	>=2.1.0	[WWW 2019]Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
排序	DPIN(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[SIGIR 2021]Deep Position-wise Interaction Network for CTR Prediction
多任务	AITM	-	✓	✓	>=2.1.0	[KDD 2021]Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising
多任务	PLE(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[RecSys 2020]Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations
多任务	ESMM(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[SIGIR 2018]Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate
多任务	MMOE(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[KDD 2018]Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
多任务	ShareBottom(文档)	Python CPU/GPU	✓	✓	>=2.1.0	[1998]Multitask learning
多任务	Maml(文档)	Python CPU/GPU	x	x	>=2.1.0	[PMLR 2017]Model-agnostic meta-learning for fast adaptation of deep networks
多任务	DSelect_K(文档)	-	x	x	>=2.1.0	[NeurIPS 2021]DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning
多任务	ESCM2	-	x	x	>=2.1.0	[SIGIR 2022]ESCM2: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation
多任务	MetaHeac	-	x	x	>=2.1.0	[KDD 2021]Learning to Expand Audience via Meta Hybrid Experts and Critics for Recommendation and Advertising
重排序	Listwise	-	✓	x	1.8.5	[2019]Sequential Evaluation and Generation Framework for Combinatorial Recommender System

社区

版本历史

2022.06.20 - PaddleRec v2.3.0
2021.11.19 - PaddleRec v2.2.0
2021.05.19 - PaddleRec v2.1.0
2021.01.29 - PaddleRec v2.0.0
2020.10.12 - PaddleRec v1.8.5
2020.06.17 - PaddleRec v0.1.0
2020.06.03 - PaddleRec v0.0.2
2020.05.14 - PaddleRec v0.0.1

许可证书

本项目的发布受Apache 2.0 license许可认证。

联系我们

如有意见、建议及使用中的BUG，欢迎在GitHub Issue提交

亦可通过以下方式与我们沟通交流：

QQ群号码：861717190
微信小助手微信号：wxid_0xksppzk5p7f22
备注REC自动加群

PaddleRec交流QQ群 PaddleRec微信小助手

paddlerec's People

Contributors

Stargazers

Watchers

Forkers

seiriosplus mrchengmo frankwhzhang aolansili xjqbest fuyinno4 yaoxuefeng6 bella-zhao lihaolh vslyu 123malin guru4elephant luckynlper scytmj xionghang ccmeteorljh hzj1558718 killy-liu mrsilencemo yinhaofeng tsingliuwin songdi3 wq343580510 wangxiao1021 bestjex yuyanze aqwertaqwert honshj allensmile ttyb jiangfuchun unaecho xiaoyangwu 202danding hundunyu youyou3 lps683 originval viktor-paul ayarrr come-come mycaster csy199509 bjjwwang asdlei99 laichenxuan py-wen-song fightseed saikarthigeyan1512 hzjai0624 qdriving zbp-xxxp cklient q4323636 andrewlesson henrypoter superxiang supermousse douliu95 majing2019 xiaohongri enhaofrank newsouljulio lilanglang1997 alieinapril michaelwang123 zhangdepengcloud weiguozhao lastrei tyihou liwb5 shiyu22 seemingwang yucaifuyoyo dingzx920 qianrenjian mltrees soar200 zhoucc 965784749-rgb forrestneo fengbeihong lucifer726 wrhwww jimling pk-hk0 jackiezkl maoqianqian123 cmjcmj8888 ericdoug-qi jch1983 qilingu wangzhen38 zhenghanho hell-to-heaven hsiachubby vvjy mmglove xingfeili orrio

paddlerec's Issues

models/recall召回任务中，基于python3.6环境下，只有w2v不报错，ssr，youtube_dnn，gru4rec，gnn，ncf均报如下错

复现代码地址：https://github.com/PaddlePaddle/PaddleRec/tree/master/models/recall

分别执行如下命令：
python -m paddlerec.run -m paddlerec.models.recall.ssr # ssr
python -m paddlerec.run -m paddlerec.models.recall.gru4rec # gru4rec
python -m paddlerec.run -m paddlerec.models.recall.gnn # gnn
python -m paddlerec.run -m paddlerec.models.recall.ncf # ncf
python -m paddlerec.run -m paddlerec.models.recall.youtube_dnn #

local_cluster本地模拟分布式训练，不兼容py3

server端正常启动，worker端爆粗如下：

I0612 09:03:26.418155 3451 communicator.h:252] AsyncCommunicator Initialized
Traceback (most recent call last):
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainer.py", line 246, in run
self.context_process(self._context)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainer.py", line 207, in context_process
self._status_processorcontext['status']
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainers/general_trainer.py", line 95, in network
network_class.build_network(context)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainers/framework/network.py", line 165, in build_network
dataset["name"], context)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainers/framework/dataset.py", line 90, in create_dataset
return self._get_dataset(dataset_name, context)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainers/framework/dataset.py", line 126, in _get_dataset
file_list = context["fleet"].split_files(file_list)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle/fluid/incubate/fleet/base/fleet_base.py", line 179, in split_files
trainer_files[i] = files[begin:begin + blocks[i]]
TypeError: slice indices must be integers or None or have an index method
Catch Exception:slice indices must be integers or None or have an index method

PaddleRec Error Message Summary:

Exit PaddleRec. catch exception in precoss status: [network_pass], except: slice indices must be integers or None or have an index method
TypeError

【用户使用问题】save按batch个数存储

SAVE按batch个数存储
save按batch只有dataloader可以做
paddlerec还不支持对batch进行save

（建议）使用DataLoader的时候，是否可以加个判单去掉QueueDataset can not support PY3, change to DataLoader提示。

分布式任务提交中遇到若干问题

1：文档只给出MPI_CPU的submit demo，未给出K8S_CPU\K8S_GPU的submit demo。

2:pip install paddlepaddle-gpu==1.7.2 --index-url=http://pip.baidu.com/pypi/simple --trusted-host pip.baidu.com mpi_cpu模式不需要gpu

3：如何在paddlecloud运行py3未给出说明，有两种方式：①：通过添加一个run.sh,在里面配置PATH
和执行python -m paddlerec.run -m config.yaml②：通过在config.ini中添加use_python3=1

4:FLAGS_communicator_max_merge_var_num: 5这个Flags说明一下，在sync和half_async时，需要和cpu_num个数相同。

5：个人感觉可以将backend.yaml文件中的内容全部放到config.yaml中，backend.yaml中重要是一个配置config和summit提交job作业，没有必要单独弄一个backend.yaml,只留一个config.yaml文档说明清楚如何配置提交到cloud的config和summit即可，这样从单机单卡、单机多卡、local_cluster\cluster都可以只用config.yaml，简洁明了；

6：现在cluster模式，只能支持train，未添加infer相关功能。
需求：PaddleRec分布式预测功能添加。

7：config.ini中的cpu_num默认等于1，如何通过backend.yaml控制cpu_num的值？文档需给出明确说明。

ESMM, python reader.py文件不存在

不太清楚这个 reader.py是怎么实现的
PaddleRec/models/multitask/esmm/data/run.sh

esmm训练样本格式

你好，请问原始训练集经过处理后（reader.py）样本格式是 index，click_label，conversion_label，field_index:feature_index... 此处并没有用到特征具体的value值。可参照PaddleRec/models/multitask/esmm/data/train/small.txt目录下的测试样本。为什么没有特征的val，此处是把所有的特征都当做了离散特征所以只记录了index吗？

【用户使用问题】目前PaddleRec在提交PaddleCloud分布式任务时，不支持多机多卡

RT，目前仅支持多机单卡，或多机多卡（伪，使用Parallel Executor 龟速运行），应该在每个分布式节点上使用 fleet.run 或 paddle.distributed.launch方法启动多机多卡训练。

文档：自定义reader 失效了

有尝试把deepfm模型的fm侧换成afm模型吗

afm不兼容py3

Traceback (most recent call last):
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainer.py", line 196, in context_process
self._status_processorcontext['status']
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainers/general_trainer.py", line 98, in network
network_class.build_network(context)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle_rec-0.0.2-py3.6.egg/paddlerec/core/trainers/framework/network.py", line 87, in build_network
model.net(model._data_var, context["is_infer"])
File "/ssd2/liyang/paddlerec/PaddleRec/models/rank/afm/model.py", line 140, in net
1]) # batch_size * (num_field*(num_field-1)/2) * 1
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 5674, in reshape
"XShape": x_shape})
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/_internal/cpython-3.6.0/lib/python3.6/site-packages/paddle/fluid/framework.py", line 1877, in init
self.desc.check_attrs()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackStringstd::string(std::string&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
2 paddle::framework::ExtractAttribute<std::vector<int, std::allocator > >::operator()(boost::variant<boost::blank, int, float, std::string, std::vector<int, std::allocator >, std::vector<float, std::allocator >, std::vector<std::string, std::allocatorstd::string >, bool, std::vector<bool, std::allocator >, paddle::framework::BlockDesc*, long, std::vector<paddle::framework::BlockDesc*, std::allocatorpaddle::framework::BlockDesc* >, std::vector<long, std::allocator >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>&) const
3 std::Function_handler<void ()(std::unordered_map<std::string, boost::variant<boost::blank, int, float, std::string, std::vector<int, std::allocator >, std::vector<float, std::allocator >, std::vector<std::string, std::allocatorstd::string >, bool, std::vector<bool, std::allocator >, paddle::framework::BlockDesc*, long, std::vector<paddle::framework::BlockDesc*, std::allocatorpaddle::framework::BlockDesc* >, std::vector<long, std::allocator >, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, boost::variant<boost::blank, int, float, std::string, std::vector<int, std::allocator >, std::vector<float, std::allocator >, std::vector<std::string, std::allocatorstd::string >, bool, std::vector<bool, std::allocator >, paddle::framework::BlockDesc*, long, std::vector<paddle::framework::BlockDesc*, std::allocatorpaddle::framework::BlockDesc* >, std::vector<long, std::allocator >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >, bool), paddle::framework::TypedAttrChecker<std::vector<int, std::allocator > > >::_M_invoke(std::_Any_data const&, std::unordered_map<std::string, boost::variant<boost::blank, int, float, std::string, std::vector<int, std::allocator >, std::vector<float, std::allocator >, std::vector<std::string, std::allocatorstd::string >, bool, std::vector<bool, std::allocator >, paddle::framework::BlockDesc, long, std::vector<paddle::framework::BlockDesc*, std::allocatorpaddle::framework::BlockDesc* >, std::vector<long, std::allocator >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, boost::variant<boost::blank, int, float, std::string, std::vector<int, std::allocator >, std::vector<float, std::allocator >, std::vector<std::string, std::allocatorstd::string >, bool, std::vector<bool, std::allocator >, paddle::framework::BlockDesc*, long, std::vector<paddle::framework::BlockDesc*, std::allocatorpaddle::framework::BlockDesc* >, std::vector<long, std::allocator >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >*, bool)
4 paddle::framework::OpDesc::CheckAttrs()

Error Message Summary:

Error: Cannot get attribute shape by type std::vector<int, std::allocator >, its type is std::vector<float, std::allocator > at (/paddle/paddle/fluid/framework/attribute.h:42)

Catch Exception:

C++ Call Stacks (More useful to developers):

Error Message Summary:

Error: Cannot get attribute shape by type std::vector<int, std::allocator >, its type is std::vector<float, std::allocator > at (/paddle/paddle/fluid/framework/attribute.h:42)

Exit app. catch exception in precoss status:network_pass, except:

C++ Call Stacks (More useful to developers):

Error Message Summary:

Error: Cannot get attribute shape by type std::vector<int, std::allocator >, its type is std::vector<float, std::allocator > at (/paddle/paddle/fluid/framework/attribute.h:42)

【用户使用问题】cluster training下默认0号GPU的卡冲突问题

cluster-trainer模式下，默认select_gpus是0，在集群上是有问题的，会导致卡冲突。

利用脚本下载的大数据安装遇到的问题，mark一下

python setup.py install 安装的时候会把目录下面提前下载好的大数据也cp一份数据，这个问题需要解决，这里记录一下

AIStudio安装PaddleRec 时出错，黄埔学员有两位出错

执行教程的时候

环境部署

安装PaddleRec

!cd PaddleRec/ && python setup.py install
在第一步遇到了错误，错误如下：
[Errno -3] Temporary failure in name resolution 的问题

请问是否下载地址需要换一个了？

mac下contentunderstanding.tagspace报错，linux下正常

python -m paddlerec.run -m paddlerec.models.contentunderstanding.tagspace

RuntimeError: Some of your fetched tensors hold LoD information.             They can not be completely cast to Python ndarray.             Please set the parameter 'return_numpy' as 'False' to             return LoDTensor itself directly.

Catch Exception:Some of your fetched tensors hold LoD information.             They can not be completely cast to Python ndarray.             Please set the parameter 'return_numpy' as 'False' to             return LoDTensor itself directly.



--------------------------------

PaddleRec Error Message Summary:

--------------------------------



Exit PaddleRec. catch exception in precoss status: [train_pass], except: Some of your fetched tensors hold LoD information.             They can not be completely cast to Python ndarray.             Please set the parameter 'return_numpy' as 'False' to             return LoDTensor itself directly.

RuntimeError

runner里不配置phase，需要check

try except error in python3

文档需要更新

dnn模型dense参数不起作用

https://github.com/PaddlePaddle/PaddleRec/blob/master/models/rank/dnn/model.py#L39

https://github.com/PaddlePaddle/PaddleRec/blob/master/models/rank/dnn/model.py#L56

把dense_input加上就可以

rank/fgcnn训练预测速度有些慢，希望后期可以优化一下。

修改fgcnn/model.py模型文件中的循环次数，与其它rank下其它模型对比速度仍很慢。

rank下dnn\dcn\deepfm\fnn等大部分模型，支持py3.6+QueueDataset训练（single_train）

拉取#71 兼容py3的pr后，本地测试发现py3+paddle1.7.2可支持QueueDataset训练。

single_infer报错

配置dataset type=DataLoader的infer报错: save_path 找不到

single_infer的时候在runner 下不配置epochs报错

dnn 数据预处理问题、py3兼容问题及文档问题

模型：https://github.com/PaddlePaddle/PaddleRec/tree/master/models/rank/dnn

1：dataset_generator.py中zip(feature_name, [dense_feature] + sparse_feature + [label])在py3中不兼容，需要修改为list(zip(feature_name, [dense_feature] + sparse_feature + [label]))

2：get_slot_data.py文件64行print及strip()方法不兼容py3，需修改为print(s.strip(b''))
https://github.com/PaddlePaddle/PaddleRec/blob/master/models/rank/dnn/data/get_slot_data.py#L64

3：文档中的network_conf.py需要修改为代码目录中的model.py

dnn 文档修复

debug选项没起作用

tools目录下面是否可添加几个编译镜像用的dockerfile（ubuntu、centos、window、mac、py2、py3）

效率云加CI需要dockerfile

缺少Dockerfile，效率云 CI 构建镜像会构建失败。

listwise不支持dataset？

raceback (most recent call last):
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainer.py", line 246, in run
self.context_process(self._context)
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainer.py", line 207, in context_process
self._status_processorcontext['status']
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainers/general_trainer.py", line 90, in network
network_class.build_network(context)
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainers/framework/network.py", line 106, in build_network
context)
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainers/framework/dataset.py", line 90, in create_dataset
return self._get_dataset(dataset_name, context)
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainers/framework/dataset.py", line 118, in _get_dataset
dataset.set_batch_size(batch_size)
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle/fluid/dataset.py", line 155, in set_batch_size
self.proto_desc.batch_size = batch_size
TypeError: None has type NoneType, but expected one of: int, long
Catch Exception:None has type NoneType, but expected one of: int, long

PaddleRec Error Message Summary:

Exit PaddleRec. catch exception in precoss status: [network_pass], except: None has type NoneType, but expected one of: int, long
TypeError

Error while finding module specification for 'paddlerec.run' (ModuleNotFoundError: No module named 'paddlerec')

hi,dear
没有这个东西啊，直接运行下面的

python -m paddlerec.run -m ./config.yaml
#or
python -m paddlerec.run -m paddlerec.models.recall.gnn

Error while finding module specification for 'paddlerec.run' (ModuleNotFoundError: No module named 'paddlerec')

咋解决啊，大佬

PaddleRec新增评价指标P，R, F1

如题~

windows单机cpu下跑word2vec报错，麻烦看一下

代码风格，以及执行报错

参数未定义：
https://github.com/PaddlePaddle/PaddleRec/blob/master/core/utils/validation.py#L23

不要使用python 内建函数作为参数：
https://github.com/PaddlePaddle/PaddleRec/blob/master/core/utils/validation.py#L19

执行报错：

请问新训练数据需要生成新的feat_dict_10.pkl2文件吗？

刚使用这个库在新数据上用没什么效果，请问是需要根据新的训练数据生成feat_dict_10.pkl2吗？还是说用已下载的就可以了。

【用户使用问题】建议支持组batch的功能

增加配置，可以自己在reader里面组完batch后训练直接用，dien模型下有这个配置，貌似还不通用

（建议）paddlerec添加version

print(paddlerec. __ version __)没有version这个attr，建议添加。(输出0.02）

【用户使用问题】数据读取异常报错，原因是隐藏文件导致

如题，config.yaml 记录的data_path路径下有未知文件或者隐藏文件导致数据读取异常

【文档】主页缺乏onlinetraining文档

【用户使用问题】SR-GNN训练速度及推理速度不及预期

PaddleRec/core/trainers/single_trainer.py中cudaplace设置不灵活

    device = envs.get_global_env("device")
    if device == 'gpu':
        self._place = fluid.CUDAPlace(0)
    elif device == 'cpu':
        self._place = fluid.CPUPlace()

Aborted at 1601178025 (unix time) try "date -d @1601178025" if you are using GNU date

Traceback (most recent call last):
File "/opt/conda/envs/python27-paddle120-env/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/trainers/framework/../../utils/dataset_instance.py", line 47, in
reader.run_from_stdin()
File "/opt/conda/envs/python27-paddle120-env/lib/python2.7/site-packages/paddle/fluid/incubate/data_generator/init.py", line 128, in run_from_stdin
for user_parsed_line in line_iter():
File "/opt/conda/envs/python27-paddle120-env/lib/python2.7/site-packages/paddle_rec-0.1.0-py2.7.egg/paddlerec/core/reader.py", line 83, in reader
feasign = int(slot_feasign[1])
IndexError: list index out of range
W0927 11:46:20.432853 1311 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0927 11:46:20.432900 1311 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0927 11:46:20.432905 1311 init.cc:214] The detail failure signal is:

W0927 11:46:20.432910 1311 init.cc:217] *** Aborted at 1601178380 (unix time) try "date -d @1601178380" if you are using GNU date ***
W0927 11:46:20.434149 1311 init.cc:217] PC: @ 0x0 (unknown)
W0927 11:46:20.434366 1311 init.cc:217] *** SIGSEGV (@0x0) received by PID 1290 (TID 0x7f0ceeafd700) from PID 0; stack trace: ***
W0927 11:46:20.435386 1311 init.cc:217] @ 0x7f0d376ac390 (unknown)
W0927 11:46:20.435848 1311 init.cc:217] @ 0x7f0d134b3d16 _ZNSt19_Sp_counted_deleterIP8_IO_FILEZN6paddle9framework11shell_popenERKSsS5_PiEUlS1_E_SaIiELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv
W0927 11:46:20.436623 1311 init.cc:217] @ 0x7f0d11807b59 std::_Sp_counted_base<>::_M_release()
Python error: is a directory, cannot continue
W0927 11:46:20.437424 1311 init.cc:217] @ 0x7f0d11b80efd paddle::framework::MultiSlotDataFeed::ReadThread()
W0927 11:46:20.438167 1311 init.cc:217] @ 0x7f0d29fa1c5c execute_native_thread_routine_compat
W0927 11:46:20.439100 1311 init.cc:217] @ 0x7f0d376a26ba start_thread
W0927 11:46:20.440027 1311 init.cc:217] @ 0x7f0d36cc841d clone
W0927 11:46:20.440940 1311 init.cc:217] @ 0x0 (unknown)
Segmentation fault (core dumped)

【用户使用问题】save_inference_model的fetch_var使用不友好，需要修复

如题，现在用户使用save_inference_model保存完整模型结构需要配置save_inference_path, save_inference_feed_varnames, save_inference_fetch_varnames的接口，其中fetch_var_names需要用户在组网中指导对应变量的var_names而不是自己指定的变量。目前建议的使用方式：ctcvr_prop_one = fluid.layers.elementwise_mul(ctr_prop_one,cvr_prop_one,name="fetch_name_ctcvr")

报错信息如下：
Error Message Summary:

Error: Cannot open increment_recall\0\fc_1.w_0_moment1_0 to write at (D:\1.7.2\paddle\paddle/fluid/operators/save_op.h:82)
[operator < save > error]
EnforceNotMet

paddlepaddle / paddlerec Goto Github PK

paddlerec's Introduction

最新动态

什么是推荐系统?

什么是PaddleRec?

快速使用

在线运行

环境要求

安装Paddle

下载PaddleRec

快速运行

帮助文档

项目背景

入门教程

进阶教程

FAQ

致谢

支持模型列表

支持模型列表

社区

版本历史

许可证书

联系我们

paddlerec's People

Contributors

Stargazers

Watchers

Forkers

paddlerec's Issues

PaddleRec Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

环境部署

安装PaddleRec

PaddleRec Error Message Summary:

报错信息如下： Error Message Summary:

Recommend Projects

Recommend Topics

Recommend Org

报错信息如下：
Error Message Summary: