paddlepaddle / paddleclas Goto Github PK

A treasure chest for visual classification and recognition powered by PaddlePaddle

License: Apache License 2.0

Python 74.91% Shell 7.79% CMake 1.06% C++ 13.33% Makefile 0.36% C 0.07% Java 2.50%

image-classification knowledge-distillation autoaugment randaugment gridmask cutmix deit repvgg swin-transformer pretrained-models

paddleclas's Introduction

简体中文 | English

PaddleClas

简介

飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别和图像分类任务的工具集，助力使用者训练出更好的视觉模型和应用落地。

PP-ShiTu图像识别系统应用范围

PULC实用图像分类模型效果展示

📣 近期更新

🔥2023.3.16 PaddleClas集成了高性能、全场景模型部署方案FastDeploy，欢迎参考指南试用（注意使用develop分支）。
💥 直播回放：PaddleClas研发团队详解PP-ShituV2优化策略与真实产业应用。微信扫描下方二维码，关注公众号并填写问卷后进入官方交流群，获取直播回放与20G重磅图像分类学习大礼包（内含20+数据集、4个垂类模型、70+前沿论文集合）

2022.9.14 发布商超零售新革命-生鲜智能结算产业应用范例
🔥️ 2022.9.13 发布超轻量图像识别系统PP-ShiTuV2：
- recall1精度提升8个点，覆盖商品识别、垃圾分类、航拍场景等20+识别场景，
- 新增库管理工具，Android Demo全新体验。
2022.9.4 新增生鲜产品自主结算范例库，具体内容可以在AI Studio上体验。
2022.6.15 发布PULC超轻量图像分类实用方案，CPU推理3ms，精度比肩SwinTransformer，覆盖人、车、OCR场景九大常见任务。
2022.5.23 新增人员出入管理范例库，具体内容可以在 AI Studio 上体验。
2022.5.20 上线PP-HGNet, PP-LCNetv2。
more

🌟 特性

PaddleClas支持多种前沿图像分类、识别相关算法，发布产业级特色骨干网络PP-HGNet、PP-LCNetv2、 PP-LCNet和SSLD半监督知识蒸馏方案等模型，在此基础上打造PULC超轻量图像分类方案和PP-ShiTu图像识别系统。

上述内容的使用方法建议从文档教程中的快速开始体验

⚡ 快速开始

PULC超轻量图像分类方案快速体验：点击这里
PP-ShiTu图像识别快速体验：点击这里
PP-ShiTuV2 Android Demo APP，可扫描如下二维码，下载体验

📖 技术交流合作

飞桨低代码开发工具（PaddleX）—— 面向国内外主流AI硬件的飞桨精选模型一站式开发工具。包含如下核心优势：
- 【产业高精度模型库】：覆盖10个主流AI任务 40+精选模型，丰富齐全。
- 【特色模型产线】：提供融合大小模型的特色模型产线，精度更高，效果更好。
- 【低代码开发模式】：图形化界面支持统一开发范式，便捷高效。
- 【私有化部署多硬件支持】：适配国内外主流AI硬件，支持本地纯离线使用，满足企业安全保密需要。
PaddleX官网地址：https://aistudio.baidu.com/intro/paddlex
PaddleX官方交流频道：https://aistudio.baidu.com/community/channel/610

👫 开源社区

📑项目合作： 如果您是企业开发者且有明确的图像分类应用需求，填写问卷后可免费与官方团队展开不同层次的合作。
👫加入社区： 微信扫描二维码并填写问卷之后，加入交流群领取20G重磅图像分类学习大礼包，内含
- 20+场景数据库，包括各类商品、动植物、航拍图像等数据集
- 场景应用模型集合：包括人员出入管理、生鲜品识别、商品识别等
- 70+前沿图像分类与识别论文、历次发版课程视频、PPT与优质社区项目等

🛠️ PP系列模型列表

模型简介	应用场景	模型下载链接
PULC 超轻量图像分类方案	固定图像类别分类方案	人体、车辆、文字相关9大模型：模型库连接
PP-ShituV2 轻量图像识别系统	针对场景数据类别频繁变动、类别数据多	主体检测模型：预训练模型 / 推理模型识别模型：预训练模型 / 推理模型
PP-LCNet 轻量骨干网络	针对Intel CPU设备及MKLDNN加速库定制	PPLCNet_x1_0：预训练模型 / 推理模型
PP-LCNetV2 轻量骨干网络	针对Intel CPU设备，适配OpenVINO	PPLCNetV2_base：预训练模型 / 推理模型
PP-HGNet 高精度骨干网络	GPU设备上相同推理时间精度更高	PPHGNet_small：预训练模型 / 推理模型

全部模型下载链接可查看文档教程中的各模型介绍

产业范例

基于PP-ShiTuV2的生鲜品自助结算：点击这里
基于PULC人员出入视频管理：点击这里
基于PP-ShiTu 的智慧商超商品识别：点击这里
基于PP-ShiTu电梯内电瓶车入室识别：点击这里

📖 文档教程

环境准备
PP-ShiTuV2图像识别系统介绍
- 图像识别快速体验
- 20+应用场景库
- 子模块算法介绍及模型训练
- PipeLine 推理部署
PULC超轻量图像分类实用方案
PP系列骨干网络模型
SSLD半监督知识蒸馏方案
前沿算法
产业实用范例库
30分钟快速体验图像分类
FAQ
社区贡献指南
许可证书
贡献代码

PP-ShiTuV2图像识别系统

PP-ShiTuV2是一个实用的轻量级通用图像识别系统，主要由主体检测、特征学习和向量检索三个模块组成。该系统从骨干网络选择和调整、损失函数的选择、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型裁剪量化多个方面，采用多种策略，对各个模块的模型进行优化，PP-ShiTuV2相比V1，Recall1提升近8个点。更多细节请参考PP-ShiTuV2详细介绍。

PP-ShiTuV2图像识别系统效果展示

瓶装饮料识别

商品识别

动漫人物识别

logo识别

车辆识别

PULC超轻量图像分类方案

PULC融合了骨干网络、数据增广、蒸馏等多种前沿算法，可以自动训练得到轻量且高精度的图像分类模型。 PaddleClas提供了覆盖人、车、OCR场景九大常见任务的分类模型，CPU推理3ms，精度比肩SwinTransformer。

PULC实用图像分类模型效果展示

许可证书

本项目的发布受Apache 2.0 license许可认证。

贡献代码

我们非常欢迎你为PaddleClas贡献代码，也十分感谢你的反馈。如果想为PaddleClas贡献代码，可以参考贡献指南。

非常感谢nblib修正了PaddleClas中RandErasing的数据增广配置文件。
非常感谢chenpy228修正了PaddleClas文档中的部分错别字。
非常感谢jm12138为PaddleClas添加ViT，DeiT系列模型和RepVGG系列模型。

paddleclas's People

Contributors

Stargazers

Watchers

Forkers

shippingwang wuhaobo littletomatodonkey d-danielyang laxlyezhi jiyuxuan926 y62951413 heavengate jiangjiajun leethony hajungong007 wishgale vicwer macytangtang stonebb payne4handsome legendxty vivien2019 xiaomoyan2020 slf12 sixgodgg vonalan ldoublev sunw71 yuantao15 tutu1234 freewind2016 burnszp zkming9 lihaolh 9876543211 t-mac-curry wqz960 saxon-zh michaelzhero ruyijidan clarkzc ttkerasoyeah chrisliusf chaos1992 sunzongdi ljyothers chengleiwei yghstill yousi2016 thailand88 adaxiadaxi beyondyourself kail85 dogwars xiaohuang1227 michaelowenliu 2862391954 jasper2326 chenpy228 sqiangcao asdlei99 linshufei nblib joe1chief tingquangao lkmy shiyuan0806 mldl richardgithubcode yuanpengcheng fanko24 sui6662012 retfings ttddtd noturneighbormrwang haoyuying frequencyxxq zhangxinnan ishgao res2net mymuli yzl19940819 peace-zy cq2019git ljcyu cqray1990 yuanmengwei witkeyshare zhiqiu dengjiongshen wanghaoshuang uptodiff huangxu96 channingss sandyhouse mikeshi80 magiccodess luozgang 2275853996 qingshuchen andrewd1024 caffebene xreki lucy3589

paddleclas's Issues

run PaddleClas infer.py ERROR

my infer.sh:
export PYTHONPATH=$PWD:$PYTHONPATH

python -m paddle.distributed.launch
--selected_gpus="0"
tools/infer/infer.py -i "dataset/FGVC2020_SSFGRC/test/26.jpg"
-m "SENet154_vd"
-p "output/expr20_SENet154_vd_train_bestv1_25971.txt_val2000_val2750_78.84"

ERROR:
Traceback (most recent call last):
File "tools/infer/infer.py", line 121, in
main()
File "tools/infer/infer.py", line 113, in main
return_numpy=False)
File "/home/daibing/software/anaconda2/lib/python2.7/site-packages/paddle/fluid/executor.py", line 790, in run
six.reraise(*sys.exc_info())
File "/home/daibing/software/anaconda2/lib/python2.7/site-packages/paddle/fluid/executor.py", line 785, in run
use_program_cache=use_program_cache)
File "/home/daibing/software/anaconda2/lib/python2.7/site-packages/paddle/fluid/executor.py", line 838, in _run_impl
use_program_cache=use_program_cache)
File "/home/daibing/software/anaconda2/lib/python2.7/site-packages/paddle/fluid/executor.py", line 909, in _run_program
self._feed_data(program, feed, feed_var_name, scope)
File "/home/daibing/software/anaconda2/lib/python2.7/site-packages/paddle/fluid/executor.py", line 591, in _feed_data
check_feed_shape_type(var, cur_feed)
File "/home/daibing/software/anaconda2/lib/python2.7/site-packages/paddle/fluid/executor.py", line 230, in check_feed_shape_type
(var.name, len(var.shape), var.shape, feed_shape))
ValueError: The fed Variable u'image' should have dimensions = 4, shape = (-1L, 3L, 224L, 224L), but received fed shape [3L, 224L, 224L] on each device

博主，想问下训练和验证过程用visualdl可视化，怎么改程序呢?

Incorrect setting of `is_test` in EfficientNet

is_test is not correctly set in EfficientNet, leading to drop_connect in test time. It can be easily reproduced by a repeat of inferring in the same image, like what happens in the following.

The predicted probabilities were different between different runs.

The cause may like this.

is_test defaults to False in EfficientNet and is not being set to True in either infer.py or predict.py.

Moreover, duplicated definition of is_test in both __init__ and net leads to confusion.

In fact, _drop_connect uses self.is_test and is_test passed by methods is not used.

It would be better to fix it.

resnet_vd训练出错，没有is_test字段

paddle版本：1.7.1
config: ResNet50_vd.yaml
执行训练后出错：

resnet50_vd init中确实没有is_test字段，但是program.create_model中会传入这个字段：

请问下这里是我的版本问题吗？

--use_tensorrt=True，怎么导出tensorrt引擎模型

想请问下，能否单独导出trt引擎文件，希望更灵活的使用trt模型，比如deep stream。

这几个接口能不能给下用例和参数说明，主要是参数

其他接口也是，基本没有参数说明

无法运行

Mixed Precision Training

Mixed precision training is available in PaddleCV/image_classification but not in this repo. According to Release Notes of PaddlePaddle 1.7, AMP interfaces have been added.

Based on these, I think it would be convenient to implement it.

Mixed precision training is critical to fast training on V100. Please consider adding it. Thank you!

import relative path instead of ```export PYTHONPATH```

although we support workable running scripts, some mistakes always happen when users try to run their own script without export PYTHONPATH

import ppcls by the relative path

您好，模型infer之后，同一张图的结果有diff

您好，我用infer脚本进行推断的时候遇到了如下的问题
第一次infer：class id: 1, probability: 0.9075
第二次infer：class id: 1, probability: 0.9048
第三次infer：class id: 1, probability: 0.9069

这是我的运行脚本：
export PYTHONPATH=$PWD:$PYTHONPATH
export CUDA_VISIBLE_DEVICES=0
#--model=EfficientNetB0 --pretrained_model=output/EfficientNetB0_val/best_model_in_epoch_124/ppcls --output_paht=./convert
python tools/infer/infer.py
--image_file=./tools/img.jpg
--model=EfficientNetB0
--pretrained_model=output/EfficientNetB0_val/best_model_in_epoch_124/ppcls \

其中我的改动是，在resize的时候去掉了resize_short模式，将图片直接resize到288大小

有遇到的小伙伴帮忙答疑一下呀，谢谢~~

demo运行出错

paddle环境1.7.2 cuda9.0 cudnn7.5
如果使用命令/home/vis/duyuting/app/anaconda3/bin/python -m paddle.distributed.launch --selected_gpus="0" tools/train.py -c ./configs/quick_start/ResNet50_vd.yaml 会报错:
Error: Failed to find dynamic library: libnccl.so ( /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/vis/duyuting/app/nccl_2.5.6-1+cuda10.0_x86_64/lib/libnccl.so) )
Please specify its path correctly using following ways:
Method. set environment variable LD_LIBRARY_PATH on Linux or DYLD_LIBRARY_PATH on Mac OS.
For instance, issue command: export LD_LIBRARY_PATH=...
Note: After Mac OS 10.11, using the DYLD_LIBRARY_PATH is impossible unless System Integrity Protection (SIP) is disabled. at (/paddle/paddle/fluid/platform/dynload/dynamic_loader.cc:177)
[operator < gen_nccl_id > error] 看起来是nccl问题
去官网下载了cuda9版本的nccl报错：
Error: An error occurred here. There is no accurate error hint for this error yet. We are continuously in the process of increasing hint for this kind of error check. It would be helpful if you could inform us of how this conversion went by opening a github issue. And we will resolve it with high priority.

New issue link: https://github.com/PaddlePaddle/Paddle/issues/new
Recommended issue content: all error stack information
[unhandled system error] at (/paddle/paddle/fluid/operators/distributed_ops/gen_nccl_id_op.cc:162)
[operator < gen_nccl_id > error]
如果不使用分布式命令:/home/vis/duyuting/app/anaconda3/bin/python tools/train.py -c ./configs/quick_start/ResNet50_vd.yaml 报错：Traceback (most recent call last):
File "tools/train.py", line 133, in
main(args)
File "tools/train.py", line 59, in main
fleet.init(role)
File "/home/vis/duyuting/app/anaconda3/lib/python3.7/site-packages/paddle/fluid/incubate/fleet/base/fleet_base.py", line 202, in init
self._role_maker.generate_role()
File "/home/vis/duyuting/app/anaconda3/lib/python3.7/site-packages/paddle/fluid/incubate/fleet/base/role_maker.py", line 500, in generate_role
assert self._worker_endpoints is not None, "can't find PADDLE_TRAINER_ENDPOINTS"
这个库难道不能单gpu运行？？？？

eval出错，AssertionError: can't find PADDLE_TRAINER_ENDPOINTS

执行你们教程，报AssertionError: can't find PADDLE_TRAINER_ENDPOINTS
python tools/eval.py
-c ./configs/eval.yaml
-o ARCHITECTURE.name="ResNet50_vd"
-o pretrained_model=output/ResNet50_vd/19/ppcls

训练loss突增后变为nan

用MobileNetV3_large_x1_0训练分类模型，训练到第二个epoch，loss突然增大后又变为nan，这是为什么呢？大家有什么经验吗？

动态图版本支持情况

如题，开发者你好，请问一下目前这个库的动态图版本代码能正常运行么？和静态图版本的开发进度目前有哪些是不对齐的？

se+hrnet

因为我想在HRNet下加上注意力机制，所以选择使用se+hrnet，在赢一个issue中反馈给我的是SE+HRNet需要有带SE的预训练，直接加载没有SE的预训练的模型精度会比较低。
我的问题：
1.是否有SE+HRNet的预训练
2.如果没有，我应该怎么训练能有一个较好的结果，是否有可行性的建议
3.是否有其他易于训练的注意力机制，相较于SE+HRNet在没有预训练模型的情况下容易训练。

ValueError: Operator "gen_nccl_id" has not been registered.

E:\projects\PaddleClas-master>python -m paddle.distributed.launch --selected_gpus='0' tools/train.py -c configs/quick_start/ResNet50_vd_finetune_my.yaml
----------- Configuration Arguments -----------
cluster_node_ips: 127.0.0.1
log_dir: None
node_ip: 127.0.0.1
print_config: True
selected_gpus: '0'
started_port: 6170
training_script: tools/train.py
training_script_args: ['-c', 'configs/quick_start/ResNet50_vd_finetune_my.yaml']
use_paddlecloud: False

trainers_endpoints: 127.0.0.1:6170 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 1
2020-05-13 23:57:14 INFO:

== PaddleClas is powered by PaddlePaddle ! ==

== ==
== For more info please go to the following website. ==
== ==
== https://github.com/PaddlePaddle/PaddleClas ==

2020-05-13 23:57:14 INFO: ARCHITECTURE :
2020-05-13 23:57:14 INFO: name : ResNet50_vd
2020-05-13 23:57:14 INFO: ------------------------------------------------------------
2020-05-13 23:57:14 INFO: LEARNING_RATE :
2020-05-13 23:57:14 INFO: function : Cosine
2020-05-13 23:57:14 INFO: params :
2020-05-13 23:57:14 INFO: lr : 0.00375
2020-05-13 23:57:14 INFO: ------------------------------------------------------------
2020-05-13 23:57:14 INFO: OPTIMIZER :
2020-05-13 23:57:14 INFO: function : Momentum
2020-05-13 23:57:14 INFO: params :
2020-05-13 23:57:14 INFO: momentum : 0.9
2020-05-13 23:57:14 INFO: regularizer :
2020-05-13 23:57:14 INFO: factor : 1e-06
2020-05-13 23:57:14 INFO: function : L2
2020-05-13 23:57:14 INFO: ------------------------------------------------------------
2020-05-13 23:57:14 INFO: TRAIN :
2020-05-13 23:57:14 INFO: batch_size : 32
2020-05-13 23:57:14 INFO: data_dir : G:/ai_data/paddle/0513/
2020-05-13 23:57:14 INFO: file_list : G:/ai_data/paddle/0513train.list
2020-05-13 23:57:14 INFO: num_workers : 4
2020-05-13 23:57:14 INFO: shuffle_seed : 0
2020-05-13 23:57:14 INFO: transforms :
2020-05-13 23:57:14 INFO: DecodeImage :
2020-05-13 23:57:14 INFO: channel_first : False
2020-05-13 23:57:14 INFO: to_np : False
2020-05-13 23:57:14 INFO: to_rgb : True
2020-05-13 23:57:14 INFO: RandCropImage :
2020-05-13 23:57:14 INFO: size : 224
2020-05-13 23:57:14 INFO: RandFlipImage :
2020-05-13 23:57:14 INFO: flip_code : 1
2020-05-13 23:57:14 INFO: NormalizeImage :
2020-05-13 23:57:14 INFO: mean : [0.485, 0.456, 0.406]
2020-05-13 23:57:14 INFO: order :
2020-05-13 23:57:14 INFO: scale : 1./255.
2020-05-13 23:57:14 INFO: std : [0.229, 0.224, 0.225]
2020-05-13 23:57:14 INFO: ToCHWImage : None
2020-05-13 23:57:14 INFO: ------------------------------------------------------------
2020-05-13 23:57:14 INFO: VALID :
2020-05-13 23:57:14 INFO: batch_size : 20
2020-05-13 23:57:14 INFO: data_dir : G:/ai_data/paddle/0513/
2020-05-13 23:57:14 INFO: file_list : G:/ai_data/paddle/0513test.list
2020-05-13 23:57:14 INFO: num_workers : 4
2020-05-13 23:57:14 INFO: shuffle_seed : 0
2020-05-13 23:57:14 INFO: transforms :
2020-05-13 23:57:14 INFO: DecodeImage :
2020-05-13 23:57:14 INFO: channel_first : False
2020-05-13 23:57:14 INFO: to_np : False
2020-05-13 23:57:14 INFO: to_rgb : True
2020-05-13 23:57:14 INFO: ResizeImage :
2020-05-13 23:57:14 INFO: resize_short : 256
2020-05-13 23:57:14 INFO: CropImage :
2020-05-13 23:57:14 INFO: size : 224
2020-05-13 23:57:14 INFO: NormalizeImage :
2020-05-13 23:57:14 INFO: mean : [0.485, 0.456, 0.406]
2020-05-13 23:57:14 INFO: order :
2020-05-13 23:57:14 INFO: scale : 1.0/255.0
2020-05-13 23:57:14 INFO: std : [0.229, 0.224, 0.225]
2020-05-13 23:57:14 INFO: ToCHWImage : None
2020-05-13 23:57:14 INFO: ------------------------------------------------------------
2020-05-13 23:57:14 INFO: classes_num : 3
2020-05-13 23:57:14 INFO: epochs : 20
2020-05-13 23:57:14 INFO: image_shape : [3, 224, 224]
2020-05-13 23:57:14 INFO: mode : train
2020-05-13 23:57:14 INFO: model_save_dir : E:/projects/PaddleClas-master/output/
2020-05-13 23:57:14 INFO: pretrained_model : E:/projects/PaddleClas-master/ResNet50_vd_pretrained
2020-05-13 23:57:14 INFO: save_interval : 1
2020-05-13 23:57:14 INFO: topk : 5
2020-05-13 23:57:14 INFO: total_images : 795
2020-05-13 23:57:14 INFO: valid_interval : 1
2020-05-13 23:57:14 INFO: validate : True

API is deprecated since 2.0.0 Please use FleetAPI instead.
WIKI: https://github.com/PaddlePaddle/Fleet/blob/develop/markdown_doc/transpiler

Traceback (most recent call last):
File "tools/train.py", line 124, in
main(args)
File "tools/train.py", line 69, in main
config, train_prog, startup_prog, is_train=True)
File "E:\projects\PaddleClas-master\tools\program.py", line 341, in build
optimizer.minimize(fetchs['loss'][0])
File "C:\python\tf\lib\site-packages\paddle\fluid\incubate\fleet\collective_init_.py", line 424, in minimize
fleet.main_program = self.try_to_compile(startup_program, main_program)
File "C:\python\tf\lib\site-packages\paddle\fluid\incubate\fleet\collective_init.py", line 358, in _try_to_compile
self.transpile(startup_program, main_program)
File "C:\python\tf\lib\site-packages\paddle\fluid\incubate\fleet\collective_init.py", line 285, in _transpile
current_endpoint=current_endpoint)
File "C:\python\tf\lib\site-packages\paddle\fluid\transpiler\distribute_transpiler.py", line 625, in transpile
wait_port=self.config.wait_port)
File "C:\python\tf\lib\site-packages\paddle\fluid\transpiler\distribute_transpiler.py", line 397, in _transpile_nccl2
self.config.hierarchical_allreduce_inter_nranks
File "C:\python\tf\lib\site-packages\paddle\fluid\framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "C:\python\tf\lib\site-packages\paddle\fluid\framework.py", line 1797, in init
proto = OpProtoHolder.instance().get_op_proto(type)
File "C:\python\tf\lib\site-packages\paddle\fluid\framework.py", line 1679, in get_op_proto
raise ValueError("Operator "%s" has not been registered." % type)
ValueError: Operator "gen_nccl_id" has not been registered.
2020-05-13 15:57:16,981-ERROR: ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
ERROR 2020-05-13 15:57:16,981 launch.py:284] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.

这是什么问题？

十万分类预训练模型的推断

UnavailableError: Load operator fail to open file pretrained/ResNet50_vd_10w_pretrained/fc_0.w_0, please check whether the model file is complete or damaged.
[Hint: Expected static_cast(fin) == true, but received static_cast(fin):0 != true:1.] at (/paddle/paddle/fluid/operators/load_op.h:41)
[operator < load > error]

received rank:2 != label_dims.size():3

报错： File "tools/train.py", line 124, in
main(args)

Error Message Summary:

InvalidArgumentError: If Attr(soft_label) == true, Input(X) and Input(Label) shall have the same dimensions. But received: the dimensions of Input(X) is [2],the shape of Input(X) is [-1, 2], the dimensions of Input(Label) is [3], the shape ofInput(Label) is [-1, 1, 2]
[Hint: Expected rank == label_dims.size(), but received rank:2 != label_dims.size():3.] at (D:\1.8.1\paddle\paddle\fluid\operators\cross_entropy_op.cc:63)
[operator < cross_entropy > error]
INFO 2020-05-23 18:17:34,812 utils.py:272] terminate all the procs
ERROR 2020-05-23 18:17:34,812 utils.py:416] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
INFO 2020-05-23 18:17:34,813 utils.py:272] terminate all the procs

图片512*512png，8位深度，类别1，2，3。这个报错里的rank和label_dims.size()分别是什么意思？？

请问大规模分类预训练模型下载地址是否提供

文档和model_zoo中均没有找到

aistudio上跑paddleclas的官方示例

请问有没有一个能跑通的，给我个链接

使用自己的训练集train from scratch

你好，我使用自己的训练集（只有1类物体）进行train from scratch ,但是训练的过程中，top1和top2始终是1.0000（eval也是这样的），如图所示：
使用的配置文件为resnet50_vd.yaml,在配置文件中我改了类别数为2，请问这种情况应该怎末更改配置文件？还有一个问题是，如何拿PaddleClas训练完成的分类模型使用PaddleDetection进行目标检测？谢谢！

HWC->CHW function redundancy

In operators.py
It seems that to_np, order and channel_first is not necessary
we already have a ToCHWImage function

能否提供十万类分类模型对应的类别标签

Why Larger Batch Size Slows Training

I am training WRN-28-10 on CIFAR10 using PaddleClas. When batch size > 128, using larger batch size, training gets slower. A detailed comparison is shown below.

Batch Size	Time (Per Epoch)
32	82.2s
64	72.8s
128	68.5s
256	74.1s
512	86.4s
1024	110.5s

The time of the 2nd epoch is reported, so warm-up time is not counted. Experiments showed that the results were consistent.

This behavior is strange and unexpected. Could you help me to find the reason?

Code to reproduce is here.

Thank you very much!

模型推理报错

你好，我在aistudio上已将训练的模型转成inference模型之后，在推断的时候报错了:
!export PYTHONPATH=./:$PYTHONPATH && python tools/infer/predict.py
-m=./inference/ResNet50_vd/model
-p=./inference/ResNet50_vd/params
-i=./dataset/flowers102/jpg/image_02275.jpg
--use_gpu=1
--use_tensorrt=True

报错信息如下：

Traceback (most recent call last):
File "tools/infer/predict.py", line 156, in
main()
File "tools/infer/predict.py", line 110, in main
predictor = create_predictor(args)
File "tools/infer/predict.py", line 66, in create_predictor
predictor = create_paddle_predictor(config)
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 paddle::framework::ir::PassRegistry::Get(std::string const&) const
3 paddle::inference::analysis::IRPassManager::CreatePasses(paddle::inference::analysis::Argument*, std::vector<std::string, std::allocatorstd::string > const&)
4 paddle::inference::analysis::IRPassManager::IRPassManager(paddle::inference::analysis::Argument*)
5 paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument*)
6 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*)
7 paddle::AnalysisPredictor::OptimizeInferenceProgram()
8 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptrpaddle::framework::ProgramDesc const&)
9 paddle::AnalysisPredictor::Init(std::shared_ptrpaddle::framework::Scope const&, std::shared_ptrpaddle::framework::ProgramDesc const&)
10 std::unique_ptr<paddle::PaddlePredictor, std::default_deletepaddle::PaddlePredictor > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&)
11 std::unique_ptr<paddle::PaddlePredictor, std::default_deletepaddle::PaddlePredictor > paddle::CreatePaddlePredictorpaddle::AnalysisConfig(paddle::AnalysisConfig const&)

Error Message Summary:

Error: Pass tensorrt_subgraph_pass has not been registered at (/paddle/paddle/fluid/framework/ir/pass.h:201)

请问如何解决？

请问是否有c++的部署代码？

请问下目前paddleclas是否有c++的部署代码，以及采用c++部署后的性能结果？

--use_tensorrt=True

Error: Pass tensorrt_subgraph_pass has not been registered at (/paddle/paddle/fluid/framework/ir/pass.h:170)

export_model模型转换出错

export CUDA_VISIBLE_DEVICES=0
python -m paddle.distributed.launch
--selected_gpus="0"
tools/train.py
-c ./configs/quick_start/ResNet50_vd.yaml

使用上述命令训练模型后，然后通过export_model转换模型
python tools/export_model.py --model=ResNet50_vd --pretrained_model=output/ResNet50_vd/19/ --output_path=inference/ResNet50_vd --class_dim=102

报错
2020-05-09 14:36:17,701-WARNING: output/ResNet50_vd/19/.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ]
2020-05-09 14:36:17,701-WARNING: output/ResNet50_vd/19/.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ]
2020-05-09 14:36:17,703-WARNING: variable file [ output/ResNet50_vd/19/ppcls.pdopt output/ResNet50_vd/19/ppcls.pdparams output/ResNet50_vd/19/ppcls.pdmodel ] not used
2020-05-09 14:36:17,703-WARNING: variable file [ output/ResNet50_vd/19/ppcls.pdopt output/ResNet50_vd/19/ppcls.pdparams output/ResNet50_vd/19/ppcls.pdmodel ] not used
/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py:804: UserWarning: There are no operators in the program to be executed. If you pass Program manually, please use fluid.program_guard to ensure the current Program is being used.
warnings.warn(error_info)
/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/export_model.py", line 78, in
main()
File "tools/export_model.py", line 74, in main
params_filename='params')
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 1245, in save_inference_model
save_persistables(executor, save_dirname, main_program, params_filename)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 640, in save_persistables
filename=filename)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 295, in save_vars
filename=filename)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 350, in save_vars
executor.run(save_program)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 783, in run
six.reraise(*sys.exc_info())
File "/home/lishi/anaconda3/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 778, in run
use_program_cache=use_program_cache)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 831, in _run_impl
use_program_cache=use_program_cache)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 905, in _run_program
fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::framework::Tensor::type() const
3 paddle::operators::SaveCombineOpKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
4 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::SaveCombineOpKernel<paddle::platform::CPUDeviceContext, float>, paddle::operators::SaveCombineOpKernel<paddle::platform::CPUDeviceContext, double>, paddle::operators::SaveCombineOpKernel<paddle::platform::CPUDeviceContext, int>, paddle::operators::SaveCombineOpKernel<paddle::platform::CPUDeviceContext, long> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
6 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
7 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
8 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
9 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool, bool)

Python Call Stacks (More useful to users):

File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 343, in save_vars
'save_to_memory': save_to_memory
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 295, in save_vars
filename=filename)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 640, in save_persistables
filename=filename)
File "/home/lishi/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 1245, in save_inference_model
save_persistables(executor, save_dirname, main_program, params_filename)
File "tools/export_model.py", line 74, in main
params_filename='params')
File "tools/export_model.py", line 78, in
main()

Error Message Summary:

Error: Tensor not initialized yet when Tensor::type() is called.
[Hint: holder_ should not be null.] at (/paddle/paddle/fluid/framework/tensor.h:140)
[operator < save_combine > error]

数据列表文件 delimiter

貌似配置里面不能设置数据列表的delimiter，我这数据集里面文件名带空格，能用 | 的话会很方便

模型推断问题

请教下大佬：
1、使用如下命令貌似只能推断一张图片，如果做到推断一个文件夹呢？类似paddle detection那样指定一个infer_dir。
python tools/infer/predict.py
-m model文件路径
-p params文件路径
-i 图片路径
--use_gpu=1
--use_tensorrt=True

2、windows环境下，怎样设置环境变量呢？我用aistudio上面的命令，Windows终端不认啊：
export PYTHONPATH=$PWD:$PYTHONPATH

train from scratch

你好，我想问一下，需要使用paddleClas从头训练自己的数据，但是那个train_list.txt 中除了图片路径外，位置坐标是使用中心点坐标和宽高，还是使用左上右下坐标呢？

PaddleClas训练数据不均衡

你好，请问如果训练数据不均衡出现数据倾斜，目前PaddleClas是否有相对应解决办法，谢谢。

export model 出现Tensor not initialized yet when Tensor::type() is called错误

根据教程导出模型的过程：
python tools/export_model.py
--model=MobileNetV3_large_x1_0
--pretrained_model=./output/MobileNetV3_large_x1_0/best_model_in_epoch_7/
--output_path=./convert/ \

报错如下：有经验的小伙伴帮忙看看？

Python Call Stacks (More useful to users):

File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 343, in save_vars
'save_to_memory': save_to_memory
File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 295, in save_vars
filename=filename)
File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 641, in save_persistables
filename=filename)
File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/io.py", line 1246, in save_inference_model
save_persistables(executor, save_dirname, main_program, params_filename)
File "tools/export_model.py", line 74, in main
params_filename='params')
File "tools/export_model.py", line 78, in
main()

Error Message Summary:

Error: Tensor not initialized yet when Tensor::type() is called.
[Hint: holder_ should not be null.] at (/paddle/paddle/fluid/framework/tensor.h:140)
[operator < save_combine > error]

如何在paddleHub上部署

你好，我用SSLD模型微调-基于ResNet50_vd_ssld预训练模型来做训练，然后生成了inference模型，想用PaddleHub进行部署的操作，有没有什么借壳快速部署的方式替换一下原来的module下的inference模型就可以启动部署的方式。

Error: Pass tensorrt_subgraph_pass has not been registered

按照AI Studio中 30分钟玩转PaddleClas的教程，训练完模型并导出后，进行infer的时候报错：

这是指令：

报错 Stack：

报错 error msg：

请大神帮忙看看，谢谢！

调用模型微调命令训练出错

模型调用命令，使用百度ResNet50_vd_10w的预训练模型：
set CUDA_VISIBLE_DEVICES=0
python -m paddle.distributed.launch --selected_gpus="0" tools/train.py -c ./configs/quick_start/ResNet50_vd_10w_finetune.yaml

报错：

Traceback (most recent call last):
File "tools/train.py", line 150, in
main(args)
File "tools/train.py", line 75, in main
config, train_prog, startup_prog, is_train=True)
File "F:\pythonproject\PaddleClas\PaddleClas\tools\program.py", line 363, in build
optimizer.minimize(fetchs['loss'][0])
File "F:\Anaconda3\lib\site-packages\paddle\fluid\incubate\fleet\collective_init_.py", line 652, in minimize
fleet.main_program = self.try_to_compile(startup_program, main_program)
File "F:\Anaconda3\lib\site-packages\paddle\fluid\incubate\fleet\collective_init.py", line 562, in _try_to_compile
self.transpile(startup_program, main_program)
File "F:\Anaconda3\lib\site-packages\paddle\fluid\incubate\fleet\collective_init.py", line 489, in _transpile
current_endpoint=current_endpoint)
File "F:\Anaconda3\lib\site-packages\paddle\fluid\transpiler\distribute_transpiler.py", line 625, in transpile
wait_port=self.config.wait_port)
File "F:\Anaconda3\lib\site-packages\paddle\fluid\transpiler\distribute_transpiler.py", line 397, in _transpile_nccl2
self.config.hierarchical_allreduce_inter_nranks
File "F:\Anaconda3\lib\site-packages\paddle\fluid\framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "F:\Anaconda3\lib\site-packages\paddle\fluid\framework.py", line 1870, in init
proto = OpProtoHolder.instance().get_op_proto(type)
File "F:\Anaconda3\lib\site-packages\paddle\fluid\framework.py", line 1751, in get_op_proto
raise ValueError("Operator "%s" has not been registered." % type)
ValueError: Operator "gen_nccl_id" has not been registered.
INFO 2020-06-22 11:29:30,706 utils.py:272] terminate all the procs
ERROR 2020-06-22 11:29:30,706 utils.py:416] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
INFO 2020-06-22 11:29:30,706 utils.py:272] terminate all the procs

ResNet50_vd_10w_finetune.yaml文件配置如下：
mode: 'train'
ARCHITECTURE:
name: 'ResNet50_vd'
pretrained_model: "F:/pythonproject/PaddleClas/PaddleClas/ResNet50_vd_10w_pretrained/ResNet50_vd_10w_pretrained"
model_save_dir: "./output/"
classes_num: 5
total_images: 11745
save_interval: 1
validate: True
valid_interval: 1
epochs: 20
topk: 2
image_shape: [3, 224, 224]

LEARNING_RATE:
function: 'Cosine'
params:
lr: 0.00375

OPTIMIZER:
function: 'Momentum'
params:
momentum: 0.9
regularizer:
function: 'L2'
factor: 0.000001

TRAIN:
batch_size: 32
num_workers: 4
file_list: "F:/pythonproject\PaddleClas/PaddleClas/dataset/driver/train_list.txt"
data_dir: "F:/pythonproject\PaddleClas/PaddleClas/dataset/driver/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1./255.
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:

VALID:
batch_size: 20
num_workers: 4
file_list: "F:/pythonproject\PaddleClas/PaddleClas/dataset/driver/val_list.txt"
data_dir: "F:/pythonproject\PaddleClas/PaddleClas/dataset/driver/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:

The relationship between models/PaddleCV/image_classification and this repo

Some models'speed are different

when infer a image: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

aistudio@jupyter-305239-473669:~/work/PaddleClas$ python tools/infer/predict.py -m output_ca/ResNet50_vd/last/model -p output_ca/ResNet50_vd/last/params -i ./test0.jpg --use_gpu=1
Traceback (most recent call last):
File "tools/infer/predict.py", line 160, in
main()
File "tools/infer/predict.py", line 121, in main
inputs = preprocess(args.image_file, operators)
File "tools/infer/predict.py", line 88, in preprocess
data = open(fname).read()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

what the problem?

multi_process reader的问题

Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit

/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
(Pdb)
/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
Process Process-2:
(Pdb)
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
(Pdb)
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
2020-05-27 14:43:10 WARNING: Your reader has raised an exception!
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1156, in thread_main
six.reraise(*sys.exc_info())
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1136, in thread_main
for tensors in self._tensor_reader():
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1206, in tensor_reader_impl
for slots in paddle_reader():
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/data_feeder.py", line 506, in reader_creator
for item in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 267, in wrapper
for idx, sample in enumerate(reader()):
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 572, in queue_reader
raise ValueError("multiprocess reader raises an exception")
ValueError: multiprocess reader raises an exception

/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
(Pdb)
Process Process-4:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
(Pdb)
Process Process-2:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
Traceback (most recent call last):
File "./jaits_utils/task_tools.py", line 494, in inner
func(jif,*args, **kwargs)
File "cla/jaits_train.py", line 215, in main
epoch_id, 'train')
File "/home/pd_source/cla/program.py", line 413, in run
for idx, batch in enumerate(dataloader()):
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1102, in next
return self._reader.read_next()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&)
5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Error Message Summary:

Error: Blocking queue is killed because the data reader raises an exception
[Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141)

2020-05-27 14:43:10 INFO: SO:exception-Traceback (most recent call last):
File "./jaits_utils/task_tools.py", line 494, in inner
func(jif,*args, **kwargs)
File "cla/jaits_train.py", line 215, in main
epoch_id, 'train')
File "/home/pd_source/cla/program.py", line 413, in run
for idx, batch in enumerate(dataloader()):
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1102, in next
return self._reader.read_next()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&)
5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Error Message Summary:

/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
(Pdb)
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
2020-05-27 14:43:10 WARNING: Your reader has raised an exception!
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1156, in thread_main
six.reraise(*sys.exc_info())
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1136, in thread_main
for tensors in self._tensor_reader():
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1206, in tensor_reader_impl
for slots in paddle_reader():
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/data_feeder.py", line 506, in reader_creator
for item in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 267, in wrapper
for idx, sample in enumerate(reader()):
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 572, in queue_reader
raise ValueError("multiprocess reader raises an exception")
ValueError: multiprocess reader raises an exception

/home/pd_source/cla/ppcls/data/reader.py(191)reader()
-> for line in full_lines:
(Pdb)
Process Process-4:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/usr/local/lib/python3.5/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/reader/decorator.py", line 549, in _read_into_queue
for sample in reader():
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/home/pd_source/cla/ppcls/data/reader.py", line 191, in reader
for line in full_lines:
File "/usr/lib/python3.5/bdb.py", line 48, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.5/bdb.py", line 67, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
Traceback (most recent call last):
File "./jaits_utils/task_tools.py", line 494, in inner
func(jif,*args, **kwargs)
File "cla/jaits_train.py", line 215, in main
epoch_id, 'train')
File "/home/pd_source/cla/program.py", line 413, in run
for idx, batch in enumerate(dataloader()):
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 1102, in next
return self._reader.read_next()
paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&)
5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Error Message Summary:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >)
4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&)
5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Error Message Summary:

日志文件在哪？

FAQ中说“启动运行后，日志会实时输出到mylog/workerlog.*中，可以在这里查看实时的日志。”
但我为什么我运行后却找不到mylog文件夹？另外怎么可视化训练过程？

下载预训练模型报错，请帮忙看一下

!python tools/download.py -a ResNet50_vd -p ./pretrained -d True
!python tools/download.py -a ResNet50_vd_ssld -p ./pretrained -d True
!python tools/download.py -a MobileNetV3_large_x1_0 -p ./pretrained -d True

Traceback (most recent call last):
File "tools/download.py", line 17, in
from ppcls import model_zoo
ModuleNotFoundError: No module named 'ppcls'
Traceback (most recent call last):
File "tools/download.py", line 17, in
from ppcls import model_zoo
ModuleNotFoundError: No module named 'ppcls'
Traceback (most recent call last):
File "tools/download.py", line 17, in
from ppcls import model_zoo
ModuleNotFoundError: No module named 'ppcls'

window10 x64 如何写训练语句

我的笔记本是window10 x64, 显卡是NVIDIA GeForce GTX 1650.

我按照示例程序编写训练语句，如下：python -m paddle.distributed.launch
--selected_gpus="0"
tools/train.py
-c ./configs/quick_start/ResNet50_vd.yaml

结果提示 ”gen_nccl_id ” has not been registered, 咨询QQ群说是window不支持多卡，请问针对我目前情况，应该如何写训练语句

res2net 200模型命名python2错误

在创建res2net 200层模型时，py2会报错：
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 4: invalid start byte

因为层数超过26个英文字母，代码里的命名会出错
conv_name = "res" + str(block+2) + chr(97+i)

代码里的上一个分支应该增加res2net200
if layers in [101, 152, 200] and block == 2:

AttributeError: module 'paddle.fluid.core_avx' has no attribute 'get_cuda_device_count'

在windows10，cpu环境下运行下面的命令出现上面的错误
python -m paddle.distributed.launch tools/train.py -c ./configs/quick_start/ResNet50_vd.yaml

PaddleClas能否提供对cpu的支持呢？

请问一下，飞桨有尝试使用HRNet来进行ImageNet的分类任务吗？

尊敬的开发者，你好！
请问一下，飞桨有尝试使用HRNet来进行ImageNet的分类任务吗？

期待你的回复！

Add unittest in PaddleClas

As the CI is already built,
The unittest can be reconstructed, like:

|—— ppcls
|
|—— test
|————|———— test_reader.py
|————|———— test_imaug.py
|————|———— test_download.py
|————|———— test_compress.py
|————|———— test_model.py
|————|———— test_speed.py
|————|———— test_finetune.py
|————|———— test_eval.py
|————|———— test_train.py
|————|———— test_infer.py
|————|———— test_performance.py (IMPORTANT)
|_________|__________test_export.py

想知道去重的具体步骤

非常感谢这么棒的项目!!! 我对数据集的去重方式有点疑问, 因为我现在的数据集也需要去重, 但是我只知道使用sift找到特征点, 但是不同图片匹配到的特征点数量也是不同的, 那么怎么判断两幅图片的相似百分比呢? 然后设定阈值去去重图片

make the concept: place clear

The concept: place confuse when someone tries to set available gpu places by indicating CUDA_VISIBLE_DEVICES

using Fleet interface, only the FLAGS_selected_gpus works

so we have to obtain gpu num by

gpu_num = paddle.fluid.core.get_cuda_device_count() if (
        'PADDLE_TRAINERS_NUM') and (
            'PADDLE_TRAINER_ID'
    ) not in env else int(env.get('PADDLE_TRAINERS_NUM', 0))

remove this switch

resnet50vd耗时

你好，我在v100上测试resnet50vd耗时接近24ms，你们的5ms以内是怎么测试的

paddlepaddle / paddleclas Goto Github PK

paddleclas's Introduction

PaddleClas

简介

📣 近期更新

🌟 特性

⚡ 快速开始

📖 技术交流合作

👫 开源社区

🛠️ PP系列模型列表

产业范例

📖 文档教程

PP-ShiTuV2图像识别系统

PP-ShiTuV2图像识别系统效果展示

PULC超轻量图像分类方案

PULC实用图像分类模型效果展示

许可证书

贡献代码

paddleclas's People

Contributors

Stargazers

Watchers

Forkers

paddleclas's Issues

trainers_endpoints: 127.0.0.1:6170 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 1 2020-05-13 23:57:14 INFO:

== PaddleClas is powered by PaddlePaddle ! ==

== == == For more info please go to the following website. == == == == https://github.com/PaddlePaddle/PaddleClas ==

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Python Call Stacks (More useful to users):

Error Message Summary:

报错如下：有经验的小伙伴帮忙看看？

Python Call Stacks (More useful to users):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

C++ Call Stacks (More useful to developers):

Error Message Summary:

Recommend Projects

Recommend Topics

Recommend Org

trainers_endpoints: 127.0.0.1:6170 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 1
2020-05-13 23:57:14 INFO:

== ==
== For more info please go to the following website. ==
== ==
== https://github.com/PaddlePaddle/PaddleClas ==