stevenwudi / kaggle_pku_baidu Goto Github PK

View Code? Open in Web Editor NEW

71.0 71.0 11.0 23.55 MB

Kaggle_PKU_Baidu

License: Apache License 2.0

Python 92.22% Dockerfile 0.02% C++ 2.27% Cuda 5.39% Shell 0.10%

kaggle-pku-baidu

kaggle_pku_baidu's People

Contributors

Stargazers

Watchers

Forkers

zechendev enhenghengheng ahmadkam fsxy1063200037 nikolasbielski youngjoo-kim tyunist datomi79 mostafa-mansour lez14004 yamina55

kaggle_pku_baidu's Issues

测试google云盘中的权重文件的问题

您好，我使用您提供的权重文件Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth进行测试，
在终端中运行python test_kaggle_pku.py之后显示Writing submission csv file to: /home/shi/data/Kaggle_pku/checkpoints/checkpoints_Jan29-00-02_epoch_261_serialized_ssd-4094ffb2__conf_0.9.csv，然后我将输出的.csv文件直接上传到kaggle比赛的网站，但是结果却是很低
在（私人/公共排行榜）上达到0.013 / 0.013，没有达到您那么高的分数，希望在您百忙之中给予解答，谢谢。

The problem of insufficient GPU memory during training

Thank you very much for being able to open source project. I encountered some problems in the process of learning the project, and I hope to get your help. I tried to use 4 nvidia2080ti for distributed training, but I encountered a problem of insufficient memory. I see It is mentioned in the Readme file that training can be completed by using nvidia1080. What are the possible reasons?

It seems that the file is missing or set incorrectly

checkpoint = load_checkpoint(model, '/pretrained_model/Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth', map_location='cpu')
The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc_car_cls_weight.weight, fc_rot_weight.weight, fc_translation_weight.weight

install mmdet: 1.0.rc0+d3ca926 error

Hi，I want to know if the wrong version of the installation will cause the code to run incorrectly.
I follow your codepython setup.py install and after install ，
i run pip list ， and show
mmdet 1.0rc0+unknown /home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg
this is not mmdet: 1.0.rc0+d3ca926

and ，i run python train_kaggle_pku.py,

Then the previous problem occurred
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 30, in before_run 'Please run "pip install future tensorboard" to install ' ImportError: Please run "pip install future tensorboard" to install the dependencies to use torch.utils.tensorboard (applicable to PyTorch 1.1 or higher)

So,i want to known,whether the mmdet version is wrong，that will cause code errors to run.
and how to install the right mmdet version mmdet:1.0.rc0+d3ca926.
Hope you can answer in your free time ，thank you.

单GPU训练产生错误

您好，又来给您添麻烦了，我又遇到了两个问题

我已经将Configurations和Dataset setup（使用的是您提供的kaggle_apollo_combined_6691_origin.json）都已经配置好
由于我只有单GPU所以，Running 单GPU-train code （python train_kaggle_pku.py），但是产生错误

2020-03-14 23:07:57,375 - INFO - Distributed training: False
 16%|█▋        | 13/79 [00:00<00:00, 127.31it/s]Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 122.13it/s]
Totaly corrupted count is: 1240, clean count: 74029
Total images: 6691, car num sum: 74029, minmin: 1, max: 43, mean: 11
x min: -851, max: 4116, mean: 1535
y min: 1482, max: 3427, mean: 1821
x min: -79, max: 79, mean: -3, std: 13.622
y min: 1, max: 42, mean: 9, std: 4.747
z min: 3, max: 150, mean: 50, std: 29.950
Car model: max: 76, min: 2, total: 74029
Unique car models:
[ 2  6  7  8  9 12 14 16 18 19 20 23 25 27 28 31 32 35 37 40 43 46 47 48
 50 51 54 56 60 61 66 70 71 76]
Number of unique car models: 34
  0%|          | 0/79 [00:00<?, ?it/s]validation_images
Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 119.03it/s]
2020-03-14 23:08:03,864 - INFO - Start running, host: shi@shi-Lenovo-Legion-Y7000P-1060, work_dir: /home/shi/Kaggle/checkpoints/Mar14-23-07
2020-03-14 23:08:03,864 - INFO - workflow: [('train', 1)], max: 200 epochs
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Traceback(most recent call last):
  File "<ipython-input-1-3c5e8e6cf921>", line 1, in <module>
    runfile('/home/shi/Kaggle/tools/train_kaggle_pku.py', wdir='/home/shi/Kaggle/tools')
  File "/home/shi/anaconda3/lib/python3.7/site-
packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)
  File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
  File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 100, in <module>
    main()
  File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 96, in main
    logger=logger)
  File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 79, in train_detector
    _non_dist_train(model, dataset, cfg, validate=validate)
  File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 257, in _non_dist_train
    runner.run(data_loaders, cfg.workflow, `cfg.total_epochs)`
  File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 351, in run
    self.call_hook('before_run')

  File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 238, in call_hook
    getattr(hook, fn_name)(self)

  File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 74, in wrapper
    return func(*args, **kwargs)

  File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 28, in before_run
    'Please run "pip install future tensorboard" to install '

ImportError: Please run "pip install future tensorboard" to install the dependencies to use torch.utils.tensorboard (applicable to PyTorch 1.1 or higher)

我按照提示，pip install future tensorboard之后，还是会报这个错误，所以没办法了，只能求助您了。

我还有一个问题，就是单GPU train 代码python train_kaggle_pku.py不支持验证评估，那我要是想用多GPU train 代码
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch
在只有单GPU的机器上train，并进行验证评估，我要怎么更改这行代码呢？CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch

希望在您百忙之中能给予解答，谢谢。

configs.data.train中的train.csv错误

您好，我在configs/htc/htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.py文件中，想要使用kaggle的数据集train.csv,于是将config.data.train代码改成如下：
data = dict( imgs_per_gpu=1, workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root, #ann_file='/data/cyh/kaggle/kaggle_apollo_combine_6692.json', # ann_file=data_root + 'apollo_kaggle_combined_6725_wudi.json', ann_file='/data/Kaggle/pku-autonomous-driving/train.csv', # 6691 means the final cleaned data img_prefix=data_root + 'train_images/', pipeline=train_pipeline, rotation_augmenation=True),

但是产生错误：
`runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')
2020-04-07 13:00:48,313 - INFO - Distributed training: False
14%|█▍ | 11/79 [00:00<00:00, 77.85it/s]Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 109.19it/s]
Traceback (most recent call last):

File "", line 1, in
runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)

File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in
main()

File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main
datasets = [build_dataset(cfg.data.train)]

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg
return obj_cls(**args)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init
self.img_infos = self.load_annotations(self.ann_file)

File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 75, in load_annotations
annotations = json.load(open(outfile, 'r'))

File "/home/shi/anaconda3/lib/python3.7/json/init.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

File "/home/shi/anaconda3/lib/python3.7/json/init.py", line 348, in loads
return _default_decoder.decode(s)

File "/home/shi/anaconda3/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File "/home/shi/anaconda3/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting value`
请问这样的问题该如何解决？

How to use monocular imaged Kitti interface？

I have found the monocular imaged interface based on Kitti3D, however, I didn't find the network architecture to be suitable for the Kitti3D. Hence, could I run the code in the kitti3D dataset and could u tell me how to fix it?

kaggle_apollo_combined_6691_origin.json文件

您好，我很仰慕您，从6DVNet开始一直在跟随您的脚步，我最近在复现您kaggle PKU比赛中的结果，但是遇到了问题，我想知道这个文件kaggle_apollo_combined_6691_origin.json
的含义、在哪可以找到或者我自己通过数据集如何构建。

more detailed docker file would help

I am very interested on your work. I have used your dockerfile to start applying your git. However, mmcv was not installed. It seems like a version problem and I have tried so many ways to get around it. Would you mind please make a fully installed dockerfile or upload your docker image?

How to load your pre-trained weights into the model?

Hi,
I'm trying to run your model for a university project, but I can't figure out how to configure your code to use your pre-trained weights. Do I use 'cfg.load_from' in your provided config file?

Any help would be appreciated. I'm running inside the docker-file you have provided.

stevenwudi / kaggle_pku_baidu Goto Github PK

kaggle_pku_baidu's People

Contributors

Stargazers

Watchers

Forkers

kaggle_pku_baidu's Issues

测试google云盘中的权重文件的问题

The problem of insufficient GPU memory during training

It seems that the file is missing or set incorrectly

install mmdet: 1.0.rc0+d3ca926 error

单GPU训练产生错误

configs.data.train中的train.csv错误

How to use monocular imaged Kitti interface？

kaggle_apollo_combined_6691_origin.json文件

more detailed docker file would help

How to load your pre-trained weights into the model?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent