stevenwudi / kaggle_pku_baidu Goto Github PK
View Code? Open in Web Editor NEWKaggle_PKU_Baidu
License: Apache License 2.0
Kaggle_PKU_Baidu
License: Apache License 2.0
您好,我使用您提供的权重文件Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth进行测试,
在终端中运行python test_kaggle_pku.py
之后显示Writing submission csv file to: /home/shi/data/Kaggle_pku/checkpoints/checkpoints_Jan29-00-02_epoch_261_serialized_ssd-4094ffb2__conf_0.9.csv
,然后我将输出的.csv文件直接上传到kaggle比赛的网站,但是结果却是很低
在(私人/公共排行榜)上达到0.013 / 0.013,没有达到您那么高的分数,希望在您百忙之中给予解答,谢谢。
Thank you very much for being able to open source project. I encountered some problems in the process of learning the project, and I hope to get your help. I tried to use 4 nvidia2080ti for distributed training, but I encountered a problem of insufficient memory. I see It is mentioned in the Readme file that training can be completed by using nvidia1080. What are the possible reasons?
checkpoint = load_checkpoint(model, '/pretrained_model/Jan29-00-02_epoch_261_serialized_ssd-4094ffb2.pth', map_location='cpu')
The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc_car_cls_weight.weight, fc_rot_weight.weight, fc_translation_weight.weight
these keys have mismatched shape:
+------------------------------------+----------------------+-------------------------+
| key | expected shape | loaded shape |
+------------------------------------+----------------------+-------------------------+
| translation_head.trans_pred.weight | torch.Size([3, 200]) | torch.Size([1629, 200]) |
| translation_head.trans_pred.bias | torch.Size([3]) | torch.Size([1629]) |
+------------------------------------+----------------------+-------------------------+
Hi,I want to know if the wrong version of the installation will cause the code to run incorrectly.
I follow your codepython setup.py install
and after install ,
i run pip list
, and show
mmdet 1.0rc0+unknown /home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg
this is not mmdet: 1.0.rc0+d3ca926
and ,i run python train_kaggle_pku.py
,
Then the previous problem occurred
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 30, in before_run 'Please run "pip install future tensorboard" to install ' ImportError: Please run "pip install future tensorboard" to install the dependencies to use torch.utils.tensorboard (applicable to PyTorch 1.1 or higher)
So,i want to known,whether the mmdet version is wrong,that will cause code errors to run.
and how to install the right mmdet version mmdet:1.0.rc0+d3ca926.
Hope you can answer in your free time ,thank you.
您好,又来给您添麻烦了,我又遇到了两个问题
我已经将Configurations和Dataset setup(使用的是您提供的kaggle_apollo_combined_6691_origin.json)都已经配置好
由于我只有单GPU所以,Running 单GPU-train code (python train_kaggle_pku.py),但是产生错误
2020-03-14 23:07:57,375 - INFO - Distributed training: False
16%|█▋ | 13/79 [00:00<00:00, 127.31it/s]Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 122.13it/s]
Totaly corrupted count is: 1240, clean count: 74029
Total images: 6691, car num sum: 74029, minmin: 1, max: 43, mean: 11
x min: -851, max: 4116, mean: 1535
y min: 1482, max: 3427, mean: 1821
x min: -79, max: 79, mean: -3, std: 13.622
y min: 1, max: 42, mean: 9, std: 4.747
z min: 3, max: 150, mean: 50, std: 29.950
Car model: max: 76, min: 2, total: 74029
Unique car models:
[ 2 6 7 8 9 12 14 16 18 19 20 23 25 27 28 31 32 35 37 40 43 46 47 48
50 51 54 56 60 61 66 70 71 76]
Number of unique car models: 34
0%| | 0/79 [00:00<?, ?it/s]validation_images
Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 119.03it/s]
2020-03-14 23:08:03,864 - INFO - Start running, host: shi@shi-Lenovo-Legion-Y7000P-1060, work_dir: /home/shi/Kaggle/checkpoints/Mar14-23-07
2020-03-14 23:08:03,864 - INFO - workflow: [('train', 1)], max: 200 epochs
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/shi/.local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Traceback(most recent call last):
File "<ipython-input-1-3c5e8e6cf921>", line 1, in <module>
runfile('/home/shi/Kaggle/tools/train_kaggle_pku.py', wdir='/home/shi/Kaggle/tools')
File "/home/shi/anaconda3/lib/python3.7/site-
packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 100, in <module>
main()
File "/home/shi/Kaggle/tools/train_kaggle_pku.py", line 96, in main
logger=logger)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 79, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+d3ca926-py3.7-linux-x86_64.egg/mmdet/apis/train.py", line 257, in _non_dist_train
runner.run(data_loaders, cfg.workflow, `cfg.total_epochs)`
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 351, in run
self.call_hook('before_run')
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/runner.py", line 238, in call_hook
getattr(hook, fn_name)(self)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 74, in wrapper
return func(*args, **kwargs)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 28, in before_run
'Please run "pip install future tensorboard" to install '
ImportError: Please run "pip install future tensorboard" to install the dependencies to use torch.utils.tensorboard (applicable to PyTorch 1.1 or higher)
我按照提示,pip install future tensorboard之后,还是会报这个错误,所以没办法了,只能求助您了。
我还有一个问题,就是单GPU train 代码python train_kaggle_pku.py
不支持验证评估,那我要是想用多GPU train 代码
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch
在只有单GPU的机器上train,并进行验证评估,我要怎么更改这行代码呢?CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 train_kaggle_pku.py --launcher pytorch
希望在您百忙之中能给予解答,谢谢。
您好,我在configs/htc/htc_hrnetv2p_w48_20e_kaggle_pku_no_semantic_translation_wudi.py文件中,想要使用kaggle的数据集train.csv,于是将config.data.train代码改成如下:
data = dict( imgs_per_gpu=1, workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root, #ann_file='/data/cyh/kaggle/kaggle_apollo_combine_6692.json', # ann_file=data_root + 'apollo_kaggle_combined_6725_wudi.json', ann_file='/data/Kaggle/pku-autonomous-driving/train.csv', # 6691 means the final cleaned data img_prefix=data_root + 'train_images/', pipeline=train_pipeline, rotation_augmenation=True),
但是产生错误:
`runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')
2020-04-07 13:00:48,313 - INFO - Distributed training: False
14%|█▍ | 11/79 [00:00<00:00, 77.85it/s]Loading Car model files...
100%|██████████| 79/79 [00:00<00:00, 109.19it/s]
Traceback (most recent call last):
File "", line 1, in
runfile('/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py', wdir='/home/shi/data/Kaggle_pku/tools')
File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "/home/shi/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 100, in
main()
File "/home/shi/data/Kaggle_pku/tools/train_kaggle_pku.py", line 78, in main
datasets = [build_dataset(cfg.data.train)]
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/builder.py", line 39, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/utils/registry.py", line 76, in build_from_cfg
return obj_cls(**args)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/custom.py", line 66, in init
self.img_infos = self.load_annotations(self.ann_file)
File "/home/shi/anaconda3/lib/python3.7/site-packages/mmdet-1.0rc0+unknown-py3.7-linux-x86_64.egg/mmdet/datasets/kaggle_pku.py", line 75, in load_annotations
annotations = json.load(open(outfile, 'r'))
File "/home/shi/anaconda3/lib/python3.7/json/init.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/home/shi/anaconda3/lib/python3.7/json/init.py", line 348, in loads
return _default_decoder.decode(s)
File "/home/shi/anaconda3/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/shi/anaconda3/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
JSONDecodeError: Expecting value`
请问这样的问题该如何解决?
I have found the monocular imaged interface based on Kitti3D, however, I didn't find the network architecture to be suitable for the Kitti3D. Hence, could I run the code in the kitti3D dataset and could u tell me how to fix it?
您好,我很仰慕您,从6DVNet开始一直在跟随您的脚步,我最近在复现您kaggle PKU比赛中的结果,但是遇到了问题,我想知道这个文件kaggle_apollo_combined_6691_origin.json
的含义、在哪可以找到或者我自己通过数据集如何构建。
I am very interested on your work. I have used your dockerfile to start applying your git. However, mmcv was not installed. It seems like a version problem and I have tried so many ways to get around it. Would you mind please make a fully installed dockerfile or upload your docker image?
Hi,
I'm trying to run your model for a university project, but I can't figure out how to configure your code to use your pre-trained weights. Do I use 'cfg.load_from' in your provided config file?
Any help would be appreciated. I'm running inside the docker-file you have provided.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.