Repository for Single Shot MultiBox Detector and its variants, implemented with pytorch, python3.
Currently, it contains these features:
- Multiple SSD Variants: ssd, rfb, fssd, ssd-lite, rfb-lite, fssd-lite
- Multiple Base Network: VGG, Mobilenet V1/V2
- Free Image Size
- Visualization with tensorboard-pytorch: training loss, eval loss/mAP, example archor boxs.
This repo is depended on the work of ssd.pytorch, faster-rcnn.pytorch, RFBNet, Detectron and Tensorflow Object Detection API. Thanks for there works.
- install pytorch
- install requirements by
pip install -r ./requirements.txt
To train, test and demo some specific model. Please run the relative file in folder with the model configure file, like:
python train.py --cfg=./experiments/cfgs/rfb_lite_mobilenetv2_train_voc.yml
Change the configure file based on the note in config_parse.py
VOC2007 | YOLO_v2 | YOLO_v3 | SSD | RFB | FSSD |
---|---|---|---|---|---|
Darknet53 | 79.3% | 77.3% | 79.5% | 81.0% | |
Darknet19 | 78.4% | 76.1% | 78.4% | 81.0% | |
Resnet50 | 79.7% | 81.2% | |||
VGG16 | 76.0% | 80.5% | 77.8% | ||
MobilenetV1 | 74.7% | 78.2% | 72.7% | 73.7% | 78.4% |
MobilenetV2 | 72.0% | 75.8% | 73.2% | 73.4% | 76.7% |
COCO2017 | YOLO_v2 | YOLO_v3 | SSD | RFB | FSSD |
---|---|---|---|---|---|
Darknet53 | 27.3% | 21.1% | 22.8% | 25.8% | |
Darknet19 | 21.6% | 20.6% | 22.4% | 26.4% | |
Resnet50 | 25.1% | 26.5% | 27.2% | ||
VGG16 | 25.4% | 25.5% | 27.2% | ||
MobilenetV1 | 21.5% | 25.7% | 18.8% | 19.1% | 24.2% |
MobilenetV2 | 20.4% | 24.0% | 18.5% | 18.5% | 22.2% |
Net InferTime* (fp32) | YOLO_v2 | YOLO_v3 | SSD | RFB | FSSD |
---|---|---|---|---|---|
Darknet53 | 5.11ms | 4.32ms | 6.63ms | 4.41ms | |
Darknet19 | 1.64ms | 2.21ms | 4.57ms | 2.29ms | |
Resnet50 | 3.60ms | 6.04ms | 3.85ms | ||
VGG16 | 1.75ms | 4.20ms | 1.98ms | ||
MobilenetV1 | 2.02ms | 3.31ms | 2.80ms | 3.84ms | 2.62ms |
MobilenetV2 | 3.35ms | 4.69ms | 4.05ms | 5.26ms | 4.00ms |
(*-only calculate the all network inference time, without pre-processing & post-processing. In fact, the speed of vgg is super impress me. Maybe it is caused by MobilenetV1 and MobilenetV2 is using -lite structure, which uses the seperate conv in the base and extra layers.)
For the transfer learning, we need to do two steps, set up dataset path in configure, and weight convert to start training.
First of all, we set cfg.DATASET.DATASET to be homemade
, and set the DATACLASSES for this dataset in the configure.
You may check homemade_train.yml
.
Second, the dataset should replace in the right location. The annotations xml(follow voc standard) must
in ./data/HOMEMADE/Annotations
. The images must
be placed in ./data/HOMEMADE/JPEGImages
. HOMEMADE
maybe changed, it follows the name of DATASET_DIR.
Third, execute python3 make_file_lists.py --cfg=./experiments/cfgs/homemade_train.yml
homemade_train.yml can be replaced.
Once the dataset are prepared, we convert the weight.
Use weight_convert.py to convert the weights to initialize new model.
Run python3 weight_converter.py --cfg=./homemade_train.yml
, homemade_train.yml may be replaced by our own configure yml.
Once dataset preparation and weight converting were completed, just train it as before.
e.g. python train.py --cfg=./experiments/cfgs/rfb_lite_mobilenetv2_train_voc.yml
The tensorboardX records the necessary information in the tensorboard. To visualize the tensorboard, type tensorboard --logdir LOG_DIR
in the ternimal.
e.g.
tensorboard --logdir ./experiments/models/ssd_mobilenet_v2_voc.yml
Once the tensorboard successfully launched. You shall see
Once you are unable to launch tensorboard, run export LC_ALL=C
in terminal may help
-
visualize the network graph (terminal) -tensorboard has bugs.
-
visualize the loss during the training progress and meanAP during the eval progress (terminal & tensorboard)
-
visualize archor box for each feature extractor (tensorboard)
-
visualize the pr curve for different classes and anchor matching strategy for different properties during evaluation (tensorboard) (*guess the dataset in the figure, coco or voc?)
-
visualize featuremap and grads (not satisfy me, does not give me any information. any suggestions? )
- prepare update to pytorch 0.4.0
- add DSSDs: DSSD FPN TDM
- test the multi-resolution traning
- add rotation for prerprocess
- test focal loss
- add resnet, xception, inception
- figure out the problem of visualize graph
- speed up preprocess part (any suggestion?)
- speed up postprocess part (any suggestion?) huge bug!!!
- add network visualization based on pytorch-cnn-visualizations
- object detection