ssds.pytorch

Repository for Single Shot MultiBox Detector and its variants, implemented with pytorch, python3.

Currently, it contains these features:

Multiple SSD Variants: ssd, rfb, fssd, ssd-lite, rfb-lite, fssd-lite
Multiple Base Network: VGG, Mobilenet V1/V2
Free Image Size
Visualization with tensorboard-pytorch: training loss, eval loss/mAP, example archor boxs.

This repo is depended on the work of ssd.pytorch, faster-rcnn.pytorch, RFBNet, Detectron and Tensorflow Object Detection API. Thanks for there works.

Installation
Usage
Performance and Model Zoo
Visualization
Future Work
Reference

Installation

install pytorch
install requirements by pip install -r ./requirements.txt

Usage

To train, test and demo some specific model. Please run the relative file in folder with the model configure file, like:

python train.py --cfg=./experiments/cfgs/rfb_lite_mobilenetv2_train_voc.yml

Change the configure file based on the note in config_parse.py

Performance

VOC2007	YOLO_v2	YOLO_v3	SSD	RFB	FSSD
Darknet53		79.3%	77.3%	79.5%	81.0%
Darknet19	78.4%		76.1%	78.4%	81.0%
Resnet50			79.7%	81.2%
VGG16			76.0%	80.5%	77.8%
MobilenetV1	74.7%	78.2%	72.7%	73.7%	78.4%
MobilenetV2	72.0%	75.8%	73.2%	73.4%	76.7%

COCO2017	YOLO_v2	YOLO_v3	SSD	RFB	FSSD
Darknet53		27.3%	21.1%	22.8%	25.8%
Darknet19	21.6%		20.6%	22.4%	26.4%
Resnet50			25.1%	26.5%	27.2%
VGG16			25.4%	25.5%	27.2%
MobilenetV1	21.5%	25.7%	18.8%	19.1%	24.2%
MobilenetV2	20.4%	24.0%	18.5%	18.5%	22.2%

Net InferTime* (fp32)	YOLO_v2	YOLO_v3	SSD	RFB	FSSD
Darknet53		5.11ms	4.32ms	6.63ms	4.41ms
Darknet19	1.64ms		2.21ms	4.57ms	2.29ms
Resnet50			3.60ms	6.04ms	3.85ms
VGG16			1.75ms	4.20ms	1.98ms
MobilenetV1	2.02ms	3.31ms	2.80ms	3.84ms	2.62ms
MobilenetV2	3.35ms	4.69ms	4.05ms	5.26ms	4.00ms

(*-only calculate the all network inference time, without pre-processing & post-processing. In fact, the speed of vgg is super impress me. Maybe it is caused by MobilenetV1 and MobilenetV2 is using -lite structure, which uses the seperate conv in the base and extra layers.)

Transfer learning/ training own data

For the transfer learning, we need to do two steps, set up dataset path in configure, and weight convert to start training.

Datasets

First of all, we set cfg.DATASET.DATASET to be homemade, and set the DATACLASSES for this dataset in the configure. You may check homemade_train.yml.

Second, the dataset should replace in the right location. The annotations xml(follow voc standard) must in ./data/HOMEMADE/Annotations. The images must be placed in ./data/HOMEMADE/JPEGImages. HOMEMADE maybe changed, it follows the name of DATASET_DIR.

Third, execute python3 make_file_lists.py --cfg=./experiments/cfgs/homemade_train.yml homemade_train.yml can be replaced.

Once the dataset are prepared, we convert the weight.

Weight converting

Use weight_convert.py to convert the weights to initialize new model. Run python3 weight_converter.py --cfg=./homemade_train.yml, homemade_train.yml may be replaced by our own configure yml.

Training

Once dataset preparation and weight converting were completed, just train it as before. e.g. python train.py --cfg=./experiments/cfgs/rfb_lite_mobilenetv2_train_voc.yml

Tensorboad Monitoring

The tensorboardX records the necessary information in the tensorboard. To visualize the tensorboard, type tensorboard --logdir LOG_DIR in the ternimal.

e.g.

tensorboard --logdir ./experiments/models/ssd_mobilenet_v2_voc.yml

Once the tensorboard successfully launched. You shall see

Problem shooting

Once you are unable to launch tensorboard, run export LC_ALL=C in terminal may help

Visualization

visualize the network graph (terminal) -tensorboard has bugs.
visualize the loss during the training progress and meanAP during the eval progress (terminal & tensorboard)
visualize archor box for each feature extractor (tensorboard)
visualize the preprocess steps for training (tensorboard)
visualize the pr curve for different classes and anchor matching strategy for different properties during evaluation (tensorboard) (*guess the dataset in the figure, coco or voc?)
visualize featuremap and grads (not satisfy me, does not give me any information. any suggestions? )