Giter Club home page Giter Club logo

train_ssd_mobilenet's People

Contributors

hongrui16 avatar naisy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

train_ssd_mobilenet's Issues

mAp of mobilenet_ssd

Hello,I want to know about the precision that the model can achieved.I am training this model with 2500 photos,and the iteration is 300k.When I am evaluating this model,the mAP is 36%~38%,it seems that it could not grow more.So I want to know if there is any improvement ways,or maybe the precision of this model can only be this case.Thanks for answering my questions.

Training with background images

Hi @naisy ,

I have trained a model in mobilenet-ssd-v2 model in Tensorflow(less than version 2.0) for detecting "mobile-phone in hand". I trained with images having 'mobile-phones in hand'. Threshold is set to 7. So if hold a mobile-phone in my hand it will detect. This scenario is working, but the issue is when i hold some other objects(like cup, pen) in my hand, that also is getting detected. That is, i'm getting wrong detection. How can i reduce this wrong detection? Will training with other images like - image having hand held cup, book, etc without specifying label in xml - help to avoid these wrong detection?

I have gone through some issues in github regarding this, but still i'm not getting a clarity on this, that's why i come to you. Hope you will help me to get a clarity on this. Kindly help me.

nms version for roadsign example

Hi, I'm experimenting with your repository and I've started with following the tutorial.
I've successfully trained an ssd mobilenet model with the steps in readme.md, exported the frozen graph but now I want to test it on a video. To test the model I'm using your linked repository for realtime object detection. I'm confused with the nms_version. What version should I use? I'm copying the .yml file for clarity:

---
image_input: 'images'       # input image dir
movie_input: '/home/DATI/insulators/SSD_mobilenet/test_videos/route226_argentina.mp4'    # mp4 or avi. Movie file.
#camera_input: 0            # USB Webcam on PC
camera_input: 1             # USB Webcam on TX2
## Input Must be OpenCV readable
## Onboard camera on Xavier (with TX2 onboard camera)
#camera_input: "nvarguscamerasrc ! video/x-raw(memory:NVMM), width=1280, height=720,format=NV12, framerate=120/1 ! nvvidconv ! video/x-raw,format=I420 ! videoflip method=rotate-180 ! appsink"
## Onboard camera on TX2 ### (need: apt-get install libxine2)
#camera_input: "nvcamerasrc ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)I420, framerate=(fraction)30/1 ! nvvidconv flip-method=0 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink"
## Onboard or external RTSP feed
#camera_input: "rtspsrc location=rtsp://127.0.0.1:8554/test latency=500 ! rtph264depay ! h264parse ! omxh264dec ! nvvidconv ! video/x-raw, width=(int)1280, height=(int)720, format=(string)BGRx ! videoconvert ! appsink"

force_gpu_compatible: False # If True with visualize False, speed up. Forces all CPU tensors to be allocated with Cuda pinned memory.
save_to_file: True         # movie or camera: ./output_movie/output_unixtime.avi. Save it in avi format to prevent compression degradation. Requires a lot of disk space.
                            # image: ./output_image/PATH_TO_FILE. Save it in image file.
visualize: True             # True: Show result image. False: Without image show.
vis_worker: False           # True: Visualization run on process. (With visuzalize:True)
max_vis_fps: 0              # >=1: Limit of show fps. 0: No limit - means try to spend full machine power for visualization. (With visualize:True.)
vis_text: True              # Display fps on result image. (With visualize:True.)
max_frames: 5000            # >=1: Quit when frames done. 0: no exit. (With visualize:False)
width: 600                  # Camera width.
height: 600                 # Camera height.
fps_interval: 5             # FPS console out interval and FPS stream length.
det_interval: 100           # interval [frames] to print detections to console
det_th: 0.5                 # detection threshold for det_intervall
worker_threads: 4           # parallel detection for Mask R-CNN.
split_model: True           # Splits Model into a GPU and CPU session for SSD/Faster R-CNN.
log_device: False           # Logs GPU / CPU device placement
allow_memory_growth: True   # limits memory allocation to the actual needs
debug_mode: False           # Show FPS spike value
split_shape: 1917           # 1917, 3000, 3309, 5118, 7326, 51150. ExpandDims_1's shape.

model_type: 'nms_v2'
model_path: '/home/DATI/insulators/SSD_mobilenet/output_models/frozen_inference_graph.pb'
label_path: '/home/davidecremona/PycharmProjects/train_ssd_mobilenet/roadsign_data/roadsign_label_map.pbtxt'
num_classes: 4

Thank you in advance, Davide.

Problems training a custom dataset

Hello,

Thanks for this detailed set of instructions! I tried to follow them for training a custom dataset of mine, but I was running into some problems, so I was hoping you might have suggestions.

My dataset has 12 classes, and it was originally being used with YOLO, so I used this converter to create train and validation TFRecord files. (Please note here that the bounding boxes are preprovided, so I was not hand labeling them) Once that was done, I followed the instructions in this repo to start the training. I ensured that my label text was correct, starting from class ID 1, etc. Same as in this repo, I am using ssd-mobilenet-v1, and changed the config file as needed. I was working with Tensorflow 1.4.1 and v1.5 of the object detction API.

Nothing seems obviously wrong in the training window, but the network doesn't really learn the objects correctly at all. It is evident from these pictures from Tensorboard.

screenshot from 2018-04-28 08-57-20

screenshot from 2018-04-28 08-58-05

It only seems to be detecting 2/12 classes somewhat decently, and even then gets confused on the classification a lot even after ~50k steps. The loss also seems to be fluctuating and not really decreasing. Would you happen to have any suggestions for fixes or improvement? Thank you!

faster_rcnn frozen graph issue

I have problem with loading frozen graph of faster_rcnn v2 graph. I trained it by my own with just one class... it looks like training process was successful, I was even able to export it to freeze state.

  1. I did change in config file to point my graph

model_type: 'faster_v2'
model_path: '/home/nvidia/Desktop/Real_Time/realtime_object_detection/models/faster_rcnn_v1/frozen_inference_graph.pb'

but during execution of run_video.py I get error on load_graph_faster_v2.py

i get warning that

assert d in name_to_node_map, "%s is not in graph" % d
AssertionError: Squeeze_2 is not in graph

where content of dest_nodes looks like :

['SecondStagePostprocessor/stack_1', 'SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/strided_slice', 'BatchMultiClassNonMaxSuppression/map/TensorArrayStack_4/TensorArrayGatherV3', 'Squeeze_2', 'Squeeze_3', 'SecondStagePostprocessor/Reshape_4']

InvalidArgumentError (see above for trackback): assertion failed: [Incorrect scores field length: ...

I am getting the following error, I am pretty sure it has to be with something mismatched with my data set and the config setting but can't figure out where. I say this because I am able to train a frozen graph and get it to work correctly with the road sign data but not my own.

I found this line in the object detection folder / utils/np_box_mask_list_ops.py

 if num_boxes != num_scores:
    raise ValueError('Incorrect scores field length: actual vs expected.')

I just don't know what num_boxes and num_scores are and what that might mean for me and a quick google search didn't seem to help...

The following is the error.

InvalidArgumentError (see above for traceback): assertion failed: [Incorrect scores field length: actual vs expected.] [7668] [1917]
    [[node Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Assert/Assert (defined at /home/nvidia/Downloads/realtime_object_detection-master/lib/load_graph_nms_v1.py:197)  = Assert[T=[DT_STRING, DT_INT32, DT_INT32], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Equal, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Assert/Assert/data_0, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/strided_slice_1, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/strided_slice)]]

And it keeps going but I am typing it out by hand since I am away from my Xavier.
Any ideas what to take a look at or change

Hi, naisy, I am here to appreciate what you have done to me!

Hi, this is your first follower. I am particularly grateful to you for your code. You know that I am not familiar with tensorflow/model, and I got confused about how to train and test with SSD-mobilenet_v2 (I am too lazy, I think). I have to say it's your code saved me and I have really learned many things from your help, and I hope you can know that what you have written really helps somebody else.

Your address required!

I really want to send You some unique high % gift straight from Poland for this code, so I demand Your shipping address!

ImportError: No module named object_detection

I seem to have a problem at the very end, I am assuming I did something wrong but this is what happens when I run the final command to create the frozen graph

Traceback (most recent call last):
  File "/home/r/github/models/research/object_detection/export_inference_graph.py", line 71, in <module>
    from object_detection import exporter
ImportError: No module named object_detection

I then went through and tried to rename the different import settings and got it to work up until I got this:

Traceback (most recent call last):
  File "/home/r/github/models/research/object_detection/export_inference_graph.py", line 71, in <module>
    import exporter
  File "/home/r/github/models/research/object_detection/exporter.py", line 28, in <module>
    from builders import model_builder
  File "/home/r/github/models/research/object_detection/builders/model_builder.py", line 25
    builders import region_similarity_calculator_builder as sim_calc
                  ^
SyntaxError: invalid syntax

I saw something suggested this :
protoc object_detection/protos/*.proto --python_out=.

But that didn't seem to do anything...

Its possible I installed something like TensorFlow wrong so I was thinking I should go back and re-install everything, although everything seemed to train just fine so... Any ideas?

overlapping train.record and val.record

Hi, I'm experimenting with my own dataset and your code. I have 800 images with labels in PascalVOC format.

The training process finish without problems and so the evaluation step. But when I have explored the data (val.record and train.record) I've found that they can contain the same images (train and val intersects).

It's because the val.record is not the test set and should not be used for evaluating the produced model?

Here my pipeline config file:

# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 1
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "/home/davidecremona/PycharmProjects/train_ssd_mobilenet/ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/home/DATI/insulators/datasets/object-detection_0/tfrecords/train.record"
  }
  label_map_path: "/home/DATI/insulators/datasets/object-detection_0/label_map.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/home/DATI/insulators/datasets/object-detection_0/tfrecords/val.record"
  }
  label_map_path: "/home/DATI/insulators/datasets/object-detection_0/label_map.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.