CNTK-FastRCNNDetector

A python implementation for a CNTK Fast-RCNN evaluation client.

Call a Fast-RCNN python model from your python code, or run as a script directly from the command line.

For more information regarding the CNTK Fast-RCNN implementation, please checkout this tutorial.

A detailed notebook containing a walkthrough for evaluating a single image using a Fast-RCNN model, is available here.

In addition, there is also a node.js wrapper for this code that lets you call this code from node.js or Electron: https://github.com/nadavbar/node-cntk-fastrcnn.

Preliminaries

Since the FRCNN detector uses bits of the CNTK Fast-RCNN implementation it has the same requirements as the CNTK Fast-RCNN training pipeline.

Before running the code in this repository, please make sure to install the required python packages as described in the Fast-RCNN CNTK tutorial.

Using directly from your python code

In order to use directly from your python code, import frcnn_detector.py and initialize a new FRCNNDetector object with a path to your model file and to the CNTK installation. Then, use the detect method to call the model on a given image.

For example, the following code snippet runs detection on a single image and prints the resulting bounding boxes and the corresponding labels:

import cv2
from os import path
from frcnn_detector import FRCNNDetector

cntk_scripts_path = r'C:/local/cntk/Examples/Image/Detection/FastRCNN'
model_file_path = path.join(cntk_scripts_path, r'proc/grocery_2000/cntkFiles/Output/Fast-RCNN.model')

# initialize the detector and load the model
detector = FRCNNDetector(model_file_path, cntk_scripts_path=cntk_scripts_path)

img = cv2.imread(path.join(cntk_scripts_path,'r../../DataSets/Grocery/testImages/WIN_20160803_11_28_42_Pro.jpg')
rects, labels = detector.detect(img)

# print detections
for rect, label in zip(rects, labels):
    print("Bounding box: %s, label %s"%(rect, label))

API Documentation:

The FRCNNDetector constructor accepts the following input parameters:

model_path (string) - Path to the Fast-RCNN model file
pad_value (integer) - The value used to pad the resized image (default value is 114)
cntk_scripts_path (string) - Path to the CNTK Fast-RCNN scripts folder. Default value:  r"c:\local\cntk\Examples\Image\Detection\FastRCNN"
use_selective_search_rois (boolean)  - Indicates whether the selective search method should be used when preparing the input ROIs. Default value : True,
use_grid_rois (boolean) - Indicates whether the grid method should be used when preparing the input ROIs. Default value : True

The FRCNN detector exposes the set of following method for object detection:

detect(img) - Accepts an image in the OpenCV format and returns a tuple of bounding boxes and labels according to the FRCNN-Model detection.

Note that all you need is to call the detect method in order to run detection using the model.

The following set of methods are helper methods that you can use in case you need to do anything extra:

load_model() - Loads the model. Note that the detect() method will make sure that the model is loaded in case the load_model method wasn't called yet.
warm_up() - Runs a "dummy" detection through the network. Can be used to make sure that all of the CNTK libraries are loaded before the actual detection is called.
resize_and_pad(img) - Accepts an image in an OpenCV format and resizes (and pads) the image according to the input format that the network accepts. Returns a tuple of the resized image in an OpenCV readable format, and in the format expected by the network (BGR).
get_rois_for_image(img) - Accepts an image in an OpenCV format and calculates a list of ROIs according to the input format that the network accepts. As an optimization,tThe grid ROIs are calculated only once and then cached and reused. The method returns a tuple, where the first item is a list of ROIs that correspond to the internal network format (in relative image coordinates), and the second item is a list of corresponding ROIs in the format of the original image.

Run as a script

The script accepts either a single image or directory of images and outputs either corresponding images with highlighted bounding boxes or a JSON file with a textual description of the detection result. (JSON description is available below)

In script mode, the script supports the following cmd line options:

usage: frcnn_detector.py [-h] --input <path> [--output <directory path>]
                         --model <file path> [--cntk-path <dir path>]
                         [--json-output <file path>]

FRCNN Detector

optional arguments:
  -h, --help            show this help message and exit
  --input <path>        Path to image file or to a directory containing image
                        in jpg format
  --output <directory path>
                        Path to output directory
  --model <file path>   Path to model file
  --cntk-path <dir path>
                        Path to the directory in which CNTK is installed, e.g.
                        c:\local\cntk
  --json-output <file path>
                        Path to output JSON file

Here is an example of the result object of a directory that contains 2 images (named '1.jpg' and '2.jpg'):

{
	"frames": {
		"1.jpg": {
			"regions": [
				{
					"class": 1,
					"x1": 418,
					"x2": 538,
					"y2": 179,
					"y1": 59
				}
			]
		},
		"2.jpg": {
			"regions": [
				{
					"class": 2,
					"x1": 478,
					"x2": 597,
					"y2": 298,
					"y1": 59
				}
			]
		}
	},
	"classes": {
		"background" : 0,
		"human": 1,
		"cat": 2,
		"dog" : 3
	}
}

Adding descriptive classes names

Since CNTK does not embed the names of the classes in the model, on default, the module returns non descriptive names for the classes, e.g. "class_1", "class_2".

If you want the module to return more descriptive names, you can place a JSON file named "model.json" in the same directory of the Fast-RCNN model file. You can then place the descriptions of the classes in the JSON file under the "classes" key.

For example, the following JSON will describe the classes for the above example:

{
    "classes" : {
        "background" : 0,
        "human" : 1,
		"cat" : 2,
		"dog" : 3
    }
}

Prepare self trained model for evaluation

Hey everybody,

I'm currently following this tutorial (https://github.com/Microsoft/CNTK/blob/master/Examples/Image/Detection/FastRCNN/BrainScript/CNTK_FastRCNN_Eval.ipynb) to implement a Fast RCNN Evaluator and for the grocery model everything worked fine. Now I want to run the script with my model, which I trained on my own data. The problem is that the structure of the model is different to the pretrained grocery model and as I'm still a beginner with CNTK I don't know who to convert it to the right structure.

The following informations of the structure of the different models are just print(model).

print(grocery-model):

Before preparation:
Composite(features: SequenceOver[][Tensor[3,1000,1000]], rois: SequenceOver[][Tensor[100,4]], roiLabels: SequenceOver[][Tensor[100,17]]) -> Tuple[SequenceOver[][Tensor[100,1]], SequenceOver[][Tensor[1,1]], SequenceOver[][Tensor[100,17]]]

After preparation:
Composite(features: Sequence[Tensor[3,1000,1000]], rois: Sequence[Tensor[100,4]]) -> Sequence[Tensor[100,17]]

print(my-model):

Before preparation:
Composite(data: Tensor[3,850,850], roi_proposals: Tensor[200,4]) -> Tuple[Tensor[200,7], Tensor[200,28]]

Any ideas or suggestions are appreciated.

Update
I think I got a part of it this is my code to convert the input:

    # load trained model
    trained_frcnn_model = load_model(modelPath)

    # find the original features and rois input nodes
    features_node = find_by_name(trained_frcnn_model, "data")
    rois_node = find_by_name(trained_frcnn_model, "rois_proposal")

    #  find the output "z" node
    z_node = find_by_name(trained_frcnn_model, 'drop7')

    # define new input nodes for the features (image) and rois
    image_input = input_variable(shape=(3,850,850), name='features')
    roi_input = input_variable(shape=(200,4), name='rois')

    # Clone the desired layers with fixed weights and place holder for the new input nodes
    cloned_nodes = combine([z_node.owner]).clone(
    CloneMethod.freeze,
    {features_node: placeholder(name='features'), rois_node: placeholder(name='rois')})

    # apply the cloned nodes to the input nodes
    self.model = cloned_nodes(image_input, roi_input)

    print("Model loaded successfully!")

But I still don't know about the z output node because I'm pretty sure drop7 is the wrong one but it can't find a z node.

microsoft / cntk-fastrcnndetector Goto Github PK