Giter Club home page Giter Club logo

cntk-fastrcnndetector's Introduction

CNTK-FastRCNNDetector

A python implementation for a CNTK Fast-RCNN evaluation client.

Call a Fast-RCNN python model from your python code, or run as a script directly from the command line.

For more information regarding the CNTK Fast-RCNN implementation, please checkout this tutorial.

A detailed notebook containing a walkthrough for evaluating a single image using a Fast-RCNN model, is available here.

In addition, there is also a node.js wrapper for this code that lets you call this code from node.js or Electron: https://github.com/nadavbar/node-cntk-fastrcnn.

Preliminaries

Since the FRCNN detector uses bits of the CNTK Fast-RCNN implementation it has the same requirements as the CNTK Fast-RCNN training pipeline.

Before running the code in this repository, please make sure to install the required python packages as described in the Fast-RCNN CNTK tutorial.

Using directly from your python code

In order to use directly from your python code, import frcnn_detector.py and initialize a new FRCNNDetector object with a path to your model file and to the CNTK installation. Then, use the detect method to call the model on a given image.

For example, the following code snippet runs detection on a single image and prints the resulting bounding boxes and the corresponding labels:

import cv2
from os import path
from frcnn_detector import FRCNNDetector

cntk_scripts_path = r'C:/local/cntk/Examples/Image/Detection/FastRCNN'
model_file_path = path.join(cntk_scripts_path, r'proc/grocery_2000/cntkFiles/Output/Fast-RCNN.model')

# initialize the detector and load the model
detector = FRCNNDetector(model_file_path, cntk_scripts_path=cntk_scripts_path)

img = cv2.imread(path.join(cntk_scripts_path,'r../../DataSets/Grocery/testImages/WIN_20160803_11_28_42_Pro.jpg')
rects, labels = detector.detect(img)

# print detections
for rect, label in zip(rects, labels):
    print("Bounding box: %s, label %s"%(rect, label))

API Documentation:

The FRCNNDetector constructor accepts the following input parameters:

model_path (string) - Path to the Fast-RCNN model file
pad_value (integer) - The value used to pad the resized image (default value is 114)
cntk_scripts_path (string) - Path to the CNTK Fast-RCNN scripts folder. Default value:  r"c:\local\cntk\Examples\Image\Detection\FastRCNN"
use_selective_search_rois (boolean)  - Indicates whether the selective search method should be used when preparing the input ROIs. Default value : True,
use_grid_rois (boolean) - Indicates whether the grid method should be used when preparing the input ROIs. Default value : True

The FRCNN detector exposes the set of following method for object detection:

detect(img) - Accepts an image in the OpenCV format and returns a tuple of bounding boxes and labels according to the FRCNN-Model detection.

Note that all you need is to call the detect method in order to run detection using the model.

The following set of methods are helper methods that you can use in case you need to do anything extra:

  • load_model() - Loads the model. Note that the detect() method will make sure that the model is loaded in case the load_model method wasn't called yet.
  • warm_up() - Runs a "dummy" detection through the network. Can be used to make sure that all of the CNTK libraries are loaded before the actual detection is called.
  • resize_and_pad(img) - Accepts an image in an OpenCV format and resizes (and pads) the image according to the input format that the network accepts. Returns a tuple of the resized image in an OpenCV readable format, and in the format expected by the network (BGR).
  • get_rois_for_image(img) - Accepts an image in an OpenCV format and calculates a list of ROIs according to the input format that the network accepts. As an optimization,tThe grid ROIs are calculated only once and then cached and reused. The method returns a tuple, where the first item is a list of ROIs that correspond to the internal network format (in relative image coordinates), and the second item is a list of corresponding ROIs in the format of the original image.
  • Run as a script

    The script accepts either a single image or directory of images and outputs either corresponding images with highlighted bounding boxes or a JSON file with a textual description of the detection result. (JSON description is available below)

    In script mode, the script supports the following cmd line options:

    usage: frcnn_detector.py [-h] --input <path> [--output <directory path>]
                             --model <file path> [--cntk-path <dir path>]
                             [--json-output <file path>]
    
    FRCNN Detector
    
    optional arguments:
      -h, --help            show this help message and exit
      --input <path>        Path to image file or to a directory containing image
                            in jpg format
      --output <directory path>
                            Path to output directory
      --model <file path>   Path to model file
      --cntk-path <dir path>
                            Path to the directory in which CNTK is installed, e.g.
                            c:\local\cntk
      --json-output <file path>
                            Path to output JSON file
    

    Here is an example of the result object of a directory that contains 2 images (named '1.jpg' and '2.jpg'):

    {
    	"frames": {
    		"1.jpg": {
    			"regions": [
    				{
    					"class": 1,
    					"x1": 418,
    					"x2": 538,
    					"y2": 179,
    					"y1": 59
    				}
    			]
    		},
    		"2.jpg": {
    			"regions": [
    				{
    					"class": 2,
    					"x1": 478,
    					"x2": 597,
    					"y2": 298,
    					"y1": 59
    				}
    			]
    		}
    	},
    	"classes": {
    		"background" : 0,
    		"human": 1,
    		"cat": 2,
    		"dog" : 3
    	}
    }

    Adding descriptive classes names

    Since CNTK does not embed the names of the classes in the model, on default, the module returns non descriptive names for the classes, e.g. "class_1", "class_2".

    If you want the module to return more descriptive names, you can place a JSON file named "model.json" in the same directory of the Fast-RCNN model file. You can then place the descriptions of the classes in the JSON file under the "classes" key.

    For example, the following JSON will describe the classes for the above example:

    {
        "classes" : {
            "background" : 0,
            "human" : 1,
    		"cat" : 2,
    		"dog" : 3
        }
    }

cntk-fastrcnndetector's People

Contributors

aribornstein avatar nadavbar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cntk-fastrcnndetector's Issues

Problem with other model than grocery

Hello,
thanks for the great code, but i have a problem with it, testing other models than the grocery model. It is the same problem with the similar code published here: https://github.com/Microsoft/CNTK/blob/master/Examples/Image/Detection/FastRCNN/CNTK_FastRCNN_Eval.ipynb

If i load the Pascal Model from the original tutorial (pkranen) and ealuate a pascal image or use a self trained model from the pascal dataset, your code does not detect anything. In the detector no rois_labels_predictions are set, i guess that this is the problem. The pretrained model i tested against is:
https://www.cntk.ai/Models/FRCN_Pascal/Fast-RCNN.model
Do you have any suggestions on this ?

Kind regards,

Dirk

Error while running - ValueError: axis(=1) out of bounds

I'm having problem using the script in command line;
Here the following result:

(C:\local\Anaconda3-4.1.1-Windows-x86_64\envs\cntk-py35) C:\local\CNTK-2-0-rc3\cntk\Examples\Image\Detection\FastRCNN>python "C:\Users\Ale\Desktop\CNTK-FastRCNNDetector-master\frcnn_detector.py" --input C:\imagetest\1.jpg --model C:\local\CNTK-2-0-rc3\cntk\Examples\Image\Detection\FastRCNN\proc\Grocery_100\cntkFiles\Output\Fast-RCNN.model --output C:\output

and the final:


Selected CPU as the process wide default device.
Number of images to process: 1
Read file in path: C:\imagetest\1.jpg
C:\local\Anaconda3-4.1.1-Windows-x86_64\envs\cntk-py35\lib\site-packages\cntk\core.py:349: UserWarning: your data is of type "float64", but your input variable (uid "Input224") expects "<class 'numpy.float32'>". Please convert your data beforehand to speed up training.
(sample.dtype, var.uid, str(var.dtype)))
C:\local\Anaconda3-4.1.1-Windows-x86_64\envs\cntk-py35\lib\site-packages\cntk\core.py:349: UserWarning: your data is of type "float64", but your input variable (uid "Input225") expects "<class 'numpy.float32'>". Please convert your data beforehand to speed up training.
(sample.dtype, var.uid, str(var.dtype)))
Traceback (most recent call last):
File "C:\Users\Ale\Desktop\CNTK-FastRCNNDetector-master\frcnn_detector.py", line 361, in
rects, labels = detector.detect(img)
File "C:\Users\Ale\Desktop\CNTK-FastRCNNDetector-master\frcnn_detector.py", line 277, in detect
rois_labels_predictions = np.argmax(rois_values, axis=1)
File "C:\local\Anaconda3-4.1.1-Windows-x86_64\envs\cntk-py35\lib\site-packages\numpy\core\fromnumeric.py", line 963, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out)
File "C:\local\Anaconda3-4.1.1-Windows-x86_64\envs\cntk-py35\lib\site-packages\numpy\core\fromnumeric.py", line 57, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
ValueError: axis(=1) out of bounds

Prepare self trained model for evaluation

Hey everybody,

I'm currently following this tutorial (https://github.com/Microsoft/CNTK/blob/master/Examples/Image/Detection/FastRCNN/BrainScript/CNTK_FastRCNN_Eval.ipynb) to implement a Fast RCNN Evaluator and for the grocery model everything worked fine. Now I want to run the script with my model, which I trained on my own data. The problem is that the structure of the model is different to the pretrained grocery model and as I'm still a beginner with CNTK I don't know who to convert it to the right structure.

The following informations of the structure of the different models are just print(model).

print(grocery-model):

Before preparation:
Composite(features: SequenceOver[][Tensor[3,1000,1000]], rois: SequenceOver[][Tensor[100,4]], roiLabels: SequenceOver[][Tensor[100,17]]) -> Tuple[SequenceOver[][Tensor[100,1]], SequenceOver[][Tensor[1,1]], SequenceOver[][Tensor[100,17]]]

After preparation:
Composite(features: Sequence[Tensor[3,1000,1000]], rois: Sequence[Tensor[100,4]]) -> Sequence[Tensor[100,17]]

print(my-model):

Before preparation:
Composite(data: Tensor[3,850,850], roi_proposals: Tensor[200,4]) -> Tuple[Tensor[200,7], Tensor[200,28]]

Any ideas or suggestions are appreciated.

Update
I think I got a part of it this is my code to convert the input:

    # load trained model
    trained_frcnn_model = load_model(modelPath)

    # find the original features and rois input nodes
    features_node = find_by_name(trained_frcnn_model, "data")
    rois_node = find_by_name(trained_frcnn_model, "rois_proposal")

    #  find the output "z" node
    z_node = find_by_name(trained_frcnn_model, 'drop7')

    # define new input nodes for the features (image) and rois
    image_input = input_variable(shape=(3,850,850), name='features')
    roi_input = input_variable(shape=(200,4), name='rois')

    # Clone the desired layers with fixed weights and place holder for the new input nodes
    cloned_nodes = combine([z_node.owner]).clone(
    CloneMethod.freeze,
    {features_node: placeholder(name='features'), rois_node: placeholder(name='rois')})

    # apply the cloned nodes to the input nodes
    self.model = cloned_nodes(image_input, roi_input)

    print("Model loaded successfully!")

But I still don't know about the z output node because I'm pretty sure drop7 is the wrong one but it can't find a z node.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.