Giter Club home page Giter Club logo

karolzak / cntk-hotel-pictures-classificator Goto Github PK

View Code? Open in Web Editor NEW
39.0 39.0 22.0 2.38 MB

This POC is using CNTK 2.1 to train model for multiclass classification of images. Our model is able to recognize specific objects (i.e. toilet, tap, sink, bed, lamp, pillow) connected with picture types we are looking for. It plays a big role in a process which will be used to classify pictures from different hotels and determine whether it's a picture of bathroom, bedroom, hotel front, swimming pool, bar, etc.

License: MIT License

Python 100.00%
ai alexnet-model classification cntk cntk-model cognitive-toolkit deep-learning deep-neural-networks dsvm faster-rcnn machine-learning ml object-classification object-detection python transfer-learning

cntk-hotel-pictures-classificator's People

Contributors

karolzak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cntk-hotel-pictures-classificator's Issues

Hard coding concerning the model architecture

There are several places you hard coded the configurations of network architecture in the main program.

They are:

  1. The input dimension of Fast-RCNN, which is coded as [4096, ]. But it should be possible to be other numbers, e.g. if i want to use ResNet ended with dimension [1000, ].
  2. The spatial_scale of RoiPooling layer, which is coded as 1/16. But in my understanding, this ratio is related to the actual total stride in the convolutional layers.

Font 'Arial' is not default for Linux

Hello sir,

in your file 'Detection/FasterRasterRCNN/plot_helper.py', you use the font 'Arial' to draw text but it's not a default font for my Linux, Azure DSVM for Linux with Ubuntu 16.04.

I am not sure if the solution for fixing this error should be included in the Readme 'Setup' section to tell Linux users to install 'Arial' font before learning the model. if anyone else encountered this problem too, you can find a help from below link: https://askubuntu.com/questions/651441/how-to-install-arial-font-in-ubuntu

Insufficient number of colors in the plot_helper

In the plot_helper.py, the color for drawing boxes currently only supports up to 15 classes, which is defined by five base colors and their 3 variants.

But it is not enough for me when i train and evaluate via my own custom dataset, which contains about 50 classes.

Name mismatching for the base model VGG16

Hello,

There is an error of name mismatching when I wanted to switch the base model to the VGG16. Basically, the problem is that you define the name of model by yourself in the config file, e.g. VGG16, while the real name of VGG model referred by the download is VGG16_ImageNet_Caffe. Then after finish downloading, the program named the model as VGG16 but the other part of main program loads the model via the name VGG16_ImageNet_Caffe.

Obviously, the problem will also happen when loading and using other base models if the program-defined name is different from the name defined by others.

Best wishes!
Lin.

ImportError: No module named 'utils.cython_modules.cython_bbox'

I know this error has been widely discussed and here are some specifications I found for this program.

  1. The default cython binaries in the directory 'Detection/utils/cython_modules/' only support for particular version, 3.4(Linux) and 3.5(Windows), of Python. So anyone who got the error info like the title should checks if your Python version is suitable.

  2. The latest cython module binaries have been updated on the repo of CNTK. But after copying the corresponding binaries for Python 3.5(Linux), i.e. cpu_nms.cpython-35m.so and cython_bbox.cpython-35m.so, into the directory of current project 'Detection/utils/cython_modules/', it still reports the same error as the title during compilation, while the program will run well if I change my Python to 3.4 by Anaconda virtual environment and run the same project, i.e. same directory and same program.

  3. Another solution for this problem is to recompile the above cython modules by ourselves following instructions from CNTK Guide. The lucky thing is that the program can pass the compilation without any error report, while unfortunately it produces/prints some unknown error info during the training.

All of my testing are run on the latest DSVM for Linux.
Default Python: Python 3.5 with CNTK 2.2
Python 3.4 Virtual Environment: Python 3.4 with CNTK 2.1

list index out of range

I was training with 17 images and testing with 3 images but gettign below error. is it is because of object identified as wrong as MAP is zero so getting below error?

warnings.warn(WARNING_MSG_GPU_ONLY % ('GPU-Specific', 'https://docs.microsoft.com/en-us/cognitive-toolkit/Setup-Windows-Python#optional-gpu-specific-packages'))
Selected CPU as the process wide default device.
Using the following parameters:
Flip image : True
Train conv layers: True
Random seed : 3
Momentum per MB : 0.9
E2E epochs : 20
Loading existing model from D:\Dropbox (eClerx Services Ltd.)\Navdeep.Singh05\Documents\navdeep\coe work\Hilton\cntk custom hotels\Detection\FasterRCNN\Output\faster_rcnn_eval_AlexNet_e2e.model
D:\Dropbox (eClerx Services Ltd.)\Navdeep.Singh05\Documents\navdeep\coe work\Hilton\cntk custom hotels\Detection\FasterRCNN..\utils\rpn\proposal_layer.py:33: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
layer_params = yaml.load(self.param_str_)
Evaluating Faster R-CNN model for 3 images.
Number of rois before non-maximum suppression: 0
Number of rois after non-maximum suppression: 0
AP for shacks = 0.0000
AP for pool = 0.0000
Mean AP = 0.0000
D:\Dropbox (eClerx Services Ltd.)\Navdeep.Singh05\Documents\navdeep\coe work\Hilton\cntk custom hotels\Detection\FasterRCNN..\utils\rpn\proposal_layer.py:33: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
layer_params = yaml.load(self.param_str_)
Plotting results from Faster R-CNN model for 3 images.
roiScores min: 0.46174436807632446, max: 1.0, threshold: 0.1
reset decision threshold to: 0.23087218403816223
Traceback (most recent call last):
File "FasterRCNN.py", line 750, in
bgrPlotThreshold=cfg["CNTK"].RESULTS_BGR_PLOT_THRESHOLD)
File "D:\Dropbox (eClerx Services Ltd.)\Navdeep.Singh05\Documents\navdeep\coe work\Hilton\cntk custom hotels\Detection\FasterRCNN\plot_helpers.py", line 186, in eval_and_plot_faster_rcnn
decisionThreshold=bgrPlotThreshold)
File "D:\Dropbox (eClerx Services Ltd.)\Navdeep.Singh05\Documents\navdeep\coe work\Hilton\cntk custom hotels\Detection\FasterRCNN\plot_helpers.py", line 95, in visualizeResultsFaster
text = classes[label]
IndexError: list index out of range

Insufficient Memory: bad_alloc

Hello,
when I was trying to train the model with VGG16 base model, I got an error of bac_alloc related to the insufficient memory allocation on my DSVM with 4GB memory.

Basically, this is not a programming error. But I think pointing out the minimum requirements on the resource, like memory, will ease lots of pain for the people whose training aborted after running for a long time due to this problem.

Finally, here are some experiment results collected from the Linux DSVM:

  1. VGG16 + e2e: above 4GB but below 8GB
  2. VGG16 + 4-stage: around 17GB

Best wishes!

Default font size may be still missing

I saw your new commit. You set the font as FreeMono if Arial not found. But the problem of font not found still happens when I try to run the program on the Azure DSVM or DLVM.

So the better way, as i think, is to mention this problem in the Readme so that people will know everything they have to have before the program runs successfully.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.