luxonis / depthai-ml-training Goto Github PK

View Code? Open in Web Editor NEW

117.0 117.0 32.0 142.67 MB

Some Example Neural Models that we've trained along with the training scripts

License: MIT License

Jupyter Notebook 99.99% Python 0.01%

depthai-ml-training's People

Contributors

Stargazers

Watchers

depthai-ml-training's Issues

Google Colab MobilenetSSD training does not use GPU

I attempted to use this notebook, https://github.com/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb, to train a model on and received the following error.

I attempted to clear all outputs, disconnect and clear out my session and tried again but with the same results when I get down to starting the training. Is there a library that should be loaded that is being skipped or a work around.

2022-11-23 20:13:47.476326: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.476531: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.476670: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.476810: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.476941: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.477072: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.477231: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2022-11-23 20:13:47.477251: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices...

YoloV6 pt to blob converter tool not working

I'm using this colab notebook to train a YoloV6N model.

As shown in the notebook, I train my model and then download my pytorch weights. I then use the online tool to convert it to blob format, which is failing (as shown below). Pt weights here

I've also tried "640 352" in the field "Input Shape", but to no avail.

Several Dependency Conflicts on YOLOv6n training notebook.

Hi, I ran into these dependency conflicts last week, along with the "numpy has no attribute 'int'" and worked through them. I don't know if its worth it to update the Colab notebook as these things continue to change but this install string resolved the issue. A lot of it was specifying the highest levels without going over routine.

!pip install -U -q torch==1.13.0 torchvision>=0.9.0 protobuf==3.19.5 numpy==1.23.1 opencv-python>=4.1.2 PyYAML>=5.3.1 scipy>=1.4.1 tqdm>=4.41.0 addict>=2.4.0 tensorboard==2.9.0 pycocotools>=2.0 onnx>=1.10.0 onnx-simplifier>=0.3.6 thop

Thank you guys for all of the hard work you've put into this, though. 👍 It must be a mess keeping all of these code iterations together.

mobilenetssd for OAK-1

I would like to know how to train a MobileNetSSD model for OAK-1 Lite. I noticed that the existing notebooks for training MobileNetSSD are deprecated. Could you please provide any methods or guidance on how to proceed with training? Thank you for your help

Trained model not usable: RuntimeError: Device booted with different OpenVINO version that pipeline requires

Followed this Deeplab V3 Plus Mobile Net V3 notebook

Changed nothing and just went through all procedure to see whether it can yield a re-trained AI model.
It does generate an AI model and also converted the model to .blob file.

I then use this generated model with depthai project to test it on my OAK-D camera. It returns me an error:

$  python depthai_demo.py -cnn deeplabv3pmnv2 -vid /media/Workspace/Work/MyBuddy/Data/videos/boxhill_trail_back_ns.mp4 -sh 8 
Using depthai module from:  /media/Workspace/Learning/Github/depthai/myvenv/lib/python3.8/site-packages/depthai.cpython-38-x86_64-linux-gnu.so
Depthai version installed:  2.7.0.0
Available devices:
[0] 14442C10013762D700 [X_LINK_UNBOOTED]
Enabling low-bandwidth mode due to low USB speed... (speed: UsbSpeed.HIGH)
Traceback (most recent call last):
  File "/home/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/.vscode/extensions/ms-python.python-2021.8.1147840270/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
    cli.main()
  File "/home/.vscode/extensions/ms-python.python-2021.8.1147840270/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
    run()
  File "/home/.vscode/extensions/ms-python.python-2021.8.1147840270/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
    runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
  File "/home/anaconda3/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/anaconda3/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/media/Workspace/Learning/Github/depthai/depthai_demo.py", line 184, in <module>
    device.startPipeline(pm.p)
RuntimeError: Device booted with different OpenVINO version that pipeline requires

I then downgraded my depthai to 2.7.0.0 and it throws another error:

I then updated the requirements.txt to have
depthai==2.7.0.0

Then re-ran the command line and the same error happens again
RuntimeError: Device booted with different OpenVINO version that pipeline requires

Converting a SavedModel Tensorflow Format to Luxonis Blob format

This is a custom trained model on edgeimpulse.com which provides a Tensorflow SavedModel as well as a Tensorflow Lite Model at the end of the training. The issue is converting a Tensorflow SavedModel to a Luxonis Blob model by first Freezing the Tensorflow Saved Model and then trying to use https://blobconverter.luxonis.com/.

These are my files:
SavedModel file:
SavedModel.zip

Tflite file:
Tflite_file.zip

Sorry for opening this issue since it might have come up several times, but none of the solutions seemed to solve my issue. I've dropped the files in case it works for you'll. Sometimes it does convert, but doesn't work while running on OAK-D. For reference, there are 5 labels in the model and it is built on Mobilenet framework.

Freezing a Tensorflow Saved Model. The method successfully freezes the SavedModel, however it fails while using https://blobconverter.luxonis.com/ to convert the Frozen format to a blob format.

Code to freeze the SavedModel:

import tensorflow as tf

# Load the saved model
loaded = tf.saved_model.load("/content/saved_model/")

# Extract the graph
graph = tf.function(loaded.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]).get_concrete_function(tf.TensorSpec(shape=[1, 320, 320, 3], dtype=tf.float32))
frozen_graph = graph.graph.as_graph_def()

# Save the frozen graph
with tf.io.gfile.GFile("/content/frozen_model.pb", "wb") as f:
    f.write(frozen_graph.SerializeToString())

Error I get with blobconverter:

[ ERROR ]  Cannot infer shapes or values for node "StatefulPartitionedCall".
[ ERROR ]  Expected DataType for argument 'dtype' not None.
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function tf_native_tf_node_infer at 0x7f99c67e1af0>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Exception occurred during running replacer "REPLACEMENT_ID" (<class 'openvino.tools.mo.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "StatefulPartitionedCall" node. 
 For more information please refer to Model Optimizer FAQ, question #38. (https://docs.openvino.ai/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=38#question-38)

Using a SavedModel -> IR Representation (OpenVINO) -> Blob conversion
I used the standard OpenVINO instructions on the site to convert the SavedModel into a .xml and a .bin file.

This is apparently what I used and it converted it successfully to .xml and .bin files, however the .xml file was just 2 Kb and the .bin file was 0 Kb
Output while conversion:

[ INFO ] The model was converted to IR v11, the latest model format that corresponds to the source DL framework input/output format. While IR v11 is backwards compatible with OpenVINO Inference Engine API v1.0, please use API v2.0 (as of 2022.1) to take advantage of the latest improvements in IR v11.
Find more information about API v2.0 and IR v11 at https://docs.openvino.ai/latest/openvino_2_0_transition_guide.html
[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: /content/saved_model.xml
[ SUCCESS ] BIN file: /content/saved_model.bin

Nevertheless, I tried to convert to a blob file and it converted it to a 1Kb file and when I tried to run it, the NN size was displayed to be 3x320 which was absurd since it was trained on 320x320 images [1,320,320,3] format.

I reran the conversion again using:

!mo --input_shape [1,320,320,3] --saved_model_dir /content/saved_model/ --layout "ncwh->nhwc"

However, this time when I converted it to blob and used it while running, it gave me an error stating the bounding boxes contain x=1,y=0,w=0,h=0.

This time I tried a .tflite to .onnx approach using https://github.com/zhenhuaw-me/tflite2onnx which wasn't successful in converting it to onnx in the first stage itself. ( A .tflite model is also included on the EdgeImpulse dashboard so I thought of trying this out)

Code:

import tflite2onnx

tflite_path = '/content/trained.tflite'
onnx_path = '/content/model.onnx'

tflite2onnx.convert(tflite_path, onnx_path)

Error:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
[<ipython-input-37-c193f27d3323>](https://localhost:8080/#) in <module>
      4 onnx_path = '/content/model.onnx'
      5 
----> 6 tflite2onnx.convert(tflite_path, onnx_path)

4 frames
[/usr/local/lib/python3.8/dist-packages/tflite2onnx/op/common.py](https://localhost:8080/#) in create(self, index)
    152             if opcode in tflite.BUILTIN_OPCODE2NAME:
    153                 name = tflite.opcode2name(opcode)
--> 154                 raise NotImplementedError("Unsupported TFLite OP: {} {}!".format(opcode, name))
    155             else:
    156                 raise ValueError("Opcode {} is not a TFLite builtin operator!".format(opcode))

NotImplementedError: Unsupported TFLite OP: 83 PACK!

Next, I tried using tf2onnx (https://github.com/onnx/tensorflow-onnx) using a SavedModel file
Input:

python -m tf2onnx.convert --saved-model /content/saved_model/ --output /content/model.onnx

Output:

2023-01-19 17:18:58,166 - WARNING - '--tag' not specified for saved_model. Using --tag serve
2023-01-19 17:19:12,548 - INFO - Signatures found in model: [serving_default].
2023-01-19 17:19:12,548 - WARNING - '--signature_def' not specified, using first signature: serving_default
2023-01-19 17:19:12,550 - INFO - Output names: ['output_0', 'output_1', 'output_2', 'output_3']
2023-01-19 17:19:15,690 - INFO - Using tensorflow=2.9.2, onnx=1.13.0, tf2onnx=1.13.0/2c1db5
2023-01-19 17:19:15,690 - INFO - Using opset <onnx, 13>
2023-01-19 17:19:15,694 - INFO - Computed 0 values for constant folding
2023-01-19 17:19:15,700 - INFO - Optimizing ONNX model
2023-01-19 17:19:15,715 - INFO - After optimization: Const -3 (4->1), Identity -1 (4->3)
2023-01-19 17:19:15,716 - INFO - 
2023-01-19 17:19:15,716 - INFO - Successfully converted TensorFlow model /content/saved_model/ to ONNX
2023-01-19 17:19:15,716 - INFO - Model inputs: ['input']
2023-01-19 17:19:15,716 - INFO - Model outputs: ['output_0', 'output_1', 'output_2', 'output_3']
2023-01-19 17:19:15,716 - INFO - ONNX model is saved at /content/model.onnx

But after that, while converting it to blob using blobconverter, this is the issue:

[ ERROR ]  Numbers of inputs and mean/scale values do not match. 
 For more information please refer to Model Optimizer FAQ, question #61. (https://docs.openvino.ai/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=61#question-61)

I'm completely stuck and have tried everything I could and hoping there is some workaround. Any workaround on converting this to blob would be really helpful since I'm completing the project under a narrow time constraint.

Thanks a lot. Your help is really appreciated!
Dhruv Sheth

YOLOv4-tiny custom - no detection

After custom training and tried to run blob file with OAK-D WiFi but there is no detections.
Could you explain what I've done wrong? the converted blob file I placed in nn folder. Change all labels in Python Script according docu. Also tried to reduce IoU and threshold to 0.1 but there is no detection. With colab test yolov4-tiny works fine. OpenVino convertingI couldn't verify (newby)
In tutorials I've found different versions of OpenVINO (2021_1 and 2021_2). For my tests I use 2021_1 because I couldn't find how to download with wget the version from februar or march.
I've tried two different example scripts (from colab notebook and from docu)
I have:

2 classes
608x608 Yolov4-tiny
placed only .blob file without any other files during converting

Could you help me to solve my issue.
Thank you!
And it were very great if you could provide me a link with actual documentation, because many hyperlinks are "dead"

Converted YOLO model doesn't show any outputs

I converted a yolo-v3-tiny model to OpenVINO and then to a .blob file as shown in Easy_TinyYolov3_Object_Detector_Training_on_Custom_Data.ipynb, but even though I don't receive any error messages when I try to use the mode as described in the To run the .blob in DepthAI section, it doesn't show any output.

My notebook: https://colab.research.google.com/drive/1Kt-ESQ-0wpemo2Bg7AefI6qL3Xt374jp?usp=sharing

YoloV7 Training Colab Error: AttributeError: module 'numpy' has no attribute 'int'`

Attempted to run the YoloV7 training colab without any modification:

https://colab.research.google.com/github/luxonis/depthai-ml-training/blob/master/colab-notebooks/YoloV7_training.ipynb#scrollTo=Z7rEpSboRibz

The first error occurred after running

!python train.py --epochs 2 --workers 8 --device 0 --batch-size 32 --data data/voc.yaml --img 640 640 --cfg cfg/training/yolov7_voc-tiny.yaml --weights 'yolov7-tiny.pt' --hyp data/hyp.scratch.tiny.yaml

/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Model Summary: 263 layers, 6066402 parameters, 6066402 gradients, 13.3 GFLOPS

Transferred 330/344 items from yolov7-tiny.pt
Scaled weight_decay = 0.0005
Optimizer groups: 58 .bias, 58 conv.weight, 61 other
train: Scanning 'VOCdevkit/voc_07_12/labels/train.cache' images and labels... 16551 found, 0 missing, 0 empty, 0 corrupted: 100% 16551/16551 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 616, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 245, in train
    dataloader, dataset = create_dataloader(train_path, imgsz, batch_size, gs, opt,
  File "/content/yolov7/utils/datasets.py", line 69, in create_dataloader
    dataset = LoadImagesAndLabels(path, imgsz, batch_size,
  File "/content/yolov7/utils/datasets.py", line 418, in __init__
    bi = np.floor(np.arange(n) / batch_size).astype(np.int)  # batch index
  File "/usr/local/lib/python3.8/dist-packages/numpy/__init__.py", line 284, in __getattr__
    raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'int'

Your help would be much appreciated!

Happy new year and holidays!

YOLOv6 notebook not working due to new release

Hello,

I've been trying to use YOLOv6 notebook and everything was okay until I got to training.
I get Can't get attribute 'SimConv' on <module 'yolov6.layers.common'

When I checked v6 repo, I found the following issue: meituan/YOLOv6#799
Turns out that new version was released v0.4.0 (yesterday), so the weights downloaded in notebook are not correct.

I did try to change the link to download 0.4.0 weights. Model then trained, but output from that can't be converted in blobconverter:

So I figured that something added in 0.4.0 is not supported and I'm writing this here. In the meantime, if anyone else is facing the same issue, just checkout tag 0.3.0 in notebook, and everything should work the same until release 0.4.0 is supported.

Yolov7 issue

Hello
When I try to execute the tensorboard cell or start traing I get the following error on Yolov7 notebook

tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA

If you cannot immediately regenerate your protos, some other possible workarounds are:

Downgrade the protobuf package to 3.20.x or lower.
Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

Yolov7 issue with people count

Conversion issue

https://colab.research.google.com/drive/1oNxfvx5jOfcmk1Nx0qavjLN8KtWcLRn6?usp=sharing#scrollTo=b5Z1QFooGIW8

Hi
I am working on this project and I want to test it. The code works fine until the conversion starts. I receive multiple errors.
Can you please help with this and try to check the code ?

Thank you very much

PyTorch Model Support

Hi team,

Just wondering do you have any plan for supporting the PyTorch models in the converter?

With such a popularity for PyTorch, it seems odd to not support PyTorch models.

I was okay to go with TF until I actually tried to train a custom model and received so many hurdles:

This is what I have experienced:

Hurdle 1
I followed the official tutorials for training a custom segmentation model here and here
and found they are all DEPRECATED(ctrl+c, ctrl+v without changing the cap).

Hurdle 2
I then followed the Roboflow colab notebook, then discovered that we need to apply for a token in order to use its script, which I cannot find...

Hurdle 3
I then created an account on their website and found some api keys and tried each(e.g. Publishable API Key), but none of them worked.

After hours searching, I gave up. it's just too much hustles to try to follow the tutorial to train our custom models with Tensorflow...

I then went to the blob converter website and found it does not even support PyTorch...

My suggestions:

customers have paid for the hardware, please provide them with open source project(not the deprecated one) for them to train their custom models free. Try to minimize the difficulties to use your products.
My expectation was once I paid for the hardware, I do not need to pay anything else in order to use it without difficulties. But the reality is, sure Roboflow provides the "public version" which is free(but need token??? no where to find the token) to use, but we have to trade out our data set in order to use its service... Could we try to clean up the old and deprecated code and provide your customers a clean, update to date and functional code and let them happily to use your products??
Hopefully, the PyTorch models can be supported soon.

Error when running main.py device-decoding

Hello!

I was trying to follow the step within the YOLOv8 colab and found an error in the final steps, when trying to run the camera with the obtained model.

Likewise, the same happened when I tried to run the Yolov7 object detection colab. Here is the error trace.

/home/venv/lib/python3.9/site-packages/depthai_sdk/oak_camera.py:237: UsbWarning: Device connected in USB2 mode! This might cause some issues. In such case, please try using a (different) USB3 cable, or force USB2 mode 'with OakCamera(usbSpeed=depthai.UsbSpeed.HIGH)'
warnings.warn("Device connected in USB2 mode! This might cause some issues. "

Downloading /home/user/.cache/blobconverter/bestyolo7_openvino_2021.4_6shave.blob...
{
"exit_code": 1,
"message": "Command failed with exit code 1, command: /opt/intel/openvino2021_4/deployment_tools/inference_engine/lib/intel64/myriad_compile -m /tmp/blobconverter/0894b1b8dea94e2f8efe44cf49f13736/bestyolo7/FP16/bestyolo7.xml -o /tmp/blobconverter/0894b1b8dea94e2f8efe44cf49f13736/bestyolo7/FP16/bestyolo7.blob -c /tmp/blobconverter/0894b1b8dea94e2f8efe44cf49f13736/myriad_compile_config.txt -ip U8",
"stderr": "Unknown model format! Cannot find reader for model format: xml and read the model: /tmp/blobconverter/0894b1b8dea94e2f8efe44cf49f13736/bestyolo7/FP16/bestyolo7.xml. Please check that reader library exists in your PATH.\n",
"stdout": "Inference Engine: \n\tIE version ......... 2021.4.2\n\tBuild ........... 2021.4.2-3974-e2a469a3450-releases/2021/4\n"
}
Closing OAK camera

Any ideas what might be causing this issue?

tensorflow 1.x not working anymore

please update the notebook as colab removed the support for tensorflow 1.x
this doesn't work anymore%tensorflow_version 1.x

Yolov7 issue with people count

The custom model is not converting to blob file using the luxonis tooL
it says error contact developer
I have attached the pt file below
yolov8ntrained.zip
I have used the coco dataset of people class with annotation from this link:
https://universe.roboflow.com/shreks-swamp/coco-dataset-limited--person-only

Numpy mismatch with Easy_Object_Detection_With_Custom_Data_Demo_Training_Git.ipynb

When running the Colab there is an error in stage "Install Tensorflow Object Detection API":

RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd

I tried different solutions. When I install numpy 1.20.0 this stage works, but it fails later in the training.

This still worked about 3 weeks ago.

Can you fix it again?

Bad results with YoloV6N blob model

I'm using this colab notebook to train a YoloV6N model.

The pt model I get worked perfectly, like this

As shown in the notebook, I train my model and then download my pytorch weights. I use the online tool to convert it to blob format. I then use this file to check the inference of this blob model on the same image.
Unfortunately, I get extremely bad results, as shown here

This doesn't look like a problem to be solved with IoU threshold change (since I see no correct bounding boxes around objects at all)

I use "640" and "640 640" both in the online tool for converting pt to blob model. It didn't fix the issue (in the colab training notebook, I used "--img-size 640" for training).

Can someone please help with this?

Add postprocessing in OAK-d-PoE

Hello!

I am working with the OAK-D-PoE camera. Currently, I have successfully integrated the YOLO v7 neural network into this camera using the tutorial provided on GitHub (https://github.com/luxonis/depthai-experiments/tree/769029ea4e215d03f741bcf085d1bb6c94009856/gen2-yolo/device-decoding).

Could you please advise if it's possible to add post-processing directly to the camera, so that neural network processing -> post-processing -> data transmission to the computer occurs within the camera itself?

I would greatly appreciate your response and any advice on how to accomplish this.

Thank you in advance for your assistance.

YoloV3_V4_tiny_training.ipynb - convert_weights_pb.py error: Cannot convert a symbolic Tensor (detector/yolo-v4-tiny/meshgrid/Size_1:0) to a numpy array.

The following cell:

!python3 convert_weights_pb.py \
--yolo $yolo_version \
--weights_file $weights_best \
--class_names /content/obj.names \
--output $output_name_pb \
--tiny \
-h 320 \
-w 512

Fails w/the output below:

!python3 convert_weights_pb.py \
--yolo $yolo_version \
--weights_file $weights_best \
--class_names /content/obj.names \
--output $output_name_pb \
--tiny \
-h 320 \
-w 512
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From convert_weights_pb.py:94: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

None
WARNING:tensorflow:From convert_weights_pb.py:68: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0323 14:01:30.138594 140253906458496 module_wrapper.py:139] From convert_weights_pb.py:68: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From convert_weights_pb.py:76: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0323 14:01:30.147910 140253906458496 module_wrapper.py:139] From convert_weights_pb.py:76: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/contrib/layers/python/layers/layers.py:1057: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0323 14:01:30.156273 140253906458496 deprecation.py:323] From /tensorflow-1.15.2/python3.7/tensorflow_core/contrib/layers/python/layers/layers.py:1057: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
Traceback (most recent call last):
  File "convert_weights_pb.py", line 94, in <module>
    tf.app.run()
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "convert_weights_pb.py", line 77, in main
    detections = model(inputs, len(classes), anchors, data_format=FLAGS.data_format)
  File "/content/yolo2openvino/models/yolo_v4_tiny.py", line 87, in yolo_v4_tiny
    net, num_classes, anchors[3:6], img_size, data_format)
  File "/content/yolo2openvino/models/common.py", line 79, in _detection_layer
    a, b = tf.meshgrid(grid_x, grid_y)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/array_ops.py", line 2943, in meshgrid
    mult_fact = ones(shapes, output_dtype)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/array_ops.py", line 2560, in ones
    output = _constant_if_small(one, shape, dtype, name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/array_ops.py", line 2295, in _constant_if_small
    if np.prod(shape) < 1000:
  File "<__array_function__ internals>", line 6, in prod
  File "/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py", line 3052, in prod
    keepdims=keepdims, initial=initial, where=where)
  File "/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 736, in __array__
    " array.".format(self.name))
NotImplementedError: Cannot convert a symbolic Tensor (detector/yolo-v4-tiny/meshgrid/Size_1:0) to a numpy array.

I'm running on Google Colab. Here is a gist of the notebook output.

tiny yolo and yolo not performing well on OAK-D stereocamera

I have custom trained a yolov7 and a yolov4-tiny model for detecting objects. I have made sure to follow all the recommended steps given in the DepthAI documentation regarding conversion of the model to blob format. Despite training well, my model makes very few detections when running on the stereocamera (most are wrong). However the model works fine when used with a simple webcam on the same live feed. This led me to believe that the problem must be with the conversion. I converted the yolov4-tiny model from .weights to .onnx using tensorrt_demos library, and later converted onnx to a 6 shave blob using https://blobconverter.luxonis.com/. For yolov7, I directly converted the weights from .pt to a 6 shave blob using https://tools.luxonis.com/. Has anyone else faced similar issues when training custom models? It would be great if anyone had any sort of insight.

tensorflow_version 2.x?

i can't seem to run TensorFlow 1.x with google cob so i changed the code to TensorFlow_ versions 2.x. Now im getting this error contrib_training' is not defined error when trying to train

Could someone help me with my issue please

converting ssd mobilenet v2 to tflite fails

Hi,

I trained the model above on custom dataset. I tried next to use google API to convert to tflite, but it fails.

ValueError: ssd_mobilenet_v2 is not supported. See `model_builder.py` for features extractors compatible with different versions of Tensorflow

Can someone please help me ?

Notebook running on windows machine

Do you have any notebook on detection with custom data working on windows system ?

Also some of the code implemented in those notebook requires the object_detection library which works on numpy=1.18 while depthai requires numpy=1.21.

Thanks!

test

DeepLabV3plus_MNV2.ipynb - Error on step "Fixing XML"

Hi,
In the second to last step, the script to fix the XML file outputs an error due to the layer name change. Below is the change I used to resolve the error.

WAS:
data = root.find('.//layer[@name="strided_slice_10/extend_end_const1245431561"]/data')
IS:
data = root.find('.//layer[@name="strided_slice_10/extend_end_const1245431174"]/data')

Mike E

yolov7 custom tiny model: X_LINK_ERROR | side values? | poor detection with OAK-D

Hello,
I have converted a custom trained yolov7 tiny model (13 classes and mAP=0.75, 1024x1024) into a blob with http://tools.luxonis.com/
I have used the blob following the instruction of YoloV7_training.ipynb notebook from [depthai-ml-training] repo with a OAK-D camera (connected to USB3 with/or without additional power supply)
I have 2 issues:

1) after few seconds I get an error. Why?
Traceback (most recent call last):
File "main.py", line 51, in
pv.prepareFrames()
File "/home/mz/Projects/ObjectDetection/depthai/depthai_sdk/src/depthai_sdk/managers/preview_manager.py", line 148, in prepareFrames
packet = queue.tryGet()
RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'color' (X_LINK_ERROR)'

2) before the error I see very few and bad detections even if the trained model gave very good results on the test set for static images (mAP~0.75). Why? Please see my Obs below: is that the reason?

Obs: The JSON note in https://github.com/luxonis/depthai-experiments/tree/master/gen2-yolo/device-decoding is not clear to me:
I have not changed the Json file of my model which is 1024x1024 as I do not understand WHERE I have to change the "side" entries: I have no side32 or side16 but many of them and all concerning the anchor masks. See attached my json file from the blob conversion: shall I do change something? how and where exactly? best.zip

Note: Values must match the values set in the CFG during training. If you use a different input width, you should also change side32 to sideX and side16 to sideY, where X = width/16 and Y = width/32. If you are using a non-tiny model, those values are width/8, width/16, and width/32.

Thank you in advance
Marco

converting deeplabv3 graph

Hi I'd like some help converting the deeplabv3 series of models.
I was trying to use the blob converter tool.
I'm trying to convert http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz, but am getting errors

Im using the following mo params:

--data_type=FP16 --mean_values=[127.5,127.5,127.5] --scale_values=[255,255,255] --input_shape=[1,513,513,3]

with the default myriad compile params -ip U8

But get the following error:

Check 'element::Type::merge(element_type, element_type, node->get_input_element_type(i))' failed at core/src/op/util/elementwise_args.cpp:18:
While validating node 'v1::Multiply ImageTensor/scale/FusedMul (ImageTensor/Transpose([0 3 1 2])[0]:f16{1,3,513,513}, data_mul_1734830178[0]:u8{1,3,1,1}) -> (u8{1,3,513,513})' with friendly_name 'ImageTensor/scale/FusedMul':
Argument element types are inconsistent.

Specific branch of models

Hi,

the models git repo branch is not the main one, I tried calling a python script that converts the checkpoints to tflite format, but it doesn't exist in this branch.

What's the reason behind choosing this one? Is it just the only one that worked for you ?

Thanks

loc("Power_3939"): error: SCALARS are not supported

Hi, I'm trying to compile an IR model which was converted from ONNX using openvino 2022.1 and I'm getting a compilation error. The device is running LuxonisOS 1.5. The model is FP16 and I have not run it through the POT. Model binary is below.

model onnx: https://github.com/commaai/openpilot/raw/1e49c54ffb274a7987626ebfaa8eb3e75ac6fe7c/selfdrive/modeld/models/supercombo.onnx

root@keembay:~# /opt/openvino/tools/compile_tool/compile_tool -d VPUX.3400 -m supercombo.xml -ip FP16 -ov_api_1_0
OpenVINO Runtime version ......... 2022.1.0
Build ........... 2022.1.0-7080-6582ec65d78-releases/2022/1
Network inputs:
    big_input_imgs : FP16 / NCHW
    desire : FP16 / CHW
    features_buffer : FP16 / CHW
    input_imgs : FP16 / NCHW
    traffic_convention : FP16 / NC
Network outputs:
    outputs : FP32 / NC

Callback signal handler installed

Opening XLink Device File
loc("Power_3939"): error: SCALARS are not supported
Compilation failed

DeepLabv3plus+Mobilenetv2-HighLevelDemo on semantic segmentation doesn't work

Hi,

Running the last step of the Google Colab example for semantic segmentation gives me the following error:

[setupvars.sh] OpenVINO environment initialized Model Optimizer arguments: Common parameters: - Path to the Input Model: /content/l_openvino_toolkit_p_2020.1.023/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set_mobilenetv2/export/frozen_inference_graph.pb - Path for generated IR: /content/ - IR output name: deeplab_v3_plus_mnv2_decoder_256.pb - Log level: ERROR - Batch: Not specified, inherited from the model - Input layers: Not specified, inherited from the model - Output layers: Not specified, inherited from the model - Input shapes: [1,256,256,3] - Mean values: [127.5, 127.5, 127.5] - Scale values: [127.5, 127.5, 127.5] - Scale factor: Not specified - Precision of IR: FP16 - Enable fusing: True - Enable grouped convolutions fusing: True - Move mean values to preprocess section: False - Reverse input channels: True TensorFlow specific parameters: - Input model in text protobuf format: False - Path to model dump for TensorBoard: None - List of shared libraries with TensorFlow custom layers implementation: None - Update the configuration file with input/output node names: None - Use configuration file used to generate the model with Object Detection API: None - Operations to offload: None - Patterns to offload: None - Use the config file: None Model Optimizer version: 2020.1.0-61-gd349c3ba4a [ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.back.ConvolutionNormalizer.PullReshapeThroughFQ'>): After partial shape inference were found shape collision for node strided_slice_10/Cast_2 (old shape: [3], new shape: [4]) [ ]

I was promised a semantic segmentation example with depth perception upon backing up the OAK-D but now I can't find even one working example of any kind of semantic segmentation model. :(

luxonis / depthai-ml-training Goto Github PK

depthai-ml-training's People

Contributors

Stargazers

Watchers

Forkers

depthai-ml-training's Issues

Recommend Projects

Recommend Topics

Recommend Org