Giter Club home page Giter Club logo

tfodcourse's Introduction

Tensorflow Object Detection Walkthrough

This set of Notebooks provides a complete set of code to be able to train and leverage your own custom object detection model using the Tensorflow Object Detection API. This accompanies the Tensorflow Object Detection course on my YouTube channel.

Steps


Step 1. Clone this repository: https://github.com/nicknochnack/TFODCourse

Step 2. Create a new virtual environment
python -m venv tfod

Step 3. Activate your virtual environment
source tfod/bin/activate # Linux
.\tfod\Scripts\activate # Windows 

Step 4. Install dependencies and add virtual environment to the Python Kernel
python -m pip install --upgrade pip
pip install ipykernel
python -m ipykernel install --user --name=tfodj

Step 5. Collect images using the Notebook 1. Image Collection.ipynb - ensure you change the kernel to the virtual environment as shown below


Step 6. Manually divide collected images into two folders train and test. So now all folders and annotations should be split between the following two folders.
\TFODCourse\Tensorflow\workspace\images\train
\TFODCourse\Tensorflow\workspace\images\test

Step 7. Begin training process by opening 2. Training and Detection.ipynb, this notebook will walk you through installing Tensorflow Object Detection, making detections, saving and exporting your model.

Step 8. During this process the Notebook will install Tensorflow Object Detection. You should ideally receive a notification indicating that the API has installed successfully at Step 8 with the last line stating OK.

If not, resolve installation errors by referring to the Error Guide.md in this folder.

Step 9. Once you get to step 6. Train the model, inside of the notebook, you may choose to train the model from within the notebook. I have noticed however that training inside of a separate terminal on a Windows machine you're able to display live loss metrics.


Step 10. You can optionally evaluate your model inside of Tensorboard. Once the model has been trained and you have run the evaluation command under Step 7. Navigate to the evaluation folder for your trained model e.g.
 cd Tensorlfow/workspace/models/my_ssd_mobnet/eval
and open Tensorboard with the following command
tensorboard --logdir=. 
Tensorboard will be accessible through your browser and you will be able to see metrics including mAP - mean Average Precision, and Recall.

tfodcourse's People

Contributors

nicknochnack avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tfodcourse's Issues

Verification Script cannot find cusolver64_11.dll or cusparse64.dll (CUDA files) even though they are present

Verification script cannot find the following files even though they are intact and located in the same CUDA bin directory as other files which are found successfully.

2021-06-02 14:16:50.299737: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2021-06-02 14:17:15.380081: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found

I reinstalled CUDA v11.3 in the hopes that that would fix the problem, but it did not. Same error still occurs.

Other steps work fine, such as:
2021-06-02 14:15:46.449806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: NVIDIA GeForce MX250 computeCapability: 6.1
coreClock: 1.582GHz coreCount: 3 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 44.76GiB/s
2021-06-02 14:15:46.471334: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll

but the verification fails as indicated by:
2021-06-02 14:17:47.953824: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at cwise_ops_common.h:128 : Resource exhausted: OOM when allocating tensor with shape[1,1,1152,320] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
INFO:tensorflow:time(main.ModelBuilderTF2Test.test_create_ssd_models_from_config): 4.73s
I0602 14:17:48.105210 21272 test_util.py:2102] time(main.ModelBuilderTF2Test.test_create_ssd_models_from_config): 4.73s
[ FAILED ] ModelBuilderTF2Test.test_create_ssd_models_from_config


Ran 24 tests in 125.189s
FAILED (errors=1, skipped=1)

I assume these failures are because the dll files cannot be found.

Conversion to ONNX - Request for advice

Hi Nick,

I am trying to convert the saved model to ONNX
On inspection of the saved model in the Netron app (https://netron.app/) I am struggling to identify the required inputs and outputs for conversion.

Looking at the exporter_main_v2.py the and the Freeze script the input tensor is listed as --input_type=image_tensor though that is not the label I see in Netron "serving_default_input_tensor"

I am unable to see any outputs from the saved_model.pb although I have read this within another jupyter notebook and used it.

Attached is the image I see from Netron for the saved model.

If you have any advice on converting the model to an ONNX format that could be used in Visual Studio I would appreciate it.

Netron_SavedModel

SOLVED :: undefined symbol: _ZSt28__throw_bad_array_new_lengthv

Hi Nick,

first of all THANK YOU for this tutorial. I hope you manage to keep on doing them. I enjoyed it a lot! :)

But I ran into a couple of issues that are specific to my setup. I wanted to post each one of them here - and the solution I found - in case someone else runs into them. Maybe there is an candidate for your Error Guide.md among them.

The first ERROR Message that I received while importing visualization_utils was:

from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder
ImportError: /opt/Python/TFODCourse/tfod/lib/python3.9/site-packages/matplotlib/ft2font.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZSt28__throw_bad_array_new_lengthv

This could be solved by upgrading Matplotlib (inside the virtual environment) from 3.2.0 to version 3.2.2 (I tried a couple of newer and older versions - all of them lead to other issues I did not find solutions for):

pip uninstall -y matplotlib
pip install matplotlib==3.2.2

Maximum box coordinate value is larger than 1.100000

SETTING
I'm running a Conda virtual env with Python 3.8.6 and Tensorflow 2.6.0 on a M1 Mac mini.
Model I picked: SSD MobileNet V2 FPNLite 320x320 (the one from the tutorial)
Code is almost identical to the one from the tutorial, except for the images (I'm using my own pre-existing .TIF files, converting them in .PNG right before the labelling process)

ISSUE
First training went fine with no errors, but when I run the evaluation script I get this error

Summarized data: b'maximum box coordinate value is larger than 1.100000: ' X<<

(with X ranging from 3 to thousands)

If I get to make the script run (by randomly re-running the command), I get impossible results, like AP and AR are either 0.000 or -1.000

WHAT I'VE ALREADY CHECKED
The script I'm using to generate TF records is the one from the tutorial (https://github.com/nicknochnack/GenerateTFRecord/blob/main/generate_tfrecord.py)
and it should normalize boxes' xmin/xmax and ymin/ymax (I guess)

I also checked the sanity of TF record by opening it with a viewer found here:https://github.com/sulc/tfrecord-viewer
Results can be seen here:
https://i.stack.imgur.com/7oWXL.jpg

Boxes are perfect in position and dimensions for both "training" and "testing" TF records, but -as I see from this viewer- they are entirely covered by the label name ("suspension").

QUESTION
Could this be the cause of the issue?
And if so, how could I make these "suspension tags" smaller/proportional to the small boxes?
More in general: what am I doing wrong?

Update the error guide

Error: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Solution: Reinstall pycocotools

For this error, the solution did not work for me. To get my code working I had to use the --no-cache-dir flag when I installed pycocotools. It would probably be worth updating the solution to this error to include this flag to prevent confusion in the future.

module 'tensorflow' has no attribute 'gfile'

Hi, I am at the script 2 on the part 5. Update Config For Transfer Learning

However this error module 'tensorflow' has no attribute 'gfile' shows up when I go in the following line command

config = config_util.get_configs_from_pipeline_file(files['PIPELINE_CONFIG'])

Do you guys have any clue? I mean it has a change in tensorflow but I have tried using tensorflow v1 but does not solve, because the problem is in the config_util of the object detection

Colab VERIFICATION_SCRIPT ModuleNotFoundError: No module named 'tensorflow.python.keras.applications'

I am trying to run the code direct from uploading into COLAB from GitHub
I am unable to get past the VERIFICATION_SCRIPT

It reports the following error:

Traceback (most recent call last):
File "Tensorflow/models/research/object_detection/builders/model_builder_tf2_test.py", line 25, in
from object_detection.builders import model_builder
File "/usr/local/lib/python3.7/dist-packages/object_detection/builders/model_builder.py", line 37, in
from object_detection.meta_architectures import deepmac_meta_arch
File "/usr/local/lib/python3.7/dist-packages/object_detection/meta_architectures/deepmac_meta_arch.py", line 19, in
from object_detection.models.keras_models import resnet_v1
File "/usr/local/lib/python3.7/dist-packages/object_detection/models/keras_models/resnet_v1.py", line 22, in
from tensorflow.python.keras.applications import resnet
ModuleNotFoundError: No module named 'tensorflow.python.keras.applications'

Using the latest Colab versions:
tensorflow 2.8.0
tensorflow-io 0.23.1
keras 2.8.0
Keras-Preprocessing 1.1.2

I have tried using older versions ranging from 2.3.0 but these just lead to different errors

Please advise if you have a solution to this.

Capture Images, Linux (Ubuntu)

When run Capture Images, after Collecting images for thumbsdown shows my camera freeze and this error this error occurs:

OpenCV(4.5.2) /tmp/pip-req-build-13uokl4r/opencv/modules/imgcodecs/src/loadsave.cpp:721: error: (-215:Assertion failed) !_img.empty() in function 'imwrite'

2021-06-13_13-49

Mask R-CNN

does anybody know why I'm getting this error when I'm trying to use mask-RCNN

ParseError Traceback (most recent call last)
/tmp/ipykernel_33/202684143.py in
----> 1 config = config_util.get_configs_from_pipeline_file(files['PIPELINE_CONFIG'])

/opt/conda/lib/python3.7/site-packages/object_detection/utils/config_util.py in get_configs_from_pipeline_file(pipeline_config_path, config_override)
137 with tf.gfile.GFile(pipeline_config_path, "r") as f:
138 proto_str = f.read()
--> 139 text_format.Merge(proto_str, pipeline_config)
140 if config_override:
141 text_format.Merge(config_override, pipeline_config)

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in Merge(text, message, allow_unknown_extension, allow_field_number, descriptor_pool, allow_unknown_field)
723 allow_field_number,
724 descriptor_pool=descriptor_pool,
--> 725 allow_unknown_field=allow_unknown_field)
726
727

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in MergeLines(lines, message, allow_unknown_extension, allow_field_number, descriptor_pool, allow_unknown_field)
791 descriptor_pool=descriptor_pool,
792 allow_unknown_field=allow_unknown_field)
--> 793 return parser.MergeLines(lines, message)
794
795

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in MergeLines(self, lines, message)
816 """Merges a text representation of a protocol message into a message."""
817 self._allow_multiple_scalars = True
--> 818 self._ParseOrMerge(lines, message)
819 return message
820

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in _ParseOrMerge(self, lines, message)
835 tokenizer = Tokenizer(str_lines)
836 while not tokenizer.AtEnd():
--> 837 self._MergeField(tokenizer, message)
838
839 def _MergeField(self, tokenizer, message):

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in _MergeField(self, tokenizer, message)
965
966 else:
--> 967 merger(tokenizer, message, field)
968
969 else: # Proto field is unknown.

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in _MergeMessageField(self, tokenizer, message, field)
1040 if tokenizer.AtEnd():
1041 raise tokenizer.ParseErrorPreviousToken('Expected "%s".' % (end_token,))
-> 1042 self._MergeField(tokenizer, sub_message)
1043
1044 if is_map_entry:

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in _MergeField(self, tokenizer, message)
965
966 else:
--> 967 merger(tokenizer, message, field)
968
969 else: # Proto field is unknown.

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in _MergeMessageField(self, tokenizer, message, field)
1040 if tokenizer.AtEnd():
1041 raise tokenizer.ParseErrorPreviousToken('Expected "%s".' % (end_token,))
-> 1042 self._MergeField(tokenizer, sub_message)
1043
1044 if is_map_entry:

/opt/conda/lib/python3.7/site-packages/google/protobuf/text_format.py in _MergeField(self, tokenizer, message)
932 raise tokenizer.ParseErrorPreviousToken(
933 'Message type "%s" has no field named "%s".' %
--> 934 (message_descriptor.full_name, name))
935
936 if field:

ParseError: 153:40 : Message type "object_detection.protos.TFRecordInputReader" has no field named "s".

ValueError: 'images' must have either 3 or 4 dimensions.

Hi. i have been following the tutorial but i am unable to run a code.

IMAGE_PATH = os.path.join(paths['IMAGE_PATH'],'test','gnf_dna-18.png')
img = cv2.imread(IMAGE_PATH)
image_np = np.array(img)

input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
detections = detect_fn(input_tensor)

num_detections = int(detections.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy()
for key, value in detections.items()}
detections['num_detections'] = num_detections

#detection_classes should be ints.
detections['detection_classes'] = detections['detection_classes'].astype(np.int64)

label_id_offset = 1
image_np_with_detections = image_np.copy()

viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'],
detections['detection_classes']+label_id_offset,
detections['detection_scores'],
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=5,
min_score_thresh=.8,
agnostic_mode=False)

plt.imshow(cv2.cvtColor(image_np_with_detections, cv2.COLOR_BGR2RGB))
plt.show()

SOLVED :: If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config

When I try to use the OD on my live video the process crashes with the following error message:

`(-2:Unspecified error) Rebuild the library with Windows, GTK+ 2.x or Carbon support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvShowImage'.`

This is a problem with openCV and missing libraries. I found that you can install the following package using pip inside your virtual environment:

pip install opencv-contrib-python

And the second step was to globally install the following packages using your distro's package manager - in my case this is Arch Linux & Pacman:

sudo pacman -Syu gtk4 pkg-config

I am not 100% sure that BOTH steps are necessary - but afterwards everything worked as expected.

Issue while creating `tflite` model

While creating the tflite file using command line method which is there in the notebook, it is creating the tflite file but it is of 2KB and I used netron to visualise the tflite model and nothing is there.

Error: maximum box coordinate value is too large

I ran into this error, I trained my data successfully but when trying to evaluate my data I am running into this error: tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected ‘tf.Tensor(False, shape=(), dtype=bool)’ to be true. Summarized data: b’maximum box coordinate value is larger than 1.100000: ‘1.1818783

I tried changing the box_list_ops.py as detailed here: #1754 but that has not fixed it.

Log can be found here: https://gist.github.com/dqthang1323/667ca62d52a66c914781022c4a145f3a

Thanks

I get the following error during training using python tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=tensorflow\workspace\models\my_ssd_mobilenet --pipeline_config_path = tensorflow\workspace\models\my_ssd_mobilenet\pipeline.config --num_train_steps=2000

2022-02-28 13:00:27.096384: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-02-28 13:00:27.642485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3991 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5 INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) I0228 13:00:27.767138 1196 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) Traceback (most recent call last): File "E:\MACHINELEARNING\TENSORFLOW_P\OBJD\tensorflow\models\research\object_detection\model_main_tf2.py", line 115, in <module> tf.compat.v1.app.run() File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 36, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\absl\app.py", line 312, in run _run_main(main, args) File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\absl\app.py", line 258, in _run_main sys.exit(main(argv)) File "E:\MACHINELEARNING\TENSORFLOW_P\OBJD\tensorflow\models\research\object_detection\model_main_tf2.py", line 106, in main model_lib_v2.train_loop( File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py", line 504, in train_loop configs = get_configs_from_pipeline_file( File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\utils\config_util.py", line 138, in get_configs_from_pipeline_file proto_str = f.read() File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 114, in read self._preread_check() File "E:\MACHINELEARNING\TENSORFLOW_P\tensorflow\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 76, in _preread_check self._read_buf = _pywrap_file_io.BufferedInputStream( tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: = : The system cannot find the file specified. ; No such file or directory

error shape=(1, 480, 640, 3), dtype=float32)) input_signature: ( TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None))

hello, when running the live detection using webcam I am getting an error saying

Python inputs incompatible with input_signature:
inputs: (
tf.Tensor(
[[[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

...

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
...
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]]], shape=(1, 480, 640, 3), dtype=float32))
input_signature: (
TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None))

code:
cap = cv2.VideoCapture(0)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

while cap.isOpened():
ret, frame = cap.read()
image_np = np.array(frame)

input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
detections = detect_fn(input_tensor)

num_detections = int(detections.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy()
              for key, value in detections.items()}
detections['num_detections'] = num_detections

# detection_classes should be ints.
detections['detection_classes'] = detections['detection_classes'].astype(np.int64)

label_id_offset = 1
image_np_with_detections = image_np.copy()

viz_utils.visualize_boxes_and_labels_on_image_array(
            image_np_with_detections,
            detections['detection_boxes'],
            detections['detection_classes']+label_id_offset,
            detections['detection_scores'],
            category_index,
            use_normalized_coordinates=True,
            max_boxes_to_draw=5,
            min_score_thresh=.8,
            agnostic_mode=False)

cv2.imshow('object detection',  cv2.resize(image_np_with_detections, (800, 600)))

if cv2.waitKey(10) & 0xFF == ord('q'):
    cap.release()
    cv2.destroyAllWindows()
    break

Tensorflow install error

ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow
getting the following error when i try to pip install tensorflow

Tensorflow object detection installation error in Ubuntu 20.04

The following code in the 2. Training and Detection.ipynb file is giving the error shown at the bottom:

if os.name=='posix':  
    `!sudo -S apt-get install protobuf-compiler -y  < password.txt 
    #!sudo apt-get install protobuf-compiler
    %cd Tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=. && cp object_detection/packages/tf2/setup.py . && python -m pip install . 
    
if os.name=='nt':
    url="https://github.com/protocolbuffers/protobuf/releases/download/v3.15.6/protoc-3.15.6-win64.zip"
    wget.download(url)
    !move protoc-3.15.6-win64.zip {paths['PROTOC_PATH']}
    !cd {paths['PROTOC_PATH']} && tar -xf protoc-3.15.6-win64.zip
    os.environ['PATH'] += os.pathsep + os.path.abspath(os.path.join(paths['PROTOC_PATH'], 'bin'))   
    !cd Tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=. && copy object_detection\\packages\\tf2\\setup.py setup.py && python setup.py build && python setup.py install
    !cd Tensorflow/models/research/slim && pip install -e . ```
    



[Errno 2] No such file or directory: 'Tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=. && cp object_detection/packages/tf2/setup.py . && python -m pip install .'

    
    

SOLVED :: libstdc++.so.6: version `GLIBCXX_3.4.29' not found

This error is also related to the following imports:

from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder
ImportError: /home/myuser/anaconda3/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/Python/TFODCourse/tfod/lib/python3.9/site-packages/matplotlib/_path.cpython-39-x86_64-linux-gnu.so)`

I have Manjaro (Arch) Linux installed and am running the latest version of Anaconda3. The latter uses libstdc++.so.6.0.28 while the newest version of Manjaro comes with libstdc++.so.6.0.29. When I check the Anaconda version I can see that it only goes up to `GLIBCXX_3.4.28':

/home/myuser/anaconda3/lib/libstdc++.so.6 | grep GLIBCXX

GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBCXX_3.4.21
GLIBCXX_3.4.22
GLIBCXX_3.4.23
GLIBCXX_3.4.24
GLIBCXX_3.4.25
GLIBCXX_3.4.26
GLIBCXX_3.4.27
GLIBCXX_3.4.28
GLIBCXX_DEBUG_MESSAGE_LENGTH

I created an backup of all libstdc++.so* files/symlinks and linked in the system library instead:

mv /home/myuser/anaconda3/lib/libstdc++.so /home/myuser/anaconda3/lib/libstdc++.so.bak
ln -s /usr/lib/libstdc++.so.6.0.29 /home/myuser/anaconda3/lib/libstdc++.so

mv /home/myuser/anaconda3/lib/libstdc++.so.6 /home/myuser/anaconda3/lib/libstdc++.so.6.bak
ln -s /usr/lib/libstdc++.so.6.0.29 /home/myuser/anaconda3/lib/libstdc++.so.6

mv /home/myuser/anaconda3/lib/libstdc++.so.6.0.28 /home/myuser/anaconda3/lib/libstdc++.so.6.0.28.bak
ln -s /usr/lib/libstdc++.so.6.0.29 /home/myuser/anaconda3/lib/libstdc++.so.6.0.28

The result looks like this:

ll /home/myuser/anaconda3/lib | grep ibstdc++ 

lrwxrwxrwx  1 myuser myuser        28 Jan  2 16:17 libstdc++.so -> /usr/lib/libstdc++.so.6.0.29
lrwxrwxrwx  1 myuser myuser        28 Jan  2 16:17 libstdc++.so.6 -> /usr/lib/libstdc++.so.6.0.29
lrwxrwxrwx  1 myuser myuser        28 Jan  2 15:10 libstdc++.so.6.0.28 -> /usr/lib/libstdc++.so.6.0.29
-rwxrwxr-x  3 myuser myuser  13121976 Jun  3  2021 libstdc++.so.6.0.28.bak
lrwxrwxrwx  1 myuser myuser        19 Dec 25 17:01 libstdc++.so.6.bak -> libstdc++.so.6.0.28
lrwxrwxrwx  1 myuser myuser        19 Dec 25 17:01 libstdc++.so.bak -> libstdc++.so.6.0.28

When I recheck the file I now see the necessary reference to GLIBCXX_3.4.29:

/home/myuser/anaconda3/lib/libstdc++.so.6 | grep GLIBCXX

GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBCXX_3.4.21
GLIBCXX_3.4.22
GLIBCXX_3.4.23
GLIBCXX_3.4.24
GLIBCXX_3.4.25
GLIBCXX_3.4.26
GLIBCXX_3.4.27
GLIBCXX_3.4.28
GLIBCXX_3.4.29
GLIBCXX_DEBUG_MESSAGE_LENGTH

Re-running the import now works as expected:

from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder

RuntimeError

Traceback (most recent call last):
File "D:\Tensoflow Object Detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 95, in NewCheckpointReader
return CheckpointReader(compat.as_bytes(filepattern))
RuntimeError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for Tensorflow\workspace\pre-trained-models\ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8\checkpoint\ckpt-0

How to show accuracy metrics con Tensorboard?

Hello,

I am using TFOD to train different architectures with 3 different datasets. Once I run the evaluation I see that for one of the datasets the AP is much lower than from the others. However, at tensorboard the loss performance is very similar for all of them. How can I guess what is going wrong there and how to improve it?
Is there any way to see accucary evolution for the different training?

Looking forward to your response,

Thanks!

Olatz

I have problem when i run this python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --num_train_steps=2000

my result
2022-02-16 23:19:08.554267: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-02-16 23:19:08.997601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1251 MB memory: -> device: 0, name: GeForce MX450, pci bus id: 0000:01:00.0, compute capability: 7.5
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0216 23:19:09.045574 5088 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
INFO:tensorflow:Maybe overwriting train_steps: 2000
I0216 23:19:09.045574 5088 config_util.py:552] Maybe overwriting train_steps: 2000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0216 23:19:09.045574 5088 config_util.py:552] Maybe overwriting use_bfloat16: False
WARNING:tensorflow:From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
rename to distribute_datasets_from_function
W0216 23:19:09.061112 5088 deprecation.py:337] From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
rename to distribute_datasets_from_function
INFO:tensorflow:Reading unweighted datasets: ['Tensorflow\workspace\annotations\train.record']
I0216 23:19:09.076739 5088 dataset_builder.py:163] Reading unweighted datasets: ['Tensorflow\workspace\annotations\train.record']
INFO:tensorflow:Reading record datasets for input file: ['Tensorflow\workspace\annotations\train.record']
I0216 23:19:09.076739 5088 dataset_builder.py:80] Reading record datasets for input file: ['Tensorflow\workspace\annotations\train.record']
INFO:tensorflow:Number of filenames to read: 1
I0216 23:19:09.076739 5088 dataset_builder.py:81] Number of filenames to read: 1
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0216 23:19:09.076739 5088 dataset_builder.py:87] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\builders\dataset_builder.py:101: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic.
W0216 23:19:09.076739 5088 deprecation.py:337] From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\builders\dataset_builder.py:101: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic.
WARNING:tensorflow:From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\builders\dataset_builder.py:236: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.map() W0216 23:19:09.092370 5088 deprecation.py:337] From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\builders\dataset_builder.py:236: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.map()
WARNING:tensorflow:From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
W0216 23:19:13.918504 5088 deprecation.py:337] From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
WARNING:tensorflow:From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:1082: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0216 23:19:16.172085 5088 deprecation.py:337] From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:1082: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W0216 23:19:17.406628 5088 deprecation.py:337] From C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\keras\backend.py:450: UserWarning: tf.keras.backend.set_learning_phase is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the training argument of the __call__ method of your layer or model.
warnings.warn('tf.keras.backend.set_learning_phase is deprecated and '
2022-02-16 23:19:32.862831: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8101
2022-02-16 23:19:34.137917: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.224891: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.397404: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.484402: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.587516: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.712935: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.733970: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.734360: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736050: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736125: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736274: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736298: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736359: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736375: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736442: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736458: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736515: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736526: W tensorflow/stream_executor/stream.cc:1260] attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736580: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736603: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736635: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736649: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736693: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2022-02-16 23:19:34.736709: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at matmul_op_impl.h:442 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support
2022-02-16 23:19:34.736849: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.736881: W tensorflow/core/kernels/gpu_utils.cc:70] Failed to check cudnn convolutions for out-of-bounds reads and writes with an error message: 'stream did not block host until done; was already in an error state'; skipping this check. This only means that we won't check cudnn for out-of-bounds reads and writes. This message will only be printed once.
2022-02-16 23:19:34.736946: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.737568: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.737629: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.738228: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.738286: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.739022: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.739079: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.739837: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.739895: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.750895: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.751451: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.761552: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.761640: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.762796: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.762893: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.764247: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.764356: I tensorflow/stream_executor/stream.cc:4442] [stream=00000211EEDF6A60,impl=00000211F66EE2C0] INTERNAL: stream did not block host until done; was already in an error state
2022-02-16 23:19:34.764458: E tensorflow/stream_executor/cuda/cuda_blas.cc:232] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
Traceback (most recent call last):
File "C:\Repos\tensorflow object detection\TFODCourse\Tensorflow\models\research\object_detection\model_main_tf2.py", line 115, in
tf.compat.v1.app.run()
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\platform\app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\absl\app.py", line 312, in run
_run_main(main, args)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\absl\app.py", line 258, in _run_main
sys.exit(main(argv))
File "C:\Repos\tensorflow object detection\TFODCourse\Tensorflow\models\research\object_detection\model_main_tf2.py", line 106, in main
model_lib_v2.train_loop(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py", line 605, in train_loop
load_fine_tune_checkpoint(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py", line 400, in load_fine_tune_checkpoint
_ensure_model_is_built(model, input_dataset, unpad_groundtruth_tensors)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py", line 175, in _ensure_model_is_built
strategy.run(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\distribute\distribute_lib.py", line 1312, in run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\distribute\distribute_lib.py", line 2888, in call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\distribute\mirrored_strategy.py", line 676, in _call_for_each_replica
return mirrored_run.call_for_each_replica(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\distribute\mirrored_run.py", line 82, in call_for_each_replica
return wrapped(args, kwargs)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: Graph execution error:

Detected at node 'Loss/MatMulGather/MatMul' defined at (most recent call last):
File "C:\Users\User\anaconda3\lib\threading.py", line 930, in _bootstrap
self._bootstrap_inner()
File "C:\Users\User\anaconda3\lib\threading.py", line 973, in _bootstrap_inner
self.run()
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py", line 171, in _dummy_computation_fn
training_step=0)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\model_lib_v2.py", line 129, in _compute_losses_and_predictions_dicts
losses_dict = model.loss(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\meta_architectures\ssd_meta_arch.py", line 842, in loss
(batch_cls_targets, batch_cls_weights, batch_reg_targets,
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\meta_architectures\ssd_meta_arch.py", line 1066, in _assign_targets
if train_using_confidences:
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\meta_architectures\ssd_meta_arch.py", line 1083, in _assign_targets
groundtruth_weights_list)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\core\target_assigner.py", line 510, in batch_assign
for anchors, gt_boxes, gt_class_targets, gt_weights in zip(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\core\target_assigner.py", line 512, in batch_assign
(cls_targets, cls_weights,
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\core\target_assigner.py", line 202, in assign
reg_targets = self._create_regression_targets(anchors,
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\core\target_assigner.py", line 259, in _create_regression_targets
matched_gt_boxes = match.gather_based_on_match(
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\core\matcher.py", line 214, in gather_based_on_match
gathered_tensor = self._gather_op(input_tensor, gather_indices)
File "C:\Repos\tensorflow object detection\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.9.egg\object_detection\utils\ops.py", line 1027, in matmul_gather_on_zeroth_axis
gathered_result_flattened = tf.matmul(indicator_matrix, params2d)
Node: 'Loss/MatMulGather/MatMul'
Attempting to perform BLAS operation using StreamExecutor without BLAS support
[[{{node Loss/MatMulGather/MatMul}}]] [Op:__inference__dummy_computation_fn_15081]

Path error on capture images step

I'm running the capture images step and I'm running into this error:
`
error Traceback (most recent call last)
in
7 ret, frame = cap.read()
8 imgname = os.path.join(IMAGES_PATH,label,label+'.'+'{}.jpg'.format(str(uuid.uuid1())))
----> 9 cv2.imwrite(imgname, frame)
10 cv2.imshow('frame', frame)
11 time.sleep(2)

error: OpenCV(4.5.1) /tmp/pip-req-build-7m_g9lbm/opencv/modules/imgcodecs/src/loadsave.cpp:753: error: (-215:Assertion failed) !_img.empty() in function 'imwrite'
`

cannot import name 'string_int_label_map_pb2' from 'object_detection.protos'

File "C:\Users\james.c\Anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 21, in
from object_detection.protos import string_int_label_map_pb2
ImportError: cannot import name 'string_int_label_map_pb2' from 'object_detection.protos' (C:\Users\james.c\Anaconda3\lib\site-packages\object_detection\protos_init_.py)

ParseError

ParseError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_23320/1400507109.py in
2 with tf.io.gfile.GFile(files['PIPELINE_CONFIG'], "r") as f:
3 proto_str = f.read()
----> 4 text_format.Merge(proto_str, pipeline_config)

~\tfodf\lib\site-packages\google\protobuf\text_format.py in Merge(text, message, allow_unknown_extension, allow_field_number, descriptor_pool, allow_unknown_field)
717 ParseError: On text parsing problems.
718 """
--> 719 return MergeLines(
720 text.split(b'\n' if isinstance(text, bytes) else u'\n'),
721 message,

~\tfodf\lib\site-packages\google\protobuf\text_format.py in MergeLines(lines, message, allow_unknown_extension, allow_field_number, descriptor_pool, allow_unknown_field)
791 descriptor_pool=descriptor_pool,
792 allow_unknown_field=allow_unknown_field)
--> 793 return parser.MergeLines(lines, message)
794
795

~\tfodf\lib\site-packages\google\protobuf\text_format.py in MergeLines(self, lines, message)
816 """Merges a text representation of a protocol message into a message."""
817 self._allow_multiple_scalars = True
--> 818 self._ParseOrMerge(lines, message)
819 return message
820

~\tfodf\lib\site-packages\google\protobuf\text_format.py in _ParseOrMerge(self, lines, message)
835 tokenizer = Tokenizer(str_lines)
836 while not tokenizer.AtEnd():
--> 837 self._MergeField(tokenizer, message)
838
839 def _MergeField(self, tokenizer, message):

~\tfodf\lib\site-packages\google\protobuf\text_format.py in _MergeField(self, tokenizer, message)
965
966 else:
--> 967 merger(tokenizer, message, field)
968
969 else: # Proto field is unknown.

~\tfodf\lib\site-packages\google\protobuf\text_format.py in _MergeMessageField(self, tokenizer, message, field)
1040 if tokenizer.AtEnd():
1041 raise tokenizer.ParseErrorPreviousToken('Expected "%s".' % (end_token,))
-> 1042 self._MergeField(tokenizer, sub_message)
1043
1044 if is_map_entry:

~\tfodf\lib\site-packages\google\protobuf\text_format.py in _MergeField(self, tokenizer, message)
930
931 if not field and not self.allow_unknown_field:
--> 932 raise tokenizer.ParseErrorPreviousToken(
933 'Message type "%s" has no field named "%s".' %
934 (message_descriptor.full_name, name))

ParseError: 172:3 : Message type "object_detection.protos.TrainConfig" has no field named "fine_tune_checkpoint_version".

VERIFICATION_SCRIPT Error

Run into the error messages as following:

Traceback (most recent call last):
File "Tensorflow\models\research\object_detection\builders\model_builder_tf2_test.py", line 25, in
from object_detection.builders import model_builder
File "C:\TFOD\TFODCourse\tfod\lib\site-packages\object_detection\builders\model_builder.py", line 20, in
from object_detection.builders import anchor_generator_builder
File "C:\TFOD\TFODCourse\tfod\lib\site-packages\object_detection\builders\anchor_generator_builder.py", line 21, in
from object_detection.protos import anchor_generator_pb2
ImportError: cannot import name 'anchor_generator_pb2' from 'object_detection.protos' (C:\TFOD\TFODCourse\tfod\lib\site-packages\object_detection\protos_init_.py)

Already searched solutions and tried without success. Please help, thanks.

Tensorflow training

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
Traceback (most recent call last):
File "E:\ANPR\anprsys\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 64, in
from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.

AttributeError: module 'pyparsing' has no attribute 'downcaseTokens'

When running the verification code in "2. Training and Detection" after installing CUDA and CuDNN, I am getting this error "AttributeError: module 'pyparsing' has no attribute 'downcaseTokens'", I have tried a lot of things I saw online including to install an older version of pyparsing, nothing works, I am not getting the "ok" at the end of the output. any help would be appreciated.
also, when running the section of code for installing tensorflow object detection, I am getting a lot of output with one part in red that says "error: avro-python3 1.10.2 is installed but avro-python3!=1.9.2,<1.10.0,>=1.8.1 is required by {'apache-beam'}", so I also tried some fixes for this I found online and also nothing worked, what could it be?

Error in Evaluate the model

I run the steps and in "Evaluate the model" I get this error:

TypeError: 'numpy.float64' object cannot be interpreted as an integer

I did not pay attention to that error and continue the process, but in "detect from image" part, it just shows the original image and there is no detection. Does it cause from the error that occurs in evaluate part?
How can I fix that?

Install Tensorflow object detection took a long time due to pip

It happens recently as I want to install the tensorflow object detection on google colab. The issue is whilst installing the pip has to search for multiple versions of chardet, idna and rsa which took a long runtime. This has not happened before when I do the tutorial so I was wondering is there such update or is there anything to resolve this long runtime. Thank you

Value Error: Checkpoint version should be V2

Step 5 raises the Error:

Message type "object_detection.protos.TrainConfig" has no field named "fine_tune_checkpoint_version"

When I delete "fine_tuning_checkpoint_version: V2" in line 172 I get a ValueError when training:

Value Error: Checkpoint version should be V2

So deleting doesn't work for me. Did someone solve this?

ImportError: cannot import name 'string_int_label_map_pb2' from 'object_detection.protos'

Thanks for sharing this amazing tutorial Nicholas.

I ran :
!python {files['TF_RECORD_SCRIPT']} -x {os.path.join(paths['IMAGE_PATH'], 'train')} -l {files['LABELMAP']} -o {os.path.join(paths['ANNOTATION_PATH'], 'train.record')}
!python {files['TF_RECORD_SCRIPT']} -x {os.path.join(paths['IMAGE_PATH'], 'test')} -l {files['LABELMAP']} -o {os.path.join(paths['ANNOTATION_PATH'], 'test.record')}

and got :
Traceback (most recent call last):
File "C:\NRenotte\Tensorflow\scripts\generate_tfrecord.py", line 29, in
from object_detection.utils import dataset_util, label_map_util
File "C:\Users\cyukiat\AppData\Local\Programs\Python\Python39\lib\site-packages\object_detection\utils\label_map_util.py", line 21, in
from object_detection.protos import string_int_label_map_pb2
ImportError: cannot import name 'string_int_label_map_pb2' from 'object_detection.protos' (C:\Users\cyukiat\AppData\Local\Programs\Python\Python39\lib\site-packages\object_detection\protos_init_.py)
Traceback (most recent call last):
File "C:\NRenotte\Tensorflow\scripts\generate_tfrecord.py", line 29, in
from object_detection.utils import dataset_util, label_map_util
File "C:\Users\cyukiat\AppData\Local\Programs\Python\Python39\lib\site-packages\object_detection\utils\label_map_util.py", line 21, in
from object_detection.protos import string_int_label_map_pb2
ImportError: cannot import name 'string_int_label_map_pb2' from 'object_detection.protos' (C:\Users\cyukiat\AppData\Local\Programs\Python\Python39\lib\site-packages\object_detection\protos_init_.py)

I have tried unsuccessfully :
!from tensorflow/models/research/
protoc object_detection/protos/*.proto --python_out=.

export PYTHONPATH=$PYTHONPATH:pwd:pwd/slim

Sorry I am new to this...
Error

cuDNN failed to initialize on Colab

2021-11-10 11:34:35.330749: E tensorflow/stream_executor/cuda/cuda_dnn.cc:362] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2021-11-10 11:34:35.333269: E tensorflow/stream_executor/cuda/cuda_dnn.cc:362] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node ssd_mobile_net_v2fpn_keras_feature_extractor/model/Conv1/Conv2D (defined at /usr/local/lib/python3.7/dist-packages/object_detection/models/ssd_mobilenet_v2_fpn_keras_feature_extractor.py:219) ]] [Op:__inference__dummy_computation_fn_15068]

How to Fix this?

generate_tfrecord.py failure

Hi, I saw your tutorial on Youtube it was amazing and I decided to perform a similar detection in my project. I wrote all the codes in VS code instead of using the notebook. Everything was fine until I got to the step to generate tfrecords. When I ran the script provided, it would output "Successfully created the TFRecord file: train.record." However, the generated files were all 0kb. Before I noticed this, I always got stuck when training at the point that the terminal outputs "Instructions for updating:Use tf.cast instead." I searched on google and found that 0kb tfrecord is a potential issue. How to resolve this? Thanks in advance!

TF2 waiting on checkpoint file

Upon running the evaluation command:

python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --checkpoint_dir=Tensorflow\workspace\models\my_ssd_mobnet

the process hangs with the following msg:
INFO:tensorflow:Waiting for new checkpoint at Tensorflow\workspace\models\my_ssd_mobnet
I0602 22:33:16.616526 1056 checkpoint_utils.py:140] Waiting for new checkpoint at Tensorflow\workspace\models\my_ssd_mobnet

pip list:
absl-py 0.12.0
astunparse 1.6.3
backcall 0.2.0
cachetools 4.2.2
certifi 2021.5.30
chardet 4.0.0
colorama 0.4.4
cycler 0.10.0
decorator 5.0.9
flatbuffers 1.12
gast 0.4.0
google-auth 1.30.1
google-auth-oauthlib 0.4.4
google-pasta 0.2.0
grpcio 1.34.1
h5py 3.1.0
idna 2.10
ipykernel 5.5.5
ipython 7.24.0
ipython-genutils 0.2.0
jedi 0.18.0
jupyter-client 6.1.12
jupyter-core 4.7.1
keras-nightly 2.5.0.dev2021032900
Keras-Preprocessing 1.1.2
kiwisolver 1.3.1
lvis 0.5.3
lxml 4.6.3
Markdown 3.3.4
matplotlib 3.4.2
matplotlib-inline 0.1.2
numpy 1.19.5
oauthlib 3.1.1
object-detection 0.1
opencv-python 4.5.2.52
opt-einsum 3.3.0
pandas 1.2.4
parso 0.8.2
pickleshare 0.7.5
Pillow 8.2.0
pip 21.1.2
prompt-toolkit 3.0.18
protobuf 3.17.1
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycocotools 2.0.2
Pygments 2.9.0
pyparsing 2.4.7
PyQt5 5.15.4
PyQt5-Qt5 5.15.2
PyQt5-sip 12.9.0
python-dateutil 2.8.1
pywin32 301
PyYAML 5.4.1
pyzmq 22.1.0
requests 2.25.1
requests-oauthlib 1.3.0
rsa 4.7.2
scipy 1.6.3
setuptools 49.2.1
six 1.15.0
slim 0.1 c:\users\jonsc\onedrive\documents\python\machinelearning\tfodcourse-main\tensorflow\models\research\slim
tensorboard 2.5.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.0
tensorflow 2.5.0
tensorflow-estimator 2.5.0
termcolor 1.1.0
tf-models-official 2.5.0
tf-slim 1.1.0
tornado 6.1
traitlets 5.0.5
typing-extensions 3.7.4.3
urllib3 1.26.5
wcwidth 0.2.5
Werkzeug 2.0.1
wget 3.2
wheel 0.36.2
wrapt 1.12.1

Verification Script:ModuleNotFoundError: No module named 'google.protobuf'

I have installed everything in a python virtual environment. The TF2 object_detection API installation was successful with no errors. But the verification script is throwing this error. ModuleNotFoundError: No module named 'google.protobuf'

File "C:\Users\UPosia\PycharmProjects\EAU\NewENV\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\utils\config_util.py", line 23, in <module> from google.protobuf import text_format ModuleNotFoundError: No module named 'google.protobuf'
I have tried uninstalling google and google-cloud.
But I am afraid of uninstalling protobuf will ruin the object_detection API.
Can someone guide me further?

OS: Windows
Python: 3.8

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.