Giter Club home page Giter Club logo

Comments (11)

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024 1

Hello @riacheruvu,

Greetings :)

Thank you for your kind reply.

Q. Could you confirm that when removing the bottle, the Yolov8 model is able to detect and overlay the bounding box on the output?

Answer : Yes, yolov8 model is able to detect and overlay the bounding box on the output.

Q. I understand issues #1765 and #1766 are grouped under the same theme - would it be ok if we looped these issues under this issue #1754?

Answer : Yes please, thank you.

from openvino_notebooks.

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024

l-bat

from openvino_notebooks.

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024

catchygit

from openvino_notebooks.

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024

Any solution ?

from openvino_notebooks.

adrianboguszewski avatar adrianboguszewski commented on September 27, 2024

@riacheruvu please look at it

from openvino_notebooks.

riacheruvu avatar riacheruvu commented on September 27, 2024

Hi @Abhijeet241093, thanks for raising these issues! I'm looking into reproducing the issues from my end and will get back to you ASAP within the next week with a solution, which will most likely be a quick patch to the code snippet.

Please note that the object addition/removal can be faulty if the Yolov8 model is unable to clearly see and detect the object itself - if there are occlusions, that can cause issues. Could you confirm that when removing the bottle, the Yolov8 model is able to detect and overlay the bounding box on the output?

I understand issues #1765 and #1766 are grouped under the same theme - would it be ok if we looped these issues under this issue #1754? The patch I'll provide should resolve all three issues. Thank you for your patience and for the detailed error reports.

from openvino_notebooks.

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024

Hello @riacheruvu,

Suppose the GPU and storage requirements are on the server side. If you downgrade from the YOLOv8 model to the YOLOv5 model, will the performance improve? Why do we choose YOLOv8 over YOLOv5? Is it for additional flexibility or features?

from openvino_notebooks.

riacheruvu avatar riacheruvu commented on September 27, 2024

Thank you, @Abhijeet241093, for the additional details - I'm working on it!

To your question, In this context, we are choosing YOLOv8 over YOLO5 for the performance improvements and being able to leverage the latest APIs. You could switch the model with YOLOv5 for meeting GPU/storage requirements - I will say I haven't validated this particular kit with YOLOv5 yet, so you may see different results if you try this.

from openvino_notebooks.

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024

Hello riacheruvu,

Greetings :)~

Logic is improved. Its fine now. Please find code, below. At this point, need answer of following questions.

  1. If we apply two separately yolovs model for person detection, and object detection, in that case, should we need to

A. Separately apply zone over it ?

B. Or Should we combine results of both, then apply zone over it ?

#Define empty lists to keep track of labels
original_labels = []
final_labels = []
person_bbox = []
p_items = []
purchased_items = set(p_items)
a_items = []
added_items = set(a_items)
hand_bbox = []
combined_detections = []

#Save result as det_tracking_result
with sv.VideoSink("new_det_tracking_result.mp4", video_info) as sink:
#Iterate through model predictions and tracking results
for index, (result, result1) in enumerate(zip(model.track(source=VID_PATH, show=False, stream=True, verbose=True, persist=True),
model1.track(source=VID_PATH, show=False, stream=True, verbose=True, persist=True))):
#Define variables to store interactions that are refreshed per frame
interactions = []
person_intersection_str = ""

  # Obtain predictions from model1
  frame1 = result1.orig_img
  detections_objects1 = sv.Detections.from_ultralytics(result1)
  detections_objects1 = detections_objects1[detections_objects1.class_id == 0]
  bboxes1 = result1.boxes 
  #print(detections_objects1)
     
  #Obtain predictions from yolov8 model
  frame = result.orig_img
  detections = sv.Detections.from_ultralytics(result)
  detections = detections[detections.class_id < 10]
  bboxes = result.boxes 

  # Apply mask over the single Zone
  mask1, mask2 = zone.trigger(detections=detections_objects1), zone.trigger(detections=detections)
  detections_filtered1, detections_filtered2 = detections_objects1[mask1], detections[mask2]
  
  if detections_objects1 and len(detections_objects1) > 0:
     label1 = label_map1[detections_objects1.class_id[0]]  # Get the label for the class_id
     combined_detections.append((detections_objects1, label1))
     for detection, label in combined_detections:
         print("Detections:", detection)
         print("Label:", label)


  if bboxes1.id is not None:
     detections_objects1.tracker_id = bboxes1.id.cpu().numpy().astype(int)
    
  labels = [
      f'#{tracker_id} {label_map1[class_id]} {confidence:0.2f}'
      for _, _, confidence, class_id, tracker_id
      in detections_objects1
  ]

  #Print labels for detections from model1
  for _, _, confidence, class_id, _ in detections_objects1:
    print(f"Label: {label_map1[class_id]} with confidence: {confidence:.2f}")

 

  print(detections)
   # Apply mask over the single Zone
  mask = zone.trigger(detections=detections)
  detections_filtered = detections[mask]

  print("mask", mask)
  print("Detection", detections_filtered)

  if detections and len(detections) > 0:
     label = label_map[detections.class_id[0]]  # Get the label for the class_id
     combined_detections.append((detections, label))

  if bboxes.id is not None:
     detections.tracker_id = bboxes.id.cpu().numpy().astype(int)
    
  labels = [
      f'#{tracker_id} {label_map[class_id]} {confidence:0.2f}'
      for _, _, confidence, class_id, tracker_id
      in detections
  ]
  
  frame = box_annotator.annotate(scene=frame, detections=detections_filtered, labels=labels)
  frame = zone_annotator.annotate(scene=frame)

  objects = [f'#{tracker_id} {label_map[class_id]}' for _, _, confidence, class_id, tracker_id in detections]

#   for _, _, confidence, class_id, _ in detections:
#    print(f"Label: {label_map[class_id]} with confidence: {confidence:.2f}")

#   # Combine detections from both models
#   # combined_detections = np.concatenate((detections_objects1, detections))
  
#   print(combined_detections)
  
#   # Extract xyxy attributes from combined detections
#   combined_detections_xyxy = [detection[0].xyxy for detection in combined_detections]

#   print(combined_detections_xyxy)
  
#   # Check if combined_detections_xyxy is not empty and contains non-empty arrays
#   if combined_detections_xyxy and all(arr.size > 0 for arr in combined_detections_xyxy):
#      # Concatenate xyxy arrays into a single array
#      combined_xyxy_array = np.concatenate(combined_detections_xyxy, axis=0)
#   else:
#       combined_xyxy_array = np.empty((0, 4))  # Create an empty array
      
#   # Create a Detections object with the concatenated xyxy array
#   combined_detections_detections = sv.Detections(xyxy=combined_xyxy_array)

#   # Apply mask over the combined detections
#   mask = zone.trigger(detections= combined_detections_detections)

#   # Filter combined detections based on the mask
#   combined_detections_filtered = [combined_detections[i] for i in range(len(combined_detections)) if mask[i]]

#   # Print the mask and filtered detections
#   #print("Combined Detections mask:", mask)
#   #print("Combined Detections filtered:", combined_detections_filtered)

#   # Iterate through combined detections to create labels
#   combined_labels = []
#   for detection in combined_detections_filtered:
#       detections, label = detection
#       for _, _, confidence, class_id, tracker_id in detections:
#             combined_labels.append(f'#{tracker_id} {label_map1[class_id]} {confidence:.2f}')

#     # Print labels for combined detections
#   for label in combined_labels:
#         print("combined_labels", label)
  
#   frame = box_annotator.annotate(scene=frame, detections=combined_detections_filtered, labels=combined_labels)
#   frame = zone_annotator.annotate(scene=frame)
  
#   objects = [f'#{tracker_id} {label_map[class_id]}' for _, _, confidence, class_id, tracker_id in combined_detections_filtered]

#   print("Combined Objects:", objects)

  #If this is the first time we run the application,
  #store the objects' labels as they are at the beginning
  if index == 0:
      
      original_labels = objects
      original_dets = len(detections_filtered)

  else:
      #To identify if an object has been added or removed
      #we'll use the original labels and identify any changes
      final_labels = objects
      new_dets = len(detections_filtered)
      #Identify if an object has been added or removed using Counters
      removed_objects = Counter(original_labels) + Counter(final_labels)
      added_objects = Counter(final_labels) - Counter(original_labels)

      #Create two variables we can increment for drawing text
      draw_txt_ir = 1
      draw_txt_ia = 1
      
    #Check for objects being added or removed
      #if new_dets - original_dets != 0 and len(removed_objects) >= 1:
      if new_dets != original_dets or removed_objects:
         #An object has been removed
          for k,v in removed_objects.items():
             #For each of the objects, check the IOU between a designated object
             #and a person.
            
             if 'person' not in k:
                 removed_object_str = f"{v} {k} purchased"
                 removed_action_str = intersecting_bboxes(bboxes, bboxes1, person_bbox, removed_object_str)
                 print("Removed Action String:", removed_action_str)  # Add this line
                 if removed_action_str is not None:
                     log.info(removed_action_str)
                     #Add the purchased items to a "receipt" of sorts
                     item = removed_action_str.split()
                     if len(item) >= 3:
                         item = f"{item [0]} {item [1]} {item [2]}"  
                     removed_label = item.split(' ')[-1]
                     if any(removed_label in item for item in purchased_items):
                        purchased_items = {f"{int(item.split()[0]) + 1} {' '.join(item.split()[1:])}" if removed_label in item else item for item in purchased_items}
                     else:    
                         purchased_items.add(f"{v} {k}")
                         p_items.append(f" - {v} {k}") 
                 print("New_Purchased_Items:", purchased_items)
                 print("Removed_Objects:")
                 #Draw the result on the screen        
                 draw_text(frame, text=removed_action_str, point=(50, 50 + draw_txt_ir), color=(0, 0, 255))
                 draw_text(frame, "Receipt: " + str(purchased_items), point=(50, 800), color=(30, 144, 255))      
                 draw_txt_ir += 80
                    
      if len(added_objects) >= 1:
          #An object has been added
          for k,v in added_objects.items():
              #For each of the objects, check the IOU between a designated object
              #and a person.
              if 'person' not in k:
                  added_object_str = f"{v} {k} returned"
                  added_action_str = intersecting_bboxes(bboxes, bboxes1, person_bbox, added_object_str)
                  print("Added Action String:", added_action_str)  # Add this line
                  if added_action_str is not None:
                      #If we have determined an interaction with a person,
                      #log the interaction.
                      log.info(added_action_str)
                      item = added_object_str.split()
                      if len(item) >= 3:
                         item = f"{item [0]} {item [1]} {item [2]}"     
                      item = item.split(' ')[-1]
                      if any(item in item for item in purchased_items):
                         purchased_items = {f"{int(item.split()[0]) - 1} {' '.join(item.split()[1:])}" if item in item else item for item in purchased_items}
                         if any(item.startswith('0 ') for item in purchased_items):
                            purchased_items = {item for item in purchased_items if not item.startswith('0 ')}
                      print("Updated_Purchased_Items:", purchased_items)
                          #p_items.remove(item)
                      added_items.add(added_object_str)
                      a_items.append(added_object_str)
                      print("Added_Objects:")
                  #Draw the result on the screen  
                  draw_text(frame, text=added_action_str, point=(50, 300 + draw_txt_ia), color=(0, 128, 0))
                  draw_text(frame, "Receipt: " + str(purchased_items), point=(50, 800), color=(30, 144, 255))
                  draw_txt_ia += 80

  # Clear the combined_detections list
  combined_detections.clear()               
  draw_text(frame, "Receipt: " + str(purchased_items), point=(50, 800), color=(30, 144, 255))
  sink.write_frame(frame)

from openvino_notebooks.

Abhijeet241093 avatar Abhijeet241093 commented on September 27, 2024

Hi @Abhijeet241093, thanks for raising these issues! I'm looking into reproducing the issues from my end and will get back to you ASAP within the next week with a solution, which will most likely be a quick patch to the code snippet.

Please note that the object addition/removal can be faulty if the Yolov8 model is unable to clearly see and detect the object itself - if there are occlusions, that can cause issues. Could you confirm that when removing the bottle, the Yolov8 model is able to detect and overlay the bounding box on the output?

I understand issues #1765 and #1766 are grouped under the same theme - would it be ok if we looped these issues under this issue #1754? The patch I'll provide should resolve all three issues. Thank you for your patience and for the detailed error reports.

Have you done it ? @riacheruvu,

from openvino_notebooks.

riacheruvu avatar riacheruvu commented on September 27, 2024

Hello @Abhijeet241093, I sincerely apologize for my delayed response, I needed additional time to validate my patch - it'll be merged into the repository shortly in a day or two. There were a few new edge use cases that the patch also needed to address, which took time to incorporate.

To your question:

"If we apply two separately yolovs model for person detection, and object detection, in that case, should we need to A. Separately apply zone over it? B. Or Should we combine results of both, then apply zone over it?"

It would depend on the use case you are trying to achieve: I would highly recommend applying the zone only for the object detection model. If you are looking to use the intersection of zones, rather than the intersection of bounding boxes, for detecting changes in objects, then you could for example consider applying zone detection for your two individual YOLO models for person and object detection, and then consider the intersection/combination of the results. To briefly summarize, I would recommend considering option A. I hope my explanation makes sense - happy to clarify further if not!

Thank you for your patience. Once the patch is merged, I will close this issue and convert it to a discussion.

from openvino_notebooks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.