Giter Club home page Giter Club logo

Comments (5)

glenn-jocher avatar glenn-jocher commented on May 17, 2024

Hello! It's great to see you making progress with your custom trained weights and exploring different methods for making predictions. Differences in prediction results between using detect.py and loading the model directly with torch.hub.load might occur due to a few reasons:

  1. Preprocessing differences: Ensure the image preprocessing steps are consistent. detect.py handles resizing and normalization in specific ways that you'll need to replicate if you're using the model directly.
  2. Model state: Double-check that you're loading the correct weights and the model is in evaluation mode by calling model.eval() after loading it.
  3. NMS settings: Non-maximum suppression thresholds and other inference settings in detect.py could be different from the defaults assumed when loading the model via torch.hub.load.

Here's a quick checklist:

  • Verify image preprocessing steps.
  • Ensure model.eval() is called.
  • Align NMS and other inference parameters.

By ensuring consistency in these areas, prediction results should align more closely. If discrepancies continue, it might be helpful to revisit the training configuration or the dataset for potential issues. If you have further questions or need more assistance, feel free to reach out. Happy coding! ๐Ÿ˜Š

from yolov5.

KAKAROT12419 avatar KAKAROT12419 commented on May 17, 2024

Sir can you clarify my doubt that what coordinates does output txt file of yolov5 contains

from yolov5.

glenn-jocher avatar glenn-jocher commented on May 17, 2024

@KAKAROT12419 hello! Sure, I'd be happy to clarify that for you ๐Ÿ˜Š.

The output text files generated by YOLOv5 after inference contain detections for each image, where each line in the text file corresponds to a detected object and is formatted as follows:

class x_center y_center width height confidence

  • class is the object class ID (integer).
  • x_center and y_center are the center coordinates of the bounding box, normalized by the image width and height respectively.
  • width and height are the dimensions of the bounding box, also normalized by the image width and height.
  • confidence is the prediction confidence score for the detected object.

All values are normalized to be between 0 and 1. This format makes it easy to scale the detection coordinates to any image size.

Hope this helps! If you have further questions, just let us know. Happy detecting!

from yolov5.

KAKAROT12419 avatar KAKAROT12419 commented on May 17, 2024

i have one doubt can you please clearify it..what is difference between x1,y1,x2,y2 and xmin,ymin,xmax,ymax and x_center,ycenter,width,height.

from yolov5.

glenn-jocher avatar glenn-jocher commented on May 17, 2024

Hello! I'd be glad to clarify those terms for you ๐Ÿ˜Š.

  • x1, y1, x2, y2 typically represent the top-left (x1, y1) and bottom-right (x2, y2) corners of a bounding box.
  • xmin, ymin, xmax, ymax are another way of denoting the bounds of a box, similar to the above, where xmin, ymin are the top-left and xmax, ymax are the bottom-right coordinates.
  • x_center, y_center, width, height describe the bounding box by its center's coordinates (x_center, y_center), its width, and its height.

All these notations aim to uniquely identify a bounding box. The choice of notation often depends on the application or the algorithm's requirements. YOLO uses the center format (x_center, y_center, width, height) because it simplifies certain calculations, like loss functions, during training.

Hope this helps clarify things! Happy coding!

from yolov5.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.