Giter Club home page Giter Club logo

Comments (3)

ankandrew avatar ankandrew commented on August 24, 2024

Some questions:

  1. Did you try diff input resolution than 640, i.e. lower 416?
  2. How big (# samples) is your training data?
  3. Which model are you using, is it pre-trained with COCO (weights provided by repo)?

Also, double check that mixup augmentation is not ruining your training. Try seeing if augmentation is what you expect. Below is a script I use to visualize the augmentation:

https://github.com/ankandrew/yolov9/blob/8fecc650bebf7348a6372f43b668b344de070129/visualize_augmentation.py

from yolov9.

travisCxy avatar travisCxy commented on August 24, 2024

@ankandrew hello

  1. i am using a bigger size 1024 for training my model, because the original document image is all high resolution
  2. I have 44000 training data, i think it is enough to train the model
  3. I am using yolov9-e and load the pretrained weights with coco
    I check my augmentation, you are right, i didnt close the mixup augmentation. I check the augmentation using your scipts, than i close mosaic and copy_paste, i will train one more time with current setting.
    by the way, i reading the code about compute loss. the bbox loss mainly focous on iou, I have doubt with the iou loss is not helpful for accurate bbox regression. So i change the loss to l1 loss, but I got a worse result, do you have any idea?

from yolov9.

ankandrew avatar ankandrew commented on August 24, 2024

Hi @travisCxy! Sorry for late response. I think your analysis on point (3) seems accurate. Seems existing MDPIoU loss could be used instead of currently one used CIoU. The MDPIoU includes a penalty term based on the distance between the corners of the bounding boxes, which should make it more suitable for text detection where corner alignment is critical to avoid cropping letters (like in your examples). Let me know if this helps in your dataset.

yolov9/utils/metrics.py

Lines 292 to 296 in 5b1ea9a

elif MDPIoU:
d1 = (b2_x1 - b1_x1) ** 2 + (b2_y1 - b1_y1) ** 2
d2 = (b2_x2 - b1_x2) ** 2 + (b2_y2 - b1_y2) ** 2
mpdiou_hw_pow = feat_h ** 2 + feat_w ** 2
return iou - d1 / mpdiou_hw_pow - d2 / mpdiou_hw_pow # MPDIoU

You can use my branch to select the bounding box loss function or cherry pick my commit to easily test other loss functions than default one ankandrew@9527269.

from yolov9.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.