jl749 / yolov3 Goto Github PK
View Code? Open in Web Editor NEWyolov3 implementation in pytorch (https://arxiv.org/pdf/1804.02767.pdf)
yolov3 implementation in pytorch (https://arxiv.org/pdf/1804.02767.pdf)
BATCH_SIZE = 32
16551/32 โ 518 loops per epoch
4952/32 โ 155 loops per epoch
anchors = tensor([[0.2800, 0.2200], # pre defined
[0.3800, 0.4800],
[0.9000, 0.7800],
[0.0700, 0.1500],
[0.1500, 0.1100],
[0.1400, 0.2900],
[0.0200, 0.0300],
[0.0400, 0.0700],
[0.0800, 0.0600]])
# class, x, y, w, h
8 0.764 0.6069277108433735 0.23600000000000002 0.3042168674698795
8 0.594 0.6159638554216867 0.188 0.29819277108433734
14 0.229 0.6445783132530121 0.166 0.45180722891566266
14 0.39 0.6430722891566265 0.168 0.4307228915662651
14 0.5650000000000001 0.5918674698795181 0.154 0.41867469879518077
14 0.787 0.5963855421686747 0.166 0.3855421686746988
all x, y, w, h scales are within 0~1 range (normalized)
VOCDataset: torch.utils.data.Dataset
overrides __getitem__
method to adjust label.txt
x, y, w, h values and return appropriate scale's cell-relative coordinates (e.g. [0.9320, 0.4223, 3.0680, 2.6239] is the relative coor to the first anchor box in scale 0)
RETURN --> img:(C, W, H) && expected_bbox_info:( (3, 13, 13, 6), (3, 26, 26, 6), (3, 52, 52, 6) )
loop expected bboxes (txt file):
coor_from_txt = [0.764, 0.6069277108433735, 0.23600000000000002, 0.3042168674698795]
IoU_wh(coor_from_txt[2:4],
anchors) # calculate IoU with width and height
IoU_arg_sorted = [0, 5, 1, 4, 3, 2, 8, 7, 6] # coor_from_txt most likely to match the first anchor box in scale0
anchor_indices = IoU_arg_sorted
highest IoU anchor box ratio = (0.28, 0.22) <-- from index 0
index 0 means --> anchor belongs to the first prediction (3, 13, 13, 6) && first anchor box out of three
now RESCALE...
there are 3 scales S=(13, 26, 52)
scale0 = first prediction (3, 13, 13, 6)
3 anchor boxes each (obj_prob, x, y, w, h, class)
scale1 = second prediction (3, 26, 26, 6)
scale3 = third prediction (3, 52, 52, 6)
YOLOv3/yolov3/datasets/pascal_VOC.py
Lines 69 to 99 in a59f9f7
each grid cell predicts only one object.
YOLOv3/yolov3/datasets/pascal_VOC.py
Lines 111 to 113 in 4f9c176
if a grid cell (13x13, 26x26, 32x32) is already reserved for an object, new object cannot be assigned on it.
this is one of reason why YOLO is doing 3 scale predictions (even if objectA is missed on prediction one, prediction two or three can cover it)
YOLOv3 makes prediction across 3 different scales (13x13, 26x26, 52x52)<-- in case of 416x416 input.
The detection layer is used to make prediction at feature maps of three different sizes, having strides 32, 16, 8
416/32 = 13
416/16 = 26
416/8 = 52
In total predicts ((52 x 52) + (26 x 26) + 13 x 13)) x 3 = 10647 bounding boxes
detection is done by using a 1x1 kernel on the feature maps
Yolov3 uses independent logistic classifiers in place of the softmax function to determine the class of an input image. It also replaces the mean squared error with the binary cross-entropy loss, in simpler terms, the probability of object in the image and the class predictions are done using logistic regression.
https://learnopencv.com/mean-average-precision-map-object-detection-model-evaluation-metric/
mAP = torch.mean(precision_curve)
https://github.com/cocodataset/cocoapi/blob/8c9bcc3cf640524c4c20a9c40e89cb6a2f2fa0e9/PythonAPI/pycocotools/cocoeval.py#L439
https://github.com/rafaelpadilla/Object-Detection-Metrics/blob/e3f29579afef10e8057bda1beb6154a3f354287c/lib/Evaluator.py#L127
https://github.com/ultralytics/yolov5/blob/e808f2267d0164edb7bc45588c4fcda68c3dd8cb/utils/metrics.py#L64
mAP = torch.trapz(precision_curve, recall_curve)
Lines 66 to 128 in 496563b
line 91 labels
should be filtered by class too
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.