Lots of the for loops in the COCORecall update_state method could be reduced to <code

Profile trace of COCORecall: <a target="_blank" rel="noopener norefe

moved to <a href="https://github.com/tensorflow/models/blob/master/offici

Unfortunately no. This uses an internal profiling tool. </blockquote

Vectorize/Optimize the COCORecall metric about keras-cv HOT 10 CLOSED

keras-team commented on July 29, 2024

Vectorize/Optimize the COCORecall metric

from keras-cv.

Comments (10)

bhack commented on July 29, 2024

Additionally, we should be able to vectorize area computation in the iou function

What version do you want to implement? We have a vectorized IOU in:
https://github.com/tensorflow/models/blob/master/official/vision/detection/utils/box_utils.py#L643

It is quite the same of the old vectorized benchmarks in:
https://medium.com/@venuktan/vectorized-intersection-over-union-iou-in-numpy-and-tensor-flow-4fa16231b63d

from keras-cv.

bhack commented on July 29, 2024

Also with the scope of having a reusable component library for CV we have also in TF Addons:

https://github.com/tensorflow/addons/blob/master/tensorflow_addons/losses/giou_loss.py#L107-L128

from keras-cv.

LukeWood commented on July 29, 2024

Wow, thank you for the vectorized iou. This will save me a lot of time.

from keras-cv.

LukeWood commented on July 29, 2024

Additionally, we should be able to vectorize area computation in the iou function

What version do you want to implement? We have a vectorized IOU in: https://github.com/tensorflow/models/blob/master/official/vision/detection/utils/box_utils.py#L643

It is quite the same of the old vectorized benchmarks in: https://medium.com/@venuktan/vectorized-intersection-over-union-iou-in-numpy-and-tensor-flow-4fa16231b63d

moved to https://github.com/tensorflow/models/blob/master/official/legacy/detection/utils/box_utils.py

from keras-cv.

LukeWood commented on July 29, 2024

Profile trace of COCORecall:

Looks like TensorScatterAdds make up for 20% of compute costs. To fix this, I can use a TensorArray in COCOBase of shape (num_categories*num_thresholds) then reshape it after stacking. This will be much more efficient.

Additionally, we should look into the cost of the sum operations.

from keras-cv.

bhack commented on July 29, 2024

Can you share the gist to reproduce this as we still don't have any CI infra for PR performance regression check?

I suppose one of the problem is that TensorSCattterAdd and Sum are in the inner loop.
Is StridedSlice also coming from _match_boxes innner loop?

The top 3 occurrences in the inner loops are going to take 48,5% of the device time.

from keras-cv.

LukeWood commented on July 29, 2024

Can you share the gist to reproduce this as we still don't have any CI infra for PR performance regression check?

Unfortunately no. This uses an internal profiling tool.

I suppose one of the problem is that TensorSCattterAdd and Sum are in the inner loop.
Yeah, the sum operation I expect we can't improve. The ScatterND add, I would bet we can improve this drastically. I will be updating the results to TensorArrays, which should help a lot.

Re: StridedSlice and matching boxes:
I also expect this to take a lot of compute time, it may just be an inherent truth of the algorithm required to compute the metric.

from keras-cv.

bhack commented on July 29, 2024

I think we need to be on the same page to test this also with the same (simulated/random?) input size. Can you share your benchmark gist?

Have you checked if you are ok with the constrains imposed by:
https://www.tensorflow.org/api_docs/python/tf/vectorized_map

from keras-cv.

bhack commented on July 29, 2024

moved to https://github.com/tensorflow/models/blob/master/official/legacy/detection/utils/box_utils.py

The new box and iou ops are in:
https://github.com/tensorflow/models/tree/master/official/vision/beta/ops

from keras-cv.

LukeWood commented on July 29, 2024

Unfortunately no. This uses an internal profiling tool.

:( Sorry

from keras-cv.

Vectorize/Optimize the COCORecall metric about keras-cv HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent