Given a reference frame with multi-targets to track, is it possible to track them use

Is it possible to Track multi-targets in a single inference? about pytracking HOT 19 CLOSED

visionml commented on September 7, 2024

Is it possible to Track multi-targets in a single inference?

from pytracking.

Comments (19)

dongfangduoshou123 commented on September 7, 2024 1

OK， I will try.
Thank you mogul!

from pytracking.

goutamgmb commented on September 7, 2024

Although not implemented currently, it should be possible to share the feature extraction for multiple targets. With the current implementation, you will have to track each target independently, i.e. one forward pass for each target.

from pytracking.

dongfangduoshou123 commented on September 7, 2024

Thank you!
If this is the case, the real-time tracking of multi-target can also be guaranteed, which is essential for many practical scenarios， That's very good， Great work！

from pytracking.

dongfangduoshou123 commented on September 7, 2024

I found in ATOM at tracking stage, The extrackted backbone feature is the target-specific Area(5 times the estimated target size) not the whole image, so the share feature extraction is refer to share feature for iou_net_target_postion_estimation and target classcification for single tracked target.
so if track multi-targets, each target should do the backbone feature extraction independtly, is it right?
@martin-danelljan @goutamgmb
Thank you!

from pytracking.

dongfangduoshou123 commented on September 7, 2024

I am most concerned about its real-time performance after multi-target-track extension.
Is it possible to extract the backbone feature of whole image first for all tracked targets to share, then take sub_feature of each target, and each target's sub_feature share for iou_net and online classification, I think only so, the real-time tracking of multi-target can also be guaranteed.

from pytracking.

martin-danelljan commented on September 7, 2024

This is one way it could be implemented.
First of all, everything could be done in the global image coordinate frame, instead of cropping an image patch as done now.

Extract features over the whole image.
To train the target classification component, construct label maps in the global image coordinates and train the model for each object individually. (Alternatively you could crop a part of the extracted feature map and do training that way.)
To use the target classifier, apply the filters for all objects over the feature map for the entire frame.
To do target estimation. First apply the common conv layers in the iou predictor. Then do the PrPool in the global coordinates. Then predict the final IoU.

Of course, we cannot guarantee real-time performance for multi-object tracking, because we have not implemented or tried it. I think it can be done fairly efficiently however, depending on how many objects you want to track. You are very welcome to try something like this, and report on your findings.

from pytracking.

dongfangduoshou123 commented on September 7, 2024

This is one way it could be implemented.
First of all, everything could be done in the global image coordinate frame, instead of cropping an image patch as done now.
1. Extract features over the whole image.

2. To train the target classification component, construct label maps in the global image coordinates and train the model for each object individually. (Alternatively you could crop a part of the extracted feature map and do training that way.)

3. To use the target classifier, apply the filters for all objects over the feature map for the entire frame.

4. To do target estimation. First apply the common conv layers in the iou predictor. Then do the PrPool in the global coordinates. Then predict the final IoU.
Of course, we cannot guarantee real-time performance for multi-object tracking, because we have not implemented or tried it. I think it can be done fairly efficiently however, depending on how many objects you want to track. You are very welcome to try something like this, and report on your findings.

"""4. To do target estimation. First apply the common conv layers in the iou predictor. Then do the PrPool in the global coordinates. Then predict the final IoU""" means dose not need to get sub_feature for each tracked target?

I think the modulation vector should be per target per modulation vector manner, so per target per sub_feature should be geted from the whole feature map to generate the their modulation vectors.

from pytracking.

martin-danelljan commented on September 7, 2024

The modulation vector needs to be target specific. Im not sure what you mean with the sub feature. Anyway, the first step in the feature extraction in the iou predictor can be shared across objects, as i mentioned earlier.

from pytracking.

dongfangduoshou123 commented on September 7, 2024

The modulation vector needs to be target specific. Im not sure what you mean with the sub feature. Anyway, the first step in the feature extraction in the iou predictor can be shared across objects, as i mentioned earlier.

After the first step in the feature extraction in the iou predictor, each target do the PrPool in the feature map region(sub feature) corresponding to target bbox in orignal image to get the target specific modulation vector? then each target do predict the final IoU independly?

from pytracking.

martin-danelljan commented on September 7, 2024

Yes.

from pytracking.

momo1986 commented on September 7, 2024

Hello, @martin-danelljan , Martin.

Thanks for your sharing.

I have tried your open-source algorithm.

It works smoothly with single object tracking.

However, I have some problems for multiple object tracking (e.g., 2):

Instantiating multiple trackers and add it into tracker-list, during the inference time, the tracker will updated separately, it can keep the accuracy, but it slow down the speed.
I am not sure whether you atom tracker can be added into cv2.MultiTracker_create(). I have tried this flow so that multiple object tracker can inference only once. Here is the code:

     if optional_box is not None:
                assert isinstance(optional_box, list, tuple)
                assert len(optional_box) == 4, "valid box's foramt is [x,y,w,h]"
                for i in range(len(tracker_list)):
                    tracer_list[i].initialize(frame_disp, optional_box)
                    multi_tracker.add(tracker_list[i], frame_disp, optional_box)
            else:
                while True:
                    # cv.waitKey()
                    frame_disp = frame.copy()

                    cv.putText(frame_disp, 'Select target ROI and press ENTER, two or more', (20, 30), cv.FONT_HERSHEY_COMPLEX_SMALL,
                                1.5, (0, 0, 0), 1)
                    cv.putText(frame_disp, str(len(tracker_list)), (40, 50), cv.FONT_HERSHEY_COMPLEX_SMALL,
                                1.5, (0, 0, 0), 1)

                    for i in range(len(tracker_list)):
                        x, y, w, h = cv.selectROI(display_name, frame_disp, fromCenter=False)
                        init_state = [x, y, w, h]
                        tracker_list[i].initialize(frame, init_state)
                        multi_tracker.add(tracker_list[i], frame_disp, init_state)
                    break
      state = multi_tracker.track(frame)

It reports the error, shows that this cannot be supported currently:

Traceback (most recent call last):
File "run_video_multiple_object.py", line 40, in
main()
File "run_video_multiple_object.py", line 36, in main
run_video(args.tracker_name, args.tracker_param,args.videofile, args.optional_box, args.debug)
File "run_video_multiple_object.py", line 24, in run_video
trackerList.run_video(videofilepath=videofile, optional_box=optional_box, debug=debug)
File "/fast/junyan/Tracking/pytracking/pytracking/evaluation/TrackerList.py", line 81, in run_video
track_videofile_multiple(videofilepath, self.trackerList, optional_box)
File "/fast/junyan/Tracking/pytracking/pytracking/tracker/base/basetracker.py", line 83, in track_videofile_multiple
multi_tracker.add(tracker_list[i], frame_disp, init_state)
TypeError: Expected cv::Tracker for argument 'newTracker'

Thus, my question is how to make multiple-object tracker with atom algorithm not decay in speed?

Thanks if you can answer the question.

Regards!

from pytracking.

momo1986 commented on September 7, 2024

@martin-danelljan , thanks for your potential reply.
Also, for multiple objects tracker with tracker list, it will trigger this error

I am not sure what is the root cause for this failure.
How can I resolve it?
Regards!

from pytracking.

dongfangduoshou123 commented on September 7, 2024

With current ATOM implement, If you want to track 5 targets in a frame, I think you must do 5 times backbone feature extraction independently, the time cost will be Linearly grow, because The extracted backbone feature is the target-specific Area not the whole image, can not share between targets, so must one inference Corresponding to one target. This is my understand of ATOM.

In addition, the ATOM's online classification model output score-raw's post processing (exclude online classification model training which is very detailed) seems to be no explanation in paper, so the process from score-raw 1×1×18×18 to final 1×1×288×288 score's principle in localize_target function is some hard to understand.

from pytracking.

martin-danelljan commented on September 7, 2024

Hi @momo1986
Thanks for your interest. A naive multi-object tracker should be easy to implement using the functionality in the ATOM class. I have no idea about the opencv MultiTracker functionality. Just make a wrapper yourself that loops over all objects in each frame and calls the initialize() and track() functions accordingly. The only think that might require some attention is sharing the network weights across all objects, so that you dont have one network in memory for each object. You should be able to do this by sharing the params.features for all objects.

@dongfangduoshou123 You are right that the upsampling of the scores are not described in the paper due to space limitation. This is not an important element, and the tracker does very well without it (if you e.g. want to optimize it further).

We have no current plans of extending pytracking to multi-target. But we would welcome any contribution that neatly adds simple multi-target tracking functionality to the current structure.

from pytracking.

dongfangduoshou123 commented on September 7, 2024

Thank you for your reply!
The tracker does very well without upsampling of the scores?
but localize_target and localize_advanced both use the upsample result to get the translatevec. It will be good if you could add a function in ATOM class to show how to get the classification model's final translatevec with out upsample the score_raw(function maybe named as localize_target_without_score_upsampling) thank you！

I mainly cannot understand this code(due to the paper has not mention) :

Convert to displacements in the base scale

disp = (max_disp + self.output_sz / 2) % self.output_sz - self.output_sz / 2
I think disp = max_disp - self.output_sz / 2, but after this edit, the tracker will not work.

from pytracking.

martin-danelljan commented on September 7, 2024

Hi. If you plot the scores at that stage, you will understand why that is needed.

from pytracking.

dongfangduoshou123 commented on September 7, 2024

I have an idea, may be not feasible.
Kalman filtering + ATOM's online classification component， ATOM's online classification component provide rough location， then use kalman to correct the position and hw. classification component's input feature may be not deep feature, use fast Artificial Designed feature.
may be this plan the real-time tracking of multi-targets can also be guaranteed.

from pytracking.

momo1986 commented on September 7, 2024

Hello @martin-danelljan ， Martin.

My current workaround is instantiating multiple ATOM tracker and do inference every time.

My understanding is that this tracker is the state-of-the-art single-object tracker.

For other multiple objecting training, maybe a specific dataset with video-frame sequences is needed.

I am not sure whether my presentation is correct.

Thanks for your guidance.
Regards!

from pytracking.

martin-danelljan commented on September 7, 2024

Hi. Yes that sounds like a good way of implementing it.
Regards,
Martin

from pytracking.

Is it possible to Track multi-targets in a single inference? about pytracking HOT 19 CLOSED

Comments (19)

Convert to displacements in the base scale

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent