peizesun / onenet Goto Github PK
View Code? Open in Web Editor NEW[ICML2021] What Makes for End-to-End Object Detection
License: MIT License
[ICML2021] What Makes for End-to-End Object Detection
License: MIT License
Hi! Thanks for releasing such a wonderful work. From the code it looks like the training loss and matching cost are identical, i.e., the original losses used in FCOS, RetinaNet, CenterNet are also replaced with it, which was a bit unclear from the paper. Have you tried using the matching cost only to find out one-2-one matching, but keeping the original losses? Are results for the three detectors without o2o matching also obtained with the new training loss (same as the matching cost)?
请问在windows平台下能使用吗?README中没有提及win系统
Very interesting work! I have read your implementation on the CondInst with OneNet matching. I've noticed there is a significant drop in mask AP compared to the original CondInst. What could be the causes?
Is it because a single positive sample per instance is not enough to train the mask branch? (I see you have doubled mask loss weight.) Or is it because of the adamw optimizer? Any further digging on this issue?
There is a second question. In the paper, it is described that all anchor box/points are used in cost calculation. Have you tried first using hand-crafted assignment (e.g. fcos) method and then matching? In other word, do the assignment results from matching still satisfy those hand-crafted methods? Love to hear from you. Thank you.
def _get_src_permutation_idx(self, indices):
# permute predictions following indices
batch_idx = torch.cat([torch.full_like(src, i) for i, (src, _) in enumerate(indices)])
src_idx = torch.cat([src for (src, _) in indices])
return batch_idx, src_idx
def _get_tgt_permutation_idx(self, indices):
# permute targets following indices
batch_idx = torch.cat([torch.full_like(tgt, i) for i, (_, tgt) in enumerate(indices)])
tgt_idx = torch.cat([tgt for (_, tgt) in indices])
return batch_idx, tgt_idx
if src or tgt is an empty tensor, it will cause an error :
in _get_src_permutation_idx
src_idx = torch.cat([src for (src, _) in indices])
RuntimeError: All input tensors must be on the same device. Received cpu and cuda:0
i and src are shown below:
0 tensor([ 1943, 20287, 17785, 4497, 6055], device='cuda:0')
1 tensor([11828, 6450, 13916, 5106], device='cuda:0')
2 tensor([14112, 14121, 5737, 12050, 12050, 13194, 9346, 9346, 5709, 3606,
8618, 12050, 4386, 1322, 4027, 8988, 4348, 8220, 6483],
device='cuda:0')
3 tensor([], dtype=torch.int64)
4 tensor([12109, 14215, 12786, 3532, 9812, 7320, 11715, 13564, 12183, 11136,
7112, 9891, 6351, 11333, 11330, 15892, 12598, 12108, 11612, 8574,
9881, 9521, 10388, 10374], device='cuda:0')
5 tensor([ 5141, 17595, 4841, 10382], device='cuda:0')
6 tensor([19832, 19286, 12812, 12812, 10720], device='cuda:0')
the third tensor is empty
maybe it can be fixed by:
def _get_src_permutation_idx(self, indices):
# permute predictions following indices
batch_idx = torch.cat([torch.full_like(src, i) for i, (src, _) in enumerate(indices) if len(src)])
src_idx = torch.cat([src for (src, _) in indices if len(src)])
return batch_idx, src_idx
def _get_tgt_permutation_idx(self, indices):
# permute targets following indices
batch_idx = torch.cat([torch.full_like(tgt, i) for i, (_, tgt) in enumerate(indices) if len(tgt)])
tgt_idx = torch.cat([tgt for (_, tgt) in indices] if len(tgt))
return batch_idx, tgt_idx
Thank your for great project. The paper URL cant open in your readme.
A clear and concise description of the feature proposal.
Tell us why the feature is useful.
Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.
We only consider adding new features if they are relevant to many users.
If you request implementation of research papers -- we only consider papers that have enough significance and prevalance in the object detection field.
We do not take requests for most projects in the projects/
directory, because they are research code release that is mainly for other researchers to reproduce results.
"Make X faster/accurate" is not a valid feature request. "Implement a concrete feature that can make X faster/accurate" can be a valid feature request.
Instead of adding features inside detectron2,
you can implement many features by extending detectron2.
The projects/ directory contains many of such examples.
As shown in the latest code at:
OneNet/projects/OneNet/onenet/detector.py
Line 24 in e63c0cb
The matcher is changed from Minimum Cost at OneNet to Hungarian Matcher. What's the performance gap between this two matchers? Is this change to make the code compatible with the Paper " What Makes for End-to-End Object Detection?"?
I did not see any the details in the paper "What Makes for End-to-End Object Detection?" of which matcher is used in the experiments.
Thanks!
Describe what you want to do, including:
Please link to which API or documentation you're asking about from
https://detectron2.readthedocs.io/
For meaning of a config, please see
https://detectron2.readthedocs.io/modules/config.html#config-references
NOTE:
Only general answers are provided.
If you want to ask about "why X did not work" for something you did, please use the
Unexpected behaviors issue template.
About how to implement new models / new dataloader / new training logic, etc., check documentation first.
We do not answer machine learning / computer vision questions that are not specific to detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.
Hi,
Thank you for your work.
I just have a question about the differences in focal loss used between OneNet and other previous works.
In CenterNet, the author used per-channel logistic regression with focal loss for key-point detection just like your code.
The only difference seems to be the key-point positions selected for labeling, without Gaussian pre-processing.
Am I right ? Thanks.
Thank you for your great work :)
As this work is comparing with CenterNet, do you have training results of OneNet based on modified DLA34 backbone (CenterNet version)? If yes, could you share the pretrained weights and COCO detection performance?
Describe what you want to do, including:
Please link to which API or documentation you're asking about from
https://detectron2.readthedocs.io/
For meaning of a config, please see
https://detectron2.readthedocs.io/modules/config.html#config-references
NOTE:
Only general answers are provided.
If you want to ask about "why X did not work" for something you did, please use the
Unexpected behaviors issue template.
About how to implement new models / new dataloader / new training logic, etc., check documentation first.
We do not answer machine learning / computer vision questions that are not specific to detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.
A clear and concise description of the feature proposal.
Thank you for sharing this excellent E2E object detector. I think it's a quite simple and elegant detection framework!
Could you please provide configure files for more backbone support? Like mobilenet, ghostnet, efficientnet and so on?
More backbones will provide more details comparison of OneNet
Tell us why the feature is useful.
Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.
We only consider adding new features if they are relevant to many users.
If you request implementation of research papers -- we only consider papers that have enough significance and prevalance in the object detection field.
We do not take requests for most projects in the projects/
directory, because they are research code release that is mainly for other researchers to reproduce results.
"Make X faster/accurate" is not a valid feature request. "Implement a concrete feature that can make X faster/accurate" can be a valid feature request.
Instead of adding features inside detectron2,
you can implement many features by extending detectron2.
The projects/ directory contains many of such examples.
For FCOS, the positive sample is chosen from the pre-defined layer in feature pyramids.
It looks like pretty similar, used for replacing NMS post-processing.
Thanks for sharing interesting and stimulating work!
Are you planning to release R50_nodcn and R50_dcn pretrained weight?
And in the paper you applied OneNet style label assignment on DETR and SparseRCNN. Are you also planning these implementation?
If you do not know the root cause of the problem, and wish someone to help you, please
post according to this template:
Check https://stackoverflow.com/help/minimal-reproducible-example for how to ask good questions.
Simplify the steps to reproduce the issue using suggestions from the above link, and provide them below:
If making changes to the project itself, please use output of the following command:
git rev-parse HEAD; git diff
<put code or diff here>
<put logs here>
If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.
If you expect the model to converge / work better, note that we do not give suggestions
on how to train a new model.
Only in one of the two conditions we will help with it:
(1) You're unable to reproduce the results in detectron2 model zoo.
(2) It indicates a detectron2 bug.
Provide your environment information using the following command:
wget -nc -q https://github.com/facebookresearch/detectron2/raw/master/detectron2/utils/collect_env.py && python collect_env.py
If your issue looks like an installation issue / environment issue,
please first try to solve it with the instructions in
https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues
Why is the learning rate set so small? What happens if the learning rate is adjusted larger, such as 0.01 or 0.001?
If you do not know the root cause of the problem, and wish someone to help you, please
post according to this template:
Check https://stackoverflow.com/help/minimal-reproducible-example for how to ask good questions.
Simplify the steps to reproduce the issue using suggestions from the above link, and provide them below:
If making changes to the project itself, please use output of the following command:
git rev-parse HEAD; git diff
<put code or diff here>
<put logs here>
If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.
If you expect the model to converge / work better, note that we do not give suggestions
on how to train a new model.
Only in one of the two conditions we will help with it:
(1) You're unable to reproduce the results in detectron2 model zoo.
(2) It indicates a detectron2 bug.
Provide your environment information using the following command:
wget -nc -q https://github.com/facebookresearch/detectron2/raw/master/detectron2/utils/collect_env.py && python collect_env.py
If your issue looks like an installation issue / environment issue,
please first try to solve it with the instructions in
https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues
Hello, could you please clarify something regarding loss computation. I understand that targets are assigned to net outputs based on their cost function computed as focal_loss + giou_loss + l1_loss. But doesn't the cost value for these locations equal their loss value since loss is as well computed as sum of focal, l1 and giou components?
hello, i easy modify the detectron2 code to simpliy for self train, and work for the face detection on widerface
But, i find the giou loss may nan which i have use the lr with [0.01, 0.001, 0.0001, 0.00001, 0.000001] with ADAMW.
I want know are you have the nan occur or do grad clip for it ?
thx
Hi! Thanks for your great works! I have a question about the index of positive, which I find in your onenet paper pseudocode as following:
# index of positive sample: N _, src_ind = torch.min(cost_mat, dim=0)
# index of ground-truth: N tgt_ind = torch.arange(N)
but in the code as following using Hungarian Algorithm:
# _, src_ind = torch.min(C, dim=0)
# tgt_ind = torch.arange(len(tgt_ids)).to(src_ind)
src_ind, tgt_ind = linear_sum_assignment(C.cpu())
indices.append((src_ind, tgt_ind))
is the performance of Hungarian Algorithm better than that?
Thanks for your awesome work!
conmand:
python3 caffe2_converter.py --format onnx --config-file $dir/onenet.res18.nodcn.yaml --output onnx MODEL.WEIGHTS model_0014999.pth
report:
KeyError: 'Non-existent config key: MODEL.OneNet'
env:
pytorch: 1.6.0
torchvision: 0.7.0
onnx: 1.7.0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.