aim-uofa / adelaidet Goto Github PK

View Code? Open in Web Editor NEW

3.3K 84.0 643.0 673 KB

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Home Page: https://git.io/AdelaiDet

License: Other

Python 89.54% Cuda 5.67% C++ 3.90% Shell 0.73% Dockerfile 0.16%

fcos blendmask abcnet object-detection instance-segmentation ocr text-recognition meinst text-detection condinst

adelaidet's People

Contributors

Stargazers

Watchers

Forkers

happog liuguoyou ml-lab liyongsheng-tech xuewengeophysics snowhou xinyuliu-jeffrey adrelino jasonallinger ingeniousfrog ahmadmobeen harold-lkk stanstarks peternara xingjici tangyoubao kapitsa2811 ahmedhaies chaoso zgsxwsdxg holygen rahul-sridhar keshabb databill86 tianweiy cqray1990 hzhang57 hardsoft2023 wenhuach awoziji leoli08 cs-chan dlsnowman wutianyirosun helloricky123 wuzzh uwo-ccpl hzy5000 elejke strategist922 dy1998 sxhxliang cclauss bruinxiong xuejiwei73 xlsean b-xiang mornydew zebrajack bovey0809 huizhang0110 fbqwings mugeshuzi tamwaiban leegenux zbwxp shengzhang90 starstylesky qiuhuan tukjet xiaoye77 amore-hdu lbin cv-ip joranwang super-ljg mumblebo jxncyym wangbo-zhao jiyuxuan926 blankxyz zhanqiqi yoline777 youtang1993 fagan2888 visionresearch liu199604 fresty zhongyingmatrix fengxingxiang killsking snowbhr06 skeletonone gaowq2017 bigdantheone chisyliu billhhh wn9081 limsijie93 chrislee2012 naiq forlovezed avinoambitton lei522 stan-haochen unsky capri2014 youngwanlee sporterman zingala2020

adelaidet's Issues

what does this mean?

in this code:

# if self.thresh_with_ctr is True, we multiply the classification
        # scores with centerness scores before applying the threshold.
        if self.thresh_with_ctr:
            box_cls = box_cls * ctrness[:, :, None]
        candidate_inds = box_cls > self.pre_nms_thresh
        pre_nms_top_n = candidate_inds.view(N, -1).sum(1)
        pre_nms_top_n = pre_nms_top_n.clamp(max=self.pre_nms_top_n)

        if not self.thresh_with_ctr:
            box_cls = box_cls * ctrness[:, :, None]

so that box_cls = box_cls * ctrness[:, :, None] will execute whatever?

[Feature Proposal] Adding efficient backbone network, VoVNet

Can I send a PR about VoVNet backbone network which is better performance and faster speed than ResNet ??

VoVNet(vovnet-detectron2) was already implemented in detectron2-style and proved better performance and faster speed than ResNet in detectron2.

About the initialization of the four 3 × 3 convolutions in the mask branch of CondInst.

Do they require special initialization?

Bezier Curve Synthetic Dataset

Hi, where can i download the Bezier Curve Synthetic Dataset which is mentioned in the original paper ABCNet

Training efficiency: tricks to make training faster

I am training FCOS based on FCOS_MS_X_101_64x4d_2x model. The estimated time required is approximately 4-5 days. Is there any tricks I can use to speed up the training please? My training process runs on 8 V100.

About training details of the "Attns" in BlendMask

Hi, I would like to ask some questions about the Attns in the Top Layer of BlendMask.
In the paper, 3.1 chapter only describes how to inference the Attns along with the top k proposals, I wonder how to add target Attns to the Nproposal(original) Attns during training, cuz there is an function of add_gt_proposals(boxlists, targets) while training.
Maybe set as resized target mask? But how about the the "K" dim of Attns?
Wondering if I misunderstood...

SOLO release

Hi! SOLO is wonderful! Since I heard you will release at end of last month, but didn't, I'm wondering when will you release the code? is there any plan?

direct pose release

when the direct pose will be released?

Does BlendMask supports export to onnx?

wherre is for ABCNet config for training?

Hello, wherre is for ABCNet config for training? ths.

No module named 'detectron2'

ModuleNotFoundError: No module named 'detectron2'

Attempt to Reproduce the Results of CondInst.

Hi~ @tianzhi0549
I want to make sure the shared head architecture of CondInst.
Design A

                 --- conv --- conv --- conv --- conv --- cls_pred 
                |       
                |                                        --- ctr_pred 
                |                                       |
FPN features --- --- conv --- conv --- conv --- conv --- --- reg_pred 
                |
                |
                |
                 --- conv --- conv --- conv --- conv --- controller_pred

Design B

                 --- conv --- conv --- conv --- conv --- cls_pred 
                |       
                |                                        --- ctr_pred 
                |                                       |
FPN features --- --- conv --- conv --- conv --- conv --- --- reg_pred 
                                                        |
                                                         --- controller_pred

Which one is right?
I found Design B will degradation Box AP and mask AP is also very low.
Here is my results for MS-R-50_1x.

Box AP

AP	AP50	AP75
38.269	57.210	55.405

Mask AP

AP	AP50	AP75
27.531	51.157	47.783

The Box AP should be higher than 39.5 for MS training(~39.5) & multi-task training(+~1.0). So I think Design B is wrong. It is hard for one branch to handle 3 preds, and the grad from controller_pred degenerate the reg_pred.

Blendmask test img error

About norm-type and activation function in the MaskHead of CondInst.

Not mentioned in the paper.

A question about FCOS postprocessing

        candidate_inds = box_cls > self.pre_nms_thresh
        pre_nms_top_n = candidate_inds.view(N, -1).sum(1)
        pre_nms_top_n = pre_nms_top_n.clamp(max=self.pre_nms_top_n)

        # multiply the classification scores with centerness scores
        box_cls = box_cls * centerness[:, :, None]

        results = []
        for i in range(N):
            per_box_cls = box_cls[i]
            per_candidate_inds = candidate_inds[i]
            per_box_cls = per_box_cls[per_candidate_inds]

            per_candidate_nonzeros = per_candidate_inds.nonzero()
            per_box_loc = per_candidate_nonzeros[:, 0]
            per_class = per_candidate_nonzeros[:, 1] + 1

            per_box_regression = box_regression[i]
            per_box_regression = per_box_regression[per_box_loc]
            per_locations = locations[per_box_loc]

Line1: why not candidate_inds = box_cls.max(dim=2)[0] > self.pre_nms_thresh, meaning get the class of max probablity.

onnx convert to tensorrt error

While parsing node number 243 [InstanceNormalization -> "615"]:
ERROR: /home/onnx-tensorrt/builtin_op_importers.cpp:1550 In function importInstanceNormalization:
[8] Assertion failed: !isDynamic(tensor_ptr->getDimensions()) && "InstanceNormalization does not support dynamic inputs!

Wonderful work! Look forward to your SOLO code.

support customized dataset

Hi, I believe AdelaiDet is an awesome project.
But I wonder how can I train the detection model with my own dataset .
I've converted my dataset to follow coco-format, but encountered "asserting failed" during loading dataset because my object categories is different from coco_2017_train.
Is there any method to change default training dataset? Thank you for your help.

Can't understand relative coordinates map well in CondInst.

Thanks for your excellent idea.
I can't understand relative coordinates very well, how did it get there? Is it related to coordconv?

train error

when i change the mask_format to bitmask find error, because my dataset has polygon and rle
how to fix? thx

About RealTime FCOS

Hi there!

The FCOS-RT is really amazing! Thanks for the work!

A quick question regarding its configuration:MS_DLA_34_4x_syncbn.yaml:
The solver is set up like this:
SOLVER: STEPS: (300000, 340000) MAX_ITER: 360000

I just noticed that while 360K is 4x of 90K in the vanilla setting, 300K and 340K are not 4x of 60K and 80K.

Is there any particular reason for such two lr dropping points?

Looking forward to your reply!

Thanks!

BlendMask Train Auxiliary Loss Error

Hi,

I met one bug when training BlendMask.

------------------------------------------------------------------------------------

The file path is adet/modeling/blendmask/basis_module.py
seg_loss = F.cross_entropy(
sem_out, gt_sem.squeeze().long())

------------------------------------------------------------------------------------

The problem is the mismatch of dimension between sem_out (torch.Size([1, 81, 88, 132]))and gt_sem (torch.Size([1, 1, 88, 132])).
Could you please help me to solve this problem?
Thanks.

How does FCOS perform in this repository (Detectron2-based) compared with the original implementation (maskrcnn-benchmark-based)?

Thanks for the great work!
I wonder if there is performance benchmark about speed and AP of this detectron2-based implementation compared with the original FCOS implementation? Does the latest improvement also merge into this version as well?

Inference with ABCNet, bit horrible?

Using the following config, I ran a random picture on ABCNet, a quick inference though, nothing changes significantly.

!python demo/demo.py \
    --config-file configs/BAText/TotalText/attn_R_50.yaml \
    --input /content/sam.png \
    --output /content/output/ \
    --opts MODEL.WEIGHTS /content/AdelaiDet/attn_tt_6262.pth

Output:

The inference is quite disappointing. Single-word detected well a bit but sequence of words, it failed to capture.

Why normalize reg_targets by FPN's strides?

No such file or directory: '..\zengimg\\6122.npz'

When I use blendmask to train my coco format, the error requires npz file, and zengimg is the picture folder.Why?

File "train_net.py", line 104, in train_loop
self.run_step()
File "d:\cnn\detect2\detectron2\engine\train_loop.py", line 209, in run_step
data = next(self._data_loader_iter)
File "d:\cnn\detect2\detectron2\data\common.py", line 140, in iter
for d in self.dataset:
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data\dataloader.py", line 345, in next
data = self._next_data()
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data\dataloader.py", line 856, in _next_data
return self._process_data(data)
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data\dataloader.py", line 881, in _process_data
data.reraise()
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch_utils.py", line 394, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch

data = [self.dataset[idx] for idx in possibly_batched_index]

File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "d:\cnn\detect2\detectron2\data\common.py", line 41, in getitem
data = self._map_func(self._dataset[cur_idx])
File "D:\CNNW\AdelaiDet\adet\data\dataset_mapper.py", line 137, in call
basis_sem_gt = np.load(basis_sem_path)["mask"]
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\numpy\lib\npyio.py", line 428, in load
fid = open(os_fspath(file), "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'D:\labelme\zengimg\6122.npz'

Question about center sampling, lower AP with FCOS.CENTER_SAMPLE=True

On my own datasets, about 8000 training images and 700 validation images, contains 65 classes, both COCO style. I found that the performance of the Center sampling is always 1 point AP lower than the without center sampling, after tried different training schedule and read the code very carefully, I can't find the reason.
Also I tried the mmdetection with center sampling enabled, but the result shows the center sampling will be always better, which make sense.
I found that without Center sampling, the AP of some classes will be higher, like "Pig", or "Racing Cars".

Could you please give me some hints about such condition, or any suggestion to debug? Thanks.

I meet a big problem.

1.why your code can change my CUDA environment variable? my company server cannot use tf anymore.
2.why I cannot del AdelaiDet file on my company server ?it like a computer virus.
3.Please tell me why?I was badly scolded by my leader!!!

MEinst release

Hello, the MEInst is a good work, the paper says that the code is available at https://git.io/AdelaiDet, but I don't find it in the link. When will the code be released? Thank you！

Connection closed by peer

Hello!

This is a kind of tricky problem and not sure whether someone has encountered it. But anyway I hope to seek some help.

So I did some modification on top of FCOS, including some CPU data processing. I can train the network normally when using 2 V100 GPUs. Now I hope to use R101 so I migrate to 4 TITAN GPUs. And every time I started the training, it always threw an error around 2100 iterations:

Here are some related post but none of them can solve the problem in my case:
facebookresearch/detectron2#817 (most related)

pytorch/pytorch#30439
pytorch/pytorch#16941

when I run test code : it gave me an error which liked below,what is wrong with me?

A question about ABCnet

Hello, I have a question about ABCnet.

Here is the sentences before 3. Experiments: "Note that during training, we directly use the generated Bezier curve GT to extract the RoI features. Therefore the detection branch does not affect the recognition branch. In the inference phase, the RoI region is replaced by the detecting Bezier curve described in Section 2.1."

Do you mean that the detection branch and the recognition are separated during training? Is ABCnet end-to-end? I don't know the total loss of ABCnet.

I would be very appreciated if you can answer me. Thank you!

ABCNet Demo gives error:

[05/16 21:21:18 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/BAText/CTW1500/attn_R_50.yaml', input=['input.jpg'], opts=['MODEL.WEIGHTS', 'ctw1500_attn_R_50.pth'], output=None, video_input=None, webcam=False)
WARNING [05/16 21:21:18 d2.config.compat]: Config 'configs/BAText/CTW1500/attn_R_50.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
File "demo/demo.py", line 72, in
cfg = setup_cfg(args)
File "demo/demo.py", line 23, in setup_cfg
cfg.merge_from_file(args.config_file)
File "/usr/local/lib/python3.6/dist-packages/detectron2/config/config.py", line 49, in merge_from_file
self.merge_from_other_cfg(loaded_cfg)
File "/usr/local/lib/python3.6/dist-packages/fvcore/common/config.py", line 118, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/usr/local/lib/python3.6/dist-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/usr/local/lib/python3.6/dist-packages/yacs/config.py", line 464, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/usr/local/lib/python3.6/dist-packages/yacs/config.py", line 477, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: INPUT.HFLIP'

[CondInst] AP=31.3 when F-mask not avalible

Thanks for your great work! As showed in Table-3, only using rel-coord can achive 31.3 AP, which is quite interesting! So have you print out the mask and controller weight with this setting? Will the region around (x,y) be activated for instance mask?

Box AP of CondInst

Since CondInst is built on object detector FCOS, I am interested in the box AP of CondInst.

SOLO release

First, thank you for your work, it is really useful.
I would like to ask when are you planning to release SOLO network with the corresponding model weights?

BlendMask RT trained AP got 0 in BN

Here is the train command and eval command I am use:

 python3 tools/train_net.py \
    --config-file configs/BlendMask/RT_R_50_4x_bn.yaml \
    --num-gpus 3 --eval-only
  MODEL.WEIGHTS output/blendmask/RT_R_50_4x/model_0294999.pth

python3 tools/train_net.py \
    --config-file configs/BlendMask/RT_R_50_4x_bn.yaml \
    --num-gpus 3

I am using 3 GPUs to train, and I have changed lr along with batch size:

cat configs/BlendMask/Base-BlendMask.yaml 
MODEL:
  META_ARCHITECTURE: "BlendMask"
  MASK_ON: True
  BACKBONE:
    NAME: "build_fcos_resnet_fpn_backbone"
  RESNETS:
    OUT_FEATURES: ["res3", "res4", "res5"]
  FPN:
    IN_FEATURES: ["res3", "res4", "res5"]
  PROPOSAL_GENERATOR:
    NAME: "FCOS"
  BASIS_MODULE:
    LOSS_ON: True
  PANOPTIC_FPN:
    COMBINE:
      ENABLED: False
  FCOS:
    THRESH_WITH_CTR: True
    USE_SCALE: False
DATASETS:
  TRAIN: ("coco_2017_train",)
  TEST: ("coco_2017_val",)
SOLVER:
  IMS_PER_BATCH: 6
  BASE_LR: 0.005  # Note that RetinaNet uses a different default learning rate
  STEPS: (60000, 80000)
  MAX_ITER: 90000
INPUT:
  MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)

the model I changed:

_BASE_: "Base-550.yaml"
INPUT:
  MIN_SIZE_TRAIN: (256, 288, 320, 352, 384, 416, 448, 480, 512, 544, 576, 608)
  MAX_SIZE_TRAIN: 900
  MAX_SIZE_TEST: 736
  MIN_SIZE_TEST: 512
MODEL:
  WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
  RESNETS:
    DEPTH: 50
    NORM: "BN"
  BACKBONE:
    FREEZE_AT: -1
SOLVER:
  STEPS: (300000, 340000)
  MAX_ITER: 360000
OUTPUT_DIR: "output/blendmask/RT_R_50_4x"

Is BlendMask RT really works?

About 169 parameters in CondInst

Conditional Convolutions for Instance Segmentation brings a new paradigm to instance segmentation, but I don't quite understand some of the details in the paper. The following three questions are my doubts. Can you give me some advice?

Are all masks calculated from the positive samples generated by FCOS? (subsection 2.2. Network Outputs and Training Targets)
Why are there 169 parameters ( vector ) in the Controller Head and How do the 169 parameters assign to three 1 × 1 convolutions with 8-channels in Mask FCN Head? How many parameters are there in each of the three 1 × 1 convolution layers?（subsection 2.2.）
How are the three conditional convolution layers and the corresponding parameters(169 parameters in total) calculated in Mask FCN Head? (subsection 2.4)

Thank you very much for your time.

adet / layers / bezier_align.py报错，ImportError：无法导入名称“ _C”，这个_C是什么我没找到

请问‘_C’在什么地方

how to test my own dataset

it seems need to register

Why set bias=True in the towers of FCOS?

AdelaiDet/adet/modeling/fcos/fcos.py

Lines 175 to 181 in 0b51e7e

 tower.append(conv_func( 

 in_channels, in_channels, 

 kernel_size=3, stride=1, 

 padding=1, bias=True 

 )) 

 if norm == "GN": 

 tower.append(nn.GroupNorm(32, in_channels))

FCOS_imprv adds GN in the prediction branches. Is it better to set bias=False?

convert onnx or TensorRT model

Hi~ @tianzhi0549

I trained Fcos-vovnet39 with CrowdHuman dataset, and got a not bad result.
Now I want to convert my pytorch model to an onnx model or a TensorRT model.
I read the detectron2 documents.
https://detectron2.readthedocs.io/tutorials/deployment.html

However, it seems that detectron2 only provides support for 3 meta architectures (GeneralizedRCNN, PanopticFPN, RetinaNet).
Have you done similar work before? Could you provide some guidance please？

ABCNet at detectron2

Hi,
when will the ABCNet be released for detectron2

Try to train BlendMask with custom dataset

Firstly, Thanks for your great work!
I am using BlendMask for my custom dataset, containing 10 classes. When starting training, it raises the following error:

[05/11 12:50:28 d2.engine.train_loop]: Starting training from iteration 0
ERROR [05/11 12:50:28 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/content/detectron2/detectron2/engine/train_loop.py", line 132, in train
self.run_step()
File "/content/detectron2/detectron2/engine/train_loop.py", line 215, in run_step
loss_dict = self.model(data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/content/adet/modeling/blendmask/blendmask.py", line 107, in forward
basis_out, basis_losses = self.basis_module(features, basis_sem)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/content/adet/modeling/blendmask/basis_module.py", line 96, in forward
gt_sem = targets.unsqueeze(1).float()
AttributeError: 'NoneType' object has no attribute 'unsqueeze'

Could you give me some advice on how to get rid of it? I checked the code and it says it was used to resize target to reduce memory. Since the default input is None for target, it seems the basis_module did not give a solution for None targets, I guess?

Question about decay factor in SOLOv2 Matrix NMS

Hi,

the new SOLOv2 is really impressive, expecially the idea of learnable mask kernel and matrix NMS. After reading the paper, I'm a little confused about the decay factor in matrix NMS. Did you compare the experimental results between directly using decay=1-ious (like that in soft NMS) and decay=(1-ious)/(1-ious_cmax)? Thank you very much!

Best Regards,
notabigfish

On the rel. coord. of CondInst

Hi~ @tianzhi0549
I am trying to implement to rel. coord. in the CondInst.

For location of interest (x, y) on the input image:
    x_range = torch.arange(W_mask)
    y_range = torch.arange(H_mask)
    y_grid, x_grid = torch.grid(y_range, x_range)
    y_rel_coord = (y_grid – y / mask_stride).normalize_to(-1, 1)
    x_rel_coord = (x_grid – x / mask_stride).normalize_to(-1, 1)
    rel_coord = torch.cat(x_rel_coord, y_rel_coord)

Am I right? Could you provide the official code snippet of rel. coord.? Thanks!

about ml_nms

Hi,

I have a question on ML_NMS.

In your ml_nms.py, you used batched_nms from detectron2.layers where torchvision.ops.nms is used.

I don't know how to link the ml_nms library to nms function.

Could you explain this?

Thanks in advance.

Is yield_proposal used in any projects?

or how should I use this setting?

Question about the semantic segmentation loss in CondInst

Thanks for your great work! I notice that adding an additional semantic segmentation loss in CondInst could boost the overall performance by about 1 mAP, which is quite a promising result! I want to ask:

where do you add the semantic segmentation loss? After P3 with a 1x1 conv (just like YOLACT) or after F_{mask} or anywhere else?
how do you calculate the semantic segmentation loss? Like YOLACT with sigmoid over 80 classes, or like HTC with 183 classes?

Waiting for your reply!

how's the performance in MEInst with resnet50-fpn backend?

Any experiments and comparasion between MEInst with resnet50 backbone and other models with same?

	tower.append(conv_func(
	in_channels, in_channels,
	kernel_size=3, stride=1,
	padding=1, bias=True
	))
	if norm == "GN":
	tower.append(nn.GroupNorm(32, in_channels))

aim-uofa / adelaidet Goto Github PK

adelaidet's People

Contributors

Stargazers

Watchers

Forkers

adelaidet's Issues

------------------------------------------------------------------------------------

------------------------------------------------------------------------------------

Recommend Projects

Recommend Topics

Recommend Org