Giter Club home page Giter Club logo

Comments (13)

youngwanLEE avatar youngwanLEE commented on June 8, 2024 1

@wirstrom
That's problem is because the pre-trained weight includes an optimizer that was used in the training step so your program loaded previous information.

I modified the centermask2-lite-V-39-eSE-FPN-ms-4x.pth (pretrained weight).

You have to re-download the weight file and try it again.

from centermask2.

wirstrom avatar wirstrom commented on June 8, 2024

Update:
If I set cls_agnostic_mask = False as suggested in #1, I get the following error instead.

RuntimeError                              Traceback (most recent call last)
<ipython-input-2-771faee0e82e> in <module>()
     59 torch.save(pth, "centermask2-lite-V-39-eSE-FPN-ms-4x.pth")
     60 
---> 61 train()

5 frames
<ipython-input-2-771faee0e82e> in train()
     52       )
     53 
---> 54   trainer.train()
     55 
     56 # Make training start from iteration 0

/content/centermask2/train_net.py in train(self)
     95             OrderedDict of results, if evaluation is enabled. Otherwise None.
     96         """
---> 97         self.train_loop(self.start_iter, self.max_iter)
     98         if hasattr(self, "_last_eval_results") and comm.is_main_process():
     99             verify_results(self.cfg, self._last_eval_results)

/content/centermask2/train_net.py in train_loop(self, start_iter, max_iter)
     84             for self.iter in range(start_iter, max_iter):
     85                 self.before_step()
---> 86                 self.run_step()
     87                 self.after_step()
     88             self.after_train()

/usr/local/lib/python3.6/dist-packages/detectron2/engine/train_loop.py in run_step(self)
    232         wrap the optimizer with your custom `step()` method.
    233         """
--> 234         self.optimizer.step()
    235 
    236     def _detect_anomaly(self, losses, loss_dict):

/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py in wrapper(*args, **kwargs)
     64                 instance._step_count += 1
     65                 wrapped = func.__get__(instance, cls)
---> 66                 return wrapped(*args, **kwargs)
     67 
     68             # Note that the returned function here is no longer a bound method,

/usr/local/lib/python3.6/dist-packages/torch/optim/sgd.py in step(self, closure)
    104                         d_p = buf
    105 
--> 106                 p.data.add_(-group['lr'], d_p)
    107 
    108         return loss

RuntimeError: output with shape [1, 128, 1, 1] doesn't match the broadcast shape [80, 128, 1, 1]

I tried to change cfg.MODEL.FCOS.NUM_CLASSES cfg.MODEL.RETINANET.NUM_CLASSES from 80 to 1, but I still get the same error.

from centermask2.

youngwanLEE avatar youngwanLEE commented on June 8, 2024

@wirstrom

I modified the code.

Now you can pull the updated code and try it again.

from centermask2.

Nihar1989 avatar Nihar1989 commented on June 8, 2024

hi,

its not working,i tried it today

from centermask2.

youngwanLEE avatar youngwanLEE commented on June 8, 2024

@Nihar1989 What's your problem?

Did you pull the updated code?

I already checked the code by using person-detection.

from centermask2.

wirstrom avatar wirstrom commented on June 8, 2024

Thank you.
Now the first issue (RuntimeError: There were no tensor arguments to this ...) seems to be solved.
I am, however, getting the following error:

RuntimeError                              Traceback (most recent call last)
<ipython-input-9-7fb5229ee163> in <module>()
     75 # pth = torch.load(pth_file)
     76 
---> 77 train()

<ipython-input-9-7fb5229ee163> in train()
     64   #     )
     65 
---> 66   trainer.train()
     67 
     68 # Make training start from iteration 0

/content/centermask2/train_net.py in train(self)
     95             OrderedDict of results, if evaluation is enabled. Otherwise None.
     96         """
---> 97         self.train_loop(self.start_iter, self.max_iter)
     98         if hasattr(self, "_last_eval_results") and comm.is_main_process():
     99             verify_results(self.cfg, self._last_eval_results)

/content/centermask2/train_net.py in train_loop(self, start_iter, max_iter)
     84             for self.iter in range(start_iter, max_iter):
     85                 self.before_step()
---> 86                 self.run_step()
     87                 self.after_step()
     88             self.after_train()

/usr/local/lib/python3.6/dist-packages/detectron2/engine/train_loop.py in run_step(self)
    232         wrap the optimizer with your custom `step()` method.
    233         """
--> 234         self.optimizer.step()
    235 
    236     def _detect_anomaly(self, losses, loss_dict):

/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py in wrapper(*args, **kwargs)
     64                 instance._step_count += 1
     65                 wrapped = func.__get__(instance, cls)
---> 66                 return wrapped(*args, **kwargs)
     67 
     68             # Note that the returned function here is no longer a bound method,

/usr/local/lib/python3.6/dist-packages/torch/optim/sgd.py in step(self, closure)
    104                         d_p = buf
    105 
--> 106                 p.data.add_(-group['lr'], d_p)
    107 
    108         return loss

RuntimeError: output with shape [1, 128, 3, 3] doesn't match the broadcast shape [80, 128, 3, 3]

from centermask2.

youngwanLEE avatar youngwanLEE commented on June 8, 2024

Did you set cfg.MODEL.FCOS.NUM_CLASSES = 1?

from centermask2.

wirstrom avatar wirstrom commented on June 8, 2024

Yes.

from centermask2.

wirstrom avatar wirstrom commented on June 8, 2024

It works now :)
Thank you very much!

from centermask2.

youngwanLEE avatar youngwanLEE commented on June 8, 2024

Have fun :)

from centermask2.

rohanshingade avatar rohanshingade commented on June 8, 2024

@youngwanLEE can you update the centermask2-V-99-eSE-FPN-ms-3x.pth weight as well? Facing issues while finetuning the model with person class only. Changes i made in the config file are as follows:

cfg.MODEL.FCOS.NUM_CLASSES = 1
cfg.MODEL.RETINANET.NUM_CLASSES = 1
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1

To register the custom dataset, used

register_coco_instances("human_dataset_val", {}, "/path/to/person.json", "/path/to/personimages/")
MetadataCatalog.get("human_dataset_val").set(thing_classes = ["person"])

I'm getting the following error:

[11/23 12:45:25 d2.evaluation.evaluator]: Inference done 159/176. 0.2748 s / img. ETA=0:00:04
[11/23 12:45:30 d2.evaluation.evaluator]: Total inference time: 0:00:47.439329 (0.277423 s / img per device, on 1 devices)
[11/23 12:45:30 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:47 (0.275423 s / img per device, on 1 devices)
No predictions from the model!
No predictions from the model!
[11/23 12:45:30 d2.engine.defaults]: Evaluation results for human_dataset_val in csv format:
[11/23 12:45:30 d2.evaluation.testing]: copypaste: Task: bbox
[11/23 12:45:30 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[11/23 12:45:30 d2.evaluation.testing]: copypaste: nan,nan,nan,nan,nan,nan
[11/23 12:45:30 d2.evaluation.testing]: copypaste: Task: segm
[11/23 12:45:30 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[11/23 12:45:30 d2.evaluation.testing]: copypaste: nan,nan,nan,nan,nan,nan

from centermask2.

Paragjain10 avatar Paragjain10 commented on June 8, 2024

@youngwanLEE can you update the centermask2-V-99-eSE-FPN-ms-3x.pth weight as well? Facing issues while finetuning the model with person class only. Changes i made in the config file are as follows:

cfg.MODEL.FCOS.NUM_CLASSES = 1
cfg.MODEL.RETINANET.NUM_CLASSES = 1
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1

To register the custom dataset, used

register_coco_instances("human_dataset_val", {}, "/path/to/person.json", "/path/to/personimages/")
MetadataCatalog.get("human_dataset_val").set(thing_classes = ["person"])

I'm getting the following error:

[11/23 12:45:25 d2.evaluation.evaluator]: Inference done 159/176. 0.2748 s / img. ETA=0:00:04
[11/23 12:45:30 d2.evaluation.evaluator]: Total inference time: 0:00:47.439329 (0.277423 s / img per device, on 1 devices)
[11/23 12:45:30 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:47 (0.275423 s / img per device, on 1 devices)
No predictions from the model!
No predictions from the model!
[11/23 12:45:30 d2.engine.defaults]: Evaluation results for human_dataset_val in csv format:
[11/23 12:45:30 d2.evaluation.testing]: copypaste: Task: bbox
[11/23 12:45:30 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[11/23 12:45:30 d2.evaluation.testing]: copypaste: nan,nan,nan,nan,nan,nan
[11/23 12:45:30 d2.evaluation.testing]: copypaste: Task: segm
[11/23 12:45:30 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[11/23 12:45:30 d2.evaluation.testing]: copypaste: nan,nan,nan,nan,nan,nan

hello @rohanshingade,

Could you figure out what was causing nan values to be generated?

I am facing a similar problem. Would be helpful of you could let me know what was the solution to this.

from centermask2.

rohanshingade avatar rohanshingade commented on June 8, 2024

Does your dataset contain only labelled images?

If yes then,

def _eval_predictions(self, tasks, predictions):
        """
        Evaluate predictions on the given tasks.
        Fill self._results with the metrics of the tasks.
        """
        self._logger.info("Preparing results for COCO format ...")
        coco_results = list(itertools.chain(*[x["instances"] for x in predictions]))
        
        human_detections = list()
        for k in range(0,len(coco_results)):
            temp3 = coco_results[k]['category_id']
            if temp3 == 0:
                #coco_results[k]['category_id'] = 1
                human_detections.append(coco_results[k])
        coco_results = human_detections.copy()

make the following changes in centermask2/centermask/evaluation/coco_evaluation.py
it detects only person and discards the rest of the predicted output.

I did this long back. so can't recall what exactly the issue was.

from centermask2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.