mseg-dataset / mseg-semantic Goto Github PK

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

License: MIT License

Python 96.11% Shell 3.89%

mseg-semantic's Introduction

Try out our models in Google Colab on your own images!

This repo includes the semantic segmentation pre-trained models, training and inference code for the paper:

MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [TPAMI Journal PDF]
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun
Presented at CVPR 2020. Link to MSeg Video (3min)

NEWS:

[Dec. 2021]: An updated journal-length version of our work is now available on ArXiv here.

This repo is the second of 4 repos that introduce our work. It provides utilities to train semantic segmentation models, using a HRNet-W48 or PSPNet backbone, sufficient to train a winning entry on the WildDash benchmark.

mseg-api: utilities to download the MSeg dataset, prepare the data on disk in a unified taxonomy, on-the-fly mapping to a unified taxonomy during training.
mseg-mturk: utilities to perform large-scale Mechanical Turk re-labeling

One additional repo will be introduced in January 2021:

mseg-panoptic: provides Panoptic-FPN and Mask-RCNN training, based on Detectron2

How fast can your models run?

Our 480p MSeg model that accepts 473x473 px crops can run at 24.04 fps on a Quadro P5000 GPU at single-scale inference.

Model	Crop Size	Frame Rate (fps) Quadro P5000	Frame Rate (fps) V100
MSeg-3m-480p	473 x 473	24.04	8.26
MSeg-3m-720p	593 x 593	16.85	8.18
MSeg-3m-1080p	713 x 713	12.37	7.85

Dependencies

Install the mseg module from mseg-api.

Install the MSeg-Semantic module:

mseg_semantic can be installed as a python package using

  pip install -e /path_to_root_directory_of_the_repo/

Make sure that you can run python -c "import mseg_semantic; print('hello world')" in python, and you are good to go!

MSeg Pre-trained Models

Each model is 528 MB in size. We provide download links and testing results (single-scale inference) below:

Abbreviated Dataset Names: VOC = PASCAL VOC, PC = PASCAL Context, WD = WildDash, SN = ScanNet

Model	Training Set	Training Taxonomy	VOC mIoU	PC mIoU	CamVid mIoU	WD mIoU	KITTI mIoU	SN mIoU	h. mean	Download Link
MSeg (1M)	MSeg train	Universal	70.7	42.7	83.3	62.0	67.0	48.2	59.2	Google Drive
MSeg (3M)-480p	MSeg train	Universal	76.4	45.9	81.2	62.7	68.2	49.5	61.2	Google Drive
MSeg (3M)-720p	MSeg train	Universal	74.7	44.0	83.5	60.4	67.9	47.7	59.8	Google Drive
MSeg (3M)-1080p	MSeg train	Universal	72.0	44.0	84.5	59.9	66.5	49.5	59.8	Google Drive

Inference: Using our pre-trained models

We show how to perform inference here in our Google Colab.

Multi-scale inference greatly improves the smoothness of predictions, therefore our demos scripts use multi-scale config by default. While we train at 1080p, our predictions are often visually better when we feed in test images at 360p resolution.

If you have video input, and you would like to make predictions on each frame in the universal taxonomy, please set:

input_file=/path/to/my/video.mp4

If you have a set of images in a directory, and you would like to make a prediction in the universal taxonomy for each image, please set:

input_file=/path/to/my/directory

If you have as input a single image, and you would like to make a prediction in the universal taxonomy, please set:

input_file=/path/to/my/image

Now, run our demo script:

model_name=mseg-3m
model_path=/path/to/downloaded/model/from/google/drive
config=mseg_semantic/config/test/default_config_360_ms.yaml
python -u mseg_semantic/tool/universal_demo.py \
  --config=${config} model_name ${model_name} model_path ${model_path} input_file ${input_file}

Testing a Model Trained in the Universal Taxonomy

To compute mIoU scores on all train and test datasets, run the following (single-scale inference):

cd mseg_semantic/scripts
./eval_models.sh

This will launch several hundred SLURM jobs, each of the following form:

python mseg_semantic/tool/test_universal_tax.py --config=$config_path dataset $dataset_name model_path $model_path model_name $model_name

Please expect this to take many hours, depending upon your SLURM cluster capacity.

Testing a Model Trained in the Oracle Taxonomy

"Oracle" models are trained in a test dataset's taxonomy, on its train split. To compute mIoU scores on the test dataset's val or test set, please run the following:

This will launch 5 SLURM jobs, each of the following form

python mseg_semantic/tool/test_oracle_tax.py

Citing MSeg

If you find this code useful for your research, please cite:

@InProceedings{MSeg_2020_CVPR,
author = {Lambert, John and Liu, Zhuang and Sener, Ozan and Hays, James and Koltun, Vladlen},
title = {{MSeg}: A Composite Dataset for Multi-domain Semantic Segmentation},
booktitle = {Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}

@article{Lambert23tpami_MSeg,
  author={Lambert, John and Liu, Zhuang and Sener, Ozan and Hays, James and Koltun, Vladlen},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation}, 
  year={2023},
  volume={45},
  number={1},
  pages={796-810},
  doi={10.1109/TPAMI.2022.3151200}
}

Many thanks to Hengshuang Zhao for his semseg repo, which we've based much of this repository off of.

Other baseline models from our paper:

Below we report the performance of individually-trained models that serve as baselines. Inference is performed at single-scale below.

You can obtain the following table by running

python mseg_semantic/scripts/collect_results.py --regime zero_shot --scale ss --output_format markdown
python mseg_semantic/scripts/collect_results.py --regime oracle --scale ss --output_format markdown

after ./mseg_semantic/scripts/eval_models.sh:

Abbreviated Dataset Names: VOC = PASCAL VOC, PC = PASCAL Context, WD = WildDash, SN = ScanNet

Model	Training Set	Training Tax- onomy	VOC mIoU	PC mIoU	CamVid mIoU	WD mIoU	KITTI mIoU	SN mIoU	h. mean	Download Link
ADE20K (1M)	ADE20K train	Universal	35.4	23.9	52.6	38.6	41.6	42.9	36.9	Google Drive
BDD (1M)	BDD train	Universal	14.4	7.1	70.7	52.2	54.5	1.4	6.1	Google Drive
Cityscapes (1M )	Cityscapes train	Universal	13.3	6.8	76.1	30.1	57.6	1.7	6.8	Google Drive
COCO (1M)	COCO train	Universal	73.4	43.3	58.7	38.2	47.6	33.4	45.8	Google Drive
IDD (1M)	IDD train	Universal	14.6	6.5	72.1	41.2	51.0	1.6	6.5	Google Drive
Mapillary (1M)	Mapillary train	Universal	22.5	13.6	82.1	55.4	67.7	2.1	9.3	Google Drive
SUN RGB-D (1M)	SUN RGBD train	Universal	10.0	4.3	0.1	1.9	1.1	42.6	0.3	Google Drive
MSeg (1M)	MSeg train.	Universal	70.7	42.7	83.3	62.0	67.0	48.2	59.2	Google Drive
MSeg Mix w/o relabeling (1M)	MSeg train.	Universal	70.2	42.7	82.0	62.7	65.5	43.2	57.6	Google Drive
MGDA Baseline (1M)	MSeg train.	Universal	68.5	41.7	82.9	61.1	66.7	46.7	58.0	Google Drive
MSeg (3M)-480p	MSeg train	Universal	76.4	45.9	81.2	62.7	68.2	49.5	61.2	Google Drive
MSeg (3M)-720p	MSeg train	Universal	74.7	44.0	83.5	60.4	67.9	47.7	59.8	Google Drive
MSeg (3M)-1080p	MSeg train	Universal	72.0	44.0	84.5	59.9	66.5	49.5	59.8	Google Drive
Naive Mix Baseline (1M)	MSeg train.	Naive								Google Drive
Oracle (1M)			77.8	45.8	78.8	-**	58.4	62.3	-

**WildDash has no training set, so an "oracle" model cannot be trained.

Oracle Model Download Links

VOC 2012 (1M) Model Google Drive
PASCAL-Context (1M) Model Google Drive
Camvid (1M) Model Google Drive
KITTI (1M) Model Google Drive
ScanNet-20 (1M) Model Google Drive

Note that the output number of classes for 7 of the models listed above will be identical (194 classes). These are the models that represent a single training dataset's performance -- ADE20K (1M), BDD (1M), Cityscapes (1M ), COCO (1M), IDD (1M), Mapillary (1M), SUN RGB-D (1M). When we train a baseline model on a single dataset, we train it in the universal taxonomy (w/ 194 classes). If we did not do so, we would need to specify 7*6=42 mappings (which would be unbelievably tedious and also fairly redundant) since we measure each's performance according to zero-shot cross-dataset generalization -- 7 training datasets with their own taxonomy, and each would need its own mapping to each of the 6 test sets.

By training each baseline that is trained on a single training dataset within the universal taxonomy, we are able to specify just 7+6=13 mappings in this table (each training dataset's taxonomy->universal taxonomy, and then universal taxonomy to each test dataset).

Results on the Training Datasets

You can obtain the following table by running

python collect_results.py --regime training_datasets --scale ss --output_format markdown

after ./eval_models.sh:

Model Name	COCO	ADE20k	Mapill	IDD	BDD	Citysca	SUN-RGBD	h. mean
COCO-1m	52.7	19.1	28.4	31.1	44.9	46.9	29.6	32.4
ADE20K-1m	14.6	45.6	24.2	26.8	40.7	44.3	36.0	28.7
Mapillary-1m	7.0	6.2	53.0	50.6	59.3	71.9	0.3	1.7
IDD-1m	3.2	3.0	24.6	64.9	42.4	48.0	0.4	2.3
BDD-1m	3.8	4.2	23.2	32.3	63.4	58.1	0.3	1.6
Cityscapes-1m	3.4	3.1	22.1	30.1	44.1	77.5	0.2	1.2
SUN RGBD-1m	3.4	7.0	1.1	1.0	2.2	2.6	43.0	2.1
MSeg-1m	50.7	45.7	53.1	65.3	68.5	80.4	50.3	57.1
MSeg-1m-w/o relabeling	50.4	45.4	53.1	65.1	66.5	79.5	49.9	56.6
MSeg-MGDA-1m	48.1	43.7	51.6	64.1	67.2	78.2	49.9	55.4
MSeg-3m-480p	56.1	49.6	53.5	64.5	67.8	79.9	49.2	58.5
MSeg-3m-720p	53.3	48.2	53.5	64.8	68.6	79.8	49.3	57.8
MSeg-3m-1080p	53.6	49.2	54.9	66.3	69.1	81.5	50.1	58.8

Experiment Settings

We use the HRNetV2-W48 architecture. All images are resized to 1080p (shorter size=1080) at training time before a crop is taken.

We run inference with the shorter side of each test image at three resolutions (360p, 720p, 1080p), and take the max among these 3 possible resolutions. Note that in the original semseg repo, the author specifies the longer side of an image, whereas we specify the shorter side. Batch size is set to 35.

We generally follow the recommendations of Zhao et al.: Our data augmentation consists of random scaling in the range [0.5,2.0], random rotation in the range [-10,10] degrees. We use SGD with momentum 0.9, weight decay of 1e-4. We use a polynomial learning rate with power 0.9. Base learning rate is set to 1e-2. An auxiliary cross-entropy (CE) loss is added to intermediate activations, a linear combination with weight 0.4. In our data, we use 255 as an ignore/unlabeled flag for the CE loss. We use Pytorch's Distributed Data Parallel (DDP) package for multiprocessing, with the NCCL backend. We use apex opt_level: 'O0' and use a crop size of 713x713, with synchronized BN.

Training Instructions

Please refer to training.md for detailed instructions on how to train each of our models. As a frame of reference as to the amount of compute required, we use 8 Quadro RTX 6000 cards, each w/ 24 GB of RAM, for training. The 3 million crop models took ~2-3 weeks to train on such hardware, and the 1 million crop models took ~4-7 days.

Running unit tests and integration tests

To run the unit tests, execute

pytest tests

All should pass. To run the integration tests, follow the instructions in the following 3 files, then run:

python test_test_oracle_tax.py
python test_test_universal_tax.py
python test_universal_demo.py

All should also pass.

Frequently Asked Questions (FAQ) (identical to FAQ on `mseg-api` page)

Q: Do the weights include the model structure or it's just the weights? If the latter, which model do these weights refer to? Under the models directory, there are several model implementations.

A: The pre-trained models follow the HRNet-W48 architecture. The model structure is defined in the code here. The saved weights provide a dictionary between keys (unique IDs for each weight identifying the corresponding layer/layer type) and values (the floating point weights).

Q: How is testing performed on the test datasets? In the paper you talk about "zero-shot transfer" -- how this is performed? Are the test dataset labels also mapped or included in the unified taxonomy? If you remapped the test dataset labels to the unified taxonomy, are the reported results the performances on the unified label space, or on each test dataset's original label space? How did you you obtain results on the WildDash dataset - which is evaluated by the server - when the MSeg taxonomy may be different from the WildDash dataset.

A: Regarding "zero-shot transfer", please refer to section "Using the MSeg taxonomy on a held-out dataset" on page 6 of our paper. This section describes how we hand-specify mappings from the unified taxonomy to each test dataset's taxonomy as a linear mapping (implemented here in mseg-api). All results are in the test dataset's original label space (i.e. if WildDash expects class indices in the range [0,18] per our names_list, our testing script uses the TaxonomyConverter transform_predictions_test() functionality to produce indices in that range, remapping probabilities.

Q: Why don't indices in MSeg_master.tsv match the training indices in individual datasets? For example, for the road class: In idd-39, road has index 0, but in idd-39-relabeled, road has index 19. It is index 7 in cityscapes-34. The cityscapes-19-relabeled index road is 11. As far as I can tell, ultimately the 'MSeg_Master.tsv' file provides the final mapping to the MSeg label space. But here, the road class seems to have an index of 98, which is neither 19 nor 11.

A: Indeed, unified taxonomy class index 98 represents "road". But we use the TaxonomyConverter to accomplish the mapping on the fly from idd-39-relabeled to the unified/universal taxonomy (we use the terms "unified" and "universal" interchangeably). This is done by adding a transform in the training loop that calls TaxonomyConverter.transform_label() on the fly. You can see how that transform is implemented here in mseg-semantic.

Q: When testing, but there are test classes that are not in the unified taxonomy (e.g. Parking, railtrack, bridge etc. in WildDash), how do you produce predictions for that class? I understand you map the predictions with a binary matrix. But what do you do when there's no one-to-one correspondence?

A: WildDash v1 uses the 19-class taxonomy for evaluation, just like Cityscapes. So we use the following script to remap the 34-class taxonomy to 19-class taxonomy for WildDash for testing inference and submission. You can see how Cityscapes evaluates just 19 of the 34 classes here in the evaluation script and in the taxonomy definition. However, bridge and rail track are actually included in our unified taxonomy, as you’ll see in MSeg_master.tsv.

Q: How are datasets images read in for training/inference? Should I use the dataset_apis from mseg-api?

A: The dataset_apis from mseg-api are not for training or inference. They are purely for generating the MSeg dataset labels on disk. We read in the datasets using mseg_semantic/utils/dataset.py and then remap them to the universal space on the fly.

Q: In the training configuration file, each dataset uses one GPU each for multi-dataset training. I don't have enough hardware resources (I only have four GPUs at most)，Can I still train？

A: Sure, you can still train by setting to the batch size to a smaller number, but the training will take longer. Another alternative is to train at a lower input resolution (smaller input crops, see the 480p or 720p configs instead of 1080p config), or to train for fewer iterations.

Q: The purpose of using MGDA is unclear -- is it recommended for training?

A: Please refer to the section "Algorithms for learning from multiple domains" from our paper. In our ablation experiments, we found that training with MGDA does not lead to the best model, so we set it to false when training our best models.

Q: Does save_path refer to the path saved by the weights after training?

A: save_path is the directory where the model checkpoints and results will be saved. See here.

Q: Does the auto_resume param refer to the weight of breakpoint training, or the mseg-3m.pth provided by the author?

A: We use the auto_resume config parameter to allow one to continue training if training is interrupted due to a scheduler compute time limit or hardware error. You could also use it to fine-tune a model.

Q: Could I know how to map the predicted label iD to the ID on cityscapes? Do you have any code/dictionary to achieve this?

A: There are two Cityscape taxonomies (cityscapes-19 and cityscapes-34), although cityscapes-19 is more commonly used for evaluation. The classes in these taxonomies are enumerated in mseg-api here and here

We have released both unified models (trained on many datasets, list available here) and models trained on single datasets, listed here.

If you use a unified model for testing, our code maps class scores from the unified taxonomy to cityscapes classes. We discuss this in a section of our paper (page 6, top-right under Using the MSeg taxonomy on a held-out dataset). The mapping is available in MSeg_master.tsv, if you compare the universal and wilddash-19 columns (wilddash-19 shares the same classes with cityscapes-19)

If instead you used a model specifically trained on cityscapes, e.g. cityscapes-19-1m, which we call an "oracle model" since it is trained and tested on different splits of the same dataset, then the output classes are already immediately in the desired taxonomy.

Our inference code that dumps model results in any particular taxonomy is available here: https://github.com/mseg-dataset/mseg-semantic/blob/master/mseg_semantic/scripts/eval_models.sh

mseg-semantic's People

Contributors

Stargazers

Watchers

mseg-semantic's Issues

mseg-3m-720p.pth?

Hi! Thanks for your gorgeous work!
I was wondering if there are no mseg-3m-720p.pth weights in your release files?
Many thanks and congratulations

Training a Semantic segmentation using Mseg dataset

I followed all steps to download all datasets to generate Mseg dataset. I want to use Mseg for training the semantic segmentation model. I looked at the training branch (training.md) but I haven't found any example to see how I can train my model using Mseg.

Invitation of making PR on MMSegmentation.

Hey, here!

I am the member of OpenMMLab. This dataset and its related code/method are very valuable for segmentation. We hope we could introduce more people know this method and use dataset.

Would you like to make a new pr on MMSegmentation with us? We could work together to support this model and dataset effectively!

Best,

detectron2 Panoptic-FPN

From the README:

One additional repo will be introduced in October 2020:
    mseg-panoptic: provides Panoptic-FPN and Mask-RCNN training, based on Detectron2

I wonder if this is still planned?

Regards,

What information do the gray pictures contain?

Your dataset is amazing. You have done a great job! Where can I find information about the architecture of your neural network? And my main question. Are the gray pictures cards of confidence or the same segmentation, but without labels?

tax_version

What does tax_version: 4.0 refer to?

What are the difference between cvpr paper and your arxiv(PAMI-like) paper?

error running demo

Hi! Thank you for your amazing work!
There was some problem when I tried to run the demo program on my own PC as I followed the instruction in README.
Here is the error message.

[2020-10-31 19:49:01,033 INFO universal_demo.py line 63 10457] => creating model ...
[2020-10-31 19:49:03,787 INFO inference_task.py line 308 10457] => loading checkpoint 'mseg_semantic/model/mseg-1m.pth'
[2020-10-31 19:49:04,255 INFO inference_task.py line 314 10457] => loaded checkpoint 'mseg_semantic/model/mseg-1m.pth'
[2020-10-31 19:49:04,259 INFO inference_task.py line 327 10457] >>>>>>>>>>>>>> Start inference task >>>>>>>>>>>>>
[2020-10-31 19:49:04,262 INFO inference_task.py line 365 10457] Write image prediction to 000000_overlaid_classes.jpg
/home/kkycj/.local/lib/python3.6/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/kkycj/.local/lib/python3.6/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
Traceback (most recent call last):
File "mseg_semantic/tool/universal_demo.py", line 109, in
run_universal_demo(args, use_gpu)
File "mseg_semantic/tool/universal_demo.py", line 76, in run_universal_demo
itask.execute()
File "/home/kkycj/workspace/mseg-semantic/mseg_semantic/tool/inference_task.py", line 343, in execute
self.render_single_img_pred()
File "/home/kkycj/workspace/mseg-semantic/mseg_semantic/tool/inference_task.py", line 379, in render_single_img_pred
id_to_class_name_map=self.id_to_class_name_map
File "/home/kkycj/workspace/mseg-api/mseg/utils/mask_utils_detectron2.py", line 468, in overlay_instances
polygons, _ = mask_obj.mask_to_polygons(segment)
File "/home/kkycj/workspace/mseg-api/mseg/utils/mask_utils_detectron2.py", line 121, in mask_to_polygons
res, hierarchy = cv2.findContours(mask.astype("uint8"), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
ValueError: too many values to unpack (expected 2)

Could you please help me finding out where is the problem?
Thank you so much.

How to train the model?

Hi! @johnwlambert
I checkout the training branch. I can not find any instructions for training. I wonder how to train the HR model in Tab2 of your paper.

Also where is train-qvga-mix-copy.sh?
where is train-qvga-mix-cd.sh ?

It is very confusing.

Prediction on Ade20k and coco_stuff

First thank you for the wonderful work
I have some suggestion:
you can add to requirement.txt the package(Apex, Yaml and yacs).

I have question: even i change the model from Mseg 3m to coco-panoptic-133-1m or ade20k-150 the output are the same. haw i can excute the prediction on coco stuff with output 182 class and ade20k with 150 class

Best regards

Problem with creating model on demo notebook

Hi @johnwlambert
How are you?
I followed the demo notebook you put here and encountered a bug while running it:

[2020-08-23 10:43:55,760 INFO universal_demo.py line 60 724] => creating model ...
[2020-08-23 10:44:01,292 INFO inference_task.py line 277 724] => loading checkpoint '/content/mseg-3m.pth'
Traceback (most recent call last):
  File "mseg-semantic/mseg_semantic/tool/universal_demo.py", line 105, in <module>
    test_runner = UniversalDemoRunner(args, use_gpu)
  File "mseg-semantic/mseg_semantic/tool/universal_demo.py", line 70, in __init__
    scales = args.scales
  File "/content/mseg-semantic/mseg_semantic/tool/inference_task.py", line 219, in __init__
    self.model = self.load_model(args)
  File "/content/mseg-semantic/mseg_semantic/tool/inference_task.py", line 279, in load_model
    checkpoint = torch.load(args.model_path)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 585, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 755, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

It occurred on "Try out our model on an indoor scene (dining room):" cell.

Can you please look into it?

Performance different from reported in paper

Hi, I tried to run the training code with the 1m config but got significantly worse performance compared to the performance reported in the paper. I notice that you mentioned in the paper that batch size 35 was used but in the 1m config you set batch size to be 14. Can you please explain what should I change to train a model that achieves the performance as you reported with 1m config? Thanks!

Could not find a version that satisfies the requirement pandas>=1.2.0 (from mseg-semantic==1.0.0)

Hello, Thanks for your work. I was trying to recreate your work. However, when I tried to run pip3 install -e ~/mseg-semantic, it gave me the following error:

ERROR: Could not find a version that satisfies the requirement pandas>=1.2.0 (from mseg-semantic==1.0.0) (from versions: 0.1, 0.2b0, 0.2b1, 0.2, 0.3.0b0, 0.3.0b2, 0.3.0, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.5.0, 0.6.0, 0.6.1, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.8.0rc1, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.2, 0.16.0, 0.16.1, 0.16.2, 0.17.0, 0.17.1, 0.18.0, 0.18.1, 0.19.0, 0.19.1, 0.19.2, 0.20.0, 0.20.1, 0.20.2, 0.20.3, 0.21.0, 0.21.1, 0.22.0, 0.23.0, 0.23.1, 0.23.2, 0.23.3, 0.23.4, 0.24.0, 0.24.1, 0.24.2, 0.25.0, 0.25.1, 0.25.2, 0.25.3, 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5)
ERROR: No matching distribution found for pandas>=1.2.0 (from mseg-semantic==1.0.0)

This issue is somehow related to this one: gboeing/osmnx#636
I am using python 3.6.9. My question is: Should I make padas>=1.1.0 in the requirements.txt file and will it still work ??

Is pretrained weight used?

Hi, in train.md you mentioned that we need to download the ImageNet-pretrained HRNet backbone model from the original authors' OneDrive. After downloading the file, I assumed the path to this weight should be specified as "weight" in config yaml file to be used as the initial weight? However the model keys don't seem to match, so I'm confused where should I use the pertained HRNET model.
Looking forward to your response

class

There's a little bit of an error with the class.

I changed one line of code to change args.tc.classes to args.tc.num_uclasses as shown below.

Inference on Taxonomy Subset

Thanks for this great repo! I have a question about running the universal demo on a subset of the universal taxonomy classes.

Is there a simple way to run the demo and inference task but only have it identify and segment a subset of classes from the universal taxonomy?

For example, only have it try identifying people in the input image and nothing else. I am curious if this would 1) speed up the run time since it is only trying to identify one class and 2) possibly improve the results if there are overlapping classes like a person behind a picket fence. With the default settings the fence tends to get identified in the segmentation and not the person behind it.

Thanks!

problem with running the demo ( No module named 'apex' , No such file img3_overlaid_classes.jpg)

I am trying to run the demo by following the instructions, but got an error saying " No module named 'apex' " (full output below"). which was followed by the error "No such file img3_overlaid_classes.jpg" in the next cell.
I also tried downgrading pytorch to match the apex cuda version, as well as commenting out the bare metal version check in the setup.py.
but nothing seemed to be working.

Namespace(config='mseg-semantic/mseg_semantic/config/test/default_config_360.yaml', file_save='default', opts=['model_name', 'mseg-3m', 'model_path', '/mseg-3m.pth', 'input_file', '/kitchen1.jpg'])
arch: hrnet
base_size: 360
batch_size_val: 1
dataset: kitchen1
has_prediction: False
ignore_label: 255
img_name_unique: False
index_start: 0
index_step: 0
input_file: /kitchen1.jpg
layers: 50
model_name: mseg-3m
model_path: /mseg-3m.pth
network_name: None
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: val
test_gpu: [0]
test_h: 713
test_w: 713
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-06 12:22:03,683 INFO universal_demo.py line 59 71646] arch: hrnet
base_size: 360
batch_size_val: 1
dataset: kitchen1
has_prediction: False
ignore_label: 255
img_name_unique: True
index_start: 0
index_step: 0
input_file: /kitchen1.jpg
layers: 50
model_name: mseg-3m
model_path: /mseg-3m.pth
network_name: None
print_freq: 10
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: test
test_gpu: [0]
test_h: 713
test_w: 713
u_classes: ['backpack', 'umbrella', 'bag', 'tie', 'suitcase', 'case', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'animal_other', 'microwave', 'radiator', 'oven', 'toaster', 'storage_tank', 'conveyor_belt', 'sink', 'refrigerator', 'washer_dryer', 'fan', 'dishwasher', 'toilet', 'bathtub', 'shower', 'tunnel', 'bridge', 'pier_wharf', 'tent', 'building', 'ceiling', 'laptop', 'keyboard', 'mouse', 'remote', 'cell phone', 'television', 'floor', 'stage', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'fruit_other', 'food_other', 'chair_other', 'armchair', 'swivel_chair', 'stool', 'seat', 'couch', 'trash_can', 'potted_plant', 'nightstand', 'bed', 'table', 'pool_table', 'barrel', 'desk', 'ottoman', 'wardrobe', 'crib', 'basket', 'chest_of_drawers', 'bookshelf', 'counter_other', 'bathroom_counter', 'kitchen_island', 'door', 'light_other', 'lamp', 'sconce', 'chandelier', 'mirror', 'whiteboard', 'shelf', 'stairs', 'escalator', 'cabinet', 'fireplace', 'stove', 'arcade_machine', 'gravel', 'platform', 'playingfield', 'railroad', 'road', 'snow', 'sidewalk_pavement', 'runway', 'terrain', 'book', 'box', 'clock', 'vase', 'scissors', 'plaything_other', 'teddy_bear', 'hair_dryer', 'toothbrush', 'painting', 'poster', 'bulletin_board', 'bottle', 'cup', 'wine_glass', 'knife', 'fork', 'spoon', 'bowl', 'tray', 'range_hood', 'plate', 'person', 'rider_other', 'bicyclist', 'motorcyclist', 'paper', 'streetlight', 'road_barrier', 'mailbox', 'cctv_camera', 'junction_box', 'traffic_sign', 'traffic_light', 'fire_hydrant', 'parking_meter', 'bench', 'bike_rack', 'billboard', 'sky', 'pole', 'fence', 'railing_banister', 'guard_rail', 'mountain_hill', 'rock', 'frisbee', 'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket', 'net', 'base', 'sculpture', 'column', 'fountain', 'awning', 'apparel', 'banner', 'flag', 'blanket', 'curtain_other', 'shower_curtain', 'pillow', 'towel', 'rug_floormat', 'vegetation', 'bicycle', 'car', 'autorickshaw', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'trailer', 'boat_ship', 'slow_wheeled_object', 'river_lake', 'sea', 'water_other', 'swimming_pool', 'waterfall', 'wall', 'window', 'window_blind']
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-06 12:22:03,683 INFO universal_demo.py line 60 71646] => creating model ...
Traceback (most recent call last):
File "mseg-semantic/mseg_semantic/tool/universal_demo.py", line 105, in
test_runner = UniversalDemoRunner(args, use_gpu)
File "mseg-semantic/mseg_semantic/tool/universal_demo.py", line 70, in init
scales = args.scales
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/emanuelml/code/Users/emanuel/mseg-semantic/mseg_semantic/tool/inference_task.py", line 219, in init
self.model = self.load_model(args)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/emanuelml/code/Users/emanuel/mseg-semantic/mseg_semantic/tool/inference_task.py", line 264, in load_model
from mseg_semantic.model.seg_hrnet import get_configured_hrnet
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/emanuelml/code/Users/emanuel/mseg-semantic/mseg_semantic/model/seg_hrnet.py", line 17, in
import apex
ModuleNotFoundError: No module named 'apex'

thank you.

Train code

Does this repository provide training code? I only see the test code

MSeg, universal demo takes 10 minute for 1 image, why so?

Hallo Pros,

i am currently working with enhancing image enhancement paper and algorithm and trying to implement that. In the process, we need to use MSeg-segmentation for real and rendered images/ datasets. i have like 50-60k images.

So the dependencies MSeg-api and MSeg_semantic were already installed. I tried the google collab first and then copying the commands, so i could run the script in my linux also. the command is like this:
python -u mseg_semantic/tool/universal_demo.py
--config="default_config_360.yaml"
model_name mseg-3m
model_path mseg-3m.pth
input_file /home/luda1013/PfD/image/try_images

the weight i used, i downloaded it from the google collab, so the mseg-3m-1080.pth

but for me, it took like 10 minutes for 1 image and also what i get in temp_files is just the gray scale image of it.
Could someone help me how i could solve this problem, thank you :)

Things and Stuff

Could you please give me the number of stuff and things in Mseg datasets
Because this information not exist in your paper
Example : Coco-datasets : 91 Stuff and 80 Things

Thank you

Testing time takes much longer than reported in the repo!!

It's taking 35-40s to process the segmentation of a single frame.
My test setup configuration:
i. Ubuntu 18.04 LTS, core i7, 24 Gb RAM
ii. Graphics Nvidia 1070M (Laptop version of 1070Ti)
iii. Cuda 10.2
iv. Pytorch version: 1.6.0 + cu101
v. CUDA_HOME = /usr/local/cuda-10.2
This is the output from my terminal:

arghya@arghya-Erazer-X7849-MD60379:~$ python3 -u ~/mseg-semantic/mseg_semantic/tool/universal_demo.py --config=/home/arghya/mseg-semantic/mseg_semantic/config/test/default_config_720_ms.yaml model_name mseg-3m-720p model_path ~/Downloads/mseg-3m-720p.pth input_file ~/Downloads/Urban_3_fps.mp4
Namespace(config='/home/arghya/mseg-semantic/mseg_semantic/config/test/default_config_720_ms.yaml', file_save='default', opts=['model_name', 'mseg-3m-720p', 'model_path', '/home/arghya/Downloads/mseg-3m-720p.pth', 'input_file', '/home/arghya/Downloads/SubT_Urban_3_fps.mp4'])
arch: hrnet
base_size: 720
batch_size_val: 1
dataset: Urban_3_fps
has_prediction: False
ignore_label: 255
img_name_unique: False
index_start: 0
index_step: 0
input_file: /home/arghya/Downloads/Urban_3_fps.mp4
layers: 50
model_name: mseg-3m-720p
model_path: /home/arghya/Downloads/mseg-3m-720p.pth
network_name: None
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: val
test_gpu: [0]
test_h: 713
test_w: 713
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-29 02:50:46,742 INFO universal_demo.py line 59 11926] arch: hrnet
base_size: 720
batch_size_val: 1
dataset: Urban_3_fps
has_prediction: False
ignore_label: 255
img_name_unique: True
index_start: 0
index_step: 0
input_file: /home/arghya/Downloads/Urban_3_fps.mp4
layers: 50
model_name: mseg-3m-720p
model_path: /home/arghya/Downloads/mseg-3m-720p.pth
network_name: None
print_freq: 10
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: test
test_gpu: [0]
test_h: 713
test_w: 713
u_classes: ['backpack', 'umbrella', 'bag', 'tie', 'suitcase', 'case', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'animal_other', 'microwave', 'radiator', 'oven', 'toaster', 'storage_tank', 'conveyor_belt', 'sink', 'refrigerator', 'washer_dryer', 'fan', 'dishwasher', 'toilet', 'bathtub', 'shower', 'tunnel', 'bridge', 'pier_wharf', 'tent', 'building', 'ceiling', 'laptop', 'keyboard', 'mouse', 'remote', 'cell phone', 'television', 'floor', 'stage', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'fruit_other', 'food_other', 'chair_other', 'armchair', 'swivel_chair', 'stool', 'seat', 'couch', 'trash_can', 'potted_plant', 'nightstand', 'bed', 'table', 'pool_table', 'barrel', 'desk', 'ottoman', 'wardrobe', 'crib', 'basket', 'chest_of_drawers', 'bookshelf', 'counter_other', 'bathroom_counter', 'kitchen_island', 'door', 'light_other', 'lamp', 'sconce', 'chandelier', 'mirror', 'whiteboard', 'shelf', 'stairs', 'escalator', 'cabinet', 'fireplace', 'stove', 'arcade_machine', 'gravel', 'platform', 'playingfield', 'railroad', 'road', 'snow', 'sidewalk_pavement', 'runway', 'terrain', 'book', 'box', 'clock', 'vase', 'scissors', 'plaything_other', 'teddy_bear', 'hair_dryer', 'toothbrush', 'painting', 'poster', 'bulletin_board', 'bottle', 'cup', 'wine_glass', 'knife', 'fork', 'spoon', 'bowl', 'tray', 'range_hood', 'plate', 'person', 'rider_other', 'bicyclist', 'motorcyclist', 'paper', 'streetlight', 'road_barrier', 'mailbox', 'cctv_camera', 'junction_box', 'traffic_sign', 'traffic_light', 'fire_hydrant', 'parking_meter', 'bench', 'bike_rack', 'billboard', 'sky', 'pole', 'fence', 'railing_banister', 'guard_rail', 'mountain_hill', 'rock', 'frisbee', 'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket', 'net', 'base', 'sculpture', 'column', 'fountain', 'awning', 'apparel', 'banner', 'flag', 'blanket', 'curtain_other', 'shower_curtain', 'pillow', 'towel', 'rug_floormat', 'vegetation', 'bicycle', 'car', 'autorickshaw', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'trailer', 'boat_ship', 'slow_wheeled_object', 'river_lake', 'sea', 'water_other', 'swimming_pool', 'waterfall', 'wall', 'window', 'window_blind']
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-29 02:50:46,743 INFO universal_demo.py line 60 11926] => creating model ...
[2021-07-29 02:50:49,912 INFO inference_task.py line 307 11926] => loading checkpoint '/home/arghya/Downloads/mseg-3m-720p.pth'
[2021-07-29 02:50:50,433 INFO inference_task.py line 313 11926] => loaded checkpoint '/home/arghya/Downloads/mseg-3m-720p.pth'
[2021-07-29 02:50:50,437 INFO inference_task.py line 326 11926] >>>>>>>>>>>>>> Start inference task >>>>>>>>>>>>>
[2021-07-29 02:50:50,440 INFO inference_task.py line 437 11926] Write video to /home/arghya/mseg-semantic/temp_files/SubT_Urban_3_fps_mseg-3m-720p_universal_scales_ms_base_sz_720.mp4
Video fps: 3.00 @ 720x1280 resolution.
[2021-07-29 02:50:50,451 INFO inference_task.py line 442 11926] On image 0/1312
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
[2021-07-29 02:51:15,910 INFO inference_task.py line 442 11926] On image 1/1312
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
[2021-07-29 02:51:39,985 INFO inference_task.py line 442 11926] On image 2/1312
[2021-07-29 02:52:03,004 INFO inference_task.py line 442 11926] On image 3/1312
[2021-07-29 02:52:25,585 INFO inference_task.py line 442 11926] On image 4/1312
[2021-07-29 02:52:48,215 INFO inference_task.py line 442 11926] On image 5/1312
[2021-07-29 02:53:11,059 INFO inference_task.py line 442 11926] On image 6/1312
[2021-07-29 02:53:34,598 INFO inference_task.py line 442 11926] On image 7/1312
[2021-07-29 02:53:57,468 INFO inference_task.py line 442 11926] On image 8/1312
[2021-07-29 02:54:21,657 INFO inference_task.py line 442 11926] On image 9/1312
[2021-07-29 02:54:47,152 INFO inference_task.py line 442 11926] On image 10/1312
[2021-07-29 02:55:14,647 INFO inference_task.py line 442 11926] On image 11/1312
[2021-07-29 02:55:43,719 INFO inference_task.py line 442 11926] On image 12/1312
[2021-07-29 02:56:13,402 INFO inference_task.py line 442 11926] On image 13/1312
[2021-07-29 02:56:45,320 INFO inference_task.py line 442 11926] On image 14/1312
[2021-07-29 02:57:18,402 INFO inference_task.py line 442 11926] On image 15/1312
[2021-07-29 02:57:51,063 INFO inference_task.py line 442 11926] On image 16/1312
[2021-07-29 02:58:26,466 INFO inference_task.py line 442 11926] On image 17/1312
[2021-07-29 02:59:00,356 INFO inference_task.py line 442 11926] On image 18/1312
[2021-07-29 02:59:33,815 INFO inference_task.py line 442 11926] On image 19/1312
[2021-07-29 03:00:08,435 INFO inference_task.py line 442 11926] On image 20/1312
[2021-07-29 03:00:43,657 INFO inference_task.py line 442 11926] On image 21/1312
[2021-07-29 03:01:19,335 INFO inference_task.py line 442 11926] On image 22/1312
[2021-07-29 03:01:55,123 INFO inference_task.py line 442 11926] On image 23/1312
[2021-07-29 03:02:30,679 INFO inference_task.py line 442 11926] On image 24/1312
[2021-07-29 03:03:05,552 INFO inference_task.py line 442 11926] On image 25/1312
[2021-07-29 03:03:42,121 INFO inference_task.py line 442 11926] On image 26/1312
[2021-07-29 03:04:18,147 INFO inference_task.py line 442 11926] On image 27/1312
[2021-07-29 03:04:53,813 INFO inference_task.py line 442 11926] On image 28/1312
[2021-07-29 03:05:30,250 INFO inference_task.py line 442 11926] On image 29/1312
[2021-07-29 03:06:06,936 INFO inference_task.py line 442 11926] On image 30/1312
[2021-07-29 03:06:43,893 INFO inference_task.py line 442 11926] On image 31/1312
[2021-07-29 03:07:22,273 INFO inference_task.py line 442 11926] On image 32/1312
[2021-07-29 03:07:58,141 INFO inference_task.py line 442 11926] On image 33/1312
[2021-07-29 03:08:36,007 INFO inference_task.py line 442 11926] On image 34/1312
[2021-07-29 03:09:13,067 INFO inference_task.py line 442 11926] On image 35/1312
[2021-07-29 03:09:48,738 INFO inference_task.py line 442 11926] On image 36/1312
[2021-07-29 03:10:25,400 INFO inference_task.py line 442 11926] On image 37/1312
[2021-07-29 03:11:02,181 INFO inference_task.py line 442 11926] On image 38/1312
[2021-07-29 03:11:39,011 INFO inference_task.py line 442 11926] On image 39/1312
[2021-07-29 03:12:18,203 INFO inference_task.py line 442 11926] On image 40/1312
[2021-07-29 03:12:56,330 INFO inference_task.py line 442 11926] On image 41/1312
[2021-07-29 03:13:35,509 INFO inference_task.py line 442 11926] On image 42/1312
[2021-07-29 03:14:11,946 INFO inference_task.py line 442 11926] On image 43/1312
[2021-07-29 03:14:47,958 INFO inference_task.py line 442 11926] On image 44/1312
[2021-07-29 03:15:23,517 INFO inference_task.py line 442 11926] On image 45/1312
[2021-07-29 03:15:59,211 INFO inference_task.py line 442 11926] On image 46/1312
[2021-07-29 03:16:34,647 INFO inference_task.py line 442 11926] On image 47/1312
[2021-07-29 03:17:11,127 INFO inference_task.py line 442 11926] On image 48/1312
[2021-07-29 03:17:47,540 INFO inference_task.py line 442 11926] On image 49/1312
[2021-07-29 03:18:24,153 INFO inference_task.py line 442 11926] On image 50/1312
[2021-07-29 03:19:00,355 INFO inference_task.py line 442 11926] On image 51/1312
[2021-07-29 03:19:36,037 INFO inference_task.py line 442 11926] On image 52/1312
[2021-07-29 03:20:12,058 INFO inference_task.py line 442 11926] On image 53/1312
[2021-07-29 03:20:47,903 INFO inference_task.py line 442 11926] On image 54/1312
[2021-07-29 03:21:24,239 INFO inference_task.py line 442 11926] On image 55/1312
[2021-07-29 03:22:00,087 INFO inference_task.py line 442 11926] On image 56/1312
[2021-07-29 03:22:35,845 INFO inference_task.py line 442 11926] On image 57/1312
[2021-07-29 03:23:11,774 INFO inference_task.py line 442 11926] On image 58/1312
[2021-07-29 03:23:47,541 INFO inference_task.py line 442 11926] On image 59/1312
[2021-07-29 03:24:23,755 INFO inference_task.py line 442 11926] On image 60/1312
[2021-07-29 03:25:00,037 INFO inference_task.py line 442 11926] On image 61/1312
[2021-07-29 03:25:35,865 INFO inference_task.py line 442 11926] On image 62/1312
[2021-07-29 03:26:12,221 INFO inference_task.py line 442 11926] On image 63/1312
[2021-07-29 03:26:49,299 INFO inference_task.py line 442 11926] On image 64/1312
[2021-07-29 03:27:26,690 INFO inference_task.py line 442 11926] On image 65/1312
[2021-07-29 03:28:03,340 INFO inference_task.py line 442 11926] On image 66/1312
[2021-07-29 03:28:40,227 INFO inference_task.py line 442 11926] On image 67/1312
[2021-07-29 03:29:17,089 INFO inference_task.py line 442 11926] On image 68/1312
[2021-07-29 03:29:54,721 INFO inference_task.py line 442 11926] On image 69/1312
[2021-07-29 03:30:31,592 INFO inference_task.py line 442 11926] On image 70/1312
[2021-07-29 03:31:08,450 INFO inference_task.py line 442 11926] On image 71/1312
[2021-07-29 03:31:45,385 INFO inference_task.py line 442 11926] On image 72/1312
[2021-07-29 03:32:22,591 INFO inference_task.py line 442 11926] On image 73/1312
[2021-07-29 03:32:58,804 INFO inference_task.py line 442 11926] On image 74/1312
[2021-07-29 03:33:35,249 INFO inference_task.py line 442 11926] On image 75/1312
[2021-07-29 03:34:12,158 INFO inference_task.py line 442 11926] On image 76/1312
[2021-07-29 03:34:48,676 INFO inference_task.py line 442 11926] On image 77/1312
[2021-07-29 03:35:26,471 INFO inference_task.py line 442 11926] On image 78/1312
[2021-07-29 03:36:03,713 INFO inference_task.py line 442 11926] On image 79/1312
[2021-07-29 03:36:40,725 INFO inference_task.py line 442 11926] On image 80/1312
[2021-07-29 03:37:17,384 INFO inference_task.py line 442 11926] On image 81/1312
[2021-07-29 03:37:54,698 INFO inference_task.py line 442 11926] On image 82/1312
[2021-07-29 03:38:32,027 INFO inference_task.py line 442 11926] On image 83/1312
[2021-07-29 03:39:07,970 INFO inference_task.py line 442 11926] On image 84/1312
[2021-07-29 03:39:45,093 INFO inference_task.py line 442 11926] On image 85/1312
[2021-07-29 03:40:22,227 INFO inference_task.py line 442 11926] On image 86/1312
[2021-07-29 03:40:58,539 INFO inference_task.py line 442 11926] On image 87/1312
[2021-07-29 03:41:35,155 INFO inference_task.py line 442 11926] On image 88/1312
[2021-07-29 03:42:11,598 INFO inference_task.py line 442 11926] On image 89/1312
[2021-07-29 03:42:49,505 INFO inference_task.py line 442 11926] On image 90/1312
[2021-07-29 03:43:26,028 INFO inference_task.py line 442 11926] On image 91/1312
[2021-07-29 03:44:02,794 INFO inference_task.py line 442 11926] On image 92/1312
[2021-07-29 03:44:39,811 INFO inference_task.py line 442 11926] On image 93/1312
[2021-07-29 03:45:15,724 INFO inference_task.py line 442 11926] On image 94/1312
[2021-07-29 03:45:53,234 INFO inference_task.py line 442 11926] On image 95/1312
[2021-07-29 03:46:29,862 INFO inference_task.py line 442 11926] On image 96/1312
[2021-07-29 03:47:06,696 INFO inference_task.py line 442 11926] On image 97/1312
[2021-07-29 03:47:42,920 INFO inference_task.py line 442 11926] On image 98/1312
[2021-07-29 03:48:19,640 INFO inference_task.py line 442 11926] On image 99/1312
[2021-07-29 03:48:55,935 INFO inference_task.py line 442 11926] On image 100/1312
[2021-07-29 03:49:32,223 INFO inference_task.py line 442 11926] On image 101/1312
[2021-07-29 03:50:08,022 INFO inference_task.py line 442 11926] On image 102/1312
[2021-07-29 03:50:44,716 INFO inference_task.py line 442 11926] On image 103/1312
[2021-07-29 03:51:22,088 INFO inference_task.py line 442 11926] On image 104/1312
[2021-07-29 03:51:59,518 INFO inference_task.py line 442 11926] On image 105/1312
[2021-07-29 03:52:37,975 INFO inference_task.py line 442 11926] On image 106/1312
[2021-07-29 03:53:14,968 INFO inference_task.py line 442 11926] On image 107/1312
[2021-07-29 03:53:50,860 INFO inference_task.py line 442 11926] On image 108/1312
[2021-07-29 03:54:26,802 INFO inference_task.py line 442 11926] On image 109/1312
[2021-07-29 03:55:02,738 INFO inference_task.py line 442 11926] On image 110/1312
[2021-07-29 03:55:38,041 INFO inference_task.py line 442 11926] On image 111/1312
[2021-07-29 03:56:13,368 INFO inference_task.py line 442 11926] On image 112/1312
[2021-07-29 03:56:48,973 INFO inference_task.py line 442 11926] On image 113/1312
[2021-07-29 03:57:24,275 INFO inference_task.py line 442 11926] On image 114/1312
[2021-07-29 03:57:59,858 INFO inference_task.py line 442 11926] On image 115/1312
[2021-07-29 03:58:36,053 INFO inference_task.py line 442 11926] On image 116/1312
[2021-07-29 03:59:12,738 INFO inference_task.py line 442 11926] On image 117/1312
[2021-07-29 03:59:52,303 INFO inference_task.py line 442 11926] On image 118/1312
[2021-07-29 04:00:32,366 INFO inference_task.py line 442 11926] On image 119/1312
[2021-07-29 04:01:12,757 INFO inference_task.py line 442 11926] On image 120/1312
[2021-07-29 04:01:51,448 INFO inference_task.py line 442 11926] On image 121/1312
[2021-07-29 04:02:31,448 INFO inference_task.py line 442 11926] On image 122/1312
[2021-07-29 04:03:08,493 INFO inference_task.py line 442 11926] On image 123/1312
[2021-07-29 04:03:45,930 INFO inference_task.py line 442 11926] On image 124/1312
[2021-07-29 04:04:22,804 INFO inference_task.py line 442 11926] On image 125/1312
[2021-07-29 04:04:58,956 INFO inference_task.py line 442 11926] On image 126/1312
[2021-07-29 04:05:35,772 INFO inference_task.py line 442 11926] On image 127/1312
[2021-07-29 04:06:12,102 INFO inference_task.py line 442 11926] On image 128/1312
[2021-07-29 04:06:48,624 INFO inference_task.py line 442 11926] On image 129/1312
[2021-07-29 04:07:25,737 INFO inference_task.py line 442 11926] On image 130/1312
[2021-07-29 04:08:01,762 INFO inference_task.py line 442 11926] On image 131/1312
[2021-07-29 04:08:37,999 INFO inference_task.py line 442 11926] On image 132/1312
[2021-07-29 04:09:15,628 INFO inference_task.py line 442 11926] On image 133/1312
[2021-07-29 04:09:51,805 INFO inference_task.py line 442 11926] On image 134/1312
[2021-07-29 04:10:28,575 INFO inference_task.py line 442 11926] On image 135/1312
[2021-07-29 04:11:04,250 INFO inference_task.py line 442 11926] On image 136/1312
[2021-07-29 04:11:40,140 INFO inference_task.py line 442 11926] On image 137/1312
[2021-07-29 04:12:17,193 INFO inference_task.py line 442 11926] On image 138/1312
[2021-07-29 04:12:56,092 INFO inference_task.py line 442 11926] On image 139/1312
[2021-07-29 04:13:32,841 INFO inference_task.py line 442 11926] On image 140/1312
[2021-07-29 04:14:08,987 INFO inference_task.py line 442 11926] On image 141/1312
[2021-07-29 04:14:45,480 INFO inference_task.py line 442 11926] On image 142/1312
[2021-07-29 04:15:24,121 INFO inference_task.py line 442 11926] On image 143/1312

I think I am missing something. According to this repo, the detection fps should be around 16 fps on a quadro P5000 and I think for Nvidia GTX 1070, it should be something similar but not (1/40) fps.

Can anybody help ??

Format of weights

Hi, can you please tell me if the weights include the model structure or it's just the weights? If the last is true then to which model these weights refer to since under the models dir there are several model implementations?

instance annotation provided?

How Do I Use Mseg Dataset to Instance Segmantation?

Thanks for your work. May I have Instance annotation for your work?

Transform the labels to the labels on Cityscapes

Hi,
Thank you for your wonderful work. Could I know how to map the predicted label iD to the ID on cityscapes? Do you have any code/dictionary to achieve this?

Thank you for your help.

got different output from sample when using pretrained model

Hi!
I am trying to use the pretrained models to process images from KITTI Odometry and changed nothing of the code. But I got some invalid segmentations. Then I tested in the sample image4 in here .The output is as follow:

The config is:
python3 -u mseg_semantic/tool/universal_demo.py --config=mseg_semantic/config/test/default_config_360_ms.yaml model_name mseg-3m model_path mseg-3m.pth input_file dirtroad10.jpg

Could you please tell me where the problem is?
Thanks!

No imagenet normalization in universaldemo.py

Hi John Lambert,

Edit: nevermind, it just happens later than I expected. Ignore this post

It appears that ImageNet normalization is never being applied when running the network using the unversaldemo.py script. Is this intentional?

This happens for all cases: single image (render_single_img_pred in inference_task.py), video (execute_on_video in inference_task.py), and even on a folder of images using create_test_loader. A comment in create_test_loader suggests normalization should happen on the fly, but it appears this was never added.

Even without normalization the results look good though.

Thanks for the great repo!

Training Time Question

Hi! @johnwlambert Thank you for opensourcing the code. I wonder how long the training time? what are your training settings for each config.

The 'sklearn' PyPI package is deprecated, use 'scikit-learn'

Dear authors,

First of all, thank you very much for your great work! I am happy to use your models in a project of mine. A user of the project has reported an easily fixable problem regarding your requierements.txt here: #BonifazStuhr/feamgan#1 (comment).

It says that "The 'sklearn' PyPI package is deprecated, use 'scikit-learn'". You can therefore simply replace sklearn with scikit-learn in requierements.txt.

I recreated the issue as well. Down below you can see the more detailed error message when installing mseg-semantic with pip install.

3.539 Collecting sklearn
3.557 Downloading sklearn-0.0.post11.tar.gz (3.6 kB)
3.724 ERROR: Command errored out with exit status 1:
3.724 command: /opt/conda/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-bfuobfzn/sklearn/setup.py'"'"'; __file__='"'"'/tmp/pip-install-bfuobfzn/sklearn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-bfuobfzn/sklearn/pip-egg-info
3.724 cwd: /tmp/pip-install-bfuobfzn/sklearn/
3.724 Complete output (18 lines):
3.724 The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
3.724 rather than 'sklearn' for pip commands.
3.724
3.724 Here is how to fix this error in the main use cases:
3.724 - use 'pip install scikit-learn' rather than 'pip install sklearn'
3.724 - replace 'sklearn' by 'scikit-learn' in your pip requirements files
3.724 (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
3.724 - if the 'sklearn' package is used by one of your dependencies,
3.724 it would be great if you take some time to track which package uses
3.724 'sklearn' instead of 'scikit-learn' and report it to their issue tracker
3.724 - as a last resort, set the environment variable
3.724 SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
3.724
3.724 More information is available at
3.724 https://github.com/scikit-learn/sklearn-pypi-package
3.724
3.724 If the previous advice does not cover your use case, feel free to report it at
3.724 https://github.com/scikit-learn/sklearn-pypi-package/issues/new
3.724 ----------------------------------------
3.758 ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Train

Author, is it correct that the output of Auxloss is 0