erosinho13 / feddrive Goto Github PK

View Code? Open in Web Editor NEW

36.0 2.0 6.0 1.09 MB

Generalizing Federated Learning to Semantic Segmentation in Autonomous Driving

License: BSD 2-Clause "Simplified" License

Python 98.89% Shell 1.11%

feddrive's Introduction

Official repository of:

E. Fanì, M. Ciccone, B. Caputo. FedDrive v2: an Analysis of the Impact of Label Skewness in Federated Semantic Segmentation for Autonomous Driving. 5th Italian Conference on Robotics and Intelligent Machines (I-RIM), 2023.
L. Fantauzzo^*, E. Fanì^*, D. Caldarola, A. Tavera, F. Cermelli¹, M. Ciccone, B. Caputo. FedDrive: Generalizing Federated Learning to Semantic Segmentation in Autonomous Driving, IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022.

Corresponding author: [email protected].

All the authors are supported by Politecnico di Torino, Turin, Italy.

^*Equal contribution. ¹Fabio Cermelli is with Italian Institute of Technology, Genoa, Italy.

Official website: https://feddrive.github.io/

Citation

If you find our work relevant to your research or use our code, please cite our papers:

@inproceedings{feddrive2023,
  title={FedDrive v2: an Analysis of the Impact of Label Skewness in Federated Semantic Segmentation for Autonomous Driving},
  author={Fanì, Eros and Ciccone, Marco and Caputo, Barbara},
  journal={5th Italian Conference on Robotics and Intelligent Machines (I-RIM)},
  year={2023}
}

@inproceedings{feddrive2022,
  title={FedDrive: Generalizing Federated Learning to Semantic Segmentation in Autonomous Driving},
  author={Fantauzzo, Lidia and Fanì, Eros and Caldarola, Debora and Tavera, Antonio and Cermelli, Fabio and Ciccone, Marco and Caputo, Barbara},
  booktitle={Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems},
  year={2022}
}

Summary

FedDrive is a new benchmark for the Semantic Segmentation task in a Federated Learning scenario for autonomous driving.

It consists of 12 distinct scenarios, incorporating the real-world challenges of statistical heterogeneity and domain generalization. FedDrive incorporates algorithms and style transfer methods from Federated Learning, Domain Generalization, and Domain Adaptation literature. Its main goal is to enhance model generalization and robustness against statistical heterogeneity.

We show the importance of using the correct clients’ statistics when dealing with different domains and label skewness and how style transfer techniques can improve the performance on unseen domains, proving FedDrive to be a solid baseline for future research in federated semantic segmentation.

Summary of the FedDrive scenarios.

Dataset	Setting	Distribution	# Clients	# img/cl	Test clients
Cityscapes	-	Uniform, Heterogeneous, Class Imbalance	146	10-45	unseen cities
IDDA	Country	Uniform, Heterogeneous, Class Imbalance	90	48	seen + unseen (country) domains
	Rainy	Uniform, Heterogeneous, Class Imbalance	69	48	seen + unseen (rainy) domains
	Bus	Uniform, Heterogeneous, Class Imbalance	83	48	seen + unseen (bus) domains

Results

Please visit the FedDrive official website for the results.

Setup

Clone this repository
Move to the root path of your local copy of the repository
Create the feddrive new conda virtual environment and activate it:

conda env create -f environment.yml
conda activate feddrive

Download the Cityscapes dataset from here. You may need a new account if you do not have one yet. Download the gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.ziparchives
Extract the archives and move the gtFine and leftImg8bit folders in [local_repo_path]/data/cityscapes/data/
Ask for the IDDA V3 version of IDDA, available here
Extract the archive and move the IDDAsmall folder in [local_repo_path]/data/idda/data/
Make a new wandb account if you do not have one yet, and create a new wandb project.
In the configs folder, it is possible to find examples of config files for some of the experiments to replicate the results of the paper. Run one of the exemplar configs or a custom one:

./run.sh [path/to/config]

N.B. change the wandb_entity argument with the entity name of your wandb project.

N.B. always leave a blank new line at the end of the config. Otherwise, your last argument will be ignored.

How to visualize model predictions, LAB and CFSI images

The script plot_samples.py is designed to save and eventually visualize sets of (image, CFSI(image), LAB(image), target, model(image)) from samples in the test set(s) associated with a dataset, given a checkpoint and the indices of the images to show.

To use this script:

Download the checkpoint of the desired run from WandB
Copy the [run_args] from the info of the same run on wandb
Customize the load_path, indices, path_to_save_folder and plot variables options
Modify the CUDA_VISIBLE_DEVICES environment variable to select one single desired GPU
Move to the root directory of this repository and run the following command:

python src/plot_samples.py [run_args]

feddrive's People

Contributors

Stargazers

Watchers

Forkers

jenson66 elequaranta whuhxb liuxinren456852 hwan-sig venkatchandu848

feddrive's Issues

For visualization

how to visualize the result? (for example, cityscapes: CLASSES and PALETTE)

how to perform debugging in pycharm?

Hi. I am trying to perform debugging to understand codes.
However, in PyCharm environments, I can't perform debugging because of shell file.
could you help?

error message is "/anaconda3/envs/feddrive/lib/python3.9/site-packages/torch/distributed/elastic/agent/server/api.py", line 87, in post_init
assert self.local_world_size > 0
AssertionError"

environment variable:
CUDA_VISIBLE_DEVICES=0

interpreter option:
-W ignore -m torch.distributed.launch --nproc_per_node 0 --master_port=2509

parameters:
--name idda_heterogeneous_country --device_ids 1 --random_seed 42 --wandb_entity FedDrive --mixed_precision --ignore_warnings --save_samples --avg_last_100 --dataset idda --clients_type heterogeneous --setting_type country --remap --framework federated --algorithm FedAvg --num_rounds 1600 --clients_per_round 5 --num_epochs 2 --model bisenetv2 --output_aux --hnm --batch_size 16 --test_batch_size 1 --test_diff_dom --optimizer SGD --weight_decay 0 --momentum 0.9 --lr 0.1 --lr_policy poly --lr_power 0.9 --rrc_transform --use_test_resize --min_scale 0.5 --max_scale 2.0 --h_resize 512 --w_resize 928 --eval_interval 30000 --test_interval 50 --print_interval 50

QUESTION

When I tried to debug src/run.py to learn your code, I encountered this problem. How can I solve it（The parameter used is cityscapes_ heterogeneous_ silobn.txt）

Question about Non-IID settings

Hello author, I am currently testing your code, but I am curious to know where in the code files the non-iid setting implemented in the ./data/cityscapes/data/ folder is achieved? Specifically, where are those .json files generated from? I seem unable to find the relevant functions in your code files.

I am getting the below error after connecting to wandb and using ./run.sh [path/to/config]

Running Centralized experiment with 828 epochs
Setting up distributed...
Let's use 1 GPUs!
Done
Initializing wandb...
wandb: Currently logged in as: swapnikvarala. Use `wandb login --relogin` to force relogin
Problem at: /usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py 837 getcaller
Traceback (most recent call last):
  File "/content/FedDrive/src/run.py", line 31, in <module>
    run_experiment()
  File "/content/FedDrive/src/run.py", line 15, in run_experiment
    main(args)
  File "/content/FedDrive/src/centr_setting/main.py", line 17, in main
    writer, device, rank, world_size, logger, label2color, denorm = setup_env(args)
                                                                    ^^^^^^^^^^^^^^^
  File "/content/FedDrive/src/utils/utils.py", line 36, in setup_env
    logger = CustomWandbLogger(name=get_job_name(args), project=get_project_name(args.framework, args.dataset),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/FedDrive/src/utils/logger.py", line 7, in __init__
    super(CustomWandbLogger, self).__init__(name=name, project=project, group=group, entity=entity, offline=offline,
  File "/usr/local/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py", line 358, in __init__
    _ = self.experiment
        ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/lightning_fabric/loggers/logger.py", line 114, in experiment
    return fn(self)
           ^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py", line 406, in experiment
    self._experiment = wandb.init(**self._wandb_init)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 1173, in init
    raise e
  File "/usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 1154, in init
    run = wi.init()
          ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 770, in init
    raise error
wandb.errors.CommError: Run initialization has timed out after 60.0 sec. 
Please refer to the documentation for additional information: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process-
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 24118) of binary: /usr/local/bin/python3
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/run.py", line 798, in <module>
    main()
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/run.py", line 794, in main
    run(args)
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/run.py", line 785, in run
    elastic_launch(
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
run.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-07-18_17:50:12
  host      : f905eb7441a0
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 24118)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================```

erosinho13 / feddrive Goto Github PK

feddrive's Introduction

Citation

Summary

Results

Setup

How to visualize model predictions, LAB and CFSI images

feddrive's People

Contributors

Stargazers

Watchers

Forkers

feddrive's Issues

For visualization

how to perform debugging in pycharm?

QUESTION

Question about Non-IID settings

I am getting the below error after connecting to wandb and using ./run.sh [path/to/config]

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent