Giter Club home page Giter Club logo

feddrive's Introduction

drawing

Official repository of:

Corresponding author: [email protected].

All the authors are supported by Politecnico di Torino, Turin, Italy.

*Equal contribution. 1Fabio Cermelli is with Italian Institute of Technology, Genoa, Italy.

Official website: https://feddrive.github.io/

Citation

If you find our work relevant to your research or use our code, please cite our papers:

@inproceedings{feddrive2023,
  title={FedDrive v2: an Analysis of the Impact of Label Skewness in Federated Semantic Segmentation for Autonomous Driving},
  author={Fanì, Eros and Ciccone, Marco and Caputo, Barbara},
  journal={5th Italian Conference on Robotics and Intelligent Machines (I-RIM)},
  year={2023}
}

@inproceedings{feddrive2022,
  title={FedDrive: Generalizing Federated Learning to Semantic Segmentation in Autonomous Driving},
  author={Fantauzzo, Lidia and Fanì, Eros and Caldarola, Debora and Tavera, Antonio and Cermelli, Fabio and Ciccone, Marco and Caputo, Barbara},
  booktitle={Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems},
  year={2022}
}

Summary

FedDrive is a new benchmark for the Semantic Segmentation task in a Federated Learning scenario for autonomous driving.

It consists of 12 distinct scenarios, incorporating the real-world challenges of statistical heterogeneity and domain generalization. FedDrive incorporates algorithms and style transfer methods from Federated Learning, Domain Generalization, and Domain Adaptation literature. Its main goal is to enhance model generalization and robustness against statistical heterogeneity.

We show the importance of using the correct clients’ statistics when dealing with different domains and label skewness and how style transfer techniques can improve the performance on unseen domains, proving FedDrive to be a solid baseline for future research in federated semantic segmentation.

Summary of the FedDrive scenarios.
Dataset Setting Distribution # Clients # img/cl Test clients
Cityscapes - Uniform, Heterogeneous, Class Imbalance 146 10-45 unseen cities
IDDA Country Uniform, Heterogeneous, Class Imbalance 90 48 seen + unseen (country) domains
Rainy Uniform, Heterogeneous, Class Imbalance 69 48 seen + unseen (rainy) domains
Bus Uniform, Heterogeneous, Class Imbalance 83 48 seen + unseen (bus) domains

Results

Please visit the FedDrive official website for the results.

Setup

  1. Clone this repository

  2. Move to the root path of your local copy of the repository

  3. Create the feddrive new conda virtual environment and activate it:

conda env create -f environment.yml
conda activate feddrive
  1. Download the Cityscapes dataset from here. You may need a new account if you do not have one yet. Download the gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.ziparchives

  2. Extract the archives and move the gtFine and leftImg8bit folders in [local_repo_path]/data/cityscapes/data/

  3. Ask for the IDDA V3 version of IDDA, available here

  4. Extract the archive and move the IDDAsmall folder in [local_repo_path]/data/idda/data/

  5. Make a new wandb account if you do not have one yet, and create a new wandb project.

  6. In the configs folder, it is possible to find examples of config files for some of the experiments to replicate the results of the paper. Run one of the exemplar configs or a custom one:

./run.sh [path/to/config]

N.B. change the wandb_entity argument with the entity name of your wandb project.

N.B. always leave a blank new line at the end of the config. Otherwise, your last argument will be ignored.

How to visualize model predictions, LAB and CFSI images

The script plot_samples.py is designed to save and eventually visualize sets of (image, CFSI(image), LAB(image), target, model(image)) from samples in the test set(s) associated with a dataset, given a checkpoint and the indices of the images to show.

To use this script:

  1. Download the checkpoint of the desired run from WandB
  2. Copy the [run_args] from the info of the same run on wandb
  3. Customize the load_path, indices, path_to_save_folder and plot variables options
  4. Modify the CUDA_VISIBLE_DEVICES environment variable to select one single desired GPU
  5. Move to the root directory of this repository and run the following command:
python src/plot_samples.py [run_args]

feddrive's People

Contributors

erosinho13 avatar

Stargazers

Sharon Wong avatar Hanwen avatar  avatar -_o avatar  avatar DarkLight avatar stanley avatar Yijun Zhai avatar Chenxin Li avatar ChengruZhu avatar Jieyi Tan avatar Xiaobing Han avatar  avatar  avatar Junnan Yin avatar hyp avatar AULAY WANG avatar Felix Wang avatar Deckard avatar Cyprien Quéméneur avatar  avatar Riccardo Zaccone avatar  avatar Tianhang Wang avatar bravozyz avatar  avatar  avatar  avatar  avatar Marco Ciccone avatar Raffaello Camoriano avatar  avatar Umberto Michieli avatar zhang avatar  avatar Debora Caldarola avatar

Watchers

Kostas Georgiou avatar  avatar

feddrive's Issues

For visualization

how to visualize the result? (for example, cityscapes: CLASSES and PALETTE)

how to perform debugging in pycharm?

Hi. I am trying to perform debugging to understand codes.
However, in PyCharm environments, I can't perform debugging because of shell file.
could you help?

error message is "/anaconda3/envs/feddrive/lib/python3.9/site-packages/torch/distributed/elastic/agent/server/api.py", line 87, in post_init
assert self.local_world_size > 0
AssertionError"

environment variable:
CUDA_VISIBLE_DEVICES=0

interpreter option:
-W ignore -m torch.distributed.launch --nproc_per_node 0 --master_port=2509

parameters:
--name idda_heterogeneous_country --device_ids 1 --random_seed 42 --wandb_entity FedDrive --mixed_precision --ignore_warnings --save_samples --avg_last_100 --dataset idda --clients_type heterogeneous --setting_type country --remap --framework federated --algorithm FedAvg --num_rounds 1600 --clients_per_round 5 --num_epochs 2 --model bisenetv2 --output_aux --hnm --batch_size 16 --test_batch_size 1 --test_diff_dom --optimizer SGD --weight_decay 0 --momentum 0.9 --lr 0.1 --lr_policy poly --lr_power 0.9 --rrc_transform --use_test_resize --min_scale 0.5 --max_scale 2.0 --h_resize 512 --w_resize 928 --eval_interval 30000 --test_interval 50 --print_interval 50

QUESTION

When I tried to debug src/run.py to learn your code, I encountered this problem. How can I solve it(The parameter used is cityscapes_ heterogeneous_ silobn.txt)
image

Question about Non-IID settings

Hello author, I am currently testing your code, but I am curious to know where in the code files the non-iid setting implemented in the ./data/cityscapes/data/ folder is achieved? Specifically, where are those .json files generated from? I seem unable to find the relevant functions in your code files.
image

I am getting the below error after connecting to wandb and using ./run.sh [path/to/config]

Running Centralized experiment with 828 epochs
Setting up distributed...
Let's use 1 GPUs!
Done
Initializing wandb...
wandb: Currently logged in as: swapnikvarala. Use `wandb login --relogin` to force relogin
Problem at: /usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py 837 getcaller
Traceback (most recent call last):
  File "/content/FedDrive/src/run.py", line 31, in <module>
    run_experiment()
  File "/content/FedDrive/src/run.py", line 15, in run_experiment
    main(args)
  File "/content/FedDrive/src/centr_setting/main.py", line 17, in main
    writer, device, rank, world_size, logger, label2color, denorm = setup_env(args)
                                                                    ^^^^^^^^^^^^^^^
  File "/content/FedDrive/src/utils/utils.py", line 36, in setup_env
    logger = CustomWandbLogger(name=get_job_name(args), project=get_project_name(args.framework, args.dataset),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/FedDrive/src/utils/logger.py", line 7, in __init__
    super(CustomWandbLogger, self).__init__(name=name, project=project, group=group, entity=entity, offline=offline,
  File "/usr/local/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py", line 358, in __init__
    _ = self.experiment
        ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/lightning_fabric/loggers/logger.py", line 114, in experiment
    return fn(self)
           ^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py", line 406, in experiment
    self._experiment = wandb.init(**self._wandb_init)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 1173, in init
    raise e
  File "/usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 1154, in init
    run = wi.init()
          ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 770, in init
    raise error
wandb.errors.CommError: Run initialization has timed out after 60.0 sec. 
Please refer to the documentation for additional information: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process-
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 24118) of binary: /usr/local/bin/python3
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/run.py", line 798, in <module>
    main()
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/run.py", line 794, in main
    run(args)
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/run.py", line 785, in run
    elastic_launch(
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
run.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-07-18_17:50:12
  host      : f905eb7441a0
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 24118)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.