Giter Club home page Giter Club logo

picai_baseline's Introduction

Baseline AI Models for Prostate Cancer Detection in MRI

This repository contains utilities to set up and train deep learning-based detection models for clinically significant prostate cancer (csPCa) in MRI. In turn, these models serve as the official baseline AI solutions for the PI-CAI challenge. As of now, the following three models are provided and supported:

All three solutions share the same starting point, with respect to their expected folder structure and data preparation pipeline.

Issues

Please feel free to raise any issues you encounter here.

Installation

picai_baseline can be pip-installed:

pip install picai_baseline

Alternatively, picai_baseline can be installed from source:

git clone https://github.com/DIAGNijmegen/picai_baseline
cd picai_baseline
pip install -e .

Installing from source ensures the scripts are present locally, which enables you to run the provided Python scripts. Additionally, this allows you to modify the baseline solutions, due to the -e option.

General Setup

We define setup steps that are shared between the different baseline algorithms. To follow the model-specific baseline algorithm tutorials, these steps must be completed first.

Folder Structure

We define three main folders that must be prepared:

  • /input/ contains the PI-CAI dataset. In this tutorial we assume this is the PI-CAI Public Training and Development Dataset.
    • /input/images/ contains the imaging files. For the Public Training and Development Dataset, these can be retrieved here.
    • /input/picai_labels/ contains the annotations. For the Public Training and Development Dataset, these can be retrieved here.
  • /workdir/ stores intermediate results, such as preprocessed images and annotations.
    • /workdir/results/[model name]/ stores model checkpoints/weights during training (enables the ability to pause/resume training).
  • /output/ stores training output, such as trained model weights and preprocessing plan.

Data Preparation

Unless specified otherwise, this tutorial assumes that the PI-CAI: Public Training and Development Dataset will be downloaded and unpacked. Before downloading the dataset, read its documentation and dedicated forum post (for all updates/fixes, if any). To download and unpack the dataset, run the following commands:

# download all folds
curl -C - "https://zenodo.org/record/6624726/files/picai_public_images_fold0.zip?download=1" --output picai_public_images_fold0.zip
curl -C - "https://zenodo.org/record/6624726/files/picai_public_images_fold1.zip?download=1" --output picai_public_images_fold1.zip
curl -C - "https://zenodo.org/record/6624726/files/picai_public_images_fold2.zip?download=1" --output picai_public_images_fold2.zip
curl -C - "https://zenodo.org/record/6624726/files/picai_public_images_fold3.zip?download=1" --output picai_public_images_fold3.zip
curl -C - "https://zenodo.org/record/6624726/files/picai_public_images_fold4.zip?download=1" --output picai_public_images_fold4.zip

# unzip all folds
unzip picai_public_images_fold0.zip -d /input/images/
unzip picai_public_images_fold1.zip -d /input/images/
unzip picai_public_images_fold2.zip -d /input/images/
unzip picai_public_images_fold3.zip -d /input/images/
unzip picai_public_images_fold4.zip -d /input/images/

In case unzip is not installed, you can use Docker to unzip the files:

docker run --cpus=2 --memory=8gb --rm -v /path/to/input:/input joeranbosma/picai_nnunet:latest unzip /input/picai_public_images_fold0.zip -d /input/images/
docker run --cpus=2 --memory=8gb --rm -v /path/to/input:/input joeranbosma/picai_nnunet:latest unzip /input/picai_public_images_fold1.zip -d /input/images/
docker run --cpus=2 --memory=8gb --rm -v /path/to/input:/input joeranbosma/picai_nnunet:latest unzip /input/picai_public_images_fold2.zip -d /input/images/
docker run --cpus=2 --memory=8gb --rm -v /path/to/input:/input joeranbosma/picai_nnunet:latest unzip /input/picai_public_images_fold3.zip -d /input/images/
docker run --cpus=2 --memory=8gb --rm -v /path/to/input:/input joeranbosma/picai_nnunet:latest unzip /input/picai_public_images_fold4.zip -d /input/images/

Please follow the instructions here to set up the Docker container.

Also, collect the training annotations. This can be done via the following command:

cd /input
git clone https://github.com/DIAGNijmegen/picai_labels

After cloning the repository with annotations, you should have a folder structure like this:

/input/picai_labels
├── anatomical_delineations
│   ├── ...
├── clinical_information
│   └── marksheet.csv
└── csPCa_lesion_delineations
    ├── ...

Cross-Validation Splits

We have prepared 5-fold cross-validation splits of all 1500 cases in the PI-CAI: Public Training and Development Dataset. We have ensured there is no patient overlap between training/validation splits. You can load these splits as follows:

from picai_baseline.splits.picai import train_splits, valid_splits

for fold, ds_config in train_splits.items():
    print(f"Training fold {fold} has cases: {ds_config['subject_list']}")

for fold, ds_config in valid_splits.items():
    print(f"Validation fold {fold} has cases: {ds_config['subject_list']}")

Additionally, we prepared 5-fold cross-validation splits of all cases with an expert-derived csPCa annotation. These splits are subsets of the splits above. You can load these splits as follows:

from picai_baseline.splits.picai_nnunet import train_splits, valid_splits

When using picai_eval from the command line, we recommend saving the splits to disk. Then, you can pass these to picai_eval to ensure all cases were found. You can export the labelled cross-validation splits using:

python -m picai_baseline.splits.picai_nnunet --output "/workdir/splits/picai_nnunet"

Data Preprocessing

We follow the nnU-Net Raw Data Archive format to prepare our dataset for usage. For this, you can use the picai_prep module. The picai_prep module allows to resample all cases to the same resolution (you can resample each case indivudually to the same resolution between the different sequences, or choose to resample the full dataset to the same resolution). For details on the available options to convert the dataset in /input/ into the nnU-Net Raw Data Archive format, and store it in /workdir/nnUNet_raw_data, please see the instructions provided here. Below we give the conversion as performed for the baseline semi-supervised nnU-Net. For the U-Net baseline, please see the U-Net tutorial for extra instructions.

Note, the picai_prep module should be automatically installed when installing the picai_baseline module, and is installed within the picai_nnunet and picai_nndetection Docker containers as well.

python src/picai_baseline/prepare_data_semi_supervised.py

For the baseline semi-supervised U-Net algorithm, specify the dataset-wise resolution: --spacing 3.0 0.5 0.5. To adapt/modify the preprocessing pipeline or its default specifications, either check out the various command like options (use flag -h to show these) or make changes to the prepare_data_semi_supervised.py script.

Alternatively, you can use Docker to run the Python script:

docker run --cpus=2 --memory=16gb --rm \
    -v /path/to/input/:/input/ \
    -v /path/to/workdir/:/workdir/ \
    -v /path/to/picai_baseline:/scripts/picai_baseline/ \
    joeranbosma/picai_nnunet:latest python3 /scripts/picai_baseline/src/picai_baseline/prepare_data_semi_supervised.py

If you don't want to include the AI-generated annotations, you can also use the supervised data preparation script: prepare_data.py.

Baseline Algorithms

We provide end-to-end training pipelines for csPCa detection/diagnosis in 3D. Each baseline includes a template to encapsulate the trained AI model in a Docker container, and uploading the same to the grand-challenge.org platform as an "algorithm".

U-Net

We include a baseline U-Net to provide a playground environment for participants and kickstart their development cycle. The U-Net baseline generates quick results with minimal complexity, but does so at the expense of sub-optimal performance and low flexibility in adapting to any other task.

→ Read the full documentation here.

nnU-Net

The nnU-Net framework [1] provides a performant framework for medical image segmentation, which is straightforward to adapt for csPCa detection.

→ Read the full documentation here.

nnDetection

The nnDetection framework is geared towards medical object detection [2]. Setting up nnDetection and tweaking its implementation is not as straightforward as for the nnUNet or UNet baselines, but it can provide a strong csPCa detection model.

→ Read the full documentation here.

References

[1] Fabian Isensee, Paul F. Jaeger, Simon A. A. Kohl, Jens Petersen and Klaus H. Maier-Hein. "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation". Nature Methods 18.2 (2021): 203-211.

[2] Michael Baumgartner, Paul F. Jaeger, Fabian Isensee, Klaus H. Maier-Hein. "nnDetection: A Self-configuring Method for Medical Object Detection". International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2021.

[3] Joeran Bosma, Anindo Saha, Matin Hosseinzadeh, Ilse Slootweg, Maarten de Rooij, Henkjan Huisman. "Semi-supervised learning with report-guided lesion annotation for deep learning-based prostate cancer detection in bpMRI". arXiv:2112.05151.

[4] Joeran Bosma, Natalia Alves and Henkjan Huisman. "Performant and Reproducible Deep Learning-Based Cancer Detection Models for Medical Imaging". Under Review.

If you are using this codebase or some part of it, please cite the following article:

Saha A, Bosma JS, Twilt JJ, et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): an international, paired, non-inferiority, confirmatory study. Lancet Oncol 2024; 25: 879–887

If you are using the AI-generated annotations (i.e., semi-supervised learning), please cite the following article:

J. S. Bosma, A. Saha, M. Hosseinzadeh, I. Slootweg, M. de Rooij, and H. Huisman, "Semisupervised Learning with Report-guided Pseudo Labels for Deep Learning–based Prostate Cancer Detection Using Biparametric MRI", Radiology: Artificial Intelligence, 230031, 2023. doi:10.1148/ryai.230031

BibTeX:

@ARTICLE{SahaBosmaTwilt2024,
  title = {Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): an international, paired, non-inferiority, confirmatory study},
  journal = {The Lancet Oncology},
  year = {2024},
  issn = {1470-2045},
  volume={25},
  number={7},
  pages={879--887},
  doi = {https://doi.org/10.1016/S1470-2045(24)00220-1},
  author = {Anindo Saha and Joeran S Bosma and Jasper J Twilt and Bram {van Ginneken} and Anders Bjartell and Anwar R Padhani and David Bonekamp and Geert Villeirs and Georg Salomon and Gianluca Giannarini and Jayashree Kalpathy-Cramer and Jelle Barentsz and Klaus H Maier-Hein and Mirabela Rusu and Olivier Rouvière and Roderick {van den Bergh} and Valeria Panebianco and Veeru Kasivisvanathan and Nancy A Obuchowski and Derya Yakar and Mattijs Elschot and Jeroen Veltman and Jurgen J Fütterer and Constant R. Noordman and Ivan Slootweg and Christian Roest and Stefan J. Fransen and Mohammed R.S. Sunoqrot and Tone F. Bathen and Dennis Rouw and Jos Immerzeel and Jeroen Geerdink and Chris {van Run} and Miriam Groeneveld and James Meakin and Ahmet Karagöz and Alexandre Bône and Alexandre Routier and Arnaud Marcoux and Clément Abi-Nader and Cynthia Xinran Li and Dagan Feng and Deniz Alis and Ercan Karaarslan and Euijoon Ahn and François Nicolas and Geoffrey A. Sonn and Indrani Bhattacharya and Jinman Kim and Jun Shi and Hassan Jahanandish and Hong An and Hongyu Kan and Ilkay Oksuz and Liang Qiao and Marc-Michel Rohé and Mert Yergin and Mohamed Khadra and Mustafa E. Şeker and Mustafa S. Kartal and Noëlie Debs and Richard E. Fan and Sara Saunders and Simon J.C. Soerensen and Stefania Moroianu and Sulaiman Vesal and Yuan Yuan and Afsoun Malakoti-Fard and Agnė Mačiūnien and Akira Kawashima and Ana M.M. de M.G. {de Sousa Machadov} and Ana Sofia L. Moreira and Andrea Ponsiglione and Annelies Rappaport and Arnaldo Stanzione and Arturas Ciuvasovas and Baris Turkbey and Bart {de Keyzer} and Bodil G. Pedersen and Bram Eijlers and Christine Chen and Ciabattoni Riccardo and Deniz Alis and Ewout F.W. {Courrech Staal} and Fredrik Jäderling and Fredrik Langkilde and Giacomo Aringhieri and Giorgio Brembilla and Hannah Son and Hans Vanderlelij and Henricus P.J. Raat and Ingrida Pikūnienė and Iva Macova and Ivo Schoots and Iztok Caglic and Jeries P. Zawaideh and Jonas Wallström and Leonardo K. Bittencourt and Misbah Khurram and Moon H. Choi and Naoki Takahashi and Nelly Tan and Paolo N. Franco and Patricia A. Gutierrez and Per Erik Thimansson and Pieter Hanus and Philippe Puech and Philipp R. Rau and Pieter {de Visschere} and Ramette Guillaume and Renato Cuocolo and Ricardo O. Falcão and Rogier S.A. {van Stiphout} and Rossano Girometti and Ruta Briediene and Rūta Grigienė and Samuel Gitau and Samuel Withey and Sangeet Ghai and Tobias Penzkofer and Tristan Barrett and Varaha S. Tammisetti and Vibeke B. Løgager and Vladimír Černý and Wulphert Venderink and Yan M. Law and Young J. Lee and Maarten {de Rooij} and Henkjan Huisman},
}
@article{Bosma23,
    author={Joeran S. Bosma, Anindo Saha, Matin Hosseinzadeh, Ivan Slootweg, Maarten de Rooij, and Henkjan Huisman},
    title={Semisupervised Learning with Report-guided Pseudo Labels for Deep Learning–based Prostate Cancer Detection Using Biparametric MRI},
    journal={Radiology: Artificial Intelligence},
    pages={e230031},
    year={2023},
    doi={10.1148/ryai.230031},
    publisher={Radiological Society of North America}
}

Managed By

Diagnostic Image Analysis Group, Radboud University Medical Center, Nijmegen, The Netherlands

Contact Information

picai_baseline's People

Contributors

anindox8 avatar joeranbosma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

picai_baseline's Issues

Overfitting Issue with PICAI Baseline Model nnDetection

Dear PICAI team,

Recently, I have been working with the baseline model provided for the PICAI dataset. Although I carefully followed the instructions, I keep encountering significant overfitting issues. Attached, you can find a plot illustrating the training process.

image

I’m wondering if there might be an issue with my training environment, or if there are adjustments I should consider making to the trainer.

Here are some details about my setup:

  • Training environment: Official picai_baseline Docker on A100 40GB
  • Cross-validation: Custom stratified 5-fold CV
  • Preprocessing: I used prepare_data.py and obtained 1295 cases, 80% of which were used in the 5-fold CV to train the model.

Thank you in advance for your assistance, and I hope you’re having a great summer!

plans.pkl file

Hi,

May I know how this plans.pkl file is generated in train.py?

shutil.copy(workdir / "results/UNet/weights_semisupervised/plans.pkl",
                output_dir / "picai_unet_semi_supervised_gc_algorithm/results/UNet/weights_semisupervised/plans.pkl")

UNet training returns ValueError shape mismatch

I followed the general guide and the model-specific guide but I encountered a problem on the UNet training part. I am running the code in Google Colab. I've tried implementing the bugfix PR but the error remains the same. Below is a code snippet of the error:

CPU Threads: 4
Dataset Definition: --------------------------------------------------------------------------------
Fold Number: 0
Data Classes: [0.0, 2.0, 3.0, 4.0, 5.0]
Train-Time Class Weights: [0.46153846 0.53846154]
Training Samples [-:140;+:120]: 260
Validation Samples [-:140;+:120]: 260
Traceback (most recent call last):
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/train.py", line 161, in <module>
    main()
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/train.py", line 99, in main
    train_gen, valid_gen, class_weights = prepare_datagens(args=args, fold_id=f)
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/training_setup/data_generator.py", line 108, in prepare_datagens
    data_pair = monai.utils.misc.first(check_loader)
  File "/usr/local/lib/python3.10/dist-packages/monai/utils/misc.py", line 116, in first
    for i in iterable:
  File "/usr/local/lib/python3.10/dist-packages/batchgenerators/dataloading/data_loader.py", line 126, in __next__
    return self.generate_train_batch()
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/training_setup/data_generator.py", line 72, in generate_train_batch
    return self.collate_fn(batch)
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/training_setup/data_generator.py", line 40, in default_collate
    return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/training_setup/data_generator.py", line 40, in <dictcomp>
    return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
  File "/content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/training_setup/data_generator.py", line 32, in default_collate
    return np.vstack(batch)
  File "<__array_function__ internals>", line 180, in vstack
  File "/usr/local/lib/python3.10/dist-packages/numpy/core/shape_base.py", line 282, in vstack
    return _nx.concatenate(arrs, 0)
  File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 2, the array at index 0 has size 21 and the array at index 2 has size 19
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
[<ipython-input-6-6e18c76e07c0>](https://localhost:8080/#) in <cell line: 1>()
----> 1 get_ipython().run_cell_magic('shell', '', "python -u  /content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/train.py \\\n  --weights_dir='/content/drive/MyDrive/300/workdir/UNet/weights/' \\\n  --overviews_dir='/content/drive/MyDrive/300/workdir/results/UNet/overviews/Task2201_picai_baseline' \\\n  --folds 0 --max_threads 4 --enable_da 0 --num_epochs 2 \\\n  --validate_n_epochs 1 --validate_min_epoch 0\n")

3 frames
[/usr/local/lib/python3.10/dist-packages/google/colab/_system_commands.py](https://localhost:8080/#) in check_returncode(self)
    135   def check_returncode(self):
    136     if self.returncode:
--> 137       raise subprocess.CalledProcessError(
    138           returncode=self.returncode, cmd=self.args, output=self.output
    139       )

CalledProcessError: Command 'python -u  /content/drive/MyDrive/300/picai_baseline/src/picai_baseline/unet/train.py \
  --weights_dir='/content/drive/MyDrive/300/workdir/UNet/weights/' \
  --overviews_dir='/content/drive/MyDrive/300/workdir/results/UNet/overviews/Task2201_picai_baseline' \
  --folds 0 --max_threads 4 --enable_da 0 --num_epochs 2 \
  --validate_n_epochs 1 --validate_min_epoch 0
' returned non-zero exit status 1.```

Unet train error at focal.py

Hello,
I followed the guide but I got stuck at the UNet train part.

Traceback (most recent call last):
  File "/home/usr/projects/bitirme/Codes/picai_baseline/src/picai_baseline/unet/train.py", line 161, in <module>
    main()
  File "/home/usr/projects/bitirme/Codes/picai_baseline/src/picai_baseline/unet/train.py", line 134, in main
    model, optimizer, train_gen, tracking_metrics, writer = optimize_model(
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/picai_baseline/unet/training_setup/callbacks.py", line 119, in optimize_model
    loss = loss_func(outputs, labels)
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/picai_baseline/unet/training_setup/loss_functions/focal.py", line 34, in forward
    targets = F.one_hot(targets, num_classes=self.num_classes).float()
RuntimeError: one_hot is only applicable to index tensor.

For debugging I checked the targets dtype and it returned float32. When I converted from float32 to long the issue resolved but now I am getting a new one.

Traceback (most recent call last):
  File "/home/usr/projects/bitirme/Codes/picai_baseline/src/picai_baseline/unet/train.py", line 161, in <module>
    main()
  File "/home/usr/projects/bitirme/Codes/picai_baseline/src/picai_baseline/unet/train.py", line 134, in main
    model, optimizer, train_gen, tracking_metrics, writer = optimize_model(
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/picai_baseline/unet/training_setup/callbacks.py", line 119, in optimize_model
    loss = loss_func(outputs, labels)
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/picai_baseline/unet/training_setup/loss_functions/focal.py", line 37, in forward
    ce_loss = F.binary_cross_entropy(inputs, targets, reduction="none")
  File "/home/usr/miniconda3/envs/demo/lib/python3.9/site-packages/torch/nn/functional.py", line 3113, in binary_cross_entropy
    raise ValueError(
ValueError: Using a target size (torch.Size([8, 256, 1, 20, 256, 2])) that is different to the input size (torch.Size([8, 2, 20, 256, 256])) is deprecated. Please ensure they have the same size.

targets.shape = torch.Size([8, 1, 20, 256, 256])
inputs.shape = torch.Size([8, 2, 20, 256, 256])

Inference with pre-trained ensembled nnUNet

Hi,

I would like to reproduce the results in the pi-cai forum (here) on the Public Training and Development Dataset, by using the pre-trained nnUNet models provided (picai_nnunet_gc_algorithm, picai_nnunet_semi_supervised_gc_algorithm).

Following the instructions on the github page (nnunet_baseline.md), could you help me figure out how to consolidate the pre-trained folds, to then run inference (as specified in "or ensemble the models for the test set")? It seems that in contrast to the nnDetection documentation, there is no consolidate script provided in the picai_baseline repo?

Maybe this might also be interesting to other people, hence worth updating the documentation :)

Thank you for your time!

Best,

Patrick

Running baseline models on cpu not possible with nvidia docker base image

Hi,

I've been trying to do inference with the pre-trained nnDetection baseline models (picai_nndetection_gc_algorithm, picai_nndetection_gc_algorithm) on a local prostate MRI dataset. My current approach was to first try to reproduce some of the results mentioned in the pi-cai forum (here) on the Public Training and Development Dataset and on my local computer (macos, m1 chip).

It seems the base docker image referenced in the Dockerfile "nvcr.io/nvidia/pytorch:20.12-py3" described in nndetection_baseline.md requires an NVIDIA account to be built. Also when using the pre-built version from Docker hub, although the preprocessing of the data works, when trying to run inference on my local machine (no GPU) with

'docker run -it --rm
-v /path/to/workdir:/workdir
-v /path/to/images:/input/images
joeranbosma/picai_nndetection:latest nndet predict Task2201_picai_baseline RetinaUNetV001_D3V001_3d /workdir
--fold -1 --check --resume --input /input/images --output /workdir/predictions/ --results /workdir/results/nnDet'

a runtime error is generated ('RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx').
Is there an alternative docker base image that one can use for building the dockerfile, compatible with mac silicon architecture + no cuda / nvidia gpu used?

I've tried 'FROM armswdev/pytorch-arm-neoverse:r23.03-torch-1.13.0-openblas' instead, but it created package issues after building, during pre-processing.

Thanks for your help!

Best,
Patrick

Error when running `python src/picai_baseline/unet/plan_overview.py`

Hi, thanks for the code!
I meet the following error during running python src/picai_baseline/unet/plan_overview.py.
Could you help to figure it out?

Preparing fold 0..
Traceback (most recent call last):
  File "src/picai_baseline/unet/plan_overview.py", line 51, in <module>
    lbl = sitk.GetArrayFromImage(sitk.ReadImage(str(preprocessed_data_path / 'labelsTr' / f'{subject_id}.nii.gz')))
  File "/research/d4/gds/zwang21/anaconda3/envs/surgical/lib/python3.8/site-packages/SimpleITK/extra.py", line 346, in ReadImage
    return reader.Execute()
  File "/research/d4/gds/zwang21/anaconda3/envs/surgical/lib/python3.8/site-packages/SimpleITK/SimpleITK.py", line 8015, in Execute
    return _SimpleITK.ImageFileReader_Execute(self)
RuntimeError: Exception thrown in SimpleITK ImageFileReader_Execute: /tmp/SimpleITK/Code/IO/src/sitkImageReaderBase.cxx:97:
sitk::ERROR: The file "workdir/nnUNet_raw_data/Task2201_picai_baseline/labelsTr/10133_1000135.nii.gz" does not exist.

BTW, I just uncomment the code in src/picai_baseline/prepare_data.py like following:

if mha2nnunet_settings_path.exists():
    print(f"Found mha2nnunet settings at {mha2nnunet_settings_path}, skipping..")
else:
    # generate mha2nnunet conversion plan
    Path(mha2nnunet_settings_path.parent).mkdir(parents=True, exist_ok=True)
    generate_mha2nnunet_settings(
        archive_dir=mha_archive_dir,
        annotations_dir=annotations_dir,
        output_path=mha2nnunet_settings_path,
    )

    # read mha2nnunet_settings
    with open(mha2nnunet_settings_path) as fp:
        mha2nnunet_settings = json.load(fp)

    # note: modify preprocessing settings here
    mha2nnunet_settings["preprocessing"]["matrix_size"] = [20, 256, 256]
    mha2nnunet_settings["preprocessing"]["spacing"] = [3.0, 0.5, 0.5]

    # save mha2nnunet_settings
    with open(mha2nnunet_settings_path, "w") as fp:
        json.dump(mha2nnunet_settings, fp, indent=4)
    print(f"Saved mha2nnunet settings to {mha2nnunet_settings_path}")

I am unable to run this.

Traceback (most recent call last):
File "train.py", line 165, in
main()
File "train.py", line 140, in main
args=args, tracking_metrics=tracking_metrics, device=device, writer=writer
File "/cs/research/external/home/namjad/Downloads/picai_baseline/src/picai_baseline/unet/training_setup/callbacks.py", line 117, in optimize_model
loss = loss_func(outputs, labels[:,0, ...].long())
File "/cs/research/external/home/namjad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/cs/research/external/home/namjad/Downloads/picai_baseline/src/picai_baseline/unet/training_setup/loss_functions/focal.py", line 32, in forward
targets = F.one_hot(targets, num_classes=self.num_classes).float()
RuntimeError: CUDA error: device-side assert triggered
88 namjad@blaze% python3 -u train.py --weights_dir='/cs/research/external/home/namjad/Downloads/picai_baseline/src/picai_baseline/workdir/results/UNet/weights/' --overviews_dir='/cs/research/external/home/namjad/Downloads/picai_baseline/src/picai_baseline/workdir/results/UNet/overviews/' --folds 1 --max_threads 6 --enable_da 1 --num_epochs 250 --validate_n_epochs 1 --validate_min_epoch 0 --num_classes 5

`nndet prep_train` fails using local Dockerfile

Hello,

When trying to run your Docker version of the nnDetection network:

cd src/picai_baseline/nndetection/training_docker/
docker build . --tag joeranbosma/picai_nndetection:latest

The preprocessing steps worked well but when running the Docker command of nndet prep_train:

docker run --cpus=6 --gpus='"device=0"' -it --rm \
        -v /workdir:/workdir \
        joeranbosma/picai_nndetection:latest nndet prep_train \
        Task2203_picai_baseline /workdir/ \
        --custom_split /workdir/nnUNet_raw_data/Task2203_picai_baseline/splits.json \  
        --fold 0

I have the following error message:

=============
== PyTorch ==
=============

NVIDIA Release 20.12 (build 17950526)
PyTorch Version 1.8.0a0+1606899

Container image Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.

Copyright (c) 2014-2020 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

NVIDIA Deep Learning Profiler (dlprof) Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.      

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.       

NOTE: MOFED driver for multi-node communication was not detected.
      Multi-node communication performance may be reduced.       

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be   
   insufficient for PyTorch.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --ipc=host ...

[#] Creating plans and preprocessing data
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 567, in _build_master
    ws.require(__requires__)
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 884, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 775, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (packaging 20.4 (/opt/conda/lib/python3.8/site-packages), Requirement.parse('packaging>20.9'), {'shap'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/bin/nndet_prep", line 33, in <module>
    sys.exit(load_entry_point('nndet', 'console_scripts', 'nndet_prep')())
  File "/opt/conda/bin/nndet_prep", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/opt/conda/lib/python3.8/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/opt/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/opt/code/nnDetection/scripts/preprocess.py", line 36, in <module>
    from nndet.planning import DatasetAnalyzer
  File "/opt/code/nnDetection/nndet/planning/__init__.py", line 2, in <module>
    from nndet.planning.experiment import PLANNER_REGISTRY
  File "/opt/code/nnDetection/nndet/planning/experiment/__init__.py", line 6, in <module>
    from nndet.planning.experiment.v001 import D3V001
  File "/opt/code/nnDetection/nndet/planning/experiment/v001.py", line 6, in <module>
    from nndet.ptmodule import MODULE_REGISTRY
  File "/opt/code/nnDetection/nndet/ptmodule/__init__.py", line 3, in <module>
    from nndet.ptmodule.base_module import LightningBaseModule
  File "/opt/code/nnDetection/nndet/ptmodule/base_module.py", line 24, in <module>
    import pytorch_lightning as pl
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/__init__.py", line 20, in <module>
    from pytorch_lightning import metrics  # noqa: E402
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/metrics/__init__.py", line 15, in <module>
    from pytorch_lightning.metrics.classification import (  # noqa: F401
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/__init__.py", line 14, in <module>
    from pytorch_lightning.metrics.classification.accuracy import Accuracy  # noqa: F401
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 16, in <module>
    from torchmetrics import Accuracy as _Accuracy
  File "/opt/conda/lib/python3.8/site-packages/torchmetrics/__init__.py", line 14, in <module>
    from torchmetrics import functional  # noqa: E402
  File "/opt/conda/lib/python3.8/site-packages/torchmetrics/functional/__init__.py", line 14, in <module>
    from torchmetrics.functional.audio.pit import permutation_invariant_training, pit, pit_permutate
  File "/opt/conda/lib/python3.8/site-packages/torchmetrics/functional/audio/__init__.py", line 14, in <module>
    from torchmetrics.functional.audio.pit import permutation_invariant_training, pit, pit_permutate  # noqa: F401
  File "/opt/conda/lib/python3.8/site-packages/torchmetrics/functional/audio/pit.py", line 24, in <module>
    from torchmetrics.utilities.imports import _SCIPY_AVAILABLE
  File "/opt/conda/lib/python3.8/site-packages/torchmetrics/utilities/imports.py", line 22, in <module>
    from pkg_resources import DistributionNotFound, get_distribution
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 3239, in <module>
    def _initialize_master_working_set():
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 3222, in _call_aside
    f(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 3251, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 569, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 582, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 775, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (packaging 20.4 (/opt/conda/lib/python3.8/site-packages), Requirement.parse('packaging>20.9'), {'shap'})
Traceback (most recent call last):
  File "/usr/local/bin/nndet", line 369, in <module>
    action(sys.argv[2:])
  File "/usr/local/bin/nndet", line 148, in nndet_prep_train
    subprocess.check_call(cmd)
  File "/opt/conda/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['nndet_prep', '2203']' returned non-zero exit status 1.

However, if I use the version from Docker Hub:

docker pull joeranbosma/picai_nndetection:latest

docker run ... nndet prep_train will run without issue. I believe there might be an issue when installing the dependencies in the Dockerfile but I was not able to find what was causing the issue.

Best,
Alexandre

AttributeError in MultiThreadedAugmenter during UNet training.

Hello,
I am encountering an AttributeError in my semi-supervised UNet training setup. Initially, I consistently received the error in every epoch. After updating my WSL RAM to 16GB, the error persists but now occurs on different number of epochs.

I am using this command on ubuntu terminal:

python3 -u train.py   --weights_dir='/home/buraknebio/picai_baseline/workdir/results/UNet/weights/'  
 --overviews_dir='/home/buraknebio/picai_baseline/workdir/results/UNet/overviews/Task2201_picai_baseline'  
 --folds 0 1 2 3 4 --max_threads 6 --enable_da 1 --num_epochs 250   --validate_n_epochs 1 --validate_min_epoch 0

Error Details:

Epoch 59/250 (Train. Loss: 11120.5093; Time: 99sec; Steps Completed: 100)
Valid. Performance [Benign or Indolent PCa (n=37) vs. csPCa (n=19)]:
Ranking Score = 0.456, AP = 0.240, AUROC = 0.673
Traceback (most recent call last):
  File "/home/buraknebio/picai_baseline/src/picai_baseline/unet/train.py", line 161, in <module>
    main()
  File "/home/buraknebio/picai_baseline/src/picai_baseline/unet/train.py", line 134, in main
    model, optimizer, train_gen, tracking_metrics, writer = optimize_model(
  File "/home/buraknebio/.local/lib/python3.10/site-packages/picai_baseline/unet/training_setup/callbacks.py", line 111, in optimize_model
    for batch_data in train_gen:
  File "/home/buraknebio/.local/lib/python3.10/site-packages/picai_baseline/unet/training_setup/augmentations/multi_threaded_augmenter.py", line 200, in __next__
    item = self.__get_next_item()
  File "/home/buraknebio/.local/lib/python3.10/site-packages/picai_baseline/unet/training_setup/augmentations/multi_threaded_augmenter.py", line 188, in __get_next_item
    if not self.pin_memory_queue.empty():
AttributeError: 'MultiThreadedAugmenter' object has no attribute 'pin_memory_queue'. Did you mean: 'pin_memory_thread'?

Iterative Training Attempts

Due to the recurring AttributeError and the inconsistency in the number of epochs at which it occurs, I've attempted to mitigate the issue by running the training script in a loop for several iterations. Each iteration involves restarting the training process from the beginning, utilizing the weights of the previous epoch.

epoch train_loss valid_auroc valid_ap valid_ranking
101 7638,408 0,607397 0,256964 0,43218
102 8185,666 0,600284 0,290952 0,445618
103 8492,908 0,611664 0,259046 0,435355
104 8340,153 0,620199 0,255093 0,437646
48 11815,43 0,657183 0,22646 0,441822
49 11887,91 0,577525 0,213911 0,395718
50 14961,47 0,608819 0,226688 0,417754

As it turned out, I got an error on the epoch #104, and in that iteration weights went back to the beginning, and continued from epoch #48.

Unfortunately, while this approach allows the script to continue running and avoids termination, it appears to consistently revert to a specific epoch in each iteration. This behavior raises concerns about the progress of the training and the accurate updating of weights. I am seeking guidance on how to address this issue.

Environment Details:

OS: WSL Ubuntu 2
Python Version: 3.10.12
GPU: RTX 4060 8GB
Processor: AMD Ryzen 5 5600
RAM: 16GB (100% allocated to WSL)

               total        used        free      shared  buff/cache   available
Mem:            15Gi       482Mi        10Gi       2.0Mi       4.7Gi        14Gi
Swap:          4.0Gi        33Mi       4.0Gi

which data dir as the input dir we should use for inference

Hi, I am following the instruction provided to train the nnUnet. For inference, there is an arg --input input/images. Which directory we should use? raw data or preprocessed or cropped?

docker run --cpus=8 --memory=28gb --gpus='"device=0"' --rm \
    -v /path/to/test_set/images:/input/images \
    -v /path/to/workdir/results:/workdir/results \
    -v /path/to/workdir/predictions:/output/predictions \
    joeranbosma/picai_nnunet:latest nnunet predict Task2201_picai_baseline \
    --trainer nnUNetTrainerV2_Loss_FL_and_CE_checkpoints \
    --fold 0 --checkpoint model_best \
    --results /workdir/results \
    --input /input/images/ \
    --output /output/predictions \
    --store_probability_maps

Thanks!

wrong path to labels when running preprocessing with prepare_data.py

Hi,

it seems that the path to the labels '/input/labels/' described in the repo, does not correspond to the expected input path in the picai_archive.py, defined in the prepare_data.py script '/input/picai_labels/' (and passed in the 'mha2nnunet_settings_path' function as 'annotations_dir'). When running the docker command for data preprocessing (converting from mha to the nnU-Net Raw Data Archive) this leads to the 'all_scans_found' not being assigned and hence the error message "Did not find any MHA scans in {archive_dir}, {archive_list} {all_scans_found} aborting."

I solved the issue by deleting the "labels" folder and directly putting the subfolder "picai_labels" into the input folder. An easy general fix would be to update the description of the folder structure in the picai_baseline repo (from "labels" to "picai_labels") and change the cloning command of the picai_labels to "git clone https://github.com/DIAGNijmegen/picai_labels /input/".

Best,

Patrick

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.