icon-lab / resvit Goto Github PK

Official Implementation of ResViT: Residual Vision Transformers for Multi-modal Medical Image Synthesis

License: Other

Python 100.00%

deep-learning image-synthesis medical-imaging residual-learning transformers image-to-image-translation mri vision-transformer attention

resvit's People

Contributors

Stargazers

Watchers

resvit's Issues

hi professor，I have a question about data size

As the picture shown，

#training images = 2086
#Validation images = 2086
In fact，my training images are not equal with Validation images.
Could you please tell me that whether the train data and validation data are same in your code？

How to generate attention maps by ResViT?

Could you please provide codes for ResViT to visualize the attention maps? Attention rollout and attention flow are mentioned in the paper, but it is unclear how to realize them.

Download pre-trained ViT models from Google Problem

The following command is wrong:
wget https://storage.googleapis.com/vit_models/imagenet21k/R50-ViT-B_16.npz

Should be
wget https://storage.googleapis.com/vit_models/imagenet21k/R50+ViT-B_16.npz

Additional dependencies and deprecated functions

Hey and thanks for a great repository.
While setting the code up I found that some of the dependencies were missing from the readme.
Specifically, these are scikit-image, h5py and ml_collections.

Additionally, in more recent versions of scikit-image, compare_psnr is moved and renamed from skimage.measure.compare_psnr to skimage.metrics.peak_signal_noise_ratio.

In more recent versions of pytorch it also seems that the .cuda(async = ...) keyword is replaced with .cuda(non_blocking = ... ) since async is a reserved keyword in python >= 3.7.

I see two ways to fix the issues; either by being more specific in terms of compatible package versions or, my preferred solution, updating the codebase to new keywords and functions.

I'd gladly submit a pull request with the mentioned changes if you'd like me to.
I also made a dockerfile for setting up an environment with the requirements if that should be of any interest.

Brats dataset

Regarding the experiments on the Brats dataset, how did you divide the training and test sets

Plz explain on many to one case synthesis .How data should be organized

can you please explain on multi-modal synthesis data organization?
How data should be arranged if T1,T2 ->Flair in Aligned data mode
In train folder there are two subfolders A & B . How Do we keep T1&T2 images all in folder A or in folder A &B. What about target images(Flairs)?
Secondly, you mentioned for multimodal to be in Green & red channel.. So do we need to convert them in green & red.
Forexample : i have 2CT images at different time pts. i want to synthesize nd get target images as shown
Red T1

Green T2

Target image

or Stack Red&green together in folder A nd targets in folder B

Data preprocessing

I have a lot of questions about data preprocessing as shown below，
How to normalize brats and ixi image volume?
Are the intensity values of each volume normalized into a range [-1,1] ? (Max-Min Normalization)
Since the brats dataset contains images acquired under different clinical protocols and scanners, how the data normalization differs from IXI？

Could you share your python script for data preprocessing?
Thanks！

A question about visdom in this code

After I run the test code, the visdom tool generate nothing. But as the picture shown

there is nothing code bug about visdom as the second picture shown.
BUT there is nothing in http://localhost:8097/, as the third picture shown.

Datasets

Hello, I don't understand very well how your dataset is placed. Can you describe in detail？I can see aligned dataset, unaligned dataset, and single dataset here. But in aligned datset, are the data of different modalities placed in the same folder? such as t1 and t2 modalities, and whether the data format is in the form of jpg images.I hope you can describe the placement of the dataset images in detail, thank you

L1 Loss-term weights

why L1 Loss-term weight is 100?

Pixel-wise consistency loss between acquired and reconstructed source modalities based on an L1 distance

Dear Mr. Dalmaz and rest of the team,

First, I would like to thank you for your work and for making it available to the public.

I am trying to use it for the task of sCT generation with my dataset. Unfortunately, I cannot find where in the code is the second term of the loss (as explained in your paper): the pixel-wise consistency loss or Lrec. Based on my understanding, this loss computes the L1 distance between the source image (MR in this case) and the MR image generated by the generator from the sCT?

Would you mind pointing where does that happen in the code? I can only locate the pixel-wise L1 loss between the CT and the sCT, and the adversarial loss.

Thanks in advance!

Best regards

Data pre-processing

Hello, the author has the honor to communicate with you, inspired by your work, recently in the study of modal synthesis work, the process encountered some problems, I hope to get your answer！
The problem is: when one mode generates another, we need to splice the two modes together. For example, when a T1 mode generates a T2 mode, we splice the slices of the two modes together horizontally, and what happens when we need multiple modes to generate a single mode? For example, the T1 and T2 modes generate Flair modes. Do we need to splice the data of these three modes horizontally together, or is it some other processing method.
I sincerely hope to get your answer！

OSError: Failed to interpret file './model/vit_checkpoint/imagenet21k/R50+ViT-B_16.npz' as a pickle

hi,

when i try to fine tune ResViT by your provided code, a pickle error is raised :

OSError: Failed to interpret file './model/vit_checkpoint/imagenet21k/R50+ViT-B_16.npz' as a pickle

how can I solve it?

Hello，I have a question

Dear professor,
Hello，I am a fresher of vit. When I run your code，I have some questions. For example, what's that mean and why this error?
Mabe a stupid question from a student. --

datasets

Excuse me,i have some problem in structuring my aligned dataset. the form of dataset BRATS are as follow, the item is not the same as you said "
T1_T2
├── T2_FLAIR
.
.
├── T1_FLAIR_T2"
so how can i deal with it.
plz~

Transformer_Discriminator

ResViT/models/networks.py

Line 167 in 1963c1b

 netD = residual_transformers.Transformer_Discriminator(residual_transformers.CONFIGS[vit_name],img_size=img_size, output_dim=1, zero_head=False, vis=False) 

Transformer_Discriminator is not implemented

Datasets used in this paper

Hello Mr. Dalmaz, thank you for your nice work. To better reproduce your experiment, could you provide the standard datasets that have been processed for use in the paper ？

Unified model - Training procedure

Hi,

Thanks for sharing your code!
Could you more details about how to train the unified model?

Thank you!
Reuben

Pretrained models for the paper

Hi!

Thank you for sharing the code for this excellent paper! Could you share the pre-trained models for the results mentioned in the paper? (brats, IXI, etc).

Thank you very much!

Hello,I have a question

In the first issue of icon-lab/Resvit，you gave this image

I know that the left is source and the right is target. Could you please tell me which one is t1，t2 or flair ？

icon-lab / resvit Goto Github PK

resvit's People

Contributors

Stargazers

Watchers

Forkers

resvit's Issues

Recommend Projects

Recommend Topics

Recommend Org