icon-lab / resvit Goto Github PK
View Code? Open in Web Editor NEWOfficial Implementation of ResViT: Residual Vision Transformers for Multi-modal Medical Image Synthesis
License: Other
Official Implementation of ResViT: Residual Vision Transformers for Multi-modal Medical Image Synthesis
License: Other
Could you please provide codes for ResViT to visualize the attention maps? Attention rollout
and attention flow
are mentioned in the paper, but it is unclear how to realize them.
The following command is wrong:
wget https://storage.googleapis.com/vit_models/imagenet21k/R50-ViT-B_16.npz
Should be
wget https://storage.googleapis.com/vit_models/imagenet21k/R50+ViT-B_16.npz
Hey and thanks for a great repository.
While setting the code up I found that some of the dependencies were missing from the readme.
Specifically, these are scikit-image
, h5py
and ml_collections
.
Additionally, in more recent versions of scikit-image
, compare_psnr
is moved and renamed from skimage.measure.compare_psnr
to skimage.metrics.peak_signal_noise_ratio
.
In more recent versions of pytorch
it also seems that the .cuda(async = ...)
keyword is replaced with .cuda(non_blocking = ... )
since async
is a reserved keyword in python >= 3.7
.
I see two ways to fix the issues; either by being more specific in terms of compatible package versions or, my preferred solution, updating the codebase to new keywords and functions.
I'd gladly submit a pull request with the mentioned changes if you'd like me to.
I also made a dockerfile
for setting up an environment with the requirements if that should be of any interest.
Regarding the experiments on the Brats dataset, how did you divide the training and test sets
can you please explain on multi-modal synthesis data organization?
How data should be arranged if T1,T2 ->Flair in Aligned data mode
In train folder there are two subfolders A & B . How Do we keep T1&T2 images all in folder A or in folder A &B. What about target images(Flairs)?
Secondly, you mentioned for multimodal to be in Green & red channel.. So do we need to convert them in green & red.
Forexample : i have 2CT images at different time pts. i want to synthesize nd get target images as shown
Red T1
Green T2
Target image
or Stack Red&green together in folder A nd targets in folder B
I have a lot of questions about data preprocessing as shown below,
How to normalize brats and ixi image volume?
Are the intensity values of each volume normalized into a range [-1,1] ? (Max-Min Normalization)
Since the brats dataset contains images acquired under different clinical protocols and scanners, how the data normalization differs from IXI?
Could you share your python script for data preprocessing?
Thanks!
After I run the test code, the visdom tool generate nothing. But as the picture shown
there is nothing code bug about visdom as the second picture shown.
BUT there is nothing in http://localhost:8097/, as the third picture shown.
Hello, I don't understand very well how your dataset is placed. Can you describe in detail?I can see aligned dataset, unaligned dataset, and single dataset here. But in aligned datset, are the data of different modalities placed in the same folder? such as t1 and t2 modalities, and whether the data format is in the form of jpg images.I hope you can describe the placement of the dataset images in detail, thank you
why L1 Loss-term weight is 100?
Dear Mr. Dalmaz and rest of the team,
First, I would like to thank you for your work and for making it available to the public.
I am trying to use it for the task of sCT generation with my dataset. Unfortunately, I cannot find where in the code is the second term of the loss (as explained in your paper): the pixel-wise consistency loss or Lrec. Based on my understanding, this loss computes the L1 distance between the source image (MR in this case) and the MR image generated by the generator from the sCT?
Would you mind pointing where does that happen in the code? I can only locate the pixel-wise L1 loss between the CT and the sCT, and the adversarial loss.
Thanks in advance!
Best regards
Hello, the author has the honor to communicate with you, inspired by your work, recently in the study of modal synthesis work, the process encountered some problems, I hope to get your answer!
The problem is: when one mode generates another, we need to splice the two modes together. For example, when a T1 mode generates a T2 mode, we splice the slices of the two modes together horizontally, and what happens when we need multiple modes to generate a single mode? For example, the T1 and T2 modes generate Flair modes. Do we need to splice the data of these three modes horizontally together, or is it some other processing method.
I sincerely hope to get your answer!
hi,
when i try to fine tune ResViT by your provided code, a pickle error is raised :
OSError: Failed to interpret file './model/vit_checkpoint/imagenet21k/R50+ViT-B_16.npz' as a pickle
how can I solve it?
Line 167 in 1963c1b
Hello Mr. Dalmaz, thank you for your nice work. To better reproduce your experiment, could you provide the standard datasets that have been processed for use in the paper ?
Hi,
Thanks for sharing your code!
Could you more details about how to train the unified model?
Thank you!
Reuben
Hi!
Thank you for sharing the code for this excellent paper! Could you share the pre-trained models for the results mentioned in the paper? (brats, IXI, etc).
Thank you very much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.