jleuschn / dival Goto Github PK
View Code? Open in Web Editor NEWDeep Inversion Validation Library
License: MIT License
Deep Inversion Validation Library
License: MIT License
Hi! Thanks for these awesome datasets and library!
The library works perfectly well for me when training models on a single GPU. But when using two GPUs with torch.multiprocessing
, an error related to the num_workers
and RandomAccessTorchDataset
occurs. Basically, whenever I use num_workers > 0
, torch.multiprocessing
somehow breaks down. A code script that showcases this problem is attached at the end. I am not sure whether this is directly related to the library itself, so feel free to close the issue. But it will be great if you have any insights on this problem. Thanks a lot!
The script debug.py
attached below appears to be long, but the essential part is just the main_worker
function. What occurs is that when I run python debug.py --num_workers 0
, the line x, d = next(iter(train_loader))
will be executed successfully; but when python debug.py --num_workers 4
, an error of the following occurs:
(base) liu0003@dmi-20-pc-09:~/Desktop/projects/ct$ python debug.py --num_workers 4
Use GPU: 0 for training
Use GPU: 1 for training
start iterating
start iterating
Traceback (most recent call last):
File "debug.py", line 135, in <module>
main()
File "debug.py", line 57, in main
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
File "/home/liu0003/anaconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/liu0003/anaconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
while not context.join():
File "/home/liu0003/anaconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 106, in join
raise Exception(
Exception: process 1 terminated with signal SIGKILL
Here is the debug.py
script:
import os, torch
import numpy as np
import random
from os import path
import torch.distributed as dist
import torch.multiprocessing as mp
import argparse
from torch.utils.data import DataLoader
from dival import get_standard_dataset
from dival.datasets.fbp_dataset import get_cached_fbp_dataset
from torch.utils.data.distributed import DistributedSampler
from torch.utils.data import Dataset as TorchDataset
parser = argparse.ArgumentParser(description='debug')
# Dataset settings
parser.add_argument('--BATCH_SIZE', type=int, default=4, help='mini-batch size for training.')
# DistributedDataParallel settings
parser.add_argument('--num_workers', type=int, default=8, help='')
parser.add_argument("--gpu_devices", type=int, nargs='+', default=[0,1], help="")
parser.add_argument('--gpu', default=None, type=int, help='GPU id to use.')
parser.add_argument('--dist-url', default='tcp://127.0.0.1:3456', type=str, help='')
parser.add_argument('--dist-backend', default='nccl', type=str, help='')
parser.add_argument('--rank', default=0, type=int, help='')
parser.add_argument('--world_size', default=1, type=int, help='')
parser.add_argument('--distributed', action='store_true', help='')
args = parser.parse_args()
gpu_devices = ','.join([str(id) for id in args.gpu_devices])
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_devices
def set_random_seeds(random_seed=0):
torch.manual_seed(random_seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(random_seed)
random.seed(random_seed)
def main():
set_random_seeds()
args = parser.parse_args()
ngpus_per_node = torch.cuda.device_count()
args.world_size = ngpus_per_node * args.world_size
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
def main_worker(gpu, ngpus_per_node, args):
args.gpu = gpu
ngpus_per_node = torch.cuda.device_count()
print("Use GPU: {} for training".format(args.gpu))
args.rank = args.rank * ngpus_per_node + gpu
dist.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
world_size=args.world_size, rank=args.rank)
datasets = CTdatasets()
batch_sizes = {'train': args.BATCH_SIZE, 'validation': args.BATCH_SIZE, 'test':1}
dataloaders = {x: DataLoader(datasets[x], batch_size=batch_sizes[x], num_workers = args.num_workers, pin_memory=True, sampler=DistributedSampler(datasets[x]) ) for x in ['train', 'validation', 'test']}
train_loader = dataloaders['train']
print('start iterating')
x, d = next(iter(train_loader))
print(x.flatten()[:10]) # when args.gpu_devices = 0, this line will be executed successfully; but when args.gpu_devices != 0, an error occurs.
class RandomAccessTorchDataset(TorchDataset):
def __init__(self, dataset, part, reshape=None):
self.dataset = dataset
self.part = part
self.reshape = reshape or (
(None,) * self.dataset.get_num_elements_per_sample())
def __len__(self):
return self.dataset.get_len(self.part)
def __getitem__(self, idx):
arrays = self.dataset.get_sample(idx, part=self.part)
mult_elem = isinstance(arrays, tuple)
if not mult_elem:
arrays = (arrays,)
tensors = []
for arr, s in zip(arrays, self.reshape):
t = torch.from_numpy(np.asarray(arr))
if s is not None:
t = t.view(*s)
tensors.append(t)
return tuple(tensors) if mult_elem else tensors[0]
def CTdatasets(IMPL = 'skimage', cache_dir = '/home/liu0003/Desktop/projects/dival/', **kwargs):
CACHE_FILES = {'train':
(path.join(cache_dir, 'cache_lodopab_train_fbp.npy'), None),
'validation':
(path.join(cache_dir, 'cache_lodopab_validation_fbp.npy'), None)}
standard_dataset = get_standard_dataset('lodopab', impl=IMPL)
ray_trafo = standard_dataset.get_ray_trafo(impl=IMPL)
dataset = get_cached_fbp_dataset(standard_dataset, ray_trafo, CACHE_FILES)
# create PyTorch datasets
dataset_train = RandomAccessTorchDataset(dataset = dataset,
part='train', reshape=((1,) + dataset.space[0].shape,
(1,) + dataset.space[1].shape))
dataset_validation = RandomAccessTorchDataset(dataset = dataset,
part='validation', reshape=((1,) + dataset.space[0].shape,
(1,) + dataset.space[1].shape))
dataset_test = RandomAccessTorchDataset(dataset = dataset,
part='test', reshape=((1,) + dataset.space[0].shape,
(1,) + dataset.space[1].shape))
datasets = {'train': dataset_train, 'validation': dataset_validation, 'test': dataset_test}
return datasets
if __name__=='__main__':
main()
Please run python debug.py --num_workers 4
to see the error I mentioned earlier. Thanks in advance!
The iterative reconstructor in from dival.reconstructors.odl_reconstructors
inherit from dival.reconstructors.reconstructor.IterativeReconstructor
, and thus should reflect the hyper parameter iterations
. Instead they currently use an unrelated attribute niter
. This should be merged, ideally ensuring backward compatibility.
Hi,
I have a question about the PSNR Measure.
I am using from dival.measure import PSNR
to compute the PSNR values on my results. Now just to be sure that everything is working I decided to compute the PSNR values of FBP to ground truth. I am using this reconstructor = construct_reconstructor('fbp', 'lodopab', impl='astra_cpu')
as FBP reconstructor.
Now in the paper 'The LoDoPaB-CT Dataset' the PSNR for the FBP reconstruction of test sample 0 is given as 16.1 (Fig. 5). In the paper 'Computed Tomography Reconstruction using Deep Image Prior' the PSNR for the FBP reconstruction of test sample 0 is given as 27.3 (Fig. 8). And if I compute it in my code I get a PSNR of 14.6.
Why are these results so different?
Hi!
I am wondering if I can ask a few questions related to the referenced fbp reconstructor.
First, for some reason, fbp reconstructor's performance on lodopab yielded by my experiments (code attached below) seems to be much better than what has been reported in the dataset paper For example, the PSNR I got on the test dataset is 30.51, but the corresponding number in the dataset paper is 24.43. I found that the fbp reconstructor results provided by Table 1 of this paper seem to be consistent with my experiments. I wonder if the results provided by the dataset paper should be considered outdated? Below are codes that I used for the fbp reconstructor experiment:
import dival
import numpy as np
from dival.measure import PSNR
from dival.datasets.fbp_dataset import get_cached_fbp_dataset
from tqdm import tqdm
dataset = dival.get_standard_dataset('lodopab')
data_dict = {x: dataset.get_data_pairs(x) for x in ['test', 'validation']}
reconstructor = dival.get_reference_reconstructor('fbp', 'lodopab')
phase = 'test'
#%% evaluate
psnrs = []
running_psnr = 0
running_size = 0
with tqdm(data_dict[phase]) as pbar:
for obs, gt in pbar:
reco = reconstructor.reconstruct(obs)
current_psnr = PSNR(reco, gt)
psnrs.append(current_psnr)
running_psnr += current_psnr
running_size += 1
pbar.set_postfix({'phase': phase,
'psnr': running_psnr/running_size})
print('mean psnr: {:f}'.format(np.mean(psnrs)))
Second, it is a bit strange to me that there is a small but noticeable gap between fbp reconstructor's performance and cached fbp results performance. As I mentioned above, the PSNR of the fbp reconstructor on the test data is 30.51. However, if I directly use the cached fbp results to calculate the PSNR, the results dropped to 29.80. I wonder if this is normal. The code I used for the cached fbp dataset is as follows:
import dival
import numpy as np
from dival.measure import PSNR, SSIM
from dival.util.plot import plot_images
from dival.datasets.fbp_dataset import get_cached_fbp_dataset
from tqdm import tqdm
import os
IMPL = 'astra_cuda'
CACHE_DIR = '/home/liu0003/Desktop/datasets/cache_lodopad/'
CACHE_FILES = {
'test':
(os.path.join(CACHE_DIR, 'cache_lodopab_test_fbp.npy'), None)}
cached_fbp_dataset = get_cached_fbp_dataset(dataset, dataset.get_ray_trafo(impl=IMPL), CACHE_FILES)
# dataset.fbp_dataset = cached_fbp_dataset
# dataset = cached_fbp_dataset
# create PyTorch datasets
pytorch_dataset = {x: cached_fbp_dataset.create_torch_dataset(part=x, reshape=((1,) + cached_fbp_dataset.space[0].shape,
(1,) + cached_fbp_dataset.space[1].shape)) for x in ['train', 'validation', 'test']}
phase = 'test'
#%% evaluate
psnrs = []
running_psnr = 0
running_size = 0
with tqdm(pytorch_dataset[phase]) as pbar:
for fbp, gt in pbar:
current_psnr = PSNR(fbp, gt)
psnrs.append(current_psnr)
running_psnr += current_psnr
running_size += 1
pbar.set_postfix({'phase': phase,
'psnr': running_psnr/running_size})
print('mean psnr: {:f}'.format(np.mean(psnrs)))
where the cache_lodopab_test_fbp.npy
file was generated in the following way:
import torch
import numpy as np
from dival import get_standard_dataset
from dival.datasets.fbp_dataset import (
generate_fbp_cache_files, get_cached_fbp_dataset)
from dival.reference_reconstructors import (
check_for_params, download_params, get_hyper_params_path)
from dival.util.plot import plot_images
from torch.utils.data import DataLoader
from torch.utils.data import Dataset as TorchDataset
from os import path
IMPL = 'astra_cuda'
LOG_DIR = './logs/lodopab_fbpunet'
CACHE_DIR = '/home/liu0003/Desktop/datasets/cache_lodopad/'
SAVE_BEST_LEARNED_PARAMS_PATH = './params/lodopab_fbpunet'
CACHE_FILES = { 'test': (path.join(CACHE_DIR, 'cache_lodopab_test_fbp.npy'), None)}
dataset = get_standard_dataset('lodopab', impl=IMPL)
ray_trafo = dataset.get_ray_trafo(impl=IMPL)
generate_fbp_cache_files(dataset, ray_trafo, CACHE_FILES)
Another naive question is that, it seems that none of the experiments in the example folder set PSNR.data_range = 1
But if we do that, the PSNR scores will be higher. It is also seems legal to me because pixel intensities are bounded between 0 and 1. Is there a particular reason that PSNR.data_range = 1 was not used?
Thanks a million!
Cheers,
Tianlin
Hi,thanks for this great Dataset and python tool. It makes people like me who is new to CT easier to explore this area.
Now I managed to reconstruct the image using fbp algorithm with the observation data. The code is like below:
fbp_reconstructor = dival.get_reference_reconstructor('fbp', 'lodopab')
fbp_img = fbp_reconstructor.reconstruct(obdata) # obdata.shape: (1000, 513)
I have two questions here:
1. How to reconstruct the image from part of the observation datas?
That means the origne observation data is 1000 angles. When the angle numbers becomes 1/2, which is 500, the observation data while be:
inds = np.round(np.linspace(0, 999, 500)).astype(np.int32)
obdata_half = obdata[inds]
How to get the reconstruction result from this observation data? E.g. how to run img = fbp_reconstructor.reconstruct(obdata_half)
correcttly?
2. Which iterative algorithm should I choose?
There are so many iterative algorithms? Which one is the best or the newest, and I should test it? Please recommend for me, thanks!
That's my problem, thanks for help!
Thank you for this very nice package.
I have trouble finding how to get ordered slices from the zenodo repo: https://zenodo.org/record/3384092
Indeed, reading the hdf5 works fine. However, the slices do not seem ordered:
import numpy as np
import h5py
import matplotlib.pyplot as plt
filename = 'ground_truth_train/ground_truth_train_000.hdf5'
with h5py.File(filename, "r") as f:
data = list(f['data'])
plt.subplot(131).imshow(data[0])
plt.subplot(132).imshow(data[1])
plt.subplot(133).imshow(data[2])
plt.show()
Is there a way to get the patient index and the slice order ?
I am trying to reproduce the results in the LoDoPaB-CT paper for FBP and FBPUNet. When I do
fbpunet = construct_reconstructor('fbpunet','lodopab')
if not check_for_params('fbpunet', 'lodopab', include_learned=True):
download_params('fbpunet', 'lodopab', include_learned=True)
hyper_params_path = get_hyper_params_path('fbpunet', 'lodopab')
fbpunet.load_hyper_params(hyper_params_path)
fbp.reconstruct(sinogram)
I get the error
...
File "/dival/dival/reconstructors/reconstructor.py", line 198, in reconstruct
reco = self._reconstruct(observation)
File "/dival/dival/reconstructors/fbpunet_reconstructor.py", line 131, in _reconstruct
self.model.eval()
AttributeError: 'NoneType' object has no attribute 'eval'
So it seems that the model has not been created.
Is there a way to access the pre-trained FBPUNet model or do I need to train it myself?
I can use the fbp reconstructor properly. The code is like below:
import dival
from dival.reconstructors.odl_reconstructors import FBPReconstructor
num_angels = 50
dataset = dival.get_standard_dataset('lodopab', num_angles=num_angels)
fbp_reconstructor = FBPReconstructor(dataset.get_ray_trafo(), hyper_params={
"filter_type": "Hann",
"frequency_scaling": 0.641025641025641
})
imorg = <image from lodopab dataset> # shape = (362, 362)
ob = <observations from lodopab> # shape = (1000, 513)
num_thetas = oborg.shape[0]
thetas = np.linspace(0, np.pi, num_thetas+1)[:-1]
inds = range(0, num_thetas, num_thetas//num_angels) # select num_angles of observation data
nthetas = thetas[inds]
nob = oborg[inds]
img = fbp_reconstructor.reconstruct(nob)
The code above works fine. When the reconstructor change from FBPReconstructor
to ISTAReconstructor
, how should I write my code?
I do not want the code in the dival/examples/ct_example.py
which is too much packaged. Thanks !
I wanna to load my own CT datasets when I use the iradonmap reconstructor.
I have loaded dataset in my code
data_path = folder_path + 'data.pkl'
state = torch.load(data_path)
data_obs_train = state['data_obs_train']
data_obs_test = state['data_obs_test']
u_train = state['u_true_train']
u_test = state['u_true_test']
data_train = TensorDataset(u_train[0:n_samples, :, :, :], data_obs_train[0:n_samples, :, :, :])
as the iradonmap reconstructor example shows the dataset loading method:
dataset = get_standard_dataset('lodopab', impl=IMPL)
how can I change my datasets to the standard_dataset format which could be read by iradonmap reconstructor like below:
dataset = get_standard_dataset('lodopab', impl=IMPL)
Hi
First of all, thanks a lot for this excellent library!
When I tried the DIPTV demo in the example folder, it returns an error shown below, which seems related to the new version of autograd with ODL. Do you have any hint on how to solve this issue? For example, installing an older version of pytorch (or ODL ) can help?? Thanks in advance.
(base) zan@pc-kw-60:~/projects/dival/dival/examples$ python3 ct_diptv.py
DIP: 0%| | 0/17000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "ct_diptv.py", line 36, in
reco = reconstructor.reconstruct(obs)
File "/home/zan/anaconda3/lib/python3.7/site-packages/dival/reconstructors/reconstructor.py", line 529, in reconstruct
reconstruction = super().reconstruct(observation, out=out)
File "/home/zan/anaconda3/lib/python3.7/site-packages/dival/reconstructors/reconstructor.py", line 198, in reconstruct
reco = self._reconstruct(observation)
File "/home/zan/anaconda3/lib/python3.7/site-packages/dival/reconstructors/dip_ct_reconstructor.py", line 154, in _reconstruct
loss = criterion(self.ray_trafo_module(output),
File "/home/zan/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/zan/anaconda3/lib/python3.7/site-packages/odl/contrib/torch/operator.py", line 397, in forward
results.append(self.op_func(x_flat_xtra[i]))
File "/home/zan/anaconda3/lib/python3.7/site-packages/torch/autograd/function.py", line 149, in call
"Legacy autograd function with non-static forward method is deprecated. "
RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)
I have successfully download the test and validation data. But when trying to download the train data, HTTP errors: 429 Too Many Requests always comes out after some time(usually 3~4GB data is done). What should I do?
Connecting to zenodo.org (zenodo.org)|137.138.76.77|:443... connected.
HTTP request sent, awaiting response... 429 Too Many Requests
2021-06-10 16:42:09 ERROR 429: Too Many Requests.
No URLs found in https://zenodo.org/api/files/cfd986de-367d-4a04-be96-b9ea84cd3690/ground_truth_train.zip.
Hello!
Thanks for this amazing work, it looks like tons of care have been put in building this library.
Unfortunately, it seems I am unable to even use the datasets functionality. I have tried twice the following:
from dival import get_standard_dataset
dataset = get_standard_dataset('lodopab')
And in every case it manages to download the first six files, but when it gets to downloading file 7/9: 'patient_ids_rand_test.csv', it crashes with a key error:
BTW I also tried with:
import dival
dival.datasets.lodopab_dataset.download_lodopab()
And got almost the same error:
Any clues about how to fix this? Thanks!
Hi,
I wanted to use dival
to get access to your standard datasets (ellipses
and lodopab
).
I installed divial
via pip install dival
.
Then I executed the following lines:
import dival
ellipses = dival.get_standard_dataset('ellipses')
And got this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-6beeed4e0fd9> in <module>
----> 1 ellipses = dival.get_standard_dataset('ellipses')
~/Programs/miniconda3/envs/pI/lib/python3.7/site-packages/dival/datasets/standard.py in get_standard_dataset(name, **kwargs)
130
131 impl = kwargs.pop('impl', 'astra_cuda')
--> 132 ray_trafo = odl.tomo.RayTransform(space, geometry, impl=impl)
133
134 def get_reco_ray_trafo(**kwargs):
~/Programs/miniconda3/envs/pI/lib/python3.7/site-packages/odl/tomo/operators/ray_trafo.py in __init__(self, domain, geometry, **kwargs)
381 super(RayTransform, self).__init__(
382 reco_space=domain, proj_space=range, geometry=geometry,
--> 383 variant='forward', **kwargs)
384
385 def _call_real(self, x_real, out_real):
~/Programs/miniconda3/envs/pI/lib/python3.7/site-packages/odl/tomo/operators/ray_trafo.py in __init__(self, reco_space, geometry, variant, **kwargs)
150 raise ValueError('`impl` {!r} not understood'.format(impl_in))
151 if impl not in _AVAILABLE_IMPLS:
--> 152 raise ValueError('{!r} back-end not available'.format(impl))
153
154 # Cache for input/output arrays of transforms
ValueError: 'astra_cuda' back-end not available
Would you recommend to download the data via zenodo instead?
Cheers
Hi!
I am wondering if there is a recommended way of adding various levels of noise to sinogram data? A naive way seems to be the following:
from dival import get_standard_dataset
import numpy as np
import matplotlib.pyplot as plt
dataset = get_standard_dataset('lodopab')
val_data = dataset.get_data_pairs('validation')
sample_at = 0
sinogram, single_gt = val_data[sample_at]
poisson_lam = 0.00001 # noise level
noisy_sinogram = (sinogram.data + np.random.poisson(poisson_lam, sinogram.shape))
fig, ax = plt.subplots(1, 2, figsize=(10, 10))
ax[0].imshow(sinogram.data)
ax[1].imshow(noisy_sinogram)
However, this naive way is perhaps not ideal: If I understood correctly, the sinogram
in val_data[sample_at]
has already been corrupted once during the construction of lodopab dataset, as it was mentioned that "Poisson noise corresponding to 4096 incident photons per pixel before attenuation is applied to the projection data". Thus, the superposition (sinogram.data + np.random.poisson(poisson_lam, sinogram.shape))
seems to be another round of noise adding upon the noisy data. My questions are the following:
Thanks a lot!! Also, big congrats for the paper published in the Journal of Imaging!
Hi,
Thank you for providing this awesome resource!
I am looking at the LoDoPaB dataset and have a question regarding the intensity scaling of the reconstructions. I am comparing the ground-truth image to the noisy reconstruction and they seem to have completely different intensity ranges.
I run the following code to obtain the described results:
import dival
import numpy as np
lodo = dival.get_standard_dataset('lodopab')
rec = dival.get_reference_reconstructor('fbp', 'lodopab')
x, y = lodo.get_sample(0)
rec_x = rec.reconstruct(x)
def stats(name, d):
print('{}: min={}; mean={}; max={}'.format(name, np.min(d), np.mean(d), np.max(d)))
stats('y',y)
stats('reconstruction x', rec_x)
Which produces this output:
y: min=0.0; mean=0.13288576900959015; max=0.48916396498680115
reconstruction x: min=-3.407857112822099e-12; mean=5.093645644160816e-11; max=1.6748728792759238e-10
The reconstruction intensities are much smaller. I am wondering where this difference comes from and how I could obtain reconstructions which are in the same intensity range as the ground truth.
Note: Visually the reconstructions look sound.
Left reconstruction and right GT.
Hi Johannes,
I met a curious problem when submitting the prediction zip file to the challenge website https://lodopab.grand-challenge.org/ After I uploaded the prediction zip file and pressed the "save" button, an error "Predictions file is too large" occurs. This seems unusual because previously I made several submissions, where the prediction files were generated with the same piece of code. The zip file that I wish to submit is 1.9 GB in size (is this a normal size?). Could there be a problem with the challenge website? Thank you!
Cheers,
Tianlin
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.