I downloaded ISIC2019 running python3 dataset_creation_scrip

Thank <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Here is a dump of my environment: <div class="snippet-clipboard-content notranslat

Can you inspect the image visually by doing: <div class="highlight highlight-sourc

Issue resizing ISIC2019 about flamby HOT 8 CLOSED

philipco commented on May 28, 2024

Issue resizing ISIC2019

from flamby.

Comments (8)

jeandut commented on May 28, 2024

Thank @philipco for looking into this issue. That looks like a download error the fact that it is not handled means the logic needs to be improved to check integrity of all images. We will look into that next week !

from flamby.

jeandut commented on May 28, 2024

@philipco I cannot reproduce. Do you have enough space to hold the data ?

from flamby.

jeandut commented on May 28, 2024

I have tested commit b9f26aacab7383daff2c0a77504a3c11cdf570a0 with a fresh install.

from flamby.

jeandut commented on May 28, 2024

Here is a dump of my environment:

absl-py==1.0.0
albumentations==1.1.0
astor==0.8.1
attrs==21.4.0
autograd==1.4
autograd-gamma==0.5.0
cachetools==5.0.0
certifi==2021.10.8
cfgv==3.3.1
charset-normalizer==2.0.12
cloudpickle==2.0.0
cycler==0.11.0
dask==2022.5.0
dicom-numpy==0.6.2
distlib==0.3.4
efficientnet-pytorch==0.7.1
filelock==3.6.0
-e git+https://github.com/owkin/FLamby.git@b9f26aacab7383daff2c0a77504a3c11cdf570a0#egg=flamby
fonttools==4.33.3
formulaic==0.3.4
fsspec==2022.3.0
future==0.18.2
google-api-core==2.7.3
google-api-python-client==2.47.0
google-auth==2.6.6
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
googleapis-common-protos==1.56.0
grpcio==1.46.0
histolab==0.5.1
httplib2==0.20.4
identify==2.5.0
idna==3.3
imageio==2.19.1
importlib-metadata==4.11.3
iniconfig==1.1.1
interface-meta==1.3.0
joblib==1.1.0
kiwisolver==1.4.2
large-image==1.14.3
large-image-source-openslide==1.14.3
lifelines==0.27.0
locket==1.0.0
Markdown==3.3.7
matplotlib==3.5.2
networkx==2.8
nibabel==3.2.2
nodeenv==1.6.0
numpy==1.22.3
oauth2client==4.1.3
oauthlib==3.2.0
opencv-python-headless==4.5.5.64
openslide-python==1.1.2
packaging==21.3
palettable==3.3.0
pandas==1.4.2
partd==1.2.0
Pillow==9.1.0
platformdirs==2.5.2
pluggy==1.0.0
pre-commit==2.19.0
protobuf==3.20.1
psutil==5.9.0
py==1.11.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pydicom==2.3.0
PyDrive==1.3.1
pyparsing==3.0.8
pytest==7.1.2
python-dateutil==2.8.2
pytz==2022.1
PyWavelets==1.3.0
PyYAML==6.0
qudida==0.0.4
requests==2.27.1
requests-oauthlib==1.3.1
rsa==4.8
scikit-image==0.19.2
scikit-learn==1.0.2
scipy==1.8.0
six==1.16.0
tensorboard==2.9.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
threadpoolctl==3.1.0
tifffile==2022.5.4
tifftools==1.3.4
toml==0.10.2
tomli==2.0.1
toolz==0.11.2
torch==1.11.0
torchvision==0.12.0
tqdm==4.64.0
typing_extensions==4.2.0
uritemplate==4.1.1
urllib3==1.26.9
virtualenv==20.14.1
Werkzeug==2.1.2
wget==3.2
wrapt==1.14.1
zipp==3.8.0

Pillow is 9.10.0 can you check if you can visualize/open the image in order to narrow down the issue ? Maybe you can retry the download ?

from flamby.

jeandut commented on May 28, 2024

Can you inspect the image visually by doing:

from PIL import ImageFile, Image
ImageFile.LOAD_TRUNCATED_IMAGES = True
im = Image.open(path_to_faulty_image)

from flamby.

philipco commented on May 28, 2024

Hello Jean,

I retried the download. As before, I had the final validation message saying that the process is complete. But this time, the script python dataset_creation_scripts/resize_images.py ran correctly. Thus, I think that during my first tryn there has been an unraised download error.

Furthermore, I was wondering which sizes are supposed to have the data after the script? Indeed, I observe that centers might have different sizes:
Center 0: torch.Size([9930, 3, 224, 224])
Center 1: torch.Size([3163, 3, 224, 298])
Center 2: torch.Size([2691, 3, 224, 298])
Center 3: torch.Size([1807, 3, 224, 298])
Center 5: RuntimeError: stack expects each tensor to be equal size, but got [3, 224, 337] at entry 0 and [3, 224, 334] at entry 3
Center 5: torch.Size([351, 3, 224, 298])

In fed_isic2019/benchmarck.py, I see that there is a cropping: albumentations.RandomCrop(sz, sz) with sz = 200. I guess that it is a mandatory to load all the pictures with this cropping to get a size of [3, 200, 200]?

I would mention in the README the necessity to crop the image before loading them. May I also ask why you are doing a RandomCrop and not a CenterCrop?

Cheers.

from flamby.

jeandut commented on May 28, 2024

The preprocessing step fixes the image width to 224 as you can see while keeping aspect ratio intact (no hard resizing, which would impact the shape of the naevi).
We need to have a better default for the transform used in the dataset you are right so that it crops images by default. I'll open an issue about that.
RandomCrop is the data augmentation version of the CenterCrop to introduce more variability into the training images.

from flamby.

philipco commented on May 28, 2024

Thanks Jean! I think it is worth adding these details to the isic's readme.

from flamby.

Issue resizing ISIC2019 about flamby HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent