Giter Club home page Giter Club logo

kits19's Introduction

NEW: The KiTS23 Challenge is Underway!

See the KiTS23 Homepage for more details, including:

  • A larger dataset
  • Additional contrast phases

KiTS19

The official 2019 KiTS Challenge repository.

Usage

To get the data for this challenge, please clone this repository (~500MB), and then run get_imaging.py. For example

git clone https://github.com/neheller/kits19
cd kits19
pip3 install -r requirements.txt
python3 -m starter_code.get_imaging

This will download the much larger and static image files from a separate source. The data/ directory should then be structured as follows

data
├── case_00000
|   ├── imaging.nii.gz
|   └── segmentation.nii.gz
├── case_00001
|   ├── imaging.nii.gz
|   └── segmentation.nii.gz
...
├── case_00209
|   ├── imaging.nii.gz
|   └── segmentation.nii.gz
└── kits.json

We've provided some basic Python scripts in starter_code/ for loading and/or visualizing the data.

Loading Data

from starter_code.utils import load_case

volume, segmentation = load_case("case_00123")
# or
volume, segmentation = load_case(123)

Will give you two Nifty1Images. Their shapes will be (num_slices, height, width), and their pixel datatypes will be np.float32 and np.uint8 respectively. In the segmentation, a value of 0 represents background, 1 represents kidney, and 2 represents tumor.

For information about using a Nifty1Image, see the Nibabel Documentation (Getting Started)

Visualizing Data

The visualize.py file will dump a series of PNG files depicting a case's imaging with the segmentation label overlayed. By default, red represents kidney and blue represents tumor.

From Bash:

python3 starter_code/visualize.py -c case_00123 -d <destination>
# or
python3 starter_code/visualize.py -c 123 -d <destination>

From Python:

from starter_code.visualize import visualize

visualize("case_00123", <destination (str)>)
# or
visualize(123, <destination (str)>)

Voxel Spacing

Each Nift1Image object has an attribute called affine. This is a 4x4 matrix, and in our case, it takes the value

array([[0.                          , 0.                      , -1*captured_pixel_width , 0. ],
       [0.                          , -1*captured_pixel_width , 0.                      , 0. ],
       [-1*captured_slice_thickness , 0.                      , 0.                      , 0. ],
       [0.                          , 0.                      , 0.                      , 1. ]])

This information is also available in data/kits.json. Since this data was collected during routine clinical practice from many centers, these values vary quite a bit.

Since spatially inconsistent data might not be ideal for machine learning applications, we have created a branch called interpolated with the same data but with the same affine transformation for each patient.

array([[ 0.        ,  0.        , -0.78162497,  0.        ],
       [ 0.        , -0.78162497,  0.        ,  0.        ],
       [-3.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])

Labeling Errors

We've gone to great lengths to produce the best segmentation labels that we could. That said, we're certainly not perfect. In an attempt to strike a balance between quality and stability, we've decided on the following policy:

If you find an problem with the data, please submit an issue describing it.

Challenge Results and References

The KiTS19 challenge was held in conjunction with MICCAI 2019 in Shenzhen, China. The official leaderboard for the challenge can be found here, and the live leaderboard for new submissions can be found on grand-challenge.org.

A paper describing the results and conclusions of the challenge has been accepted at Medical Image Analysis. For further reading, an in-depth description of how the dataset was collected an annotated can be found on arxiv. If this data is useful to your research, please cite these papers as

@article{heller2020state,
  title={The state of the art in kidney and kidney tumor segmentation in contrast-enhanced CT imaging: Results of the KiTS19 Challenge},
  author={Heller, Nicholas and Isensee, Fabian and Maier-Hein, Klaus H and Hou, Xiaoshuai and Xie, Chunmei and Li, Fengyi and Nan, Yang and Mu, Guangrui and Lin, Zhiyong and Han, Miofei and others},
  journal={Medical Image Analysis},
  pages={101821},
  year={2020},
  publisher={Elsevier}
}
@article{heller2019kits19,
  title={The kits19 challenge data: 300 kidney tumor cases with clinical context, ct semantic segmentations, and surgical outcomes},
  author={Heller, Nicholas and Sathianathen, Niranjan and Kalapara, Arveen and Walczak, Edward and Moore, Keenan and Kaluzniak, Heather and Rosenberg, Joel and Blake, Paul and Rengel, Zachary and Oestreich, Makinna and others},
  journal={arXiv preprint arXiv:1904.00445},
  year={2019}
}

kits19's People

Contributors

neheller avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kits19's Issues

PermissionError

when I was running "python -m starter_code.get_imaging" to get the data
it lead to an ERROR
"PermissionError: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'D:\kits19-master\starter_code\temp.tmp' -> 'data\case_00000\imaging.nii.gz'"
but I have run nothing but this

can I have another way to get these data?such as github?

Problems with cysts

Dear organizers,
Thanks for organizing the excellent challenge.

I found that some renal tumors and renal cysts seem very similar. I wonder if the tumor segmentation will be confused if the cyst is labeled as normal kidney?

Statistical comparisons for top solutions

Dear @neheller,

For the ranking scheme,

Teams will be ranked by their average score per case, with the highest being the winner.

Would it be possible to add pairwise statistical comparisons between top solutions as BraTS, Decathlon and iSeg did?

This may help to identify the real best solution because a small performance improvement can't indicate particular dominance of a method over the others.

Thanks for organizing the great challenge, and I really enjoy it.
Looking forward to your reply!

Best,
Jun

Constant data-spacing

React with a 👍 if you would find useful a separate branch(es) with the data and labels transformed to fixed pixel_width and slice_thickness.

Information about kits.json

Hi, @neheller ,

Thanks for organizing this great challenge, from which I have learned a lot.

I wonder to know when we could have access to the attributes information mentioned in the preprint, such as birth year, nephrectomy year, gender and so on.

Best,
Jet

How to rotate the image to standard view?

Dear organizers,
Thanks for organizing the excellent challenge.

I want to rotate the image to the standard view and write the code, but it does not help.
Would it be possible for you to give me some instructions?

Following is my code.

import numpy as np
import nibabel as nb
import os

filepath = r'H:\Data\kits19\data'
savepath = r'H:\Data\kits19\preData'
if os.path.exists(savepath) is False: os.mkdir(savepath)

STD_AXCODES = ('S', 'A', 'R')
def do_reorientation(data_array, init_axcodes, final_axcodes):
    """
    Performs the reorientation (changing order of axes).

    :param data_array: Array to reorient
    :param init_axcodes: Initial orientation
    :param final_axcodes: Target orientation
    :return data_reoriented: New data array in its reoriented form
    """
    ornt_init = nb.orientations.axcodes2ornt(init_axcodes)
    ornt_fin = nb.orientations.axcodes2ornt(final_axcodes)
    if np.array_equal(ornt_init, ornt_fin):
        return data_array
    if np.any(np.isnan(ornt_init)) or np.any(np.isnan(ornt_fin)):
        raise ValueError
    try:
        ornt_transf = nb.orientations.ornt_transform(ornt_init, ornt_fin)
        data_reoriented = nb.orientations.apply_orientation(data_array, ornt_transf)
    except (ValueError, IndexError):
        raise ValueError

    return data_reoriented


filenames = os.listdir(filepath)
i = 0
for name in filenames[0:1]:
    img_nii = nb.load(os.path.join(filepath, name+'\\imaging.nii.gz'))
    img_data = img_nii.get_data()
    
    label_data = nb.load(os.path.join(filepath, name+'\\segmentation.nii.gz')).get_data()
    # reorientation to standard view
    _axcodes = tuple(nb.aff2axcodes(img_nii.affine)) # affine is a 4*4 matrix
    img_data = do_reorientation(img_data, _axcodes, STD_AXCODES)
    label_data = do_reorientation(label_data, _axcodes, STD_AXCODES)
    
    img_save_name = 'kidneyvol-' + str(i) + '.nii.gz'
    label_save_name = 'kidneymask-'+ str(i) + '.nii.gz'
#    new_affine = np.zeros_like(img_nii.affine)
#    new_affine[0][0] = img_nii.affine[2][2]
#    new_affine[2][2] = img_nii.affine[0][0]
    # save results
    save_img_nii = nb.Nifti1Image(img_data, img_nii.affine, img_nii.header)
    nb.save(save_img_nii, os.path.join(savepath, img_save_name))
    
    save_label_nii = nb.Nifti1Image(label_data, img_nii.affine, img_nii.header)
    nb.save(save_label_nii, os.path.join(savepath, label_save_name))

Results:
Markdown

Looking forward to your reply!
Best regards,
Edward

Data download issue

Cloning into 'kits19'...
remote: Enumerating objects: 3890, done.
remote: Counting objects: 100% (3890/3890), done.
remote: Compressing objects: 100% (3853/3853), done.
remote: Total 3890 (delta 48), reused 3867 (delta 32), pack-reused 0
Receiving objects: 100% (3890/3890), 470.30 KiB | 866.00 KiB/s, done.
Resolving deltas: 100% (48/48), done.
Downloading data/case_00000/imaging.nii.gz (226 MB)
Error downloading object: data/case_00000/imaging.nii.gz (cdae5f3): Smudge error: Error downloading data/case_00000/imaging.nii.gz (cdae5f3e0fbc7c98ab0430b3de42677abda2c1cf93ae0f86bd29fb8606688cb7): batch response: This repository is over its data quota. Purchase more data packs to restore access

Errors logged to /content/workspace/kits19/.git/lfs/objects/logs/20190606T062731.240685733.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: data/case_00000/imaging.nii.gz: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry the checkout with 'git checkout -f HEAD'

It shows the repository data quota was run out...

Regarding predictions submission

I see the rules say I need to submit files in .nii format.
I used .tif (converted .nii to .tif) to run 2D U-net to generate prediction scores. Is it possible to submit .tif based test predictions inasted of .nii?

ImageFileError

from starter_code.utils import load_case

volume, segmentation = load_case("case_00125")

throws the error : ImageFileError: Cannot work out file type of "/Users/divyanshuaggarwal/Desktop/kits19-master/data/case_00123/imaging.nii.gz"

Holes in annotations of large kidney tumors

I think this was not reported before; I noticed that cases 42 and 114 have a lot of holes in the annotations of the kidney tumor. Note that both cases have large kidney tumors, I did not see this problem on small kidney tumors. See image below of case 42:

kidney_tumor_holes

how can I find training, test and validation data?

Somehow I downloaded all 300 files in one folder. How can I separate the train set from the test set? Also, any light on how the top teams generated the validation set (was it random split, 70:30 split?) would be greatly helpful? thanks

Full Installation Guide

I think it's worth mentioning that when cloning the repository directory with git clone https://github.com/neheller/kits19, the segmentations are not always cloned.

I ended having to run

git lfs install --skip-smudge --skip-repo <- also added this
git clone https://github.com/neheller/kits19
cd kits19
git lfs pull <- added this

to get the segmentations to install fully.

Just thought it's something worth noting on the README. Wonder if this was true for other people as well?

code error !!!

usage: ipykernel_launcher.py [-h] -c CASE_ID -d DESTINATION
[-u UPPER_HU_BOUND] [-l LOWER_HU_BOUND]
ipykernel_launcher.py: error: the following arguments are required: -c/--case_id, -d/--destination

Extracting images from NIFTI

Starter code works fine and some minor details were cleared by looking at nibabel docs, but now I have a numpy array of float64 with values in +/- thousands without a clear understanding of what to do next: do I cast it to uint8, do I normalize it to [0;1] or maybe it's int16 image. Could you please clarify?

Some cases have common error

Thanks for sharing the large-scale dataset.

I open these nii.gz files via ITK-SNAP, and found the following cases having common error: case_00015, case_00025, case_00061 and case_00117, repeating some slices containing kidney and tumor, but without corresponding mask 'kidney' and mask 'tumor'.

I recommend you remove these slices in the above three cases.

Is spacing and direction of amended training data correct?

3D Slicer shows wrong Image Spacing and Three Plane as following (Please watch inside the borders).
wrong_image

I fixed Image Spacing and Direction by SimpleITK.

import SimpleITK as sitk
image = sitk.ReadImage('./data/case_00000/imaging.nii.gz')
image.SetDirection((0,0,1,0,1,0,-1,0,0))
image.SetSpacing((0.5, 0.919921875, 0.919921875)) # this values from data/kits.json
sitk.WriteImage(image, './data/case_00000/imaging_fix.nii.gz')

3D Slicer shows correct Image Spacing and Three Plane as following.
correct_image

I checked only Case 00000 and 00001, but other cases should be wrong too.

Thank you.

Test set submission instructions are unclear

Hi there,
I took this from Rules -> Submission format on the KiTS Homepage. Not sure if a github issue is the correct place of asking this, so please feel free to redirect me.

Predictions should be submitted as a zip (your-teamname.zip) archive of .nii.gz files named prediction_00210.nii.gz, ..., prediction_000299.npy corresponding to the ordered list of 90 test cases. That is, you should generate each of these prediction files in a particular directory, and then from that directory, run the following command:
zip your-teamname.zip case_*.nii.gz

The name of the predicted segmentations is all over the place. First it's prediction_XXXXX.nii.gz, then the next one is prediction_XXXXXX.npy and finally in the last line it is case_??.nii.gz. Would be great if you could update this description.

Thank you so much for organizing this challenge and putting all the effort into this! It's a really nice dataset and also quite nicely annotated, despite the occasional hiccup people seem to have noticed regarding some of the cases!

Best,
Fabian

Cannot downloading data

I have cloned without git lfs install and this comes:
$ git clone https://github.com/neheller/kits19.git
Cloning into 'kits19'...
remote: Enumerating objects: 679, done.
remote: Counting objects: 100% (679/679), done.
remote: Compressing objects: 100% (655/655), done.
Receiving objects: 74%remote: Total 679 (delta 19), reused 679 (delta 19), pack-reused 0
Receiving objects: 88% (598/679), 60.00 KiB | 57.00 KiB/s
Receiving objects: 100% (679/679), 93.87 KiB | 65.00 KiB/s, done.
Resolving deltas: 100% (19/19), done.
After this , there is no respond and any Filtering content process.

when i cloned with git lfs install, i find that each nii file is only 1kb,

Have you solved this problem? Can anyone share this dataset on google drive or baidu drive?
3ks alot!

CID 155 cyst mislabeled as background?

I found the left kidney of cid=155 has a cyst-like blob labeled as background near the kidney hilum.

image

Is this not a cyst (which should be labeled 0) or a tumor(1)?
I would also appreciate if organizers share us how annotators tell this from cysts and tumors.

Thanks in advance.

Python script to download Test Files (case_00210 to case_00299)

import os
import requests
from tqdm import tqdm

dataDir = r"" # The target dir you want to store the downloaded files
caseNamePattern = "case_%05d"
dataUrl = "https://media.githubusercontent.com/media/neheller/kits19/master/data/"
imageUrlPattern = dataUrl + caseNamePattern + "/imaging.nii.gz"

#segmentationUrlPattern = dataUrl + caseNamePattern + "/segmentation.nii.gz"

for case in tqdm(range(210, 300)): # Use tqdm_notebook in jupyter notebook to show the download progress
caseDir = os.path.join(dataDir, caseNamePattern%case)
print(caseDir)
os.mkdir(caseDir)
imageUrl = imageUrlPattern%case
imageR = requests.get(imageUrl)
with open(os.path.join(caseDir, "imaging.nii.gz"), 'wb') as f:
f.write(imageR.content)

kits data download script.txt

downloaded data's size only 1kb

when use the comand git clone https://github.com/neheller/kits19 Cloning into 'kits19'... remote: Enumerating objects: 3876, done. remote: Counting objects: 100% (3876/3876), done. remote: Compressing objects: 100% (3843/3843), done. remote: Total 3876 (delta 41), reused 3855 (delta 28), pack-reused 0 Receiving objects: 100% (3876/3876), 468.09 KiB | 70.00 KiB/s, done. Resolving deltas: 100% (41/41), done. Checking connectivity... done.

the imaging.nii.gz size only 1kb

data download

how to fix this error and continue downloading

Error downloading object: data/case_00008/imaging.nii.gz (ce45627): Smudge error: Error downloading data/case_00008/imaging.nii.gz (ce45627ceffdcd225655132ae50d9d1e7661d1c41f171e3419bb9ebd286e14d8): cannot write data to tempfile "/data/kits19/.git/lfs/incomplete/ce45627ceffdcd225655132ae50d9d1e7661d1c41f171e3419bb9ebd286e14d8.tmp": LFS: unexpected EOF

Errors logged to /data/kits19/.git/lfs/logs/20190704T230255.737132676.log
Use git lfs logs last to view the log.
error: external filter git-lfs smudge -- %f failed 2
error: external filter git-lfs smudge -- %f failed
fatal: data/case_00008/imaging.nii.gz: smudge filter lfs failed

Data Record, KiTS.json file

Only 3 fields are present in KiTS.json. But in the research paper more than 50 attributes are described.

for each case the json file has only:

{
"case_id": "case_00209",
"captured_pixel_width": 0.78125,
"captured_slice_thickness": 5
}

Visualizing the data fails

The get_data() method in visualize.py fails with this Trace:
Traceback (most recent call last):
File "/home/jp/kits19/starter_code/visualize.py", line 188, in
plane=args.plane
File "/home/jp/kits19/starter_code/visualize.py", line 82, in visualize
vol = vol.get_data()
File "/usr/local/lib/python3.6/dist-packages/nibabel/deprecator.py", line 162, in deprecated_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/nibabel/dataobj_images.py", line 208, in get_data
data = np.asanyarray(self._dataobj)
File "/home/bubu/.local/lib/python3.6/site-packages/numpy/core/_asarray.py", line 138, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
File "/usr/local/lib/python3.6/dist-packages/nibabel/arrayproxy.py", line 393, in array
arr = self._get_scaled(dtype=dtype, slicer=())
File "/usr/local/lib/python3.6/dist-packages/nibabel/arrayproxy.py", line 360, in _get_scaled
scaled = apply_read_scaling(self._get_unscaled(slicer=slicer), scl_slope, scl_inter)
File "/usr/local/lib/python3.6/dist-packages/nibabel/arrayproxy.py", line 339, in _get_unscaled
mmap=self._mmap)
File "/usr/local/lib/python3.6/dist-packages/nibabel/volumeutils.py", line 523, in array_from_file
n_read = infile.readinto(data_bytes)
File "/usr/lib/python3.6/gzip.py", line 276, in read
return self._buffer.read(size)
File "/usr/lib/python3.6/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.6/gzip.py", line 471, in read
uncompress = self._decompressor.decompress(buf, size)
zlib.error: Error -3 while decompressing data: invalid code lengths set

CID 166 hydronephrosis mislabeled as cyst?

Dear organizers,
thank you for replying to issue #10 !

I found another suspicious ground truth CID=166, in which a water-value blob in the right kidney is labeled 0. When tracking the blob downwards, it looks like a part of the ureter, and thus it might be appropriate to exclude it from the "strict kidney" labeling. The cyst-like appearance might be a result of hydronephrosis instead of a parapelvic cyst.

image

Thank you in advance.

No module named starter_code

sayantan@kali:~/kits19$ python3 starter_code/visualize.py -c case 123 -d /home/sayantan/Desktop/
Traceback (most recent call last):
File "starter_code/visualize.py", line 7, in
from starter_code.utils import load_case
ModuleNotFoundError: No module named 'starter_code'

Test set optimization

Within 24 hours of submitting, you will receive an email prompting whether or not you would like to hear your score. Scores for each team will provided only twice, but you may keep submitting after receiving two scores. The most recent submission prior to the deadline will be the one used for the competition.

Erm... Now that is something I don't like to see. Can you please at least add Gaussian Noise (or any other type of noise that prevents us from knowing the exact Dice scores) to the reported values? Anything that obfuscates +- 1 dice point would be greatly appreciated.

I apologize if I appear a little rude, but optimizations on the test set really should be avoided. Someone could create more than one team and get plenty of feedback.

Best,
Fabian

Lack of knowledge in kits.json file

In the 'The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository' paper N. Heller et al. said that there is information about a lot of things but we have the only pixel width and slice thickness. We don't have 'pathology t stage' – 'pathology n stage' and 'pathology m stage' etc. Can you please share others too?
Thanks

Dimention (width) of case_00160 is not same

The width of case_00160 is not same as others i.e. (*, 512, 796), thus it may return error while input in Neural network.

The dimension of all the other cases is (*, 512, 512)

  • *means variable depth of nifti image

is this label correct? on case 00015

hi,

i found that in the interpolated scan of case 00015, the mask label 1 signifying kidney is extending eyond the kidney region, please see the image below

Screenshot 2019-05-02 at 8 33 28 PM

Screenshot 2019-05-02 at 8 33 53 PM

is this label correct

thanks

Problem about downloading data

I've installed git lfs successfully and it seems that I could also clone the git repo successfully.
While the size of each .nii file I download is only "134 bytes".
Could anyone help me.
Thanks!

Here's the output:
~/Desktop/Research/Github/kits19$ git clone https://github.com/neheller/kits19
Cloning into 'kits19'...
remote: Enumerating objects: 676, done.
remote: Counting objects: 100% (676/676), done.
remote: Compressing objects: 100% (671/671), done.
remote: Total 676 (delta 17), reused 638 (delta 0), pack-reused 0
Receiving objects: 100% (676/676), 92.89 KiB | 0 bytes/s, done.
Resolving deltas: 100% (17/17), done.
Checking connectivity... done.
Checking out files: 100% (429/429), done.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.