Hi, julian, I am trying to build a nodule detector based on you job, and thanks ve

Issue about the negative data and label about kaggle_ndsb2017 HOT 6 CLOSED

juliandewit commented on May 21, 2024

Issue about the negative data and label

from kaggle_ndsb2017.

Comments (6)

juliandewit commented on May 21, 2024

Hello.. candidatesv2 are also negative examples. (there are around 400.000 negatives there)
Basically that is the most important source of negatives. The edge examples only let the network know that non-lung-tissue is also not a lung nodule.
Another (small) source of negatives are the false positives that were predicted after one round of training on LUNA16.

The networks learns 2 things at once
1: Lung nodule y/n.. (non lung tissue should always be n)
2: Malignancy (0 if not a lung nodule, 0.1-25 if lung nodule).

I train/predict 32x32x32 cubes. The prediction is nodule Y/N.
If Yes then I also look at the malignancy..
Malignancy is the only thing I work with for the final prediction.

I hope that makes things clearer.
It's quite a complex solution with all the different label sources.

from kaggle_ndsb2017.

ypflll commented on May 21, 2024

Quite clear and really a complicated and refined work..

But the question is where is candidate v2 from?
In step1_preprocess_luna16.py, seems that you generate your negative samples from two files: lidc.xml and annotation_excluded.csv（it's candidate.csv?）.
So, where they are from?
If I don't have such files in my case, I should cut lung-tissue cubes randomly (does not contain a nodule) manually as negative samples?

from kaggle_ndsb2017.

juliandewit commented on May 21, 2024

In the resources folder there is a link to "resources.rar" in the readme.md.
This file contains all the data you need and even more.

In the resources.rar there is a folder "luna16_annotations".
In that folder there is candidatesv2.csv.
This file is directly taken from the LUNA16 competition.
Look here for more:
LUNA16 data

from kaggle_ndsb2017.

ypflll commented on May 21, 2024

Got it.
I am in Tianchi (a competition held by Alibaba, China). In my case, only nodules information were given.
Seems that I need train a 3d unet to generate false positive samples firstly.

from kaggle_ndsb2017.

juliandewit commented on May 21, 2024

Hi I looked at the competition..
My chinese is not too good :S

I do think this approach can be translated to that competition since the #6 team of the datascience bowl is #1 now at your competition.

Good luck!

from kaggle_ndsb2017.

ypflll commented on May 21, 2024

Tianchi's english version lacks important information-_-||

4th place in kaggle is now the first place in Tianchi.
So we need to do more.
Thanks.

from kaggle_ndsb2017.

Recommend Projects

Issue about the negative data and label about kaggle_ndsb2017 HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent