ncoudray / deeppath Goto Github PK

View Code? Open in Web Editor NEW

485.0 485.0 210.0 11.11 MB

Classification of Lung cancer slide images using deep-learning

Python 95.30% Shell 3.50% Starlark 0.88% R 0.32%

deeppath's People

Contributors

Stargazers

Watchers

Forkers

stevekm pnayebvali samanfrm kant mkim0710 ahn-github chizhou-siti xtmgah richardxie1119 zpeng1989 utayao jialab-ucr tsing-h polojacky sangyeoncho than9th yankongsjtu ieee820 anu-bioinfo saeedseyyedi jmoore12 gengyuanzhe batermj pablojrios monjoybme si-medbif mingrui decoderkurt jpatrickpark fendaq amiremadz yinxiwang vallurumk amoliu allensmile codeaudit amaurysabran decenwill sprapat filiperigueiro matinehakhlaghinia cuihandong ky66 hamdichau mikasarkinjain huazhou041 aihyvari iamrwbylover biomedicalmachinelearning wangbo00129 beckyhan miro1942 huangzhii chungyeh ayeaton liusiqi96 tingitngyan dxman364349866 p140748 xiaobovlog cyli2019 xuzf2016 amirunpri2018 qctian wgs666 wayneguo279 zhangzongliang docurdt xiao233333 seyedalirezafatemi astone-inc neal1202 shazha lsptb cassie07 liuwenhaha steermomo yushanlu jy00002 np-008 bpcarson zixunzuihao waqaas993 cgmeyer jasongweed bveerama byebaibai tontorino saiuz guyucowboy yonischirris fortitude94deng josoga2 hsfarahani amberwangsiwen yf817 qiaowei-vvjoe flywind2 jjpray lilab-sysu

deeppath's Issues

About labels_files.txt mentioned in Sorting. (0.3b multi-output prediction)

I'd like to know how the labels_files.txt looks like?

TCGA-38-4632 TP53
TCGA-67-3771 FAT4
TCGA-34-7107 EGFR

TCGA-38-4632 TP53
TCGA-38-4632 FAT4
TCGA-34-7107 EGFR
TCGA-34-7107 STK11
TCGA-34-7107 FAT1

TCGA-38-4632 TP53 STK11
TCGA-67-3771 FAT4 KRAS EGFR
TCGA-34-7107 EGFR

3-way classifier

Hi,
I have a question about the step of 3-way classifier.
when I run code as the example of lung cancer（run 0d_SortTiles.py --SourceFolder='E:/marklee/master/git/DeepPATH/DeepPATH_code/preprocessing/512px_Tiled' --Magnification=20.0 --MagDiffAllowed=0 --SortingOption=3 --PatientID=12 --nSplit 0 --JsonFile='D:/evi/downloads/lung10.29/metadata.cart.2019-10-29.json' --PercentTest=15 --PercentValid=15）,it will error as the following image.

Could you help me to solve this problem?Thank you very much.

Data augmentation

Thanks for much for your work in preparing this repository, it is an incredibly helpful resource. I was wondering whether you (or anyone reading) has experience combining this workflow with on-the-fly data augmentation (rotation/flip etc). There seems to be a reasonable body of evidence that augmentation improves performance in digital pathology tasks.

Best wishes,

Dr Sam Kleeman
Barts Cancer Insistute

Issue about Gene mutation prodiction

Hi,

We are doing relatively the same job as yours. In particular, we wanna reproduce the case studies in your Nature paper. I have several concerns about gene mutation prediction on LUAD. We have all data from TCGA and we are currently using pytorch. We used the same experimental setups you mentioned in your Nature paper. We tried 4 months but failed to obtain the model performance you demonstrated. After reviewing your paper, I found some strange points. Now, I would like to ask you.

Can you give us a comprehensive description of gene mutation prediction? Including all parameters and operations. The description in your Nature paper is not detailed so far for me. Cause we cannot use that to reproduce your work.
You mentioned that you used the label from TCGA and tested your model using the results given by LUAD classifier. I'm a little confused. Why you did not keep things consistent? To me, it makes sense that you use TCGA labels in training and testing.

Rowen
13/12/2019

How is the loss after transfer learning?

I am trying to use about 100 slides to fine tune the Inception model to go through all the steps, but the loss dropped very very slowly. Is it normal? Should I use more slides for the training? Expecting your reply, thank you very much!

How to get the ROC curve

Hi,
I have run the code 0h_ROC_MultiOutput_BootStrap.py,and get txt as following photo,how can I get the ROC curve？
Thanks,
Mark

Some questions about the dataset

Dear Nicolas,

Thanks for your brilliant work, high-quality codes, and detailed documents, these really helped me a lot. But I still have some concerns as follows, I hope could hear from you! Thanks!
1: The images from TCGA are not as clean as we expected, some of them are low quality, marked with pen by the doctors. Do the “dirty” images/tiles have bad effects on the model performance? Did you do some manually or automatically filtering for these dirty tiles?
2: The labels are not always correct for the tiles. As far as I know, the Tumour images always contain normal parts inside, and for the so-called normal images, they sometimes contain several cancerous regions. Did you conduct some special processes for these wrongly labelled tiles?
3: How is your AUC values calculated? Is it based on the predicted results of the tiles?
4: Have you evaluated the influence caused by the tile size?
5: How do you think the performance of applying your methodology to other types of cancer? Could you share some insights on this?

Thank you very much! Looking forward to hearing from you!
Best wishes!
D

Validation and testing questions

Hello,

I've run through all the steps except for generating the ROC curves, and have a few questions on interpreting results that I am hoping you can help me with.
More specifically to what I've done: I tiled all .svs files only for 5x magnification, and have only performed transfer learning (on Inception v3), for 10,000 batches of 30 tiles. Classifying for normal, LUAD and LUSC.

Running on a Google VM with a single K80 GPU.

During validation, nc_imagenet_eval.py appears to complete successfully, but reports the following for each image in the validation set:
ERROR: Object was never used (type <class 'tensorflow.python.framework.ops.Operation'>): <tf.Operation 'init' type=NoOp> If you want to mark it as used call its "mark_used()" method.
There are multiple reports of this error on stackoverflow, but no solution that I can find. Have you encountered this? I'm running tensorflow-gpu 1.13.1.
As I mentioned, validation does seem to complete successfully, generating these five AUC files:
::::::::::::::
valid_out2_AvPb_AUCs_1.txt
::::::::::::::
/mnt/disks/deeppath-data/Data/images/tilings/299Px0Ol25Bg5Mg/eval results/ 10000 auc 0.9758 CIs 0.9517 0.9942 t0.127599
::::::::::::::
valid_out2_AvPb_AUCs_2.txt
::::::::::::::
/mnt/disks/deeppath-data/Data/images/tilings/299Px0Ol25Bg5Mg/eval results/ 10000 auc 0.9493 CIs 0.9218 0.9712 t0.573981
::::::::::::::
valid_out2_AvPb_AUCs_3.txt
::::::::::::::
/mnt/disks/deeppath-data/Data/images/tilings/299Px0Ol25Bg5Mg/eval results/ 10000 auc 0.9641 CIs 0.9411 0.9815 t0.437919
::::::::::::::
valid_out2_AvPb_AUCs_macro.txt
::::::::::::::
/mnt/disks/deeppath-data/Data/images/tilings/299Px0Ol25Bg5Mg/eval results/ 10000 auc 0.9654 CIs 0.9455 0.9795
::::::::::::::
valid_out2_AvPb_AUCs_micro.txt
::::::::::::::
/mnt/disks/deeppath-data/Data/images/tilings/299Px0Ol25Bg5Mg/eval results/ 10000 auc 0.9496 CIs 0.9329 0.9632
Can you explain what these are about? I presume the first three files correspond to Normal, LUAD and LUSC respectively. What are the macro and micro files, and what are the CI and t values?
The file out_filename_Stats.txt that is generated by nc_imagenet_eval.py during testing seems to repeat the evaluation for each tile multiple times, e.g.:
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_3_4.dat True [0.02668606 0.7224843 0.20596845 0.04486124] 0.7422931346515897 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_3_5.dat True [0.01350667 0.9084061 0.06515231 0.01293499] 0.9208435613386187 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_3_6.dat True [0.02568119 0.62040573 0.24854586 0.10536721] 0.6367584581442096 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_4_6.dat False [0.03846446 0.29852733 0.4035099 0.25949836] 0.31046934641665214 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_3_4.dat True [0.0266861 0.722484 0.20596856 0.04486128] 0.7422929477823531 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_3_5.dat True [0.01350666 0.908406 0.0651524 0.01293498] 0.9208434861390423 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_3_6.dat True [0.02568119 0.62040573 0.24854586 0.10536721] 0.6367584581442096 labels: 1
test_TCGA-58-8386-11A-01-TS1.7276c7ee-44c5-45e8-a8cd-1e456dc4795f_4_6.dat False [0.03846446 0.29852733 0.4035099 0.25949836] 0.31046934641665214 labels: 1
... etc
The above file, for example, is repeated 75 times. What are these repetitions about? And can you explain the three values in brackets?

Thanks as always.
Bill

some bugs in your code

Well, in your code, you use rstrip to remove the suffix of the string, which may cause some problems. For example, imgRootName = imgRootName.rstrip('_files') of the file "0d_SortTiles.py".

>>> 'helle_files'.rstrip('_files')
'h'

In this case, the imgRootName will not in the jdata map , and the whole dir will not copy to the train-val-test directory.

Use 0b_tileLoop_deepzoom4.py to only get tile coordinates

Hi,
I am not quite familiar wih openslide code, so apologizes if this is kind of a trivial question. I would like to get only the coordinates where I could find tissue but I don't want to create tiles per se for the WSI. Could I use 0b_tileLoop_deepzoom4.py for it? Getting file names of the tiles would suffice.
Thanks

Add requirements.txt or config.yml

add a requirements.txt or config.yml of the venv you use to run all the code successfully. an example: TS v2 does not work as tf.app is deprecated, and e.g TS v1.15 does work.

PostProcessing issue

Hi. I used the following command to use Heatmap in this repository.

python3 0f_HeatMap_nClasses.py --image_file '/data/Genomic/asian_sort_512' --tiles_overlap 0 --output_dir '/data/Genomic/heatmap_sample' --tiles_stat '/data/Genomic/Manifest_Cancer10RandomSampling.txt' --resample_factor 10 --slide_filter 'TCGA-57-1993' --tiles_size 256

The contents of the tiles_stat parameter-compliant file are as follows:

TCGA-04-1348-01A-01-TS1.ffb07f65-72b7-494c-abf8-c94d8007321b.svs
TCGA-61-2610-02A-01-TS1.aade8dd8-10dc-446d-a06b-39baf5dc92d2.svs

However, when I run it, I get an error that the Heatmap_divider variable is not declared. What is the problem?

sub_dirs:
['/data/Genomic/asian_sort_512/cancer', '/data/Genomic/asian_sort_512/Solid_Tissue_Normal']
Traceback (most recent call last):
  File "0f_HeatMap_nClasses.py", line 577, in <module>
    main()
  File "0f_HeatMap_nClasses.py", line 491, in main
    skip = saveMap(HeatMap_divider, HeatMap_0, WholeSlide_0, SlideRootName, NewSlide, dir_name)
UnboundLocalError: local variable 'HeatMap_divider' referenced before assignment

Training the dataset using Folder name as image Labels

Hello,
I have read in your repo that TCGA dataset are not annotated(only metadata+svs images). So we divide/sort(as mentioned) into folders LUSC,LUAD,Normal_Solid_Tissue. How does the model learn without the annotations?

Forgive me if this is a trivial question: If there is no annotations then similar(tiles) features are clustered and then inferred. But here we are just classifying.

TFRecords are used to keep track of tiles with patient(full tissue image).

So how is the model learn or some TF libraries doing the job?

validation error - cannot generate "out_filename_Stats.txt"

Hi Nicholas,
I was running validation for classification of lung cancer into three types - normal, LUAD, and LUSC. The command was shown as in the two following screenshot:

However, there was an error message like below:

My training command is shown below using step size of 2300:
~/tools/DeepPATH-master/DeepPATH_code/01_training/xClasses/bazel-bin/inception/imagenet_train --num_gpus=4 --batch_size=32 --train_dir='/public/home/yangzhzh/projects/imaging/3_training/r1_results' --data_dir='/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_train' --ClassNumber=3 --mode='0_softmax' --NbrOfImages=923688 --save_step_for_chekcpoint=2300 --max_steps=230001

I figured the error likely results from "nc_imagenet_eval.py". So I submitted the job locally for one checkpoint only with command shown below:

python ~/tools/DeepPATH-master/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py --checkpoint_dir='/public/home/yangzhzh/projects/imaging/4_validation/r1_valid/tmp_checkpoints' --eval_dir='/public/home/yangzhzh/projects/imaging/4_validation/r1_valid' --data_dir='/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_valid' --batch_size 300 --run_once --ImageSet_basename='valid_' --ClassNumber 3 --mode='0_softmax' --TVmode='test'

And I got error message like below:
New Slide ------------ 0
label 3: label_3
tf_record_pattern:
/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_valid/valid_TCGA-60-2711-01A-01-BS1.2591160e-4240-41ba-a7ef-38f7b135313e_3.TFRecord
['/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_valid/valid_TCGA-60-2711-01A-01-BS1.2591160e-4240-41ba-a7ef-38f7b135313e_3.TFRecord']
tf_record_pattern:
/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_valid/valid_TCGA-60-2711-01A-01-BS1.2591160e-4240-41ba-a7ef-38f7b135313e_3.TFRecord
['/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_valid/valid_TCGA-60-2711-01A-01-BS1.2591160e-4240-41ba-a7ef-38f7b135313e_3.TFRecord']
/public/home/yangzhzh/projects/imaging/2_preprocessing/r1_TFRecord_valid/valid_TCGA-60-2711-01A-01-BS1.2591160e-4240-41ba-a7ef-38f7b135313e_3.TFRecord FAILED to be processed properly

In the "nc_imagenet_eval.py" script, there is a block of code which shows how this may be caused:

So I am guessing the error is likely because the following condition failed:
try:
precision_at_1, current_score = nc_inception_eval.evaluate(dataset)

Could you perhaps know why the valid_*.TFRecord failed to be processed properly?

Thanks very much.
Zhenzhen

Set mode '1_sigmoid' for classification of Normal, LUAD, LUSC?

If I want to training to classify Normal, LUAD, LUSC, do I need to set the training mode to '1_sigmoid'?
In the abstract of the paper. It states that average area under the curve (AUC) of 0.97 for classifying Normal, LUAD, LUSC. I want to know how you calculate AUC for multi-classification task, as I know AUC is for binary classification.
Is it like that calculate for following three AUC and average them?
(1) Normal vs LUAD and LUSC,
(2) LUAD vs Normal and LUSC,
(3) LUSC vs Normal and LUAD.
Why just simply applying softmax and select max prediction score for per-tile like most of classification task. Is there any other reason for using sigmoid for each output class?

The problem of images ,corresponding JSON and xml files download

Hi, sorry to interrupt.

As shown in the first image, I have tried a lot to download the file from https://portal.gdc.cancer.gov/legacy-archive/search/f.

When I try to test the code, 0.1 Tile the svs slide images works well, but it seems that 0.2 Sort the tiles into train/valid/test sets according to the classes defined fails, which is shown in the second image.

I think I downloaded the wrong images, xml, and JSON files when the website has changed a lot.

Could you help me to find the data you use in this program?

Thanks a lot for your kindness!

TCGA images and metadata file

How to resume training?

Hello again,
I've been working on reproducing your mutation results, running on a single puny Google VM with one GPU, fully training the model. However, after 300K batches, my process halted. I think Google had some kind of networking problem yesterday. So... I tried to resume training by setting the pretrained_model_checkpoint_path parameter to the model.ckpt-300000.data-00000-of-00001 intermediate checkpoint file, but without success. Do you know if it is possible to resume training from such a checkpoint?

BTW, even at this point I'm seeing an AUC of 0.76 for EGFR, though the other mutations look like a coin toss

I refined the mutation calling by excluding silent mutations. Here at ISB-CGC we have all the TCGA genomics data as well as pathology image metadata in BigQuery, so it was easy to create an SQL query to create a mutation calling manifest. We also have TCGA pathology images available in GCS, so I am pulling pathology image from GCS rather than from TCIA.

Did you find it necessary to fully train the model for mutation classification or was transfer learning sufficient?

Looking foward to your next paper.

Thanks as always,
Bill

Work with .bif .tif images from Ventana software

Hi! I am would like to start a project with digital slides scanned from Ventana (Roche). I saw that the preprossesing can manage only svs and jpg files.

Is there any way I could convert the files or modify the code to fit these images in your analysis pipeline?

Thanks!

Issue with tensorflow

When I try to run the "build_image_data" pre-processing step to convert the JPEG tiles into TFRecords, I get the following errors. For some reason, it says it's complete, but I don't know if all the files are actually there; in the last photo, you can see that it only goes up to 770 out of 1024.

Advice needed on mutation classification

Hi Nicolas,

I've done mutation classification training (from scratch) for about 500K batches at 30 tiles/batch. Running classification on test data shows the model seems to be recognizing EGFR pretty well (AUC ~ 0.82), but AUC for all the other mutations kind of dances around 0.5. However, the AUCs for the validation data set look, to my naive eye, more like what I would expect. Here is a chart:

Is this seeming lack of correlation between validation and testing results to be expected?
Am I just in the early stages of training? How many batches did you need to run when you trained for mutation classification?
Is your final checkpoint available somewhere? It would be useful in determining whether my testing/validation/viz steps are functioning as expected.

Regards,
Bill

Python & Packages version

Step 0.2 File Clarification

Hi,
I'm trying to run step 0.2 and sort my tiles. However, there was one step I am confused about.

"--SortingOption. In the current directory, create one sub-folder per class, and fill each sub-folder with train_, test_ and valid_ test files. Images will be sorted into classes depending on the sorting option."

What does this mean? What is my current directory, and what are the train_, test_, and valid_ files?
Sorry if this is a bad question; I am fairly new to this type of programming.

Issue with bazel-bin

I get the following error, which says I don't have the inception module:

Any idea how to fix this?

about the xml file

I download the data from this website, https://portal.gdc.cancer.gov/repository

But only the svs file is available. Where can i get the xml file, and what's the main use of the xml?
Thanks a lot for your kindness!

Normal tissue whole slide images

I cannot find them nor in the GDC page neither in the legacy one. Any idea where I could find them?

Thanks!

0f_HeatMap_nClasses.py parameterization

Hello,
I'm trying to run 03_postprocessing/0f_HeatMap_nClasses.py. The --image_file parameter is presumably the output directory of the tiling stage. In that directory, I find subdirectories named "TCGA_xxx_files". It looks like 0f_HeatMap_nClasses.py expects to find jpegs in those subdirectories. However those subdirectories only contain magnification specific subdirectories, e.g. "5.0" or "20.0". It's the magnification specific supdirectories that hold the jpegs. As a result, the code in lines 347-356 doesn't find the file that corresponds to each tile.

Am I missing something here? It's easy enough to change the code, but before I do that I thought I should ask if I'm not understanding something.
Thanks.

Help with training

Hi, I get this error when I do this step:
bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=30 --train_dir='output_directory' --data_dir='TFRecord_images_directory' --ClassNumber=3 --mode='0_softmax'

Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4]

Hello,
When I test the data using :
python3 /home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py --checkpoint_dir='/home/revanth/training' --eval_dir='/home/revanth/output_data' --data_dir="/home/revanth/test" --batch_size=10 --ImageSet_basename='test_' --run_once --ClassNumber 2 --mode='0_softmax' --TVmode='test'

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4]
[[Node: save/Assign_31 = Assign[T=DT_FLOAT, _class=["loc:@logits/logits/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](logits/logits/weights, save/RestoreV2:31)]]

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 268, in
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 64, in main
precision_at_1, current_score = nc_inception_eval.evaluate(dataset)
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/inception/nc_inception_eval.py", line 411, in evaluate
precision_at_1, current_score = _eval_once(saver, summary_writer, top_1_op, top_5_op, summary_op, max_percent, all_filenames, filename_queue, net2048, sel_end_points, logits, labels)
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/inception/nc_inception_eval.py", line 72, in _eval_once
saver.restore(sess, ckpt.model_checkpoint_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1759, in restore
err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [2048,3] rhs shape= [2048,4]
[[Node: save/Assign_31 = Assign[T=DT_FLOAT, _class=["loc:@logits/logits/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](logits/logits/weights, save/RestoreV2:31)]]

Caused by op 'save/Assign_31', defined at:
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 268, in
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/nc_imagenet_eval.py", line 64, in main
precision_at_1, current_score = nc_inception_eval.evaluate(dataset)
File "/home/revanth/DeepPATH/DeepPATH_code/02_testing/xClasses/inception/nc_inception_eval.py", line 402, in evaluate
saver = tf.train.Saver(variables_to_restore)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1281, in init
self.build()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1293, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1330, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 778, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 112, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/state_ops.py", line 216, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

I tried removing all the partially trained models, tried training with transfer learning and without too. When it comes to testing I get the above mentioned error. Do you know where the issue is?

Validation step does not output "out_filename_Stats.txt"

Code is run on aws ec2 p3.16xlarge instance, ami-004852354728c0e51.

There seems to be an issue with the output generated by the validation step when the following code is run. Code does not work if absolute path is used. Only works if relative path is used. This is true for preprocessing steps as well.

export CHECKPOINT_PATH='./t3_results'
export OUTPUT_DIR='./r1_valid'
export DATA_DIR='./TF_valid'
export LABEL_FILE='./labels.txt'
# create temporary directory for checkpoints
mkdir -p $OUTPUT_DIR/tmp_checkpoints
export CUR_CHECKPOINT=$OUTPUT_DIR/tmp_checkpoints
# check if next checkpoint available
declare -i count=1300
declare -i step=1300
declare -i NbClasses=2
 while true; do
	echo $count
	if [ -f $CHECKPOINT_PATH/model.ckpt-$count.meta ]; then
		echo $CHECKPOINT_PATH/model.ckpt-$count.meta " exists"
		# check if there's already a computation for this checkpoinmt
		export TEST_OUTPUT=$OUTPUT_DIR/test_$count'k'
		if [ ! -d $TEST_OUTPUT ]; then
			mkdir -p $TEST_OUTPUT
			
			ln -s $CHECKPOINT_PATH/*-$count.* $CUR_CHECKPOINT/.
			touch $CUR_CHECKPOINT/checkpoint
			echo 'model_checkpoint_path: "'$CUR_CHECKPOINT'/model.ckpt-'$count'"' > $CUR_CHECKPOINT/checkpoint
			echo 'all_model_checkpoint_paths: "'$CUR_CHECKPOINT'/model.ckpt-'$count'"' >> $CUR_CHECKPOINT/checkpoint
			# Test
                        echo "Now entering Imagenet_eval"
			python 02_testing/xClasses/nc_imagenet_eval.py --checkpoint_dir=$CUR_CHECKPOINT --eval_dir=$OUTPUT_DIR --data_dir=$DATA_DIR 
   --mode='0_softmax' --TVmode='test'
			# wait
			mv $OUTPUT_DIR/out* $TEST_OUTPUT/.
			# ROC
			export OUTFILENAME=$TEST_OUTPUT/out_filename_Stats.txt
                        echo "Now entering BootStrap"
			python 03_postprocessing/0h_ROC_MultiOutput_BootStrap.py --file_stats=$OUTFILENAME --output_dir=$TEST_OUTPUT 

		else
			echo 'checkpoint '$TEST_OUTPUT' skipped'
		fi
	else
		echo $CHECKPOINT_PATH/model.ckpt-$count.meta " does not exist"
		break
	fi
	# next checkpoint
	count=`expr "$count" + "$step"`
done

The error received:

Only one file for each checkpoint is being generated out_All_stats.txt. The 0h_ROC_MultiOutput_BootStrap.py script is trying to retrieve out_filename_Stats.txt but this file is never generated by nc_imagenet_eval.py.

Contents of directories:

data preprocessing

Hello Nicolas,
I have two questions regarding the data preprocessing step.
First is why the patient ID is 12. The image downloaded for the Lung cancer is for instance, "TCGA-44-6147-01B-05-BS5.B838E2DC-8869-4C72-9F1D-A066FF307579.svs".

The second question is how to get the ".json" file. The file i obtained for the lung cancer samples is only 1.8Mb, much smaller than the file posted under "example folder" in the software. The manifest file I obtained from TCGA website is the same as listed in the software.

Thanks for your time.
Zhenzhen

missing checkpoint data file

Hello
I am a graduate student at Syracuse University. Our team is working on a DL class project to connect Deep learning with disease diagnosis.
Your work is very inspiring to us but it seems that the checkpoint data file is missing. Unfortunately, we don't have enough computing power to retrain the model for 400k iteration. Would you mind sharing the checkpoint data with us for study purposes?

https://www.dropbox.com/request/igSH0wxH46VhrEboxyqz
I create a dropbox upload link here, I will be very grateful if you can share it with us.

UnboundLocalError: local variable 'HeatMap_divider' referenced before assignment

Thanks for your great work of DeepPATH, and I want to use the 0f_HeatMap_nClasses.py to generate the heatmap, however, I got the error "UnboundLocalError: local variable 'HeatMap_divider' referenced before assignment". And my input file is the same structure as your "out_filename_Stats.txt". and my “ directory_to_jpeg_classes” is a directory composed of two sub-directory which are two classes with the jpeg files in the sub-directory. Could you do me a favor?
NewestTestDataSetStat.txt

Checkpoint Data Missing

Hello!

I'm a graduate student doing a class project on what it would take to bring machine learning models from the lab to the clinic, and was hoping to leverage your work here as a useful case studies as a well built out model. We were hoping to use some of your existing models on some augmented data, but it seems like the actual data files are missing from the DeepPATH_code/example_TCGA_lung/checkpoints directory. Given the computational cost to retrain some of these models, would it be possible for you to share the complete checkpoints?

Removal of ~empty slides

In the preprint of your paper, you state that "The slides with a low amount of information
were removed, that is all the tiles where more than 50% of the surface is covered by background (where all the values are below 220 in the RGB color space). " I don't see this happening in the tiling code (v0b_tileLoop_deepzoom4.py).
Did I miss it, or did you determine that culling such images is not important?

Thanks for making this very interesting work available.

How do you install Bazel?

Having difficulty using bazel-bin. I've tried everything on the bazel website, still no luck.

Model checkpoints missing data file

I am trying to use the model checkpoints to try to predict on a new patch/image. Generally for each checkpoint we have three files: meta file, index file, data file. However for the checkpoints in path DeepPATH/DeepPATH_code/example_TCGA_lung/checkpoints/, all of them only contain meta file and index file but the data file containing model weights is missing.
Is it still possible to load the weights used in the paper after training for evaluation-- If yes, can you point me how to? or is it that the checkpoints are dummy and serve only as a purpose to be able to run the pipeline?

When I try to load the model, it says run1a_3D_classifier/model.ckpt-69000.data-00000-of-00001; No such file or directory

How do I evaluate a single TFRecord to determine the most likely label?

I'm at the point where I have the models (checkpoints) and an AUC that shows me which is the best checkpoint of the model. Let's say I just have 3 labels: normal, luad, and lusc and I have a new slide (svs file). Assuming I get that svs file to a TFRecord, how can I just evaluate the most likely label based on my model? i.e., I would like to run a script that points to my checkpoint, those three labels, and the directory of my TFRecord and produce a simple result with normal, luad, OR lusc (with perhaps a percentage likelihood).

(As an aside is there a better place to ask more beginner questions like these, e.g., stackoverflow?)

how can I get the mutation information for a patient? I have downloaded the related dataset.

I don't know how to get the mutation information for a patient ID like this:
TCGA-38-4632 TP53
TCGA-38-4632 FAT4
I can't find any mutation information in the file name, (TCGA-86-8279-01A-01-BS1.fc1b4518-c751-49cb-a782-e8c684fb0917.svs) or json file

Typo on example_TCGA_lung/checkpoints/README.md

The class label stated for run1a_3D_classifier and run2a_3D_classifier checkpoint is: Normal/LUAD/LUAD.
Isn't this should be: Normal/LUAD/LUSC?
Since these checkpoints are using class labels from example_TCGA_lung/labelref_r1.txt

Regards,
Xiao

How to select Hugo Symbol from mutation file?

Hi Nicolas,

I've downloaded the mdf file. Tiles are generated at 20X, 299 pixels, 0 overlap, 25% background threshold.

Currently, the Hugo Symbol is chosen by "Variant_Classification", I only select the Hugo Symbol which is in ['Missense_Mutation', 'Nonsense_Mutation', 'Frame_Shift_Del', 'Frame_Shift_Ins'], but the performance only slightly better than #37 .

I want to know the correct way to use the mutation file. Did you use all the Hugo Symbol that appears in the mdf file?

Many thanks!

The problem of the JSON file download.

I have tried a lot for downloading a right JSON file for the data split step.
As shown in the image, after I added all the slide images into the cart, I can download the "manifest" file as well as "metadata" file, but the content of the metadata file is not the same with you used, some keys are missed, but I can find them in the "Biospecimen" file or "Clinical" file. Could you help me to figure it out to obtain a corrected JSON file? Thank you very much!

Binary masks for svs files

Hi,

I am generating binary masks for .svs files. I see in your code that you could use them in .jpg or .dcm files to tile for ROI. Is there any reason it is not possible to do it in svs files?

Also, I do not completely understand to code name I should use. If I have a svs files names "Sample1.svs" the mask name should be "Sample1mask.jpg"?

Thanks

Configuration Requirement

Hi, I want to use your model to train my dataset. Is there any hardware requirement, such as GPU, CPU or Memory size? Everytime I want to test my trained model, my computer crashes. So could you please tell me your hardware parameters, particularly Memory size？ Thank you.

Tiling - can result in high FP?

When creating tiles (using TCGA metadata) into separate class folder names won't there be tiles having no cancerous cells? This could result in high False Positives. How is this issue solved?
If using annotated data dividing into tiles then each tile should also be associated with correct annotations. How is this achieved?

GPU Util is 0 while training with tensorflow-gpu

Hello Nicolas,
I am running training with tensorflow-gpu. I think there is no problem in my gpu setup, as can be seen from the first attachment. However, while using nvidia-smi to show the gpu usage, it shows 0 (see the second attachment).
Does that mean that gpu is not utilized? I also see the following post:
https://stackoverflow.com/questions/56271551/tensorflow-not-utilizing-gpu

The following is my command:

module load compiler/cuda/7/9.0
export CUDA_VISIBLE_DEVICES=0

~/tools/DeepPATH-master/DeepPATH_code/01_training/xClasses/bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=64 --train_dir='/public/home/yangzhzh/projects/imaging_bladder/3_training/r1_results' --data_dir='/public/home/yangzhzh/projects/imaging_bladder/2_preprocessing/3_convert2TFRecord/r1_TFRecord_train' --ClassNumber=2 --mode='0_softmax' --NbrOfImages=173000 --save_step_for_chekcpoint=2300 --max_steps=230001

Looking forward to your reply.
Does this matter? I am concerned that this is unexpected. If you could share me your thoughts, that would be great!

Thanks,
Zhenzhen

Some questions about mutations classifier and LUAD vs LUSC classifier

Dear authors, many thanks for your great work, which has made a great contribution to society. Could I ask some questions about mutations classifiers? Thank you!

I want to recreate the classifier of mutation in your paper(Table 1). What is the option of "--SortingOption" in the step of "0.2 Sort the tiles"?
In the step of "0.3b Convert the JPEG", why training and validation sets are in the same output directory? It is different from the 2 or 3 classes jobs. Should I separate them into two different folders before training?
What does the mean of the micro and macro-average in your paper(Table 1)?

LUAD vs LUSC classifier problem

I set the "--SortingOption=4 Sort according to type of cancer (LUSC, LUAD)" in the step of "0.2 Sort the tiles". After training with Inception v3 fully-trained, about 500000 batches to run, but the AUC is only 0.86, which is much lower than your paper(Table 1).

While I set the "--SortingOption=3 Sort according to type of cancer (LUSC, LUAD, or Nomal Tissue)" and remove the Nomal Tissue dataset, and train the same way. Finnaly the AUC can achieve 0.956. Do you know why this happen?
Thank you!

about the transfer learning

wrong

Per_slide statistics

We took 250 SVS images from TCGA. We used a 3Class classifier for training for 50000 steps, validating. The loss reached to 1.4.

After the testing, when I run script for ROC curves, I got a file named out2_perSlideStats.txt in output folder in which lines looks like this :

$ test_TCGA-66-2792-11A-01-TS1.fb255c48-b47f-45b1-9f04-5107b8c16e4e_0100 true_label: [1.0, 0.0, 0.0] Percent_Selected: 0.301299 0.002597 0.696104 Average_Probability: 0.306132 0.245624 0.448244 tiles#: 385.000000

$ test_TCGA-55-6975-11A-01-TS1.9bd5efc7-f279-4150-a410-19fb057f9df8_0100 true_label: [1.0, 0.0, 0.0] Percent_Selected: 0.343254 0.000000 0.656746 Average_Probability: 0.345403 0.228446 0.426151 tiles#: 504.000000

$ test_TCGA-67-3773-01A-01-BS1.2e8279d9-57a3-4343-8e8b-0fa7e500a531_0010 true_label: [0.0, 1.0, 0.0] Percent_Selected: 0.110599 0.000000 0.889401 Average_Probability: 0.162894 0.306063 0.531043 tiles#: 434.000000

Percent_Selected for LUAD always remained 0 except for one(second) slide. What does Percent_Selected actually convey and is it correct to get an output like this?
How is the final prediction calculated for a slide and which program calculates it?

Influence on parameters

How does tiles from LUAD/LUSC slide without tumour cells influences the parameters?