Giter Club home page Giter Club logo

segcaps's Introduction

Capsules for Object Segmentation (SegCaps)

Modified by Cheng-Lin Li

Objectives: Build up an End-to-End pipeline for Object Segmentation experiments on SegCaps with not only 3D CT images (LUNA 16) but also 2D color images (MS COCO 2017) on Binary Image Segmentation tasks.

This repository downloaded from the official website of SegCaps implementation with program restructure and enhancements.

The original paper for SegCaps can be found at https://arxiv.org/abs/1804.04241.

The original source code can be found at https://github.com/lalonderodney/SegCaps

Author's project page for this work can be found at https://rodneylalonde.wixsite.com/personal/research-blog/capsules-for-object-segmentation.

Getting Started Guide

This is the presentation file for this project.

This is my experiment to test SegCaps Net R3. I overfit on a single image, then tested how the modeled performed as the image orientation was changed. Pre-trained weights include in 'data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-0_loss-mar_slic-1_sub--1_strid-1_lr-0.01_recon-20.0_model_20180723-235354.hdf5'

Enhancements & Modifications

  1. The program was modified to support python 3.6 on Ubuntu 18.04 and Windows 10.
  2. Support not only 3D computed tomography scan images but also 2D Microsoft Common Objects in COntext (MS COCO) dataset images.
  3. Change the dice loss function type from Sørensen to Jaccard coefficient for comparing the similarity
  4. Add Kfold parameter for users to customize the cross validation task. K = 1 will force model to perform overfit.
  5. Add retrain parameter to enable users to reload pre-trained weights and retrain the model.
  6. Add initial learning rate for users to adjust.
  7. Add steps per epoch for users to adjust.
  8. Add number of patience for early stop of training to users.
  9. Add 'bce_dice' loss function as binary cross entropy + soft dice coefficient.
  10. Revise 'train', 'test', 'manip' flags from 0 or 1 to flags show up or not to indicate the behavior of main program.
  11. Add new webcam integration program for video stream segmentation.
  12. Accept any size of images and program automatically convert to 512 X 512 resolutions.

Procedures

1. Download this repo to your own folder

1-1. Download this repository via https://github.com/Cheng-Lin-Li/SegCaps/archive/master.zip

1-2. Extract the zip file into a folder.

1-3. Change your current directory to project folder.

cd ./SegCaps-master/SegCaps-master

2. Install Required Packages on Ubuntu / Windows

This code is written for Keras using the TensorFlow backend. The requirements.txt will install tensorflow CPU as default. You may need to adjust requirements.txt file according to your environment (CPU only or GPU for tensorflow installation).

Please install all required packages before using programs.

pip install -r requirements.txt

You may need to install additional library in Ubuntu version 17 or above version.

If you get the following error:

ImportError: libjasper.so.1: cannot open shared object file: No such file or directory

These steps will resolve it:

sudo apt-get update
sudo apt-get install libjasper-dev

3. Make your data directory.

Below commands:

3-1. Create root folder name 'data' in the repo folder. All models, results, etc. are saved to this root directory.

3-2. Create 'imgs' and 'masks' folders for image and mask files.

3-3. If you would like to leverage the data folder which come from this repo, then leave the repo as is.

mkdir data
chmod 755 data
cd ./data
mkdir imgs
mkdir masks
chmod 755 *
cd ..

4. Select Your dataset

4-1. Test the result on original LUNA 16 dataset.

  1. Go to LUng Nodule Analysis 2016 Grand-Challenges website
  2. Get an account by registration.
  3. Join the 'LUNA 16' challenge by click 'All Challenges' on the tab of top. Click the 'Join' and goto 'Download' section to get your data.
  4. copy your image files into BOTH ./data/imgs and ./data/masks folders.

4-2. Test on Microsoftsoft Common Objects in COntext (MS COCO) dataset 2017.

The repo include a crawler program to download your own class of images for training. But you have to download the annotation file first.

Click Microsoft COCO 2017 to download it.

There are two JSON files contain in the zip file. Extract them into a folder.

In this example, these two annotation files were extracted into the folder ~/SegCaps/annotations/

Example 1: Download 10 images and mask files with 'person' class from MS COCO validation dataset.

cd ./cococrawler
$python3 getcoco17.py --data_root_dir ../data --category person --annotation_file ./annotations/instances_val2017.json --number 10

Example 2: Download image IDs 22228, and 178040 with mask images for only person class from MS COCO 2017 training dataset.

cd ./cococrawler
$python3 getcoco17.py --data_root_dir ../data/coco --category person --annotation_file ./annotations/instances_train2017.json  --number 10 --id 22228 178040

You can choose multiple classes if you want. Just specify category of each class by space.

Example: --category person dog cat

Try below command to list all parameters for the crawler program.

python3 getcoco17.py -h
usage: getcoco17.py [-h] [--data_root_dir DATA_ROOT_DIR]
                    [--category CATEGORY [CATEGORY ...]]
                    [--annotation_file ANNOTATION_FILE]
                    [--resolution RESOLUTION] [--id ID [ID ...]]
                    [--number NUMBER]

Download COCO 2017 image Data

optional arguments:
  -h, --help            show this help message and exit
  --data_root_dir DATA_ROOT_DIR
                        The root directory for your data.
  --category CATEGORY [CATEGORY ...]
                        MS COCO object categories list (--category person dog
                        cat). default value is person
  --annotation_file ANNOTATION_FILE
                        The annotation json file directory of MS COCO object
                        categories list. file name should be
                        instances_val2017.json
  --resolution RESOLUTION
                        The resolution of images you want to transfer. It will
                        be a square image.Default is 0. resolution = 0 will
                        keep original image resolution
  --id ID [ID ...]      The id of images you want to download from MS COCO
                        dataset.Number of images is equal to the number of
                        ids. Masking will base on category.
  --number NUMBER       The total number of images you want to download.

4-3. Test on your own dataset.

The program only tested on LUNA 16 and MS COCO2017 dataset. But it can support for your own dataset too.

4-3-1. For 2D images

Dimension of images: (Width, Height, Channels)

  • Channels = 1 or 3

  • Program parameters: --dataset mscoco17

4-3-2. For 3D images

Dimension of images: (Width, Height, Slices)

  • Slices = 1 (default) or integer (Number of slices to include for training/testing.)

  • Program parameters: --dataset luna16 --slices 1

4-3-4. Mask files

Due to the program only support binary image segmentation.

The mask should be either 0(background, Black) or 1(Foreground, White) for each pixel.

Dimension of images: (Width, Height, 1)

5. Train your model

5-1 Main File

From the main file (main.py) you can train, test, and manipulate the segmentation capsules of various networks. Simply set the --train, --test, or --manip flags to turn these on respectively. The argument --data_root_dir is the only required argument and should be set to the directory containing your 'imgs' and 'masks' folders. There are many more arguments that can be set and these are all explained in the main.py file.

Please be aware of manipulate function only support 3D images. This version of source codes DO NOT support 2D images.

The program will convert all image files into numpy format then store training/testing images into ./data/np_files and training (and testing) file lists under ./data/split_list folders. You need to remove these two folders every time if you want to replace your training image and mask files. The program will only read data from np_files folders.

5-2 Train your model:

The program uses KFold cross-training and testing, and K = 4 as default. If your testing image files less than 4, please indicate the number of image files you have.

Example: You have only 2 images, and you indicate --Kfold 2, which means you will use 1 image file for training, and 1 image file for testing.

Example command: Train SegCaps R3 on MS COCO dataset without GPU support. Assume you have 4 or more images.

python3 ./main.py --train --initial_lr 0.1 --net segcapsr3 --loss dice --data_root_dir=data --which_gpus=-2 --gpus=0 --dataset mscoco17 

Example command: Train basic Capsule Net on MS COCO dataset with GPU support. Number of GPU = 1. K = 1 = You only train your model on one image for overfitting test.

python3 ./main.py --train --data_root_dir=data --net capsbasic --initial_lr 0.0001 --loglevel 2 --Kfold 1 --loss dice --dataset mscoco17 --recon_wei 20 --which_gpu -1 --gpus 1 --aug_data 0

5-3 Test your model:

For testing you will also have to specify the number of Kfolds as above if you have fewer than 4 images.

Again, the program will convert all image files into numpy format and store training/testing images into ./data/np_files and testing (and training) file lists under ./data/split_list folders. You need to remove these two folders every time if you want to replace your training image and mask files. The program will only read data from np_files folders.

Example: You have only 2 images, and you indicate --Kfold 2, which means you will use 1 image file for training, and 1 image file for testing.

Example command: Test SegCaps R3 on MS COCO dataset without GPU support. Your pre-trained weight file store on ./data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-1_loss-dice_slic-1_sub--1_strid-1_lr-0.01_recon-2.0_model_20180702-055808.hdf5

python3 ./main.py --test --Kfold 2 --net segcapsr3 --data_root_dir=data --loglevel 2 --which_gpus=-2 --gpus=0 --dataset mscoco17 --weights_path data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-1_loss-dice_slic-1_sub--1_strid-1_lr-0.01_recon-2.0_model_20180702-055808.hdf5

5-4 List all parameters:

Try below command to list all parameters for the main program.

python3 main.py -h

And the result will be:

usage: main.py [-h] --data_root_dir DATA_ROOT_DIR
               [--weights_path WEIGHTS_PATH] [--split_num SPLIT_NUM]
               [--net {segcapsr3,segcapsr1,capsbasic,unet,tiramisu}] [--train]
               [--test] [--manip] [--shuffle_data {0,1}] [--aug_data {0,1}]
               [--loss {bce,w_bce,dice,bce_dice,mar,w_mar}] [--batch_size BATCH_SIZE]
               [--initial_lr INITIAL_LR] [--steps_per_epoch STEPS_PER_EPOCH]
               [--epochs EPOCHS] [--patience PATIENCE] [--recon_wei RECON_WEI]
               [--slices SLICES] [--subsamp SUBSAMP] [--stride STRIDE]
               [--verbose {0,1,2}] [--save_raw {0,1}] [--save_seg {0,1}]
               [--save_prefix SAVE_PREFIX] [--thresh_level THRESH_LEVEL]
               [--compute_dice COMPUTE_DICE]
               [--compute_jaccard COMPUTE_JACCARD]
               [--compute_assd COMPUTE_ASSD] [--which_gpus WHICH_GPUS]
               [--gpus GPUS] [--dataset {luna16,mscoco17}]
               [--num_class NUM_CLASS] [--Kfold KFOLD] [--retrain {0,1}]
               [--loglevel LOGLEVEL]

Train on Medical Data or MS COCO dataset

optional arguments:
  -h, --help            show this help message and exit
  --data_root_dir DATA_ROOT_DIR
                        The root directory for your data.
  --weights_path WEIGHTS_PATH
                        /path/to/trained_model.hdf5 from root. Set to "" for
                        none.
  --split_num SPLIT_NUM
                        Which training split to train/test on.
  --net {segcapsr3,segcapsr1,capsbasic,unet,tiramisu}
                        Choose your network.
  --train               Add this flag to enable training.
  --test                Add this flag to enable testing.
  --manip               Add this flag to enable manipulation.
  --shuffle_data {0,1}  Whether or not to shuffle the training data (both per
                        epoch and in slice order.
  --aug_data {0,1}      Whether or not to use data augmentation during
                        training.
  --loss {bce,w_bce,dice,bce_dice,mar,w_mar}
                        Which loss to use. "bce" and "w_bce": unweighted and
                        weighted binary cross entropy, "dice": soft dice
                        coefficient, "bce_dice": binary cross entropy + soft dice coefficient, "mar" and "w_mar": unweighted and
                        weighted margin loss.
  --batch_size BATCH_SIZE
                        Batch size for training/testing.
  --initial_lr INITIAL_LR
                        Initial learning rate for Adam.
  --steps_per_epoch STEPS_PER_EPOCH
                        Number of iterations in an epoch.
  --epochs EPOCHS       Number of epochs for training.
  --patience PATIENCE   Number of patience indicates the criteria of early
                        stop training.If score of metrics do not improve
                        during the patience of epochs, the training will be
                        stopped.
  --recon_wei RECON_WEI
                        If using capsnet: The coefficient (weighting) for the
                        loss of decoder
  --slices SLICES       Number of slices to include for training/testing.
  --subsamp SUBSAMP     Number of slices to skip when forming 3D samples for
                        training. Enter -1 for random subsampling up to 5% of
                        total slices.
  --stride STRIDE       Number of slices to move when generating the next
                        sample.
  --verbose {0,1,2}     Set the verbose value for training. 0: Silent, 1: per
                        iteration, 2: per epoch.
  --save_raw {0,1}      Enter 0 to not save, 1 to save.
  --save_seg {0,1}      Enter 0 to not save, 1 to save.
  --save_prefix SAVE_PREFIX
                        Prefix to append to saved CSV.
  --thresh_level THRESH_LEVEL
                        Enter 0.0 for masking refine by Otsu algorithm. Or set
                        a value for thresholding level of masking. Value
                        should between 0 and 1.
  --compute_dice COMPUTE_DICE
                        0 or 1
  --compute_jaccard COMPUTE_JACCARD
                        0 or 1
  --compute_assd COMPUTE_ASSD
                        0 or 1
  --which_gpus WHICH_GPUS
                        Enter "-2" for CPU only, "-1" for all GPUs available,
                        or a comma separated list of GPU id numbers ex:
                        "0,1,4".
  --gpus GPUS           Number of GPUs you have available for training. If
                        entering specific GPU ids under the --which_gpus arg
                        or if using CPU, then this number will be inferred,
                        else this argument must be included.
  --dataset {luna16,mscoco17}
                        Enter "mscoco17" for COCO dataset, "luna16" for CT
                        images
  --num_class NUM_CLASS
                        Number of classes to segment. Default is 2. If number
                        of classes > 2, the loss function will be softmax
                        entropy and only apply on SegCapsR3** Current version
                        only support binary classification tasks.
  --Kfold KFOLD         Define K value for K-fold cross validate default K =
                        4, K = 1 for over-fitting test
  --retrain {0,1}       Retrain your model based on existing weights. default
                        0 = train your model from scratch, 1 = retrain
                        existing model. The weights file location of the model
                        has to be provided by --weights_path parameter
  --loglevel LOGLEVEL   loglevel 3 = debug, 2 = info, 1 = warning, 4 = error,
                        > 4 =critical

5-5 Test your model on video stream:

Please note the segcapsr3 model is too big to load into the memory of Raspberry Pi R2/R3. This program can be only executed on your laptop/desktop with webcam.

The segmentation task on segcapsr3 will take 45~50 seconds in a laptop without GPU support. The capsbasic net will take around 20 seconds. Although the program support 'ESC' or 'q' key press to terminate, you may need to terminate the console to close the program due to the latency of model inference time.

Example: Run the model segcapsr3 with pre-trained weight file without GPU at ./data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.0001_recon-20.0_model_20180705-092846.hdf5

python3 gen_mask.py --weights_path data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.0001_recon-20.0_model_20180705-092846.hdf5 --which_gpus=-2 --gpus=0 --net segcapsr3

This is the test result based on the pre-trained weight files 'data/saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.0001_recon-20.0_model_20180705-092846.hdf5' which I included in the github.

Try below command to list all parameters for the main program.

python3 gen_mask.py -h
usage: gen_mask.py [-h] [--net {segcapsr3,segcapsr1,capsbasic,unet,tiramisu}]
                   --weights_path WEIGHTS_PATH [--num_class NUM_CLASS]
                   [--which_gpus WHICH_GPUS] [--gpus GPUS]

Mask image by segmentation algorithm

optional arguments:
  -h, --help            show this help message and exit
  --net {segcapsr3,segcapsr1,capsbasic,unet,tiramisu}
                        Choose your network.
  --weights_path WEIGHTS_PATH
                        /path/to/trained_model.hdf5 from root. Set to "" for
                        none.
  --num_class NUM_CLASS
                        Number of classes to segment. Default is 2. If number
                        of classes > 2, the loss function will be softmax
                        entropy and only apply on SegCapsR3** Current version
                        only support binary classification tasks.
  --which_gpus WHICH_GPUS
                        Enter "-2" for CPU only, "-1" for all GPUs available,
                        or a comma separated list of GPU id numbers ex:
                        "0,1,4".
  --gpus GPUS           Number of GPUs you have available for training. If
                        entering specific GPU ids under the --which_gpus arg
                        or if using CPU, then this number will be inferred,
                        else this argument must be included.

5-6 Pretrain weights for your testing:

Weights stored under your data folder.

example: ./data/saved_models/segcapsr3

  1. Pretrained weights for general purpose on person with 40 images.

split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.0001_recon-20.0_model_20180705-092846.hdf5

  1. Pretrained weights for portrait of man. (Overfit test, not good at general purpose usage)

split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.01_recon-20.0_model_20180713-041900.hdf5

  1. Pretrained weights for 3 girls on the street (Overfit test, but still can test on general environments)

split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.01_recon-20.0_model_20180707-222802.hdf5

6. Program Descriptions

  1. main.py: The entry point of this project.
  2. train.py: The major training module.
  3. test.py: The major testing module.
  4. manip.py: The manipulate module of the model. Only support 3D images. This version does NOT support 2D images.
  5. gen_mask.py: A video streaming capture program integrated with SegCaps for segmentation task

7. Program Structures:

----SegCaps-master  (Project folder)
    |
    \-cococrawler (Crawler program folder)
    |   \-annotations (Folder of Microsoft COCO annotation files)
    \-data  (The root folder of program output)
    |   \-imgs (Folder of training and testing images)
    |   \-masks (Folder of training and testing masking data)
    |   \-np_files (Folder to store processed image and mask files in numpy form.)
    |   \-split_lists (Folder for training and testing image splits list)
    |   \-logs (Training logs)
    |   \-plots (Trend diagram for Training period. Only generate after the training completed )
    |   \-figs (Conver image to numpy format, part of images stored for checking)
    |   \-saved_models (All model weights will be stored under this folder)
    |   \-results (Test result images will be stored in this folder)
    |     |
    |     \segcapr3\split_0\final_output
    |                      \raw_output
    |
    \-models (Reference model files: Unet and DenseNet)
    |
    \-segcapsnet (main modules for Capsule nets and SegCaps)
    |
    \-utils (image loader, loss functions, metrics, image augmentation, and thread safe models)
    |
    \-notebook (Some experiment notebooks for reference)
    |
    \-raspberrypi (Raspberry Pi software installation scripts) 
    |
    \-imgs (image file for this readme)

8. Install package on Raspberry Pi 2/3

The section is under constructing. The SegCaps R3 model cannot fit into the memory of Raspberry Pi 2/3 so far.

The below description is for installing onto your Raspberry Pi environment.

Download tensorflow pre-compile version for ARM v7.

Tensorflow for ARM - Github Repo: https://github.com/lhelontra/tensorflow-on-arm/releases

installation instructions.

https://medium.com/@abhizcc/installing-latest-tensor-flow-and-keras-on-raspberry-pi-aac7dbf95f2

OpenCV installation on Raspberry Pi 2/3

https://www.alatortsev.com/2018/04/27/installing-opencv-on-raspberry-pi-3-b/

There are two scripts under ./raspberrypi/ You can follow below steps to speed up the installation. The while installation will take a very long time. It runs arround 1 day in the Raspberry Pi 2.

*** Please noted these scripts only tested on Raspberry Pi 2. But it should work on Pi 3***

cd ~/SegCaps/raspberrypi
chmod 755 *
./Raspi3-install.sh

Raspberry will reboot after the installation.

Execute second command.

cd ~/SegCaps/raspberrypi
./opencv-install.sh

TODO List:

  1. Execute programs on LUNA 16 dataset. =>Completed
1-1. Porting program from python 2.7 to 3.6 (Jun 11)

1-2. Execute manipulation function. (Jun 11)

1-2. Execute test function on one image without pre-trained weight(Jun 11)

1-3. Execute train function on 3 images. (Jun 12)

1-4. Execute test function on trained model (Jun 12)

1-5. Display original image and result mask image. (Jun 12)

1-6. Identify input image mask format. (Jun 14)
  1. Find right dataset for person/cat/dog segmentation. Candidate dataset is MS COCO. =>Completed
2-1. Identify COCO stuff 2017 as target dataset. (Jun 15)

2-2. Download annotation files for COCO 2017. (Jun 15)
  1. Test existing program on color images.=>Completed
3-1. Generate single class mask on COCO masked image data.(Jun 13)

3-2. Convert the image mask format to background=0, objects=1. (Jun 18)

3-3. Convert the color image to gray scale image (Jun 18)

3-3. Feed the color images with single classes mask to model for training. (Jun 21)
  1. Pipeline up:=>Completed
4-1. Modify code to support experiments.(Jun 25)

  4-1-1. Models persistent by version with configuration and dataset. (Jun 26)

  4-1-2. Notebook folder build up to store experiment results.

4-2. Test pipeline (Jun 27)
  1. Modify program for color images. =>Completed

  2. Model training =>Completed 6-1. Enhance MSCOCO crawler to download image by IDs with specific class of image masking file. (Jun 28)

  3. Integrate model with webcam. =>Completed

  4. Modify the manipulate module to support 2D images.

  5. Reduce the model size to fit into Raspberry memory (1GB)

  6. Support multiclass segmentation.

Citation

This project based on the official codebase of Capsules for Object Segmentation:

@article{lalonde2018capsules,
  title={Capsules for Object Segmentation},
  author={LaLonde, Rodney and Bagci, Ulas},
  journal={arXiv preprint arXiv:1804.04241},
  year={2018}
}

Questions or Comments

For this modification version, please email me at [email protected]

For the original implementation, please direct any questions or comments to the author. You can either comment on the project page, or email author directly at [email protected].

Original README.md description:

Condensed Abstract

Convolutional neural networks (CNNs) have shown remarkable results over the last several years for a wide range of computer vision tasks. A new architecture recently introduced by Sabour et al., referred to as a capsule networks with dynamic routing, has shown great initial results for digit recognition and small image classification. Our work expands the use of capsule networks to the task of object segmentation for the first time in the literature. We extend the idea of convolutional capsules with locally-connected routing and propose the concept of deconvolutional capsules. Further, we extend the masked reconstruction to reconstruct the positive input class. The proposed convolutional-deconvolutional capsule network, called SegCaps, shows strong results for the task of object segmentation with substantial decrease in parameter space. As an example application, we applied the proposed SegCaps to segment pathological lungs from low dose CT scans and compared its accuracy and efficiency with other U-Net-based architectures. SegCaps is able to handle large image sizes (512 x 512) as opposed to baseline capsules (typically less than 32 x 32). The proposed SegCaps reduced the number of parameters of U-Net architecture by 95.4% while still providing a better segmentation accuracy.

Baseline Capsule Network for Object Segmentation

SegCaps (R3) Network Overview

Quantative Results on the LUNA16 Dataset

Method Parameters Split-0 (%) Split-1 (%) Split-2 (%) Split-3 (%) Average (%)
U-Net 31.0 M 98.353 98.432 98.476 98.510 98.449
Tiramisu 2.3 M 98.394 98.358 98.543 98.339 98.410
Baseline Caps 1.7 M 82.287 79.939 95.121 83.608 83.424
SegCaps (R1) 1.4 M 98.471 98.444 98.401 98.362 98.419
SegCaps (R3) 1.4 M 98.499 98.523 98.455 98.474 98.479

Results of Manipulating the Segmentation Capsule Vectors

segcaps's People

Contributors

cheng-lin-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

segcaps's Issues

test.py

script file: test.py --> function threshold_mask

Why did you use the threshold_otsu filter? During segmentation normally you are not supposed to use any manually defined filter since it will make the network lose valuable information?

Training on own dataset

Hello,
First, thank you for your implementation of SegCaps.
I try to train on a small dataset consisting of 100 pairs of greyscaled/ 4 levels groundtruth ("masks") images:

pair-grey_mask
The original dataset is available

100 pairs of images were copied in SegCaps/data/ :

  • from: ...data/imgs/train0000000.png and ...data/masks/train0000000.png

  • to : ...data/imgs/train0000099.png and ...data/masks/train0000099.png

then I tried to train secaps as follow. The given arguments are wrong, how could I modify them to train segcaps on those images?

Thank you.

(DeepFish) jeanpat@Dell-T5500:~/Developpement/SegCaps$ python3 ./main.py --test --Kfold 2 --net segcapsr3 --data_root_dir=data --loglevel 2 --which_gpus=-2 --gpus=0 --dataset mscoco17 --weights_path data/saved_models/segcapsr3/overlap_test.hdf5
Using TensorFlow backend.
INFO 2018-08-25 17:55:49,226: 
No existing training, validate, test files...System will generate it.
basename=['train0000050.png']
basename=['train0000051.png']
basename=['train0000052.png']
basename=['train0000053.png']
basename=['train0000054.png']
basename=['train0000055.png']
basename=['train0000056.png']
basename=['train0000057.png']
basename=['train0000058.png']
basename=['train0000059.png']
basename=['train0000060.png']
basename=['train0000061.png']
basename=['train0000062.png']
basename=['train0000063.png']
basename=['train0000064.png']
basename=['train0000065.png']
basename=['train0000066.png']
basename=['train0000067.png']
basename=['train0000068.png']
basename=['train0000069.png']
basename=['train0000070.png']
basename=['train0000071.png']
basename=['train0000072.png']
basename=['train0000073.png']
basename=['train0000074.png']
basename=['train0000075.png']
basename=['train0000076.png']
basename=['train0000077.png']
basename=['train0000078.png']
basename=['train0000079.png']
basename=['train0000080.png']
basename=['train0000081.png']
basename=['train0000082.png']
basename=['train0000083.png']
basename=['train0000084.png']
basename=['train0000085.png']
basename=['train0000086.png']
basename=['train0000087.png']
basename=['train0000088.png']
basename=['train0000089.png']
basename=['train0000090.png']
basename=['train0000091.png']
basename=['train0000092.png']
basename=['train0000093.png']
basename=['train0000094.png']
basename=['train0000095.png']
basename=['train0000096.png']
basename=['train0000097.png']
basename=['train0000098.png']
basename=['train0000099.png']
basename=['train0000000.png']
basename=['train0000001.png']
basename=['train0000002.png']
basename=['train0000003.png']
basename=['train0000004.png']
basename=['train0000005.png']
basename=['train0000006.png']
basename=['train0000007.png']
basename=['train0000008.png']
basename=['train0000009.png']
basename=['train0000010.png']
basename=['train0000011.png']
basename=['train0000012.png']
basename=['train0000013.png']
basename=['train0000014.png']
basename=['train0000015.png']
basename=['train0000016.png']
basename=['train0000017.png']
basename=['train0000018.png']
basename=['train0000019.png']
basename=['train0000020.png']
basename=['train0000021.png']
basename=['train0000022.png']
basename=['train0000023.png']
basename=['train0000024.png']
basename=['train0000025.png']
basename=['train0000026.png']
basename=['train0000027.png']
basename=['train0000028.png']
basename=['train0000029.png']
basename=['train0000030.png']
basename=['train0000031.png']
basename=['train0000032.png']
basename=['train0000033.png']
basename=['train0000034.png']
basename=['train0000035.png']
basename=['train0000036.png']
basename=['train0000037.png']
basename=['train0000038.png']
basename=['train0000039.png']
basename=['train0000040.png']
basename=['train0000041.png']
basename=['train0000042.png']
basename=['train0000043.png']
basename=['train0000044.png']
basename=['train0000045.png']
basename=['train0000046.png']
basename=['train0000047.png']
basename=['train0000048.png']
basename=['train0000049.png']
INFO 2018-08-25 17:55:49,245: 
Read image files...data/imgs/train0000077.png
Traceback (most recent call last):
  File "./main.py", line 276, in <module>
    main(arguments)
  File "./main.py", line 94, in main
    model_list = create_model(args=args, input_shape=net_input_shape, enable_decoder=True)
  File "/home/jeanpat/Developpement/SegCaps/utils/model_helper.py", line 29, in create_model
    model_list = CapsNetR3(input_shape, args.num_class, enable_decoder)
  File "/home/jeanpat/Developpement/SegCaps/segcapsnet/capsnet.py", line 55, in CapsNetR3
    name='deconv_cap_1_1')(conv_cap_4_1)
  File "/home/jeanpat/anaconda3/envs/DeepFish/lib/python3.6/site-packages/keras/engine/base_layer.py", line 457, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/jeanpat/Developpement/SegCaps/segcapsnet/capsule_layers.py", line 250, in call
    out_height = deconv_length(self.input_height, self.scaling, self.kernel_size, self.padding)
TypeError: deconv_length() missing 1 required positional argument: 'output_padding'

PS:
The GPU is a GTX 960, 4Gb

$ nvidia-smi 
Sun Aug 26 10:00:11 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.51                 Driver Version: 396.51                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960     Off  | 00000000:03:00.0  On |                  N/A |
| 39%   29C    P8    12W / 130W |    312MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

OOM error when training on my own dataset

Hello, thanks for the code and documents, they are easy to understand.

When I train the model on my own dataset (dataset includes 16 512*512 2D grayscale png images, and masks are black-white), it always get an error that tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,1,4,512,512,2] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

I have 2 GTX-1080 gpus with 8GB dedicated memory. And the options I use to do training is
python ./main.py --train --split_num=0 --batch_size=2 --aug_data=1 --loglevel=2 --net segcapsr3 --data_root_dir=data --which_gpus=-1 --gpus=2 --loss bce --dataset mscoco17

In addition, is there any way that produces test outputs in black-and-white instead of yellow-and-purple?

Iteration Time Issue

Hi Cheng_Lin_Li,

I hope this message finds you well. I've been using your GitHub project and I encountered an issue regarding the iteration time. Specifically, I selected 10 person photos for processing, but each iteration is taking more than 80 hours. I wanted to reach out to you and confirm whether this duration is expected or if there might be some issue on my end.

I appreciate your time and assistance in resolving this matter. If you need any additional information or logs from my side, please let me know.

Thank you,
AiBing
path_to_np=data/np_files/train9.npz
1/10000 [..............................] - ETA: 89:09:12 - loss: 0.8261 - out_seg_loss: 0.5970 - out_recon_loss: 0.2291 - out_seg_dice_hard: 0.2155INFO 2023-12-24 13:36:55,324:
path_to_np=data/np_files/train4.npz
2/10000 [..............................] - ETA: 79:53:11 - loss: 0.7541 - out_seg_loss: 0.5197 - out_recon_loss: 0.2344 - out_seg_dice_hard: 0.2829INFO 2023-12-24 13:37:20,756:
path_to_np=data/np_files/train4.npz
3/10000 [..............................] - ETA: 77:02:40 - loss: 0.8524 - out_seg_loss: 0.6285 - out_recon_loss: 0.2239 - out_seg_dice_hard: 0.1892INFO 2023-12-24 13:37:46,455:
path_to_np=data/np_files/train9.npz
4/10000 [..............................] - ETA: 75:12:46 - loss: 0.9051 - out_seg_loss: 0.6828 - out_recon_loss: 0.2223 - out_seg_dice_hard: 0.1423INFO 2023-12-24 13:38:11,586:
path_to_np=data/np_files/train6.npz
5/10000 [..............................] - ETA: 74:24:26 - loss: 0.8601 - out_seg_loss: 0.6315 - out_recon_loss: 0.2285 - out_seg_dice_hard: 0.1833INFO 2023-12-24 13:38:37,227:
path_to_np=data/np_files/train8.npz
6/10000 [..............................] - ETA: 73:38:04 - loss: 0.7891 - out_seg_loss: 0.5567 - out_recon_loss: 0.2325 - out_seg_dice_hard: 0.3061INFO 2023-12-24 13:39:02,370:
path_to_np=data/np_files/train3.npz
7/10000 [..............................] - ETA: 73:05:20 - loss: 0.7929 - out_seg_loss: 0.5625 - out_recon_loss: 0.2304 - out_seg_dice_hard: 0.2835INFO 2023-12-24 13:39:27,537:
path_to_np=data/np_files/train7.npz
8/10000 [..............................] - ETA: 72:38:26 - loss: 0.7882 - out_seg_loss: 0.5554 - out_recon_loss: 0.2328 - out_seg_dice_hard: 0.2500INFO 2023-12-24 13:39:52,595:
path_to_np=data/np_files/train6.npz
9/10000 [..............................] - ETA: 72:22:19 - loss: 0.7823 - out_seg_loss: 0.5525 - out_recon_loss: 0.2298 - out_seg_dice_hard: 0.2298INFO 2023-12-24 13:40:17,919:
path_to_np=data/np_files/train9.npz
10/10000 [..............................] - ETA: 72:15:30 - loss: 0.7474 - out_seg_loss: 0.5154 - out_recon_loss: 0.2319 - out_seg_dice_hard: 0.2990INFO 2023-12-24 13:40:43,619:
path_to_np=data/np_files/train7.npz
11/10000 [..............................] - ETA: 72:02:35 - loss: 0.7496 - out_seg_loss: 0.5167 - out_recon_loss: 0.2329 - out_seg_dice_hard: 0.3056INFO 2023-12-24 13:41:08,827:
path_to_np=data/np_files/train8.npz
12/10000 [..............................] - ETA: 71:50:43 - loss: 0.7657 - out_seg_loss: 0.5331 - out_recon_loss: 0.2326 - out_seg_dice_hard: 0.2858INFO 2023-12-24 13:41:33,973:
path_to_np=data/np_files/train3.npz

Instruction for running the code from .jpg files

Hello, i have put images and into .jpg and mask under imgs and ans masks folder respectively. after running code getting some error which is mentioned into following figure, how to run your code. My images are gray scale. and os is ubuntu 16.04
Screenshot from 2019-07-04 18-05-42

RGB images with binary masks

Hi @Cheng-Lin-Li, thanks for your great effort.

I have a question about using custom dataset.
Is it possible to train SegCapsR3 with 3 channels RGB images and 1 channel grayscale mask?

UNet works with the same configuration (3 channels input + 1 channel mask) but SegCapsR3 doesn't.

SegCapsR3 fails with the error message of ValueError: Error when checking target: expected out_recon to have shape (224, 224, 1) but got array with shape (224, 224, 3)

Here is the network summary. I didn't do anything on the network and I'm at the very beginning of this road and couldn't find a solution.
Regards,

Layer (type)                         Output Shape              Param #   Connected to             


input_1 (InputLayer)                 (None, 224, 224, 3)       0                                  
conv1 (Conv2D)                       (None, 224, 224, 16)      1216      input_1[0][0]            
reshape_1 (Reshape)                  (None, 224, 224, 1, 16)   0         conv1[0][0]              
primarycaps (ConvCapsuleLayer)       (None, 112, 112, 2, 16)   12832     reshape_1[0][0]          
conv_cap_2_1 (ConvCapsuleLayer)      (None, 112, 112, 4, 16)   25664     primarycaps[0][0]        
conv_cap_2_2 (ConvCapsuleLayer)      (None, 56, 56, 4, 32)     51328     conv_cap_2_1[0][0]       
conv_cap_3_1 (ConvCapsuleLayer)      (None, 56, 56, 8, 32)     205056    conv_cap_2_2[0][0]       
conv_cap_3_2 (ConvCapsuleLayer)      (None, 28, 28, 8, 64)     410112    conv_cap_3_1[0][0]       
conv_cap_4_1 (ConvCapsuleLayer)      (None, 28, 28, 8, 32)     409856    conv_cap_3_2[0][0]       
deconv_cap_1_1 (DeconvCapsuleLayer)  (None, 56, 56, 8, 32)     131328    conv_cap_4_1[0][0]       
up_1 (Concatenate)                   (None, 56, 56, 16, 32)    0         deconv_cap_1_1[0][0]     
                                                                         conv_cap_3_1[0][0]       
deconv_cap_1_2 (ConvCapsuleLayer)    (None, 56, 56, 4, 32)     102528    up_1[0][0]               
deconv_cap_2_1 (DeconvCapsuleLayer)  (None, 112, 112, 4, 16)   32832     deconv_cap_1_2[0][0]     
up_2 (Concatenate)                   (None, 112, 112, 8, 16)   0         deconv_cap_2_1[0][0]     
                                                                         conv_cap_2_1[0][0]       
deconv_cap_2_2 (ConvCapsuleLayer)    (None, 112, 112, 4, 16)   25664     up_2[0][0]               
deconv_cap_3_1 (DeconvCapsuleLayer)  (None, 224, 224, 2, 16)   8224      deconv_cap_2_2[0][0]     
up_3 (Concatenate)                   (None, 224, 224, 3, 16)   0         deconv_cap_3_1[0][0]     
                                                                         reshape_1[0][0]          
seg_caps (ConvCapsuleLayer)          (None, 224, 224, 1, 16)   272       up_3[0][0]               
input_2 (InputLayer)                 (None, 224, 224, 1)       0                                  
mask_1 (Mask)                        (None, 224, 224, 1, 16)   0         seg_caps[0][0]           
                                                                         input_2[0][0]            
reshape_2 (Reshape)                  (None, 224, 224, 16)      0         mask_1[0][0]             
recon_1 (Conv2D)                     (None, 224, 224, 64)      1088      reshape_2[0][0]          
recon_2 (Conv2D)                     (None, 224, 224, 128)     8320      recon_1[0][0]            
out_seg (Length)                     (None, 224, 224, 1)       0         seg_caps[0][0]           
out_recon (Conv2D)                   (None, 224, 224, 1)       129       recon_2[0][0]            

Value Error

Hi eveyone,

When I run code, I get an error like below. Could you help me about it,please?
hata

input size

Hello,
my data is 3D volumes with dimensions of (217, 181, 50). In the readme, you mentioned that the code can handle to input size itself, but I get this error:
"ValueError: could not broadcast input array from shape (217,181,50) into shape (512,512,1)".
Another issue is when trying to load my own data, load_2d_data invokes, although my data is 3D.
Is there anything that I should take care of?
Thank you

Can not load images and masks

Unable to load img or masks for 43
index 3 is out of bounds for axis 2 with size 3
Skipping file

INFO 2019-07-23 21:14:04,744:
path_to_np=data/np_files/404.npz
INFO 2019-07-23 21:14:04,744:
Pre-made numpy array not found for 404.
Creating now...
DEBUG 2019-07-23 21:14:04,744: STREAM b'IHDR' 16 13
DEBUG 2019-07-23 21:14:04,744: STREAM b'pHYs' 41 9
DEBUG 2019-07-23 21:14:04,744: STREAM b'IDAT' 62 8192

this error/bug keeps happening over and over and the training doesn't start please help me. The stuff after the dotted line is another thing that keeps happening over and over for all of the images and masks and yes I have in the data/imgs and data/masks folder

Proplem with deconv_length from keras.utils.conv_utils

I'm trying to evaluate the performance of the code for segmentation of my 2D CT images. However, there is problem for loading deconv_length from keras.utils.conv_utils or even tensorflow.keras.utils. I googled the problem and replaced the "deconv_length" by "deconv_output_length". Again, an error occurs for the following line:

deconv_cap_1_1 = DeconvCapsuleLayer(kernel_size=4, num_capsule=8, num_atoms=32, upsamp_type='deconv', scaling=2, routings=3, padding='same', name='deconv_cap_1_1')(conv_cap_4_1)

Exception has occurred: AssertionError
Exception encountered when calling layer "deconv_cap_1_1" (type DeconvCapsuleLayer).

in user code:

File "d:\My_Codes\New_Cheng\segcapsnet\capsule_layers.py", line 250, in call  *
    out_height = deconv_output_length(self.input_height, self.scaling, self.kernel_size, self.padding)
File "C:\Users\ASUS\anaconda3\lib\site-packages\keras\utils\conv_utils.py", line 177, in deconv_output_length  **
    assert padding in {'same', 'valid', 'full'}

AssertionError: 

Call arguments received:
• input_tensor=tf.Tensor(shape=(None, 64, 64, 8, 32), dtype=float32)
• training=None
File "C:\Users\ASUS\AppData\Local\Temp_autograph_generated_filekx3mg4bh.py", line 63, in tf__call
ag
_.if_stmt(ag__.ld(self).upsamp_type == 'resize', if_body_1, else_body_1, get_state_1, set_state_1, ('outputs',), 1)
File "C:\Users\ASUS\AppData\Local\Temp_autograph_generated_filekx3mg4bh.py", line 55, in else_body_1
ag
_.if_stmt(ag__.ld(self).upsamp_type == 'subpix', if_body, else_body, get_state, set_state, ('outputs',), 1)
File "C:\Users\ASUS\AppData\Local\Temp_autograph_generated_filekx3mg4bh.py", line 45, in else_body
out_height = ag
_.converted_call(ag__.ld(deconv_output_length), (ag__.ld(self).input_height, ag__.ld(self).scaling, ag__.ld(self).kernel_size, ag__.ld(self).padding), None, fscope)

During handling of the above exception, another exception occurred:

During handling of the above exception, another exception occurred:

File "D:\My_Codes\New_Cheng\segcapsnet\capsnet.py", line 53, in CapsNetR3
deconv_cap_1_1 = DeconvCapsuleLayer(kernel_size=4, num_capsule=8, num_atoms=32, upsamp_type='deconv',
File "D:\My_Codes\New_Cheng\utils\model_helper.py", line 29, in create_model
model_list = CapsNetR3(input_shape, args.num_class, enable_decoder)
File "D:\My_Codes\New_Cheng\main.py", line 98, in main
model_list = create_model(args=args, input_shape=net_input_shape, enable_decoder=True)
File "D:\My_Codes\New_Cheng\main.py", line 284, in
main(arguments)

ValueError: No gradients provided for any variable

On running code on google colab I am getting following error, please help:

[ValueError: in user code:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
    return step_function(self, iterator)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:795 step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:1259 run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2730 call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:3417 _call_for_each_replica
    return fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:788 run_step  **
    outputs = model.train_step(data)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:757 train_step
    self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:498 minimize
    return self.apply_gradients(grads_and_vars, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:598 apply_gradients
    grads_and_vars = optimizer_utils.filter_empty_gradients(grads_and_vars)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/utils.py:79 filter_empty_gradients
    ([v.name for _, v in grads_and_vars],))

ValueError: No gradients provided for any variable: ['conv1/kernel:0', 'conv1/bias:0', 'primarycaps/W:0', 'primarycaps/b:0', 'seg_caps/W:0', 'seg_caps/b:0', 'recon_1/kernel:0', 'recon_1/bias:0', 'recon_2/kernel:0', 'recon_2/bias:0', 'out_recon/kernel:0', 'out_recon/bias:0'].](url)

dependencies version

Hi Cheng Lin Li,

Could you list all the dependencies versions in the requirements.txt as well?
I just can't import print_summary from keras.utils, so I really need to know the keras version that you used.

Thank you,
Ian

ValueError: Dimension 0 in both shapes must be equal, but are 3 and 16.

I am trying to use jupyter notebook 20180701-SegCapsR3-image-segmentation-with Color image input.ipynb but while loading the model it throws an error :

ValueError: Dimension 0 in both shapes must be equal, but are 3 and 16. Shapes are [3,3,1,32] and [16,1,5,5]. for 'Assign' (op: 'Assign') with input shapes: [3,3,1,32], [16,1,5,5].

Any idea why it's not able to load the model?

Why transformation matrix shared by different child capsule types?

@Cheng-Lin-Li Thanks for sharing this wonderful work!

I have a question about the transformation matrix. It's mentioned in the paper that different transformation matrices are used for each type of capsule. Does it mean both child and parent capsule types or only for parent capsule types? It seems to me that the transformation matrices are shared between different child capsule types. If so, why do you use multiple child capsule types and what's the meaning of applying same transformation matrix to different child capsule types?

I'd appreciate it a lot if you could help me!
Thank you!

Creating MSCOCO dataset

How did you create the label vector (y) when using the model on MSCOCO dataset? For each input (image), MSCOCO has multiple outputs (categories). How to create the dataset in this case (i.e. the input vector x and the label vector y)?
Any help would be appreciated. Thank You.

Training image and mask error

Thank you for your implementation of SegCaps.
I train on my dataset ,got a problem"Unable to load img or masks for 00001,too many indices for array,Skipping file".
When loading MSCOO dataset with 10 images, it works.
My image are 2D uint8 greyscale of (512,512), masks are the same 2D uint8 greyscale of (512,512). I checked the Issues, someone said masks should be (512,512,1), Does my images and masks fit the input format?If not, what should I do for it?
Any advice would be appreciated.

TypeError: deconv_length() missing 1 required positional argument: 'output_padding'

Traceback (most recent call last):
File "./main.py", line 281, in
main(arguments)
File "./main.py", line 95, in main
model_list = create_model(args=args, input_shape=net_input_shape, enable_decoder=True)
File "/home/nd/capsnet/SegCaps-master (copy)/utils/model_helper.py", line 29, in create_model
model_list = CapsNetR3(input_shape, args.num_class, enable_decoder)
File "/home/nd/capsnet/SegCaps-master (copy)/segcapsnet/capsnet.py", line 55, in CapsNetR3
name='deconv_cap_1_1')(conv_cap_4_1)
File "/home/nd/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "/home/nd/capsnet/SegCaps-master (copy)/segcapsnet/capsule_layers.py", line 250, in call
out_height = deconv_length(self.input_height, self.scaling, self.kernel_size, self.padding)
TypeError: deconv_length() missing 1 required positional argument: 'output_padding'

Multiclass Segmentation

Hi Cheng-Lin-Li,

I found your modification of this repo really good, specially the adaptions for training on MS Coco Dataset. Nevertheless, did you also perform on images with more than 2 classes apart from binary segmentation?

Thanks man!
Cheers,
Oushesh

reconstruction layer

Hi Cheng

I looked at your code and description. I just want to know why we need reconstruction layer, what if we just use original images and pixel-wised true labels as input and output?

Best

What does num_atom mean?

What does num_atom mean?
Does this mean the number of feature maps in each capsule?

Can I put another image instead of a mask file?
Must the values ​​be 0 and 1?

bad mask

I run the following command :
python3 ./main.py --test --Kfold 2 --net segcapsr3 --data_root_dir=data --loglevel 2 --which_gpus=-2 --gpus=0 --dataset mscoco17 --weights_path saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-0_loss-dice_slic-1_sub--1_strid-1_lr-0.0001_recon-20.0_model_20180705-092846.hdf5.
I joined the image, the reference mask and the output mask (data/results/segcapsr3/split0/final_output
train2
train2
train2_final_output
What am I doing wrong ?

Training Performance Do Not Improve

Does this result of training process that I got reasonable and should i proceed to the end of the epoch?
It looks like the dice_hard do not improve and the optimizer has achieved local minima.

I use MRI dataset from ISLES 2017 and has adjusted the load data process without using K Fold.

Epoch 1/50
369/369 [==============================] - 1370s 4s/step - loss: 1.4192 - out_seg_loss: 1.2236 - out_recon_loss: 0.1956 - out_seg_dice_hard: 0.0746 - val_loss: 1.1187 - val_out_seg_loss: 1.0050 - val_out_recon_loss: 0.1137 - val_out_seg_dice_hard: 0.0304
Epoch 00001: val_out_seg_dice_hard improved from -inf to 0.03035, saving model to [my folder]

Epoch 2/50
369/369 [==============================] - 1371s 4s/step - loss: 1.1169 - out_seg_loss: 1.0563 - out_recon_loss: 0.0605 - out_seg_dice_hard: 0.0995 - val_loss: 1.0128 - val_out_seg_loss: 1.0010 - val_out_recon_loss: 0.0118 - val_out_seg_dice_hard: 0.0522
Epoch 00002: val_out_seg_dice_hard improved from 0.03035 to 0.05218, saving model to [my folder]

Epoch 3/50
369/369 [==============================] - 1365s 4s/step - loss: 1.0673 - out_seg_loss: 1.0443 - out_recon_loss: 0.0229 - out_seg_dice_hard: 0.1156 - val_loss: 1.0043 - val_out_seg_loss: 0.9994 - val_out_recon_loss: 0.0049 - val_out_seg_dice_hard: 0.0485
Epoch 00003: val_out_seg_dice_hard did not improve from 0.05218

Epoch 4/50
369/369 [==============================] - 1365s 4s/step - loss: 1.0133 - out_seg_loss: 0.9917 - out_recon_loss: 0.0216 - out_seg_dice_hard: 0.0873 - val_loss: 0.9998 - val_out_seg_loss: 0.9957 - val_out_recon_loss: 0.0041 - val_out_seg_dice_hard: 9.4607e-09
Epoch 00004: val_out_seg_dice_hard did not improve from 0.05218

Epoch 5/50
369/369 [==============================] - 1370s 4s/step - loss: 1.0076 - out_seg_loss: 0.9868 - out_recon_loss: 0.0207 - out_seg_dice_hard: 0.0623 - val_loss: 0.9991 - val_out_seg_loss: 0.9952 - val_out_recon_loss: 0.0039 - val_out_seg_dice_hard: 9.4697e-09
.....

Epoch 14/50
369/369 [==============================] - 1373s 4s/step - loss: 1.0047 - out_seg_loss: 0.9830 - out_recon_loss: 0.0217 - out_seg_dice_hard: 0.0644 - val_loss: 0.9982 - val_out_seg_loss: 0.9945 - val_out_recon_loss: 0.0038 - val_out_seg_dice_hard: 9.9502e-09
Epoch 00014: val_out_seg_dice_hard did not improve from 0.05218

Final output and raw output are always same no matter what input images are given

@Cheng-Lin-Li Thank you for your nice work!

I have one confusing question, I can successfully run the codes (both Train and Test on segcapsr3) using both MSCOCO17 and my own grey images. However, I found the final output and raw output images (stored in the folder of ../SegCaps/data/results/segcapsr3/split_0) are always the same, no matter what input images are given?

Any suggestions will be appreciated. The following is my testing code.

python3 ./main.py --test --Kfold 2 --net segcapsr3 --data_root_dir=data --loglevel 2 --which_gpus=-2 --gpus=0 --dataset mscoco17 --weights_path saved_models/segcapsr3/split-0_batch-1_shuff-1_aug-1_loss-dice_slic-1_sub--1_strid-1_lr-0.1_recon-131.072_model_20190918-151252.hdf5

expected out_recon shape (512, 512, 1)

I'm training CapsNetR3 on my own dataset. I get the following error

Error when checking target: expected out_recon to have shape (512, 512, 1) but got array with shape (512, 512, 3)

my input shape is 512, 512, 3. Any clue or hint is highly appreciated

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.