Giter Club home page Giter Club logo

embedseg's Introduction

Embedding-based Instance Segmentation in Microscopy

Table of Contents

Introduction

This repository hosts the version of the code used for the publication Embedding-based Instance Segmentation of Microscopy Images.

We refer to the techniques elaborated in the publication, here as EmbedSeg. EmbedSeg is a method to perform instance-segmentation of objects in microscopy images, based on the ideas by Neven et al, 2019.

With EmbedSeg, we obtain state-of-the-art results on multiple real-world microscopy datasets. EmbedSeg has a small enough memory footprint (between 0.7 to about 3 GB) to allow network training on virtually all CUDA enabled hardware, including laptops.

Dependencies

One could execute these lines of code to run this branch with GPU support:

mamba create -n EmbedSeg python
mamba activate EmbedSeg
mamba install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
git clone https://github.com/juglab/EmbedSeg.git
cd EmbedSeg
pip install -e .

For CPU support, one could execute the following lines of code:

mamba create -n EmbedSeg python
mamba activate EmbedSeg
pip install torch torchvision
git clone https://github.com/juglab/EmbedSeg.git
cd EmbedSeg

Getting Started

Look in the examples directory, and try out the DSB-2018 notebooks for 2D images or Mouse-Organoid-Cells-CBG notebooks for volumetric (3D) images. Please make sure to select Kernel > Change kernel to EmbedSegEnv.

Datasets

3D datasets are available as release assets here. datasets

Training and Inference on your data

*.tif-type images and the corresponding masks should be respectively present under images and masks, under directories train, val and test. (In order to prepare such instance masks, one could use the Fiji plugin Labkit as suggested here). The following would be a desired structure as to how data should be prepared.

$data_dir
└───$project-name
    |───train
        └───images
            └───X0.tif
            └───...
            └───Xn.tif
        └───masks
            └───Y0.tif
            └───...
            └───Yn.tif
    |───val
        └───images
            └───...
        └───masks
            └───...
    |───test
        └───images
            └───...
        └───masks
            └───...

Issues

If you encounter any problems, please file an issue along with a detailed description.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{lalit2021embedseg,
  title = 	 {Embedding-based Instance Segmentation in Microscopy},
  author =       {Lalit, Manan and Tomancak, Pavel and Jug, Florian},
  booktitle = 	 {Proceedings of the Fourth Conference on Medical Imaging with Deep Learning},
  pages = 	 {399--415},
  year = 	 {2021},
  editor = 	 {Heinrich, Mattias and Dou, Qi and de Bruijne, Marleen and Lellmann, Jan and Schläfer, Alexander and Ernst, Floris},
  volume = 	 {143},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v143/lalit21a/lalit21a.pdf},
  url = 	 {https://proceedings.mlr.press/v143/lalit21a.html},
}

and

@article{lalit2022mia,
title = {EmbedSeg: Embedding-based Instance Segmentation for Biomedical Microscopy Data},
journal = {Medical Image Analysis},
volume = {81},
pages = {102523},
year = {2022},
issn = {1361-8415},
doi = {https://doi.org/10.1016/j.media.2022.102523},
url = {https://www.sciencedirect.com/science/article/pii/S1361841522001700},
author = {Manan Lalit and Pavel Tomancak and Florian Jug},
}

Acknowledgements

The authors would like to thank the Scientific Computing Facility at MPI-CBG, thank Matthias Arzt, Joran Deschamps and Nuno Pimpao Martins for feedback and testing. Alf Honigmann and Anna Goncharova provided the Mouse-Organoid-Cells-CBG data and annotations. Jacqueline Tabler and Diana Afonso provided the Mouse-Skull-Nuclei-CBG dataset and annotations. This work was supported by the German Federal Ministry of Research and Education (BMBF) under the codes 031L0102 (de.NBI) and 01IS18026C (ScaDS2), and the German Research Foundation (DFG) under the code JU3110/1-1(FiSS) and TO563/8-1 (FiSS). P.T. was supported by the European Regional Development Fund in the IT4Innovations national supercomputing center, project number CZ.02.1.01/0.0/0.0/16013/0001791 within the Program Research, Development and Education.

The authors would like to thank the authors of StarDist repository for several useful, helper functions. The authors would also like to thank Sahar Kakavand and Marco Dalla Vecchia for feedback on notebooks.

embedseg's People

Contributors

ajinkya-kulkarni avatar fjug avatar jdeschamps avatar lmanan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

embedseg's Issues

Updating pip package and compatibility with ZeroCostDL4Mic/DL4MicEverywhere

Hi there!

I'm updating the Embedseg notebook in ZeroCostDL4Mic so we can also use it in DL4MicEverywhere with Docker containers. For the containerisation in Mac we are having some issues that could be solved with some feature changes directly in EmbedSeg. We were wondering if the followings are easy or doable for you to deploy:

In the setup.py of embedseg, imagecodecs is listed as a dependency. Do you think it is be possible to remove it from the basic installation and leave it as an extra requirement for the environment?

Also, could it be possible to update the pip package with the new version of Embedseg? When installing it from pip there are multiple issues with old versions of numpy, however, when installing it directly from your repo, things work nicely.

Thank you!
Sincerely,

Esti

RuntimeError:The size of tensor a (96) must match the size of tensor b (0) at non-singleton dimension 2

I am trying to train the model with 02-train.ipynb on the available CTC data since the pretrained model is not available online. I get the following error:

RuntimeError Traceback (most recent call last)
/Users/.../EmbedSeg/examples/2d/dsb-2018/02-train.ipynb Cell 36 line 1
----> 1 begin_training(train_dataset_dict, val_dataset_dict, model_dict, loss_dict, configs, color_map=new_cmap)

File .../EmbedSeg/criterions/my_loss.py:41, in SpatialEmbLoss.forward(self, prediction, instances, labels, center_images, w_inst, w_var, w_seed, iou, iou_meter)
37 loss = 0
39 for b in range(0, batch_size):
---> 41 spatial_emb = torch.tanh(prediction[b, 0:2]) + xym_s # 2 x h x w #TODO
42 sigma = prediction[b, 2:2 + self.n_sigma] # n_sigma x h x w
43 seed_map = torch.sigmoid(prediction[b, 2 + self.n_sigma:2 + self.n_sigma + 1]) # 1 x h x w

RuntimeError: The size of tensor a (96) must match the size of tensor b (0) at non-singleton dimension 2

Any ideas on how to solve this? Is it alternatively possible to get access to the pretrained models? Thanks!

[ISSUE]No json file for Evaluation Configuration

Describe the bug
I cannot find the json file mentioned in file EmbedSeg/tree/main/examples/2d/dsb-2018/03-predict.ipynb. This file is important for configuring evaluation process.
The code from notebook is as follows.
torch.hub.download_url_to_file(url = 'https://owncloud.mpi-cbg.de/index.php/s/H1pXwhq3aO4kJK3/download', dst = 'pretrained_model', progress=True) import zipfile with zipfile.ZipFile('pretrained_model', 'r') as zip_ref: zip_ref.extractall('') checkpoint_path = os.path.join(project_name+'-'+'demo', 'best_iou_model.pth') if os.path.isfile(os.path.join(project_name+'-'+'demo','data_properties.json')): with open(os.path.join(project_name+'-'+'demo', 'data_properties.json')) as json_file: data = json.load(json_file) one_hot, data_type, min_object_size, n_y, n_x, avg_bg = data['one_hot'], data['data_type'], \ int(data['min_object_size']), int(data['n_y']), int(data['n_x']), float(data['avg_background_intensity']) if os.path.isfile(os.path.join(project_name+'-'+'demo','normalization.json')): with open(os.path.join(project_name+'-'+'demo', 'normalization.json')) as json_file: data = json.load(json_file) norm = data['norm']

Can you provide this file in the project?

Pretrained models not found

Hello,

I found your links of pretrained models in this project page are 404.
Do they still available? I want to try your models on our private dataset of 3D nuclei instance segmentation.

Thank you!
Best wishes.

creating prediction without having val files

Hi,
I am trying to create/generate prediction (part 3) but my dataset lacks validation files which prevent me from going further.
I was wondering is there a specific function or code that can be implemented to tackle the issue or by default, validation files are required to generate prediction?

dsb-2018/01-data.ipynb ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Hi,

I am trying to run the firs example notebook, and I am failing at the very first cell...

miniconda installation, creating the environment from your directions.

conda env create -f EmbedSeg_environment.yml
conda activate EmbedSegEnv
python3 -m pip install -e .
python3 -m ipykernel install --sys-prefix  --name EmbedSegEnv --display-name "EmbedSegEnv"

(instead of --user to install it into the virtualenv instead of $HOME/.local)

(EmbedSegEnv) [tru@sillage EmbedSeg]$ pip3 list |grep numpy
numpy                             1.19.4
(EmbedSegEnv) [tru@sillage EmbedSeg]$ pip3 list |grep hdm
hdmedians                         0.14.1
from tqdm import tqdm

from glob import glob

import tifffile

import numpy as np

import os

from EmbedSeg.utils.preprocess_data import extract_data, split_train_val

from EmbedSeg.utils.generate_crops import *

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-54e5f42b447e> in <module>
      5 import os
      6 from EmbedSeg.utils.preprocess_data import extract_data, split_train_val
----> 7 from EmbedSeg.utils.generate_crops import *

~/git/github/juglab/EmbedSeg/EmbedSeg/utils/generate_crops.py in <module>
      5 from scipy.ndimage.morphology import binary_fill_holes
      6 from scipy.spatial import distance_matrix
----> 7 import hdmedians as hd
      8 from numba import jit
      9 

/c7/home/tru/miniconda3/envs/EmbedSegEnv/lib/python3.7/site-packages/hdmedians/__init__.py in <module>
      4 
      5 from .medoid import medoid, nanmedoid
----> 6 from .geomedian import geomedian, nangeomedian

hdmedians/geomedian.pyx in init hdmedians.geomedian()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Where is medoid used?

Hi.

I have a question about implementation about medoid which is mentioned in the paper.
I found the calculation of it is done at

def generate_center_image(instance, center, ids, one_hot):

But I could not found the clue that this funtion is called from any other python scripts.

Could you tell me how the medoid is used in the code?

TypeError: forward() missing 4 required positional arguments: 'prediction', 'instances', 'labels', and 'center_images'[BUG]

Hello. I tried tutorial of bbbc010-2012 Jupyter notebooks, but this error happend and I don't know solution. Could you tell me what I should do ?

I ran 01-data.ipynb and 02-train.ipynb. When I ran 「begin_training(train_dataset_dict, val_dataset_dict, model_dict, loss_dict, configs, color_map=new_cmap)」, the following error happend.
image
image

Environment

  • OS: Ubuntu 18.04
  • GPU:Tesla
    -python3.7
    torch 1.1.0 torchvision 0.3.0 cuda=10.0

[BUG]RuntimeError: result type Byte can't be cast to the desired output type Bool

Hi, again..

When I run begin_evaluating(test_configs, verbose = False, avg_bg= avg_bg/normalization_factor) at predict notebook,
I got the following error:

2-D `test` dataloader created! Accessing data from ../../../data/bbbc010-2012/test/
Number of images in `test` directory is 50
Number of instances in `test` directory is 50
Number of center images in `test` directory is 0
*************************
Creating branched erfnet with [4, 1] classes

0%| | 0/50 [00:01<?, ?it/s]


RuntimeError Traceback (most recent call last)
/tmp/ipykernel_33/4185926816.py in
----> 1 begin_evaluating(test_configs, verbose = False, avg_bg= avg_bg/normalization_factor)

/kaggle/input/embedsegv1/EmbedSeg/test.py in begin_evaluating(test_configs, verbose, mask_region, mask_intensity, avg_bg)
62 test(verbose = verbose, grid_x = test_configs['grid_x'], grid_y = test_configs['grid_y'],
63 pixel_x = test_configs['pixel_x'], pixel_y = test_configs['pixel_y'],
---> 64 one_hot = test_configs['dataset']['kwargs']['one_hot'], avg_bg = avg_bg, n_sigma=n_sigma)
65 elif(test_configs['name']=='3d'):
66 test_3d(verbose=verbose,

/kaggle/input/embedsegv1/EmbedSeg/test.py in test(verbose, grid_y, grid_x, pixel_y, pixel_x, one_hot, avg_bg, n_sigma)
126
127 center_x, center_y, samples_x, samples_y, sample_spatial_embedding_x, sample_spatial_embedding_y, sigma_x, sigma_y,
--> 128 color_sample_dic, color_embedding_dic = prepare_embedding_for_test_image(instance_map = instance_map, output = output, grid_x = grid_x, grid_y = grid_y, pixel_x = pixel_x, pixel_y =pixel_y, predictions =predictions, n_sigma = n_sigma)
129
130 base, _ = os.path.splitext(os.path.basename(sample['im_name'][0]))

/kaggle/input/embedsegv1/EmbedSeg/utils/utils.py in prepare_embedding_for_test_image(instance_map, output, grid_x, grid_y, pixel_x, pixel_y, predictions, n_sigma)
483 sample_spatial_embedding_y[id.item()] = add_samples(samples_spatial_embeddings, 1, grid_y - 1, pixel_y)
484 center_image = predictions[id.item() - 1]['center-image'] # predictions is a list!
--> 485 center_mask = in_mask & center_image.byte()
486
487

RuntimeError: result type Byte can't be cast to the desired output type Bool

License and general questions

embedseg seems promising,

  • why not use bsd or apache for license
  • how does embedseg compares to DenoiSeg in segmenting connected components, performance, efficiency, etc...

RuntimeError: CUDA out of memory.

I have 4 images, and batch size is only 1. but when I start the
begin_training(train_dataset_dict, val_dataset_dict, model_dict, loss_dict, configs), I have RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 31.75 GiB total capacity; 30.71 GiB already allocated; 62.50 MiB free; 12.93 MiB cached). Please let me know how can I solve it.
Thanks

Trouble installing dependencies

Hello team, thank for this repo! While installing EmbedSeg and creating the env, I get the error:

PackagesNotFoundError: The following packages are not available from current channels:

  - cudatoolkit=10.2

Any ideas how to bypass this?

Thanks, Ajinkya

Where is cmap_60.npy?

Hello again.

I have a question about your elaborate notebook.
I get stacked one section when loading cmap_60.npy.

When I tried to load it, I got FileNotFoundError: [Errno 2] No such file or directory: '../../../cmaps/cmap_60.npy'.

How can I prepare it?

How can I reduce memory for inference

Hi.

I tried separately ran the notebook[bbbc010-2012] for inference provided by this repo but I had a memory allocation issue.
I used batch size as 1.

Is there any other parameters to reduce memory requirement?

Also I set normalization_factor = 32767 if data_type=='8-bit' else 255
instead of normalization_factor = 65535 if data_type=='16-bit' else 255.

But nothing changed.

[BUG] Pre-trained model unloadable

Describe the bug
In the bbbc010-2012 predict notebook, the default cell loading the pre-trained model fails.

It seems it cannot load the state_dict because the module path has changed (encoder.initial_block.conv.weight -> module.encoder.initial_block.conv.weight):

File [/localscratch/miniconda3/envs/EmbedSeg/lib/python3.10/site-packages/torch/nn/modules/module.py:2152](https://vscode-remote+ssh-002dremote-002bvdi8.vscode-resource.vscode-cdn.net/localscratch/miniconda3/envs/EmbedSeg/lib/python3.10/site-packages/torch/nn/modules/module.py:2152), in Module.load_state_dict(self, state_dict, strict, assign)
   2147         error_msgs.insert(
   2148             0, 'Missing key(s) in state_dict: {}. '.format(
   2149                 ', '.join(f'"{k}"' for k in missing_keys)))
   2151 if len(error_msgs) > 0:
-> 2152     raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   2153                        self.__class__.__name__, "\n\t".join(error_msgs)))
   2154 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for BranchedERFNet:
	Missing key(s) in state_dict: "encoder.initial_block.conv.weight", "encoder.initial_block.conv.bias", [...]
	Unexpected key(s) in state_dict: "module.encoder.initial_block.conv.weight", "module.encoder.initial_block.conv.bias", [...]

(I shorten the error, obviously it lists all the layers)

I used pytorch 2.1.0

cublas Run time error

Describe the bug
I am trying the example notebooks and successfully ran
01-data
However, when I try the training notebook and being training the model, it takes a long time to initialise and then I get the following error:
cublas runtime error : the GPU program failed to execute at C:/w/1/s/tmp_conda_3.7_044431/conda/conda-bld/pytorch_1556686009173/work/aten/src/THC/THCBlas.cu:259

Desktop (please complete the following information):

  • OS: Tried this on Window 10 and Windows 11
  • Graphics NVIDIA RTX 3080

Additional context
Not sure if its a compatibility issue with RTX 30 series cards.
I found a similar error for RTX 2080 cards on older pytorch
pytorch/pytorch#17334

[BUG] `workers` Parameter not Respected by DataLoaders

Describe the bug
Only 1 thread (core) is used for the dataloaders.

To Reproduce
Steps to reproduce the behavior:

  1. Spin up any of the training examples
  2. Set batch_size to something respectable, like 512
  3. Adjust workers dataloader parameter
  4. Examine CPU utilization

Expected behavior
Multiple cores get engaged and are used to feed the GPU(s).

Screenshots
Only 1 CPU Core Engaged

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.2 LTS
  • Graphics 2x GeForce GTX 3090

Additional context

train_dataset_dict = create_dataset_dict(
	data_dir = data_dir, 
	project_name = project_name,  
	center = center, 
	size = train_size, 
	batch_size = train_batch_size, 
	virtual_batch_multiplier = virtual_train_batch_multiplier, 
	normalization_factor= normalization_factor,
	one_hot = one_hot,
	workers=16,
	type = 'train'
)

To help debug, from the same virtual environment I put together this dummy script:

import random
import numpy as np
from torch.utils.data import Dataset
import torch
from tqdm.auto import tqdm

class TestDS(Dataset):
    def __len__(self):
        return 5000

    def __getitem__(self, index):
        z = np.zeros((256*256))
        for i in range(256*256): z[i] = i
        return z
        

val_dataset = TestDS()
val_dataset_it = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=32,
    shuffle=True,
    drop_last=True,
    num_workers=12,
    pin_memory=True
)

while True:
    for i, sample in enumerate(tqdm(val_dataset_it)):
        sample = sample.to('cuda:1')

Running the above results in proper core utilization:
Cores Properly Engaged

Even adding the following code at the head of EmbSeg training script does not help:

import os
os.environ["MKL_NUM_THREADS"] = "20"
os.environ["OMP_NUM_THREADS"] = "20"

Possibility to train on non-GPU machines?

Hello team, I am trying to run the 3 DSB notebooks from the examples on my macbook (no GPU), and I run into the error FileNotFoundError: [Errno 2] No such file or directory: 'nvidia-smi': 'nvidia-smi'. This part of the code exists in the EmbedSeg/utils/preprocess_data.py location, specifically def get_gpu_memory().
Is there a way to "switch-off" this get gpu memory thing for training on non-gpu machines?

Thanks, Ajinkya

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.