Giter Club home page Giter Club logo

deepcell-label's Introduction

DeepCell Label: Cloud-Based Labeling for Single-Cell Analysis

Build Status Coverage Status Apache 2.0

DeepCell Label is a web-based tool to visualize and label biological images. It can segment an image, assign cells across a timelapse, and track divisions in multiplexed images, 3D image stacks, and time-lapse movies.

As it's available through a browser, DeepCell Label can crowdsource data labeling or review, correct, and curate labels as a domain expert.

The site is built with React, XState, and Flask and runs locally or on the cloud.

Visit label.deepcell.org to create a project from an example file or your own .tiff, .png, or .npz. Dropdown instructions are available while working on a project in DeepCell Label.

deepcell-label's People

Contributors

ccamacho89 avatar dependabot[bot] avatar enricozb avatar geneva-miller avatar jannieyu avatar mekwarrior avatar ngreenwald avatar rossbar avatar tddough98 avatar tyler4p avatar willgraf avatar ykevu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepcell-label's Issues

edit mode visual improvements

Miscellaneous small improvements would give the pixel-editing mode a nicer feel. These include:

  • color of brush preview should match color of cell that is being annotated

  • brush should be somewhat transparent

  • annotations currently are composited on top of image, making them difficult to see against dim background. Maybe some combination of compositing and adjusting alpha?

  • (bonus) it would be awesome if the transparency of brush/annotations could be adjusted by the user. the simplest version of this could be toggling the overlay on/off (this is implemented in desktop caliban) but a nicer version would perhaps include a slider that corresponds to the alpha value when drawing the overlays

eb deploy fails for browser caliban

Amazon's Elastic Beanstalk (eb) command-line interface works to create new caliban apps, but "deploy" does not work to update existing apps with modified code. It would be awesome if we could fix this, since otherwise, updating the eb caliban app requires creating a new one, modifying the configuration settings, and then reconfiguring caliban.deepcell.org to use the newest eb site. If there is something we can do on our end to get deploy to work correctly, it should be much easier to update the app with code changes. (This would also help prevent configuration errors; these are more likely if a person must remember to change each setting every time this happens.)

Update readme with corrected docker instructions

Volume mounted should be $PWD/desktop not $PWD/caliban/desktop.

Could also include some of the pitfalls we've encountered so far? Ie, what to do if port 5900 isn't free, some of the Windows troubleshooting, etc.

refine parent assignment in trk files

Currently, the "parent" action for trk files will prevent duplicate daughters from being added to the daughters list of a parent cell (implemented in desktop in #36 and in browser in #78), but additional changes to the action could help it perform correctly/avoid bugs in a wider variety of cases.

  • parent/daughter relationship should not be assigned if parent label ever appears in same frame as daughter label, or if parent label ever appears in movie again once daughter label has appeared (implemented for browser in #78)

  • label should not be assigned as parent to itself (implemented for browser in #78)

  • frame_div should be set to earliest frame that any daughter appears (for edge divisions, sometimes the second daughter appears in frame after the first daughter does, which can confuse frame_div assignment) (implemented for browser in #78)

  • if a parent/daughter relationship is assigned incorrectly, and a new parent is assigned to the daughter cell(s), the daughter label should be cleared from the old, incorrect parent's lineage info

submit button should look more like a button

The button to submit a file just looks like text. This should be visually updated to more obviously be the submit button.

Also, we should consider adding a confirmation dialog to the button in case of misclick.

Readme should have more information for users

Effects of different commands should be explained, and list of commands should be expanded as functionality expands. Will also need significant update for other Caliban use modes (eg, zstack editing).

Text Editing

Allow for the deletion/modification of lineages (parent/children/capped/etc) manually via text

rendering problems with large images

The interface gets buggy with large images. Nothing appears unless the window is resized, at which point the labels appear; this seems to only work sometimes. Perhaps enforce a maximum image size in the short term. Long term can decide whether it's worth effort to support 1024x1024

add single-frame versions of actions

Many of the actions in Caliban were implemented with trk editing in mind, where changing only one frame was not the desired behavior. However, in npz corrections, we often want to only modify the current frame with actions such as create or replace. Desktop caliban (npz) has added single-frame versions of many of these actions. These would be useful to have in browser caliban because they enable easier (and less mistake-prone) fixes of certain annotation errors.

Each single-frame action should:

  • be added to caliban.py as a function
  • be added to the javascript as an action
  • have display text when action is selected (ie, instead of showing "space/esc" in side info panel, it should be "space/s/esc")

Caliban should be able to open files with no annotations

It is possible that we will deal with npz (or trk?) files where no annotations exist. This could be because there are no objects that need to be annotated, or because objects that need annotations do not have corresponding annotations. Currently, the reshape npz (preprocessing) function in deepcell-toolbox addresses this by not saving npzs with empty annotation files. However, we may want to run "empty" annotation files through fig8 in the future, or even just check through files with caliban.

add timestamps to db rows

Would help to determine which rows in the database can be deleted without causing problems (eg, if a row hasn't been updated in a week). Not urgent but may be useful as browser caliban sees more traffic, as files won't necessarily be submitted each time they are opened (eg, demos, debugging, checking fig8 results), which will lead to rows that never get deleted from db.

Display color overlays additional channel

Currently, each channel is treated as greyscale image that is given color by intensity scaling. A nice potential improvement would be the ability to display color overlays in the form of RGB images that are pre-defined by the user and then loaded into the npz file. This would require 1) the ability to render RGB images, and 2) an additional dimension to separate out multiple RGB and greyscale channels from each other, so that they could be scrolled through as is currently implemented.

save checkpoint or undo feature for browser users

In desktop Caliban, users can save frequently so that if they make a mistake, they can go back to a previous save instead of starting annotation over from scratch. We don't have a way to allow this for browser users, since the file is saved only when it is submitted. Pixel-editing is a little more robust to mistakes since annotators can erase or re-draw annotations, but bulk label editing mistakes can take more work to undo.

Ideally, we would have an "undo" command that undoes the most recent command. (I'm not sure how we would implement that, or if we'd be able to extend it to being able to undo/redo multiple commands.) What may be easier to implement is a save/load checkpoint feature. One checkpoint could be stored at a time that annotators could use to reload their file from. Since this relies on annotators remembering to checkpoint their file at appropriate intervals, this is less ideal than "undo" but could still work as a compromise until undo is implemented.

we should have more files in our test folder

We should have several files in the test folder so we can check browser caliban functionality over a range of use cases. Some variables I'd like to make sure get included are:

  • a range of image sizes

  • different data types (trk, untracked movie as npz, zstack, single frame npz) with both nuclear and cytoplasmic images

  • easy and difficult tasks

  • files at various stages of completion (ie, things that would get fixed with different sets of tools, such as the bulk mode operations vs various pixel-level tools that we haven't implemented yet)

Each file in the test folder should be included in the dropdown list on the caliban website. We may want to include a way to say which features each file has (ie, test2.npz is multichannel cytoplasm zstack, uncorrected, with shape of (slices, y, x, channels)). Including comments along those lines in the code or the browser caliban readme may also be helpful. Files will likely get added to the test folder as we find them in the course of annotating data, and the list of useful test examples may change as time goes on.

scale image display to fill available space

The html canvas element is likely to remain the same size across different jobs, but npzs and trks might have a range of different sizes. Currently, browser caliban scales these files by 2 to display them. Preferably, the javascript load_file function would include the available canvas size as an arg so that upon initialization, the python object (ZStackReview or TrackReview) would scale to fit that size. Scaling is currently the only way we have to magnify the image (until a zoom feature can be implemented), so we should use it to the fullest extent we can.

display channel names

Currently channels are displayed based on their index; being able to provide a set of text labels associated with each channel and display that would be helpful for annotating to know what channel is being displayed

add keybind to cycle backwards through channels

Currently, "c" advances the channel being viewed. It would be nice to add another keybind, perhaps shift+c, to cycle through channels in the other direction. That way, with npzs that contain >2 channels, contributors don't need to cycle through all of the channels to get back to a previous channel.

favicon for browser caliban

Add a favicon to the flask app. Can use the deepcell.org favicon or a custom favicon. Minor cosmetic detail but will also put an end to "favicon not found" errors.

Swap cell masks in just one frame

Occasionally, cell tracking messes up in just one frame, such that cell 1 is misidentified as cell 2 only in that frame and vice versa. The current swap feature can't fix this because it swaps the track information between the two cells for all frames of the movie. This is an uncommon error but does happen occasionally. See cells 9 and 10 (erroneously swapped in frame 13) in attached .trk file for example.
HEK293_S0P1_Batch44.zip

Separate non-adjacent cell masks

Watershed can be used when cells are touching, but sometimes a cell mask appears on opposite sides of the movie (ie, the real cell 5 is near the left edge, but a few pixels called "cell 5" are on the right edge). Currently there is no way to separate those stray pixels from the real cell using Caliban.

add updated color map system to TrackReview

ZStackReview has been recently updated to allow for a new color map system. Now, when viewing .trk files, we get the following error:

'TrackReview' object has no attribute 'get_array'

To fix this, add updated color map system to TrackReview.

zoom in and out

It would be useful to add ability to zoom in and out while annotating

Investigate Containerization

Use deepcell-tf as roadmap to investigate containerization. It is likely a graphics port will need to be exposed.

add keybind to set brush color to unused value

We should have a keybind in edit mode (perhaps "n") to set the brush value to an unused value. (Perhaps setting the brush preview to show the same color as highlighted cells?) This would make it easier for annotators to draw in new cells without accidentally duplicating labels.

Browser Caliban should be able to load files from bucket even if they are not in a subfolder

I'm having trouble loading the npz file we use for testing (caliban-input/test.npz, no subfolders). This may be because the landing page for Caliban has not been updated to reflect the new way of accessing files (where input_bucket, output_bucket, and folder structure are encoded in the url). The caliban website should be updated so that the dropdown list of files leads to working caliban sessions. This may not be the issue, but we should also check that caliban is able to access files even if they aren't in subfolders in the s3 bucket.

Watershed clears nearby cell masks

When using watershed to separate one mask into two (eg cell 1 -> cell 1 and cell 2), nearby cells will have portions of their segmentation masks overwritten (ie, a chunk of cell 3 that is near cell 1 goes missing). These masks should be left unmodified by watershed.

actions should use add and del cell info helper functions

Browser caliban.py should use the cell info helper functions in action functions (eg, watershed, replace, create, etc). This will clean up the code and keep behavior consistent between these functions. Consistently using the helper functions will also make it easier to add other actions, such as the single-cell versions of several actions (implemented in the npz class in desktop/caliban.py).

Corrupted .npz file after sending to S3 bucket

.npz file isn't being properly fed to S3 bucket. Had same issue before when sending .trk file, but that has been fixed and resolved. Look into TrackReview class at load()/loadtrk() for solutions.

Add new mask to annotation

Useful in cases where annotation is incorrect (missing pixels) or user has erroneously deleted a cell mask in a frame.

This option should:

  • require user input to determine where annotation should be created
    • two clicks to determine corners of bounding box that contains cell to be annotated
    • third click to determine seed location for watershed transform
  • create new mask in that frame using largest cell value in movie + 1 (new unique mask)
  • create new lineage information corresponding to cell mask

"c- relabel selected cell with an unused label" doesn't work in npz mode

https://github.com/vanvalenlab/caliban/blob/23e8aa39cafa4fdb3909c5e7c2945882edb3c96d/desktop/caliban.py#L352

When relabeling a cell in npz mode, all cells that have that label get moved to a new label, rather than just one of those cells.

Perhaps this isn't how we're supposed to be using c?
As it currently stands, if a single cell needs to be split into two cells, if we erase half of the cell, pick an arbitrary label for the new second half of the cell, and then select the new cell and use c to relabel it, both that cell and whatever other cell happens to also have that label will get moved to a new, unused channel.

Let me know if there's a better way to be doing this!

Highlight cell

Sometimes it is difficult to identify mislabeled pixels in an image. This can be because of low contrast between cell masks, small mistakes (eg, a single pixel annotated incorrectly in the corner of the image), or even both. Movies that are annotated incorrectly can lead to noticeable errors in tracking, most often due to the center of the "cell" shifting drastically between two frames. These errors can be time consuming to fix because they require locating the incorrect pixels.

Distinguishing cells can also be difficult in trk files with many cells. Normally, training data is made with fairly small (~30 cell) .trk files, but for benchmarking/challenges/unforeseen use cases trk files may have many (200+) cells per frame. This may also be the case for small field of view but long timescale tracking movies that have many divisions or cells crossing in/out of the movie. In these cases, the contrast between masks may be low (even with enhancements such as adjusting contrast with the scroll wheel).

A highlight cell option would help make difficult pixels more visible. Such an option would display a cell mask with a different color. Some ways this might be implemented:

  • select a single cell, use "h" to toggle highlighting of that cell
  • with no cells selected, use "h" to prompt input. type in the cell id of the cell to be highlighted. (it is rare but possible to have a few pixels labeled but to only know by inference, eg a gap in the numbering of the cells you can find easily)
  • toggle highlighting visible/invisible with h; one cell is always selected for highlighting and can be cycled through with other keys (eg, m/n). Ie, when highlighting is toggled invisible, display is the same as always and cycling through highlighted cell does nothing. when highlighting toggled visible, can use cycling keys to display one cell at a time as highlighted

Preferably, cell would be highlighted with a color distinct from the default cmap. (Bright red?)

display channels and labels simultaneously

Currently either channels or labels can be displayed, but not both. It would be great if you could draw your labels directly over the channels data, rather than over the grey background.

It looks like some sort of transformed version of the current channel can be viewed, but not the actual image itself

colormaps should be robust to different ranges of labels

Eg, an image that has labels between 100 and 120 should be just as easy to look at as an image with labels between 0 and 20. Ideally, even labels that span a wide range of values (eg, if the labels in an image were [1, 5, 10, 50, 100, 500]) should be easy to tell apart.

I'd like to avoid having user-adjustable label colors in browser caliban as they exist in desktop caliban. Since browser caliban.py masks background with black, setting vmin = min(self.tracks) might be an easy step towards a solution. There may be additional ways we can improve the quality of the colormapping.

Delete annotation option

Option to delete extraneous cell masks by selecting and deleting them (perhaps the x key). Delete should only remove mask from one frame at a time. (If it is useful to delete a whole track at a time, this should be a separate command.) Delete should change all of the pixels of selected value to zero in the selected frame, and remove that frame from the list of frames for that cell id in the lineage data. If that is the only frame the cell appears in, the rest of the lineage data should be deleted.

Tracks created by watershed not the same as tracked created with c

When watershed is used to separate an annotation into two, the new annotation should be associated with a new entry in the lineage data. Currently, cells created by watershed have different behavior than cells created with "create new track". Watershed-created cells behave normally if they are replaced by an existing track.

For an example of behavior differences:
Create two new tracks in a .trk file. The cell ids should be different. Separate a cell with watershed. Without replacing the "new cell" created by watershed, create a new track with c. The cell ids of these cells will be the same. This is a problem in cases where the appropriate track to replace the watershed-created cell does not already exist.

reduce lag for browser caliban

  • try timing functions to identify slowest parts
  • look into reducing how often object needs to be pickled/unpickled
  • do connections need to be closed each time, or can we maintain a persistent connection to the db?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.