This Notebook contains information on use of _deepflash2_.

Required Data Structure and Naming

Ground Truth Estimation

Input Details: deepflash2 fuses

  • binary segmentations of an image, that is, there must be a single foreground value that represents positively classified pixels
    • Segmentation pixel values: background-class: 0; foreground-class: 1 or 255
  • instance segmentations of an image (instances represent positively classified pixels)
    • Segmentation pixel values: background-class: 0; foreground-instances: 1,2,...,I

Examplary input folder structure:

expert_segmentations  -> one parent folder
│                     
│───expert1           -> one folder per expert
│   │   mask1.png     -> segmentation masks
│   │   mask2.png
│   
└───expert2
    │   mask1.png
    │   mask2.png

All common image formats (tif, png, etc.) are supported. See imageio docs.

Training

  • Images must have unique name or ID
    • _0001.tif --> name/ID: 0001; img_5.png --> name/ID: img5, ...
    • Arbitrary number of channels (e.g., 1 greyscale; 3 RGB)
  • Corresponding masks must start with name or ID + a mask suffix__
    • Semantic segmentation mask pixel values: background-class: 0; foreground-classes: 1,2,...,C (or 255 if binary)
    • Instance segmentation mask pixel values (binary only): background-class: 0; foreground-instances: 1,2,...,I
    • _0001 -> 0001_mask.png (mask_suffix = "mask.png")
    • _0001 -> 0001.png (masksuffix = ".png")
    • mask suffix is inferred automatically
    • binary segmentations of an image, that is, there must be a single foreground value that represents positively classified pixels
    • instance segmentations of an image (instances represent positively classified pixels

Examplary input folder structure:

──images            -> one image folder
  │   0001.tif      
  │   0002.tif
──masks             -> one mask folder
  │   0001_mask.png
  │   0002_mask.png

All common image formats (tif, png, etc.) are supported. See imageio docs.

Prediction

  • One folder for images
    • Images must have unique name or ID
      • _0001.tif --> name/ID: 0001; img_5.png --> name/ID: img5, ...
    • Same number of channels as training images (e.g., 1 greyscale; 3 RGB)
  • For evaluation: Corresponding masks must start with name or ID + a mask suffix
  • One folder containing trained models (ensemble)
    • Ensemble folder and models will be created during Training
      • Do not change the naming of the models
      • If you want to train different ensembles, simply rename the ensemble folder

Examplary input folder structure:

──images            -> one image folder
  │   0001.tif      
  │   0002.tif

──masks             -> one masks folder (evaluation only)
  │   0001_mask.png
  │   0002_mask.png

──ensemble          -> one model folder
  │   Unet_resnet34_2classes-fold1.pth
  │   Unet_resnet34_2classes-fold2.pth

Train-validation-split

The train-validation-split is defined as _k-fold cross validation_ with n_splits

  • n_splits is the minimum of: (number of files in dataset, max_splits (default:5))
  • By default, the number of models per ensemble is limited to n_splits

Example for a dataset containing 15 images

  • model_1 is trained on 12 images (3 validation images)
  • model_2 is trained on 12 images (3 different validation images)
  • ...
  • model_5 is trained on 12 images (3 different validation images)

Example for a dataset containing 2 images

  • model_1 is trained on 1 image (1 validation image)
  • model_2 is trained on 1 images (1 different validation image)
  • Only two models per ensemble

Training Epochs and Iterations

To streamline the training process and allow an easier comparison across differently sized datasets, we decided to use the number of training iterations instead of epochs to define the lenght of a training cycle.

Some useful definitions (adapted from stackoverflow):

  • Epoch: one training pass (forward pass and one backward pass) of all the training examples
  • Batch size: the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.
  • Iteration: One forward pass and one backward pass using [batch size] number of examples.