Required Data Structure and Naming

Ground Truth Estimation

Input Details: deepflash2 fuses

binary segmentations of an image, that is, there must be a single foreground value that represents positively classified pixels
- Segmentation pixel values: background-class: 0; foreground-class: 1 or 255
instance segmentations of an image (instances represent positively classified pixels)
- Segmentation pixel values: background-class: 0; foreground-instances: 1,2,...,I

Examplary input folder structure:

expert_segmentations  -> one parent folder
│                     
│───expert1           -> one folder per expert
│   │   mask1.png     -> segmentation masks
│   │   mask2.png
│   
└───expert2
    │   mask1.png
    │   mask2.png

All common image formats (tif, png, etc.) are supported. See imageio docs.

Training

Images must have unique name or ID
- _0001.tif --> name/ID: 0001; img_5.png --> name/ID: img5, ...
- Arbitrary number of channels (e.g., 1 greyscale; 3 RGB)
Corresponding masks must start with name or ID + a mask suffix__
- Semantic segmentation mask pixel values: background-class: 0; foreground-classes: 1,2,...,C (or 255 if binary)
- Instance segmentation mask pixel values (binary only): background-class: 0; foreground-instances: 1,2,...,I
- _0001 -> 0001_mask.png (mask_suffix = "mask.png")
- _0001 -> 0001.png (masksuffix = ".png")
- mask suffix is inferred automatically
- binary segmentations of an image, that is, there must be a single foreground value that represents positively classified pixels
- instance segmentations of an image (instances represent positively classified pixels

Examplary input folder structure:

──images            -> one image folder
  │   0001.tif      
  │   0002.tif
──masks             -> one mask folder
  │   0001_mask.png
  │   0002_mask.png

All common image formats (tif, png, etc.) are supported. See imageio docs.

Prediction

One folder for images
- Images must have unique name or ID
  - _0001.tif --> name/ID: 0001; img_5.png --> name/ID: img5, ...
- Same number of channels as training images (e.g., 1 greyscale; 3 RGB)
For evaluation: Corresponding masks must start with name or ID + a mask suffix
- same requirements as for training
One folder containing trained models (ensemble)
- Ensemble folder and models will be created during Training
  - Do not change the naming of the models
  - If you want to train different ensembles, simply rename the ensemble folder

Examplary input folder structure:

──images            -> one image folder
  │   0001.tif      
  │   0002.tif

──masks             -> one masks folder (evaluation only)
  │   0001_mask.png
  │   0002_mask.png

──ensemble          -> one model folder
  │   Unet_resnet34_2classes-fold1.pth
  │   Unet_resnet34_2classes-fold2.pth

Train-validation-split

The train-validation-split is defined as _k-fold cross validation_ with n_splits

n_splits is the minimum of: (number of files in dataset, max_splits (default:5))
By default, the number of models per ensemble is limited to n_splits

Example for a dataset containing 15 images

model_1 is trained on 12 images (3 validation images)
model_2 is trained on 12 images (3 different validation images)
...
model_5 is trained on 12 images (3 different validation images)

Example for a dataset containing 2 images

model_1 is trained on 1 image (1 validation image)
model_2 is trained on 1 images (1 different validation image)
Only two models per ensemble

Training Epochs and Iterations

To streamline the training process and allow an easier comparison across differently sized datasets, we decided to use the number of training iterations instead of epochs to define the lenght of a training cycle.

Some useful definitions (adapted from stackoverflow):

Epoch: one training pass (forward pass and one backward pass) of all the training examples
Batch size: the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.
Iteration: One forward pass and one backward pass using [batch size] number of examples.

Additional Information

Required Data Structure and Naming

Ground Truth Estimation

Training

Prediction

Train-validation-split

Training Epochs and Iterations