APS Logo

Dependence of deep learning-based whole organ segmentation on training dataset size in computed tomography (CT) images

ORAL

Abstract

A significant drawback of deep learning-based medical image segmentation is its reliance on large amounts of labeled training data. However, little literature has characterized the dependence of model performance on the amount of training data provided. Here, we examine this dependence in the application of abdominal organ segmentation on patient CT images.
Two public datasets, BTCV (N=30) and VISCERAL.eu (N=20), were used for training. A third dataset, pancreasCT (N=43) was used as an independent test set. Segmentation was performed for five abdominal organs: liver, spleen, kidneys, stomach, and pancreas. Instances of the same CNN were trained on a varying number of randomly selected training scans (N=5-50). The architecture used was DeepMedic, a 3D patch-based CNN. Performance was measured with Dice coefficient, average surface distance, and 95% Hausdorff distance.
We observe that segmentation performance improves with increasing training dataset size, but in some cases plateaus before the whole training set is used. Absolute performance of our model is comparable to literature while minimizing the amount of labelled data required.
This work has implications for optimizing deep learning-based image segmentation pipelines by minimizing time spent on unnecessary dataset labelling.

Presenters

  • Daniel Huff

    Medical Physics, University of Wisconsin - Madison

Authors

  • Daniel Huff

    Medical Physics, University of Wisconsin - Madison

  • Amy J Weisman

    Medical Physics, University of Wisconsin - Madison

  • Robert Jeraj

    Medical Physics, University of Wisconsin - Madison, University of Wisconsin - Madison