Vienna, Austria

ESTRO 2023

Session Item

Sunday
May 14
10:30 - 11:30
Business Suite 3-4
Physics brachytherapy
Jacob Johansen, Denmark
Poster Discussion
Brachytherapy
Deep learning-based segmentation considering observer variation - evaluation in prostate MRI for BT
Peter Bosman, The Netherlands
PD-0498

Abstract

Deep learning-based segmentation considering observer variation - evaluation in prostate MRI for BT
Authors:

Arkadiy Dushatskiy1, Peter A. N. Bosman1,5, Karel A. Hinnen2, Jan Wiersma2, Henrike Westerveld3, Bradley Pieters2, Tanja Alderliesten4

1Centrum Wiskunde & Informatica, Evolutionary Intelligence, Amsterdam, The Netherlands; 2Amsterdam UMC, University of Amsterdam, Radiation Oncology, Amsterdam, The Netherlands; 3Erasmus Medical Center, Radiation Oncology, Rotterdam, The Netherlands; 4Leiden University Medical Center, Radiation Oncology, Leiden, The Netherlands; 5Delft University of Technology, Algorithmics, Delft, The Netherlands

Show Affiliations
Purpose or Objective

Recently we proposed a novel deep learning-based method for (semi)automatic scan segmentation that can output multiple segmentations representing the observer variation in the training set. Here, our goal is to verify its potential for clinical practice integration by comparing the automatically produced segmentations to the clinically approved one and one produced by a classical deep learning method (CDLM). Specifically, we consider prostate segmentation on MRI scans acquired for brachytherapy.

Material and Methods

In contrast to CDLM, our method can capture and exploit observer variation inherently present in the data, e.g., produced segmentations might correspond to different observer groups. For clinical use, it means that a clinician can select the preferred segmentation among multiple automatically produced ones (here we use 2), potentially requiring less/no manual correction.
    Our method uses a multi-head U-Net with ResNeXt-50 encoder. CDLM has the same neural network, but with one head. In our method, heads are trained on separate training data subsets obtained by an optimization algorithm. The dataset was split into 40/13/13 scans for train/validation/test (used for the evaluation study).
    We used MRI scans, previously used for HDR prostate brachytherapy with catheters in situ. For each scan, four prostate segmentation variants were presented: 1) Clinically used segmentation (reference) 2) Segmentation produced by a CDLM 3-4) Two segmentations produced by our method. Segmentation variants were named to not reveal their origin, enabling an unbiased blinded study. For each scan, an experienced radiation oncologist was asked to grade individual slices, the whole volumetric prostate segmentation and, finally, rank the presented segmentations. The grade scale was from 1 to 4 meaning that a segmentation: 1) should be rejected; 2) requires major manual correction; 3) requires minor manual correction; 4) can be approved without correction. For our method, we use the best grade (or rank) for the two segmentation variants produced (per-slice or per-scan) as, at time of use, a clinician can choose the preferred one.
    We test statistical differences between different segmentation methods using the chi-squared test and significance level p=0.05.



Results

Figures show main results and particular segmentation examples. In per-slice and per-scan grading, our method produces acceptable segmentations more often than CDLM. However, there is no statistically significant difference between our method and CDML and between our method and the reference segmentations. Our method produces segmentations that are on average better ranked (p-value=0.01) than both CDLM and the reference.


Conclusion

Deep-learning-based automatic segmentation can produce high-quality segmentations. Our method, which produces multiple segmentation variants instead of just one, was evaluated to give the best results, ranking better even than the reference segmentations.