ESTRO 2023

Session Item

Sunday

May 14

10:30 - 11:30

Business Suite 3-4

Physics brachytherapy

Chair: , Denmark

Overview: Poster Discussions are presented in one of the sessions scheduled at the two poster discussion theatres. Each author will present a digital poster orally for 2 minutes, followed by 2 minutes for discussion. Sessions will not be recorded.

Session Type: Poster Discussion

Track: Brachytherapy

Journey:

Deep learning-based segmentation considering observer variation - evaluation in prostate MRI for BT

, The Netherlands

Presentation Number: PD-0498

Abstract

Abstract Title:

Deep learning-based segmentation considering observer variation - evaluation in prostate MRI for BT

Authors:

Arkadiy Dushatskiy¹, Peter A. N. Bosman^1,5, Karel A. Hinnen², Jan Wiersma², Henrike Westerveld³, Bradley Pieters², Tanja Alderliesten⁴

¹Centrum Wiskunde & Informatica, Evolutionary Intelligence, Amsterdam, The Netherlands; ²Amsterdam UMC, University of Amsterdam, Radiation Oncology, Amsterdam, The Netherlands; ³Erasmus Medical Center, Radiation Oncology, Rotterdam, The Netherlands; ⁴Leiden University Medical Center, Radiation Oncology, Leiden, The Netherlands; ⁵Delft University of Technology, Algorithmics, Delft, The Netherlands

Show Affiliations

Purpose or Objective

Recently we proposed a novel deep learning-based method for (semi)automatic scan segmentation that can output multiple segmentations representing the observer variation in the training set. Here, our goal is to verify its potential for clinical practice integration by comparing the automatically produced segmentations to the clinically approved one and one produced by a classical deep learning method (CDLM). Specifically, we consider prostate segmentation on MRI scans acquired for brachytherapy.

Material and Methods

In contrast to CDLM, our method can capture and exploit observer variation inherently present in the data, e.g., produced segmentations might correspond to different observer groups. For clinical use, it means that a clinician can select the preferred segmentation among multiple automatically produced ones (here we use 2), potentially requiring less/no manual correction.
Our method uses a multi-head U-Net with ResNeXt-50 encoder. CDLM has the same neural network, but with one head. In our method, heads are trained on separate training data subsets obtained by an optimization algorithm. The dataset was split into 40/13/13 scans for train/validation/test (used for the evaluation study).
We used MRI scans, previously used for HDR prostate brachytherapy with catheters in situ. For each scan, four prostate segmentation variants were presented: 1) Clinically used segmentation (reference) 2) Segmentation produced by a CDLM 3-4) Two segmentations produced by our method. Segmentation variants were named to not reveal their origin, enabling an unbiased blinded study. For each scan, an experienced radiation oncologist was asked to grade individual slices, the whole volumetric prostate segmentation and, finally, rank the presented segmentations. The grade scale was from 1 to 4 meaning that a segmentation: 1) should be rejected; 2) requires major manual correction; 3) requires minor manual correction; 4) can be approved without correction. For our method, we use the best grade (or rank) for the two segmentation variants produced (per-slice or per-scan) as, at time of use, a clinician can choose the preferred one.
We test statistical differences between different segmentation methods using the chi-squared test and significance level p=0.05.

Results

Figures show main results and particular segmentation examples. In per-slice and per-scan grading, our method produces acceptable segmentations more often than CDLM. However, there is no statistically significant difference between our method and CDML and between our method and the reference segmentations. Our method produces segmentations that are on average better ranked (p-value=0.01) than both CDLM and the reference.

Conclusion

Deep-learning-based automatic segmentation can produce high-quality segmentations. Our method, which produces multiple segmentation variants instead of just one, was evaluated to give the best results, ranking better even than the reference segmentations.