AI auto-segmentation for MRgRT of prostate cancer: evaluating 269 MR images from two institutes
PD-0067
Abstract
AI auto-segmentation for MRgRT of prostate cancer: evaluating 269 MR images from two institutes
Authors: Maria Kawula1, Indrawati Hadi2, Davide Cusumano3, Luca Boldrini3, Lorenzo Placidi4, Stefanie Corradini1, Claus Belka1,5, Guillaume Landry1, Christopher Kurz1
1University Hospital, LMU Munich, Radiation Oncology, Munich, Germany; 2University Hospital, LMU Munich, Radiation Oncology , Munich, Germany; 3Fondazione Policlinico Universitario “Agostino Gemelli” IRCCS, Radiation Oncology, Rome, Italy; 4Fondazione Policlinico Universitario “Agostino Gemelli” IRCCS, Radiation Oncoloogy, Rome, Italy; 5German Cancer Consortium, (DKTK), Munich, Germany
Show Affiliations
Hide Affiliations
Purpose or Objective
The introduction of MR Linacs into clinics has enabled
online adaptive radiotherapy, at the cost of longer workflows, notably
due to the need for online recontouring. The aim of this work was (1)
the development of an AI-based segmentation of organs at risk (OARs) and
the CTV for prostate cancer treatments at the
0.35 T
MRIdian, (2) to examine the transferability of trained models between
institutes, and (3) to compare the fraction contours propagated by the
MRIdian treatment planning system (TPS) with the AI predictions.
Material and Methods
MR
images of 19 prostate cancer patients (19 planning + 240 fraction
images) treated at our institution (cohort 1, C1) and 73 planning images
acquired at a collaborating
institution (cohort 2, C2) were included. The bladder, rectum and CTV
were manually segmented on planning MRIs by radiation oncologists, while
fraction contours were propagated by the TPS and corrected by
physicians shortly before the irradiation. We trained a 3D U-Net on C2
planning data and tested the network performance using the Dice
similarity coefficient (DSC), the average and 95th percentile Hausdorff distance (HDavg and HD95) on 3 datasets: (i)
10 planning C2 images not used for training, (ii) 19 C1 planning
images, (iii) 240 C1 fraction images. For the rectum, we evaluated
slices up to 1.5cm above/below the PTV top/bottom. Additionally, for 5
C1 patients with 5 fractions each, we propagated the manual planning
contours to the anatomy of the day without further corrections using a simulated workflow in the
TPS. Finally, we divided the CTV test set
into subgroups of grade I&II (10%) and III&IV (90%) cases, due
to differences in inclusion of seminal vesicles. Post-prostatectomy
patients were excluded from the CTV analysis.
Results
For OARs, the mean DSC, HDavg, and HD95
for C2 and C1 planning images were comparable, while the performance
for fractions decreased slightly (see Table 1 and Fig. 1). CTV
predictions showed higher network performance for C2 than C1 data and
higher performance for grade III&IV cases than I&II. For the
bladder, apart from one case, network predictions were better than the
TPS propagated contours, both with average DSC=0.91(0.11). The outlier cases were related to patients with limited bladder filling, which were absent in the C2 training set. For the rectum, average DSCpred=0.86(0.15) and DSCprop=0.88(0.16) were obtained.
Conclusion
Results for OARs
suggest model transferability between institutes. However, this does
not apply to CTV. Worse scores for fraction images might suggest higher
contour variability caused by time pressure during adaptation. The CTV
model performs poorly for grades ≤ II
suggesting that separate training may be required. TPS propagated
contours show comparable quality to the network predictions, however,
the analysis may be biased in favor of propagated contours, which were the basis for manual corrections leading to the ground truth.
Acknowledgments:
Wilhelm Sander-Stiftung