Vienna, Austria

ESTRO 2023

Session Item

Saturday
May 13
10:30 - 11:30
Lehar 1-3
Autosegmentation & automation for QA
Daniel Sandys, United Kingdom;
Jan Lagendijk, The Netherlands
1250
Proffered Papers
Physics
11:30 - 11:40
Deep Learning Segmentation of Cardiac Substructures in Radiotherapy Planning
Leonard Nuernberg, The Netherlands
OC-0122

Abstract

Deep Learning Segmentation of Cardiac Substructures in Radiotherapy Planning
Authors:

Leonard Nürnberg1, Dennis Bontempi1, Karolien Verhoeven1, Hayian Zeng1, Richard Canters1, Enrique Hortal Quesada2, Francesca Romana Giglioli3, Umberto Ricardi4, Mario Levis4, Dirk De Ruysscher1, Alberto Traverso1

1GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Department of Radiation Oncology (Maastro), Maastricht, The Netherlands; 2Maastricht University, Department of Data Science and Knowledge Engineering, Maastricht, The Netherlands; 3A.O.U. Città della Salute e della Scienza di Torino, Medical Physics Unit, Turin, Italy; 4University of Turin, Department of Oncology, Turin, Italy

Show Affiliations
Purpose or Objective

The radiosensitivity of the different substructures of the heart varies. Nevertheless, manual segmentation of these substructures in planning computed tomography (pCT) required for dose re-calculations is time-consuming and prone to inter- and intra-observer variability. In this study, we aim to develop a deep learning (DL) solution for segmenting the cardiac substructures in pCT.

Material and Methods

A cohort of 180 lymphoma and lung cancer patients treated with radiotherapy (RT) was retrospectively collected and used. Left and right ventricles (LV, RV), left and right atria (LA, RA), right coronary artery (RCA), left anterior descending artery (LAD), and the circumflex branch of the left coronary artery (CFLX) were manually annotated. The training, validation, and testing split was 70%/15%/15%. We extracted 53 axial 149x149 slices for each patient, limited to the maximum cardiac border, and developed seven individual 2D U-Net models for each region of interest (ROI). We investigated the effect of heart masking by removing all CT data outside the heart contour and four different loss functions - binary cross-entropy loss (BCEL), binary Dice score loss (BDL), binary focal loss (BFL), and the combo loss (CL). We explored different thresholds (TH) for generating binary segmentation masks. We used the Surface Dice score (SDS) with 3 (5) mm tolerance for small (large) ROIs as an evaluation metric. Two radiation oncologists (RO) conducted a quality assessment for all the automatic annotations obtained on the test set. Each RO estimated the time spent on modifications, which we compared to the original delineation time.

Results

The average SDS for a test patient was (0.74 ± 0.21). The results per ROI were LV (0.88 ± 0.07), LA (0.88 ± 0.09), RA (0.87 ± 0.09), RV (0.83 ± 0.08), LAD (0.7 ± 0.16), CFLX (0.56 ± 0.27), RCA (0.48 ± 0.18). The best results were obtained when masking was applied. The influence of the loss function differed for each ROI. While BCE and BDS led to better results for the large ROIs (RV, LV, RA, LA), only the CL led to usable results for the small ones (RCA, LAD, CFLX). The selection of a lower TH led to better SDS for the RCA (p < 0.05) but caused over-contouring, as revealed by the RO assessment. 100% of large ROIs and 55% of small ROIs were rated acceptable by all the ROs. Disagreement between experts on the acceptance of a ROI was only found for the LAD, CFLX and RCA with 9%, 18% and 41%, respectively. Using the automatic annotations as prior could save ROs about 20 minutes per patient (> 50%).

Conclusion

We trained a DL pipeline that delineates the LV, RV, LA, and RA with high clinical acceptance. We propose heart masking during preprocessing for all ROI and highlight the CL for smaller ones. Using our model, delineation times for all ROIs can be significantly reduced. We find that evaluation metrics cannot replace clinical evaluation, i.e., for LA, RA, and CFLX, we observe increasing mean SDS values for decreasing quality levels.