Vienna, Austria

ESTRO 2023

Session Item

Saturday
May 13
10:30 - 11:30
Lehar 1-3
Autosegmentation & automation for QA
Daniel Sandys, United Kingdom;
Jan Lagendijk, The Netherlands
Proffered Papers are presented in one of the sessions scheduled in the main session halls. Each author will present orally for 7 minutes, followed by 3 minutes for discussion. Sessions will be recorded and available on-demand.
Proffered Papers
Physics
10:40 - 10:50
First results on DAHANCA automatic segmentation algorithms of organs at risk
Ebbe Laugaard Lorenzen, Denmark
OC-0118

Abstract

First results on DAHANCA automatic segmentation algorithms of organs at risk
Authors:

Ebbe Laugaard Lorenzen1, Ruta Zukauskaite2, Martin Kyndt3, Jesper Grau Eriksen4, Nis Sarup5, Jørgen Johansen2, Christian Maare6, Hanne Primdahl7, Åse Bratland8, Claus Andrup Kristensen9, Maria Andersen10, Jens Overgaard4, Carsten Brink1, Christian Rønn Hansen1

1Odense University Hospital, Laboratory of Radiation Physics, Department of Oncology, Odense, Denmark; 2Odense University Hospital, Department of Oncology, Odense, Denmark; 3MIM Software Inc., EU Office, Brussel, Belgium; 4Aarhus University Hospital, Department of Experimental Clinical Oncology, Aarhus, Denmark; 5 Odense University Hospital, Laboratory of Radiation Physics, Department of Oncology, Odense, Denmark; 6Copenhagen University Hospital Herlev, Department of Oncology, Copenhagen, Denmark; 7Aarhus University Hospital, Department of Oncology, Aarhus, Denmark; 8Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway; 9Copenhagen University Hospital/Rigshospitalet, Department of Oncology, Copenhagen, Denmark; 10Aalborg University Hospital, Department of Oncology, Aalborg, Denmark

Show Affiliations
Purpose or Objective

Automatic organ-at-risk segmentation (OAR) has high potential; ideadly its precission should be comparable to that of clinical experts. The purpose of this study was to train both an open source and a commercial automatic segmentation method for 16 OAR (shown in Figure 1A) according to the Danish Head and Neck Cancer Study Group (DAHANCA) guidelines and to validate both algorithms by comparison with inter-observer variability by clinical DAHANCA experts.

Material and Methods

CT scans and clinical delineations from 600 patients from six centres in the DAHANCA 19 randomized study were included. This data was randomly selected into a validation set (N=70) and training batches of increasing (total N=530). This abstract presents the first results of the training set of 50 patients. An experienced oncologist curated training data manually to ensure adherence to DAHANCA guidelines. The final test of the automatic segmentation algorithm was done (only once) in a test set (N=26) with multiple independent delineations by clinical experts from all the DAHANCA centres (median 9 observers per patient). Two convolutional networks were trained: A 3D full-resolution network using the nnU-net open source framework (nnU-net) and a U-net-like network in collaboration with MIM software (MIM). When both networks were considered final by evaluation in the validation set, they were evaluated in the test set using multiple metrics, including the mean surface distance (MSD). For each segmentation under evaluation (both manual and automatic), metrics were calculated pairwise to all remaining manual segmentations for that organ and patient (see Figure 1 B). Following this, the median of these metrics was assigned to each segmentation. The Mann-Whitney U test evaluated differences in metrics between manual and automatic segmentations.


Results

Both networks performed well and segmented all OAR at risk with a precision comparable to clinical experts. The median MSDs are shown in figure 2. For most organs, there was no statistically significant difference in the precision of the experts and the automatic segmentations. Both networks were significantly worse for the PCM_Up and the LarynxSG than the experts but significantly better for the PCM_low and Oral Cavity. In addition, the nnU-net was better than experts in the delineation of the Lips. The oesophagus is not shown in the figures; both algorithms segmented its entire length, whereas the experts only segmented the oesophagus in the head and neck region. As a result, the MSD differences were substantial even though they were very similar in the region relevant to HN cancer radiotherapy.

Conclusion

Even though both networks were trained on only 50 patients, they segmented all organs with a precision similar to clinical experts. Following publication, both networks will be made publicly available, the nnU-net network will be freely available for all to download, and MIM will distribute to the customers that are eligible to Protege AI after regulatory approval.