Vienna, Austria

ESTRO 2023

Session Item

Automation
Poster (Digital)
Physics
Clinical evaluation of deep learning-based nodal structures segmentation for gynecological cancers
Shrikant Deshpande, Australia
PO-1633

Abstract

Clinical evaluation of deep learning-based nodal structures segmentation for gynecological cancers
Authors:

Shrikant Deshpande1,3,4, Phillip Chlap2,5,6, Robert Finnegan7,5, Daniel Al Mouiee3,8, Lois Holloway6,3,4,9,10, Karen Lim6,4, Viet Do6,4

1Liverpool & Macarthur Cancer Therapy Centres, SWSLHD Cancer Services, Liverpool,NSW, Radiation Oncology, Sydney, Australia; 2 UNSW Medicine & Health, South-West Sydney Clinical Campus, UNSW , School of Clinical Medicine, Sydney, Australia; 3Ingham Institute for Applied Medical Research, Liverpool, NSW, Medical Physics, Sydney, Australia; 4UNSW Medicine & Health, South-West Sydney Clinical Campus, UNSW, School of Clinical Medicine, Sydney, Australia; 5Ingham Institute for Applied Medical Research, Medical Physics, Sydney, Australia; 6Liverpool & Macarthur Cancer Therapy Centres, SWSLHD Cancer Services, Liverpool, NSW, Radiation Oncology, Sydney, Australia; 7Royal North Shore Hospital, St Leonards, NSW, , Radiation Oncology, Sydney, Australia; 8UNSW Medicine & Health, South-West Sydney Clinical Campus, UNSW SYDNEY, School of Clinical Medicine, Sydney, Australia; 9Institute of Medical Physics, University of Sydney, NSW, Medical Physics, Sydney, Australia; 10University of Wollongong, Wollongong, NSW, Centre for Medical Radiation Physics, Sydney, Australia

Show Affiliations
Purpose or Objective

Manual delineation of nodal target structures is time consuming and prone to inter- and intra-observer variability. The Deep Learning (DL) framework nnUNet has shown versatility to adapt to a range of clinical sites and imaging modalities. In this work we investigate the feasibility of using nnUNet for auto-segmentation of clinical structures training on local datasets. The purpose of this study was to evaluate the accuracy and consistency of nodal clinical target volume (CTV) definition derived from DL based auto-segmentation compared with manual delineation for gynecological cancers.

Material and Methods

 A framework for training, evaluating and deploying auto-segmentation models using nnUNet was implemented. By using nnUnet in our framework, the need for network adaption and hyper-parameter tuning per clinical site is removed. As a use case the automatic delineation of three CTV structures using a dataset of 51 computed tomography (CT) scans from gynecological cancer patients was performed. The dataset was randomly split to use 41 cases for training and 10 for validation. The network was trained to perform multi-label prediction for the internal iliac, common iliac and obturator nodes using four approaches: 2D, 3D low resolution, 3D high resolution, and an ensemble model (which is a composite of the first three methods). Automatic segmentations derived from each of four methods were evaluated against manual contours for the 10 CT validation scans using Dice Similarity Coefficient (DSC), 95th Percentile of Hausdorff Distance (HD95) and mean distance to Agreement (MDA). Each DL segmented contour was reviewed visually by a physicist to assess deviations from the ground truth manual contours and identify the region where this deviation occurs.

Results

Table 1 summarises the quantitative metrics for DL based auto-segmentation accuracy for the four approaches against the manual segmentation. Our preliminary results were comparable to published data despite a relatively small training dataset. The ensemble model showed incremental benefit over the other three approaches. Figure 1 illustrates the comparison between DL and manual contours with an example of the best and worst performing cases. Most variation between automatic and manual contours was seen in the superior and inferior boundaries.





Conclusion

Our preliminary results demonstrate the feasibility of using our framework to implement DL-based auto-segmentation models using a generic approach. Further assessment of clinical benefit from the clinician grading as well adding more training dataset to improve the accuracy will be undertaken in the future.