Evaluation of measures for assessing automatic OAR delineation acceptability and time-savings
PO-1725
Abstract
Evaluation of measures for assessing automatic OAR delineation acceptability and time-savings
Authors: Vaassen|, Femke(1)*[femke.vaassen@maastro.nl];Hazelaar|, Colien(1);Vaniqui|, Ana(1);Gooding|, Mark(2);van der Heyden|, Brent(1);Canters|, Richard(1);van Elmpt|, Wouter(1);
(1)Maastricht University Medical Centre, Department of Radiation Oncology MAASTRO, Maastricht, The Netherlands;(2)Mirada Medical Ltd., Science and Technology, Oxford, United Kingdom;
Show Affiliations
Hide Affiliations
Purpose or Objective
Automatic organ-at-risk delineation algorithms allow for significantly faster delineation, but a clinically relevant contour evaluation method remains challenging. Commonly used measures to quantify automatic contour accuracy and acceptability, such as the volumetric Dice Similarity Coefficient (DSC) or Hausdorff distance, have shown to be good measures for geometric similarity, but do not always correlate with clinical contour acceptability or adaptation time. This study aimed to introduce new evaluation measures for clinical delineation time-savings and investigate the correlation between commonly used measures.
Material and Methods
Twenty lung cancer patients were used to compare user-adjustments after atlas-based and deep-learning delineation with manual delineation. The lungs, esophagus, spinal cord, heart, and mediastinum were delineated. Novel measures (including the surface DSC, added path length (APL)) and conventional evaluation measures (volumetric DSC and Hausdorff distance) were correlated (R) with the delineation time-recordings and time-savings. The surface DSC was defined similar to the volumetric DSC, as the exact intersection surface of the two contours normalized by the union of the two surface contours. Secondly, the path length of a contour that had to be added (APL) to meet the institutional guidelines for delineation was calculated (Figure 1). This APL measure considered all manual adjustments, both expansion and shrinkage of the automatically generated contour.
Results
The highest correlation (R=0.87) was found between the novel APL metric and absolute adaption time (Figure 2). Correlation for surface DSC with absolute adaption time (R=-0.67) and relative time-saving (R=0.57) was found, and APL with relative time-saving (R=-0.38). Conventional measures, volumetric DSC and Hausdorff distance, showed lower correlation coefficients for absolute adaptation time (R=-0.32 and 0.64, resp.) and relative time-saving (R=0.44 and -0.64, resp.).
Conclusion
Surface DSC and APL are better indicators for delineation time and time-saving when using auto-delineation and provide a more clinically relevant and better quantitative surrogate for clinical acceptability compared to standard-used geometry-based measures. Being able to evaluate the time-saving automatically without labor intensive manual time clocking (only a calibration curve is needed) is an important aspect in the introduction and evaluation of clinical applicability of auto-delineation techniques. These new measures provide helpful information in analyzing time-saving when using automatically generated contours. Furthermore, this finding suggests that in clinical practice it is important for radiation oncologists and technologists to evaluate the amount of path length to be adjusted when considering clinical use of auto-contours, rather than considering volume overlap or maximum contour error.