Session Item

Thursday
January 21
17:00 - 18:55
SBRT for oligometastatic NSCLC
0100
Module
00:00 - 00:00
Evaluation of atlas and deep learning based automatic contouring – A spatially resolved approach
PO-1741

Abstract

Evaluation of atlas and deep learning based automatic contouring – A spatially resolved approach
Authors: Steinseifer|, Isabell(1)*[isabell.steinseifer@radboudumc.nl];Brunenberg|, Ellen(1);Linthorst|, Charlotte(1);Wijsman|, Robin(2);Bussink|, Jan(1);Monshouwer|, René(1);
(1)Radboud University Medical Center, Radiation Oncology, Nijmegen, The Netherlands;(2)University Medical Center Groningen, Radiation Oncology, Groningen, The Netherlands;
Show Affiliations
Purpose or Objective

Contouring of OARs is an important part of treatment planning. However, it is time consuming and has a high inter-observer variability. Automated contouring of OARs can save time and provide more consistent results. Measures like the Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD) are often used to evaluate and compare OAR delineations. However, these parameters lack spatial information of the measured deviations. In this study we present a spatially resolved approach to evaluate the quality of ROI delineations. This is done by comparing the heart, lung and esophagus contours generated by Atlas-based and Deep Learning contouring (DLC) software with the manual delineation.

Material and Methods

An experienced radiation oncologist manually delineated the heart, left and right lung and esophagus on planning CTs of 190 lung cancer patients. These manual delineations were used as ground truth in this contouring comparison. The same CT data sets were delineated by an Atlas-based algorithm trained on 20 lung cancer patients from the Radboudumc (WorkflowBox 2.0.1, Mirada Medical Ltd., Oxford, UK) and by a DLC algorithm (Mirada DLC Expert) trained on 450 lung cancer patients from Maastro, Maastricht, the Netherlands [1]. Agreement with the manually delineated contours was analyzed using global DSC and HD. For the HD, the 95th percentile (HD95) was reported to exclude the largest outliers. To obtain spatial information about the delineation performance, each contoured organ was split into 4 equally spaced bins in craniocaudal direction (bin1=caudal, bin4=cranial), and DSC and HD95 were calculated for these 4 bins. Differences in DSC and HD95 between both algorithms were tested using Mann-Whitney U-tests, considering p<0.05 statistically significant. Multiple testing was taken into account with the Bonferroni correction.

Results

Global DLC values were comparable to a contouring study reported before [1]. In this study we were able to further localize deviations in contouring: The auto-delineations of the upper lung lobes were in good agreement with the manual delineation, but both algorithms had reduced scores in the lowest part of the lung, where most motion artifacts occur in the CTs (see Figure 1 and Table 1). The DLC had low scores on the cranial part of the heart, but performed better than the Atlas in the caudal part of the esophagus. Most likely this was due to differences in the contouring atlases used in the two institutes at which both algorithms were trained.

Conclusion

The spatial analysis provided information about the location where the Atlas and DLC based delineations performed differently, and which areas need further editing. Furthermore, differences in contouring style of the two contouring algorithms became obvious. This enables a more structured way of improving delineation algorithms.

References

[1] Lustberg T, et al; Clinical evaluation of atlas and deep learning based automatic contouring for lung cancer; Radiother Oncol. 2018 Feb; 126(2):312-317