Is Monte Carlo uncertainty a good predictor of manual adjustments of deep-learning contours?
Georgia-Valeria Ionescu,
United Kingdom
MO-0213
Abstract
Is Monte Carlo uncertainty a good predictor of manual adjustments of deep-learning contours?
Authors: Georgia Ionescu1, Pádraig Looney1, Julien M. Y. Willaime1, Femke Vaasen2, Wouter van Elmpt3, Mark J. Gooding1
1Mirada Medical, Science, Oxford, United Kingdom; 2Maastro Clinic, Medical Physics, Maastricht, The Netherlands; 3Maastro Clinic, Medical Physics, Maastricht, The Netherlands
Show Affiliations
Hide Affiliations
Purpose or Objective
Deep learning contouring has been proven to be
highly efficient at delineating structures in radiotherapy, but models do not
generally provide information regarding regions that are difficult to contour,
where the model is uncertain. It has been suggested that regions with high
model uncertainty might require increased attention and more manual editing
than regions where models are certain.
This study aimed to assess whether the amount
of manual adjustment of auto-contouring in routine clinical practice can be predicted
by the uncertainty of the auto-contouring system.
Material and Methods
In this study, the heart and esophagus of 100
thorax cases were contoured using a deep learning model and subsequently reviewed
and edited for clinical use.
The uncertainty of the deep learning model was
calculated using Monte Carlo Dropout in the CNN’s final layers. The distance
between clinician’s contours and the automated contours were compared with the uncertainty
of contours predicted by the automated model.
Spearman’s correlation coefficient was used for
measuring the degree of association between the amount of manual contour adjustments and the uncertainty.
Additionally, a qualitative assessment
investigated whether regions of the AI generated contours that had no manual
adjustments were associated with low uncertainty of the deep learning model.
Results
The results showed a weak positive correlation
between contour edits and uncertainty. Spearman coefficient was 0.22
(p<0.001) for esophagus and 0.47 (p<0.001) for heart. However, as shown in
Figure 2, there are regions of the AI generated contours not edited by
clinicians which corresponded to regions with varying levels of uncertainty.
A qualitative assessment showed numerous
examples of contour regions where low uncertainty corresponded to major edits,
and regions with high uncertainty that had no edits (see Figure 2).
Conclusion
The weak correlations between contour edits and
model uncertainty suggests that the Monte Carlo uncertainty is not a strong
predictor of manual adjustments of contours. This suggests that using model
uncertainty is an unsuitable approach to highlight contour regions that may
need clinician’s attention.
Future work will include investigating whether model
uncertainty correlates with clinical uncertainty and whether regions with high model uncertainty
and no edits might be caused by low image contrast.
Figure 1: Examples of contours. The automated
contours are shown in dark blue. The light blue contours represent the 10% and
90% of the contours predicted using Monte Carlo Dropout. Red contours were that
drawn by clinicians after seeing the automated contour. The distance between
dark blue and red is the edit; the distance between the two light
blue contours is the uncertainty. The images show: (a) high uncertainty and high edit, (b) low
uncertainty and high edit, (c) high uncertainty and low edit, and (d) low
uncertainty and low edit.
Figure 2: Heatmap of uncertainty vs edit for (a) heart and (b) esophagus.