unedited deep-learning based OARs are suitable for rigorous head and neck treatment planning
PD-0325
Abstract
unedited deep-learning based OARs are suitable for rigorous head and neck treatment planning
Authors: Jihye Koo1,2, Jimmy J. Caudell1, Kujtim Latifi1, Eduardo G. Moros1, Vladimir Feygelman1
1H.Lee Moffitt Cancer Center and Research Institute, Department of Radiation Oncology, Tampa, USA; 2University of South Florida, Department of Physics, Tampa, USA
Show Affiliations
Hide Affiliations
Purpose or Objective
Quality of organ at risk (OAR) autosegmentation is often judged by concordance metrics against the human-generated gold standard. However, the ultimate goal is the ability to use unedited autosegmented OARs in treatment planning, while maintaining the plan quality associated with the manually segmented counterparts. We tested this approach with head and neck (HN) OARs generated by a prototype deep-learning (DL) model developed in collaboration with the vendor.1 (1Koo et al. https://doi.org/10.1016/j.radonc.2022.06.024)
Material and Methods
Forty previously treated oropharynx cancer patients were selected, with all structures delineated by an experienced physician. For each patient, a set of 13 OARs were generated by the DL model. Each patient was re-planned based on original targets and unedited DL-produced OARs. The new dose distributions were then applied back to the manually delineated structures. The target coverage was evaluated with conformity index (CI), homogeneity index (HI), inhomogeneity index (II), and the PTV_High volume of regret (the volume outside the PTV receiving ≥ Rx dose). For the OARs, Dice similarity coefficient (DSC) of areas under the DVH curves, individual DVH objectives, and composite continuous plan quality metric (PQM) were compared.
Results
The nearly identical primary target coverage for the original and re-generated plans was achieved, with the volume of regret of 23.0±16.2 cc and 21.7±14.5 cc, and the same CI, HI, and II values of 1.2±0.1, 0.1±0.1, 0.1±0.03, respectively. For the 13 HN OARs, the overall average of the DSC of the areas under the corresponding pairs of DVH curves was 0.97±0.06. The mean DSC values for each individual DVH area under the curve pair were all above 0.9. The number of critical DVH points which met the clinical objectives with the re-planned dose and autosegmented structures but failed with the manual ones was 13 of 1131 (1.1%, Table 1). The average PQM score (out of 100) with the re-planned dose distributions was 69.1±15.4 and 69.2±15.2 for the manual and DL-generated OARs, respectively.
Table 1 Number of unmet clinical objectives with dose distributions from plans generated on unedited DL structures.
Clinical DVH Objectives |
| Number of failures to meet objectives
|
| p-value
|
OAR
| Dose-Volume Limit
| Replan dose on gold data structures
| Replan dose on unedited DL structures
|
|
Brainstem_3mm
| Max <25Gy
| 16 | 15 | 0.12 |
Bone_Mandible
| Max<50Gy
| 34 | 32 | <0.0001 |
Cavity_Oral
| Mean<32Gy
| 13 | 11 | <0.0001 |
SMG_contralateral
| Mean<39Gy
| 7 | 6 | 0.22 |
Larynx
| V(55Gy)<32%
Mean<51Gy
| 5 6 | 4 5 | 0.45 0.60 |
Musc_Constrict_M
| Mean<54Gy
| 7 | 4 | 0.14 |
Musc_Constrict_I
| V(40Gy)<65%
| 6 | 5 | <0.0001 |
Conclusion
The DL-generated HN OARs resulted in treatment plans of equivalent quality to the original ones, as judged by target coverage, PQM scores, and DVH analysis of the replanned dose distributions combined with the original OARs. The DL-based replanned dose distributions projected on the gold data OARs resulted in meeting clinical objectives in 99% of the total 1131 individual dose-volume points.