Vienna, Austria

ESTRO 2023

Session Item

Saturday
May 13
10:30 - 11:30
Strauss 3
Gynaecology
Diana-Cristina Pop (Patcas), Romania;
Maximilian Schmid, Austria
1270
Proffered Papers
Brachytherapy
10:50 - 11:00
Deep learning-based segmentation in cervical HDR brachytherapy with two types of applicators
Ruiyan Ni, Canada
OC-0131

Abstract

Deep learning-based segmentation in cervical HDR brachytherapy with two types of applicators
Authors:

Ruiyan Ni1, Kathy Han2,3, Benjamin Haibe-Kains1,2, Alexandra Rink1,2,3,4

1University of Toronto, Department of Medical Biophysics, Toronto, Canada; 2University Health Network, Princess Margaret Cancer Center, Toronto, Canada; 3University of Toronto, Department of Radiation Oncology, Toronto, Canada; 4University Health Network, TECHNA Institute, Toronto, Canada

Show Affiliations
Purpose or Objective

Manually delineating OARs and targets is a time-consuming process in cervix brachytherapy (BT). Deep learning (DL)-based automatic segmentation approaches have demonstrated promising delineation results with significantly reduced time. Applicator selection is a key step to delivering appropriate treatment that primarily depends on disease extent and anatomy. However, whether the mixing of various applicators will diminish the model performance and how to retain the model’s generalizability among different applicator types remain unclear. In this study, we addressed this research gap by developing DL model with a dataset of both interstitial ring and tandem (R&T) and Syed Neblett template (S-N) and assessed the transfer learning (TL) strategies when using the existing model to auto-delineate patients with another type of applicator.

Material and Methods

A dataset of 165 T2-weighted MR images (130 with R&T and 35 with S-N) with clinically used contours was built from 74 cervical cancer patients (39 R&T and 35 S-N). Bladder, rectum, sigmoid, small bowel and HR-CTV were segmented. First, The R&T model was trained with 119 R&T cases using a self-adapting U-Net-based framework (nnU-Net). This baseline network (RTmodel) provided the initial weights for TL on the S-N dataset (25 finetuning cases for TL25 and 10 testing cases). The S-N testing results segmented by the RTmodel and finetuned model were used to assess the TL efficiency. A Mixed model was trained with both R&T training set and S-N finetuning set, and tested on two testing sets separately. Second, we examined TL data requirements by using a different number of fine-tuning cases (n=5-20). Additionally, training-from-scratch models were trained with the same subsets to be compared with the fine-tuned models. Segmentation performance was evaluated by four metrics.

Results

TL25 outperformed the RTmodel when testing on S-N cases [mean vDSC of 0.88 vs 0.81 for bladder, 0.88 vs 0.74 for rectum, 0.69 vs 0.57 for sigmoid, 0.65 vs 0.48 for small bowel, and 0.72 vs 0.54 HR-CTV]. The Mixed model had similar performance compared with both RTmodel for R&T cases and TL25 for S-N cases, indicating that mixing different applicators does not reduce the model performance (improved mean DSC of 0.07 for S-N small bowel, diff < 0.02 for others). TL significantly improved the model performance with only 5 finetuning cases (mean vDSC increased 0.03~0.14) but reached a plateau with ≧10 finetuning cases (mean vDSC diff < 0.03). The training-from-scratch models performed worse than the TL models. The training times of RT/Mixed model and one TL model were 14 hours and 2.5 hours. The prediction time per image was 12 s.


Conclusion

We have successfully demonstrated that (1) DL model has the ability to handle segmentation with various applicators; (2) TL can achieve similar results to Mixed model with limited fine-tuning data and highly reduced computational costs. This study shows the potential of TL when applying our model to different institutions in the future.