Copenhagen, Denmark
Onsite/Online

ESTRO 2022

Session Item

Imaging acquisition and processing
7000
Poster (digital)
Physics
Standardising Nomenclatures in Breast Radiotherapy Imaging Data using Machine Learning Algorithms
Phillip Chlap, Australia
PO-1618

Abstract

Standardising Nomenclatures in Breast Radiotherapy Imaging Data using Machine Learning Algorithms
Authors:

ali haidar1,2,3, Matthew Field1,2,3, Vikneswary Batumalai1,4, Kirrily Cloak1,2,3, Daniel Al Mouiee1,2,3, Phillip Chlap1,2,3, Xiaoshui Huang5,2,3, Vicky Chin1,2,3, Martin Carolan6, Jonathan Sykes7,8, Shalini Vinod1,2,3, Geoffrey Delaney1,2,3, Lois Holloway1,2,3

1University of New South Wales, South Western Sydney Clinical School, Sydney, Australia; 2South Western Sydney Local Health District, Liverpool and Macarthur Cancer Therapy Centres, Sydney, Australia; 3Ingham Institute for Applied Medical Research, Medical Physics Research Group, Sydney, Australia; 4GenesisCare, Radiation Oncology, Sydney, Australia; 5University of Sydney, ImageX, Sydney, Australia; 6Wollongong Hospital, Illawarra Cancer Care Centre, Wollongong, Australia; 7Sydney West Radiation Oncology Network, Radiation Onology, Sydney, Australia; 8University of Sydney, Institute of Medical Physics, Sydney, Australia

Show Affiliations
Purpose or Objective

Data mining and analyses using retrospective radiotherapy imaging datasets sourced from single/multiple centres requires translation of local ontologies for structure names to a standardised ontology. Our aim was to investigate machine learning (ML) based tools for standardising target and organ-at-risk (OAR) volume definition in breast cancer radiotherapy plans.

Material and Methods

Radiotherapy imaging data for 1613 breast cancer patients treated between 2014 and 2018 were collected from a single centre. The volumes were initially classified based on discussions with clinicians. 1440 patients were selected for ML model development, and 173 patients were used for testing (hold-out). To represent each target and OAR volume, four characteristics were generated: textual features, geometric features, dosimetry features, and central slices representing the slice with the highest number of contoured pixels in a volume. Five datasets were created from the original cohort, the first four represented different subsets of volumes and the last represented the whole list of volumes (Table1). For each dataset, 15 sets of feature combinations were created to see how the use of different attributes affected the standardisation performance.


Three types of artificial neural networks were used to model different combinations of features: feed forward neural networks (FFNN), convolutional neural networks (CNNs), and multi-input neural networks (MINN).  FFNN were used for training tabular data combinations (e.g. text and dosimetry features), a CNN was used for training imaging data (central slices), and MINN were used for training tabular and imaging combinations (e.g. text and imaging features). Classification accuracy was used to compare the developed models against each other over the hold-out dataset.


Table 1 Classes used in datasets.

 

Results

Classification accuracy of each of the developed models is shown in Fig. 1. The best model (a MINN) reported 99.416% classification accuracy over the hold-out samples when used to standardise all the nomenclatures in a breast radiotherapy plan into 21 different classes (Dataset 5). 19 samples belonging to different classes were misclassified with 10 being predicted as ‘exclude’ (i.e. not to use).Three types of features were used with this model: textual features, dosimetry features, and images. When compared to employing single characteristics, integrating several features resulted in greater classification accuracy. Reliable performance was observed with all the datasets when using the text feature as input to the model, which is consistent with the traditional approach, where the clinicians look at text first to standardise nomenclatures.




Fig.1 Modelling results.

Conclusion

Standardisation of nomenclatures using ML is feasible on single institutional data if multiple features are included in the model. This is an ongoing project, where federated ML will be investigated for standardising radiotherapy data across different centres, guidelines, and anatomical sites.