Copenhagen, Denmark
Onsite/Online

ESTRO 2022

Session Item

Saturday
May 07
09:00 - 10:00
Poster Station 1
01: Image processing & analysis
René Winter, Norway
1180
Poster Discussion
Physics
Automatic detection of facial landmarks in paediatric CT scans using a convolutional neural network
Aaron Rankin, United Kingdom
PD-0069

Abstract

Automatic detection of facial landmarks in paediatric CT scans using a convolutional neural network
Authors:

Aaron Rankin1, Edward Henderson2, Oliver Umney2, Abigail Bryce-Atkinson2, Andrew Green3, Eliana Vásquez Osorio2

1University of Manchester, Division of Cancer Sciences, Manchester, United Kingdom; 2University of Manchester, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, Manchester, United Kingdom; 3University of Manchester, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, Derry, United Kingdom

Show Affiliations
Purpose or Objective

In children receiving radiotherapy, damage to healthy tissue and bone structures can lead to facial asymmetries. Tools for automatic evaluation of bone structures on 3D images would be useful to monitor and quantify the development of facial asymmetry in these patients.  Due to the lack of available data, there has been little research conducted to assess the performance of machine learning models using paediatric data. Novel approaches such as transfer learning can facilitate the translation of adult-trained models to paediatric data. Here, we investigate the performance of a convolutional neural network (CNN) using transfer learning to detect facial anatomical landmarks in 3D CT scans. 

Material and Methods

Facial landmarks were selected through an extensive literature review, to contain both left and right components with minimal anatomical differences between adult and paediatric patients (fig 1A). A training dataset of 104 CT scans from adult head and neck cancer patients was annotated by two observers. A random point between the two annotations was used as a form of data augmentation and to reduce the impact of interobserver (IOV) bias. The UNet-based CNN model was trained for 100 epochs, using several commonly used data augmentation steps (e.g., translational shifts and left-right mirroring). A heatmap was produced by the network based on the confidence of the predicted landmark location. The model was evaluated on a test set of 10 scans annotated by an expert. We investigated the IOV between the locations selected by the two observers and the expert (Table 1) to determine if this had a considerable effect on the model’s performance.

The best performing adult model was fine-tuned using 10 paediatric scans for a further 100 epochs and then evaluated with 6 unseen paediatric scans. All paediatric scans (age 4-17y) were annotated by an expert. We quantified model performance measuring the 3D Euclidean point-to-point (p2p) distance between the predicted and manual annotations for the best adult CNN model and the final paediatric model.


Results

The average accuracy across all 10 landmarks on adult CT scans was 5.6 ± 0.8, comparable to the IOV limits of our training data and resolution of the scans (Table 1). Following the adaption of our model to paediatric data using transfer learning, the average p2p error was 5.4 ± 0.5 mm across all 10 landmarks, performing equally as well as in adult data and showing similar variation in accuracy depending on the landmark location (Table 1).

Conclusion

We developed a CNN model to locate facial landmarks in adult and paediatric 3D CT images. We demonstrate that a model trained with adult CT scans can be successfully adapted to locate landmarks in paediatric images without a significant loss of accuracy using transfer learning. Further work would involve using a bigger paediatric dataset, including patients with facial asymmetry to explore the ability of the model to quantify structural asymmetries in different areas of the face.