Dealing with national datasets
Lasse Refsgaard,
Denmark
SP-0369
Abstract
Dealing with national datasets
Authors: Lasse Refsgaard1
1Aarhus University Hospital, Department of Experimental Clinical Oncology , Aarhus, Denmark
Show Affiliations
Hide Affiliations
Abstract Text
Radiotherapy (RT) planning generates a lot of data. DICOM data, in particular, offers a wealth of information that can be used to advance our understanding of the effectiveness of our treatment approaches.
To take advantage of this potential, it is essential that we do large-scale RT studies and that we move beyond highly condensed registration of RT data such as yes/no or the prescribed dose and fractionation only, and instead aim to include the full exposure data. This is important when we design new prospective studies, but also if we want to learn from the data that we generate on a daily basis when we treat patients who are not part of a clinical protocol.
Working with large datasets can be challenging in a single institution and even more so when moving to a national setup.
In this talk, we will discuss the main challenges of dealing with big national DICOM datasets and how to overcome them. These include:
- Automated collection of large DICOM data sets from multiple centres using different treatment planning systems.
- Data standardisation for non-protocol treatments in a setup where data is not formatted consistently across different institutions.
- The need for tools for storage, organisation, and analysis of large national dataset.
- Quality assurance and data curation.
We will present results and lessons learned from the national Danish Breast Cancer Group (DBCG) RT Nation study, where we collected and analysed DICOM data for 8000 consecutively treated high-risk breast cancer patients (2009-2016). Furthermore, we will highlight some of the possibilities and spin-off applications that came from collecting data for the study.