Publication
Article
Author(s):
A prognostic, deep-learning model developed by investigators at Cleveland Clinic’s Taussig Cancer Institute in Ohio and collaborators brings radiation oncology into the realm of personalized medicine.
Mohamed E. Abazeed, MD, PhD
Mohamed E. Abazeed, MD, PhD
Radiation Oncologist,
Cleveland Clinic,
Mishka Gidwani
Mishka Gidwani
Taussig Cancer Institute Student, Molecular Medicine PhD Program, Cleveland Clinic Lerner Research Institute
A prognostic, deep-learning model developed by investigators at Cleveland Clinic’s Taussig Cancer Institute in Ohio and collaborators brings radiation oncology into the realm of personalized medicine. The findings, published in Lancet Digital Health, demonstrate the ability of the model to learn from computed tomography (CT) scans of lung lesions, profiling patients for risk of failure after stereotactic body radiation therapy (SBRT).1 Using clinical variables, the model, named Deep Profiler, is able to recommend an individualized radiation therapy dose (iGray) that reduces the risk of local failure to <5% 1 year after treatment.
Although advances in targeted therapy and immunotherapy have improved outcomes, lung cancer remains the greatest cause of cancer-related mortality in men and women. High-dose radiation therapy, although a mainstay of cancer treatment for patients with early-stage lung cancer, is not successful in nearly 20% of patients, leading to cancer recurrence.2 The high failure rate calls for methods to stratify patients into groups who would benefit from reduced or intensified therapy. In addition, SBRT is delivered in fixed high-dose regimens, based on traditional clinical variables, primarily tumor site and tumor volume3; however, this approach does not address the known heterogeneity in cancers, and it underutilizes the vast latent data contained in patient images.
How the Model Works
Deep Profiler’s artificial intelligence (AI) approach consists of a convolutional neural network designed to analyze pretreatment chest CT images and synthesize them into single-risk scores (Figure). In the study, the network was tasked with re-creating a set of handcrafted radiomic features, making use of prior knowledge and avoiding the pitfall of overfitting. A critical feature of the model is the use of multitask learning, in which the network is assigned 2 or more tasks rather than a singular, focused task. By sharing “representations,” or tasks, the network can potentially perform better. Here, investigators used prior knowledge in the form of classical radiomics (ie, human-derived handcrafted features) to improve the network’s ability to classify radiation treatment failures. The balance of tasks can be tuned, so that as the size of the data set grows and overfitting is mitigated, clinicians can allow the network to learn without the auxiliary task. This approach can significantly optimize the analyses of most current clinical data sets (<1000 cases). Although traditional radiomic features have been used for outcome prediction in a number of published models, Deep Profiler neural network—derived features are deformable, allowing for more expansive representations of encoded image data.
Investigators trained the model on a Cleveland Clinic cohort of 849 patients with stage IA to IVB primary, recurrent lung cancer and other cancer types with metastatic disease in the lung. Findings showed that patients could be stratified into groups predictive rather than prognostic of local failure (HR, 3.64; CI 95%, 2.19-6.05). Deep Profiler was then tested on an external validation cohort of 95 patients from Cleveland Clinic satellite sites, demonstrating equally profound accuracy (HR 5.13; 95% CI, 1.77-14.84). These results demonstrated the transportability of Deep Profiler across disease stages, CT scanner types, and treatment periods, and the investigative team hopes to extend the model’s use to new clinical centers.
Combining the Deep Profiler score with clinical variables such as histology and biologically effective dose, leads to iGray, a patient-specific radiation dose reducing the risk of local failure to <5% 1 year after treatment. The range of iGray values has significant overlap with that of current dosing regimens—from 21.1 to 277 Gy— yet in several cases, the iGray recommended dose was half or even double the dose delivered. This suggests that improvements to standard dosing are possible and that most iGray dose recommendations are clinically tractable.
Understanding Deep Profiler Function
Although AI models can perform clinically meaningful prediction tasks, they perform analyses in a “black box” and are limited by a lack of interpretability. In the study, investigators attempted to unmask the rationale driving Deep Profiler by using saliency maps, heat maps revealing the relative importance of CT image voxels to the network prediction. Although the majority of salient voxels were contained within the gross tumor volume, the rest were in the peritumoral region, including the planning target volume and beyond. The investigative group is considering how saliency mapping can be used to identify areas of marginal recurrence and improve the accuracy of tumor delineation through automatic segmentation. The group’s interest in interpretability goes beyond pedagogy because understanding the biological basis of the image-feature space and its relation to treatment failures will reduce false associations, enhance model accuracy across institutions, and provide a target discovery platform that leads to not only outcome predictions but also a reduction in treatment failures.
Figure. (Click to Enlarge)
Despite the pending impact on cancer treatments, AI models still face a number of challenges in clinical implementation. These include small sample sizes, interpretability, and regulatory approvals. Despite data augmentation techniques, many medical data sets are considered “miniature data.” Sample size continues to be an issue for supervised learning in clinical data sets. Most data sets are not large enough to allow for sufficient training and validation of a model. Improved natural language processing and computer vision could accelerate the generation of annotated large-scale clinical data sets. Until these pipelines are in place, an important goal is to facilitate the confederation of data sets via multi-institutional collaborations.
As with other AI models, Deep Profiler’s output is highly scalable: The addition of new data will improve model predictions. The same effect is not seen with traditional clinical variables, and investigators at Cleveland Clinic are planning for prospective validation of the framework in a clinical trial. This study represents an innovation in its use of deformable radiomic features, multitask learning, and clinically implementable prediction of personalized radiotherapy dose.