Publication
Article
Oncology Live®
Krishnansu S. Tewari, MD, and Kimberly Futch, MBA, detail considerations with AI programs, and roles AI may have in early diagnosis, time reduction in the examination of scans, and more.
The most prominent areas of cancer care where artificial intelligence (AI) programs may have roles are in the sectors of early diagnosis, time reduction for radiologists in the examination of scans, patient recruitment for clinical trials, data management, and determining response on imaging, according to Krishnansu S. Tewari, MD, and Kimberly Futch, MBA. However, limitations including privacy, reliability of the system, and cost are contenders for AI.
“Community oncologists should understand that the applications of AI in oncology are going to be very broad,” Tewari said in an interview with OncologyLive. “There are going to be applications in clinical trial design, treatment response, triaging, imaging, pathology, genomics, transcriptomics, proteomics, [and] metabolomics. The applications are going to be remarkable.”
Like many AI programs, large language models (LLMs) have broad potential uses that range from aiding in administrative tasks to decision-making. Findings from a cross-sectional study on the performance of LLMs on medical oncology examination questions revealed that proprietary LLM 2 correctly answered 125 of 147 questions for an accuracy rate of 85.0% (95% CI, 78.2%-90.4%; P < .001 vs random answering). However, additional data showed that if acted upon in clinical practice, 18 of the 22 incorrect answers (81.8%; 95% CI, 59.7%-94.8%) would have a medium or high likelihood of causing moderate to severe harm.1
“There are some limitations [with AI], such as the reliability of the system,” Futch, director of Clinical Operations Strategy at ProPharma, said in an interview with OncologyLive. When addressing how AI can comb through records to gather information, she added that “you have the accuracy there, and it may be accurate for a million records, but is it accurate for 10s of millions records?”
Data published in NEJM AI examined 5 publicly available LLMs on 2044 oncology questions in the oncology field and found there was significant heterogeneity in performance between models (analysis of variance P < .001). However, investigators noted that relative to a human benchmark using 2013 and 2014 examination results, the GPT-4 model was the only one that performed above the 50th percentile.2
Additionally, a combination of model selection, prompt repetition, and confidence self-appraisal made it possible to identify high-performing subgroups of questions. The Claude -v1 model was 81.7% accurate, and the GPT-4 model was 81.1% accurate.
Tewari, who is director of the Division of Gynecologic Oncology at the University of California, Irvine School of Medicine, added to “think of AI as software. When we’re thinking of robots taking over the world, the robots are just the hardware. AI is the software.” Tewari is also a professor and the Philip J. DiSaia, MD, Chair in Gynecologic Oncology in the Department of Obstetrics & Gynecology.
“[AI] enhances decision-making by supplementing human logic into how we can look at vast amounts of data and helps us to identify patterns [as well as] insights that we may overlook from just the human mind,” Futch said. “When you use AI to process large data sets, it may help us to identify the most effective treatment options that are tailored to an individual’s specific cancer profile [by] looking at what their gene sequence is, mutations are, [etc].”
Recent data from a study including 42 patients with PD-L1–negative non–small cell lung cancer (NSCLC) and 94 patients with PD-L1–positive NSCLC showed that a PET/CT-based deep learning radiomics model could predict PD-L1 expression accurately in patients with NSCLC. Findings from the validation data set demonstrated that the area under the curve of the fusion model (0.910; 95% CI, 0.779-0.977) was higher than that of the radiomics model (0.785; 95% CI, 0.628-0.897) and the deep learning model (0.867; 95% CI, 0.724-0.952).3
“AI can be very helpful in determining the objective response rate in serial imaging studies for patients in clinical trials with novel drugs. It can also look for patterns in response to drugs and assist in statistical analyses in clinical trials,” Tewari said. “AI has the ability to triage abnormal scans, pathology, and drug response faster than humans, but at this point, human intervention and oversight are critical.”
“Probably the most powerful [current application of AI is for] early diagnosis,” Futch said. “In cancer care, we [often] have a group of radiologists look at [images and scans] because we want a consensus of what’s being seen. If we use some of these AI algorithms that are trained on millions of medical images to assist in these [situations], it provides a supplemental accuracy that can lead to earlier, more precise diagnoses.”
A deep-learning model named Sybil, which uses low-dose computed tomography (LDCT) from the National Lung Screening Trial (NLST) and can run in real time in the background on a radiology reading station, accurately predicted future lung cancer risk from 1 LDCT scan. The model does not require clinical data or radiologist annotations.4
Sybil was evaluated in 3 independent data sets featuring LDCT images from NLST participants (n = 6282), LDCT images from Massachusetts General Hospital (n = 8821), and LDCT images from Chang Gung Memorial Hospital (n = 12,280). The AI model achieved areas under the receiver operating characteristic curve for lung cancer prediction at 1 year of 0.92 (95% CI, 0.88-0.95), 0.86 (95% CI, 0.82-0.90), and 0.94 (95% CI, 0.91-1.00), respectively.
“[When] combining [AI] with the human mind and radiologists looking at [scans], it’s highlighting areas of concern that help [radiologists] focus on more critical findings,” Futch said.
Furthermore, data from a recent study found there was substantial heterogeneity in the treatment effects of AI assistance among 140 radiologists who used AI across 15 chest x-ray diagnostic tasks. Investigators measured AI’s treatment effect as the improvement in absolute error across all pathologies, observing a range of treatment effects from –1.295 to 1.440 (IQR, 0.797).5
“Because AI algorithms look at large volumes of data, they may have the potential to pull that entire hospital’s medical record and say, ‘How can we find patients who have these key criteria?’ It’s almost an automated system that [can] efficiently and [possibly] more accurately identify potential patients,” Futch noted on the potential application of AI in clinical trial recruitment.
Tewari added that “before we roll AI out into clinical trial designs, [an area] where I do believe it will have an impact, we need to do so cautiously. AI was developed for applications outside of the world of medicine and was not designed for medicine.”
Although programs are being examined in the design of clinical trials, certain AI platforms are already under evaluation in clinical trials. Two ongoing studies with the CURATE.AI platform are examining optimizing immunotherapy doses for patients with solid tumors and chemotherapy doses for those with multiple myeloma. The former phase 1/2 trial (NCT05175235) is examining dose optimization with nivolumab (Opdivo) and pembrolizumab (Keytruda) with CURATE.AI and sequential circulating tumor DNA measurements in patients with solid tumors.6 The latter phase 2/3 trial (NCT03759093) is examining the same program for optimized dosing with bortezomib (Velcade), thalidomide (Thalomid), cyclophosphamide, and lenalidomide (Revlimid) in multiple myeloma.7
“Making sure that you’re looking at ethical considerations around data privacy [is important]. Patients are rightfully concerned about that, and physicians are very concerned about making sure that data are anonymized and you can separate them from those patients and decentralize data,” Futch said. “You need to have a very closely validated system and a clean way of ensuring that everything has been completely anonymized. That’s a limitation that we’re overcoming right now, but it does still exist.”
Tewari added that “the most important shortcoming to be aware of with AI in oncology, particularly in clinical trials and drug development, is that AI has been loaded with illness scripts but does not understand pathophysiology. Therefore, human intervention and oversight are essential.”
As AI applications and research venture into unknown territory, biases, inconsistencies with programs, and costs are limitations that researchers will try to overcome as they continue to develop the software.8
Futch noted that “this is all supplemental to what we have. It builds a stronger foundation for making good decisions of how to treat patients most effectively.”