Y. Liu1, T. A. Lin2, A. Koong3, C. Lin3, J. A. Jaoude4, R. R. Patel4, R. Kouzy5, M. B. El Alam3, T. Meirson6, and E. B. Ludmir7; 1Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, 2The University of Texas MD Anderson Cancer Center, Houston, TX, 3Department of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX, 4Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, 5MD Anderson Cancer Center, Houston, TX, 6Davidoff Cancer Center, Petah-Tikva, Israel, 7Department of Gastrointestinal Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
Purpose/Objective(s): The standard of care for oncologic treatment is often established and refined through randomized phase III clinical trials. While the success of a trial is generally established through meeting statistical significance, there are few metrics to measure the robustness of a trial’s outcomes. Using the survival-inferred fragility index (SIFI), we performed an assessment of the fragility of oncology clinical trials. Materials/
Methods: We established a database of 332 phase III oncology trials from 2002-2020 for which Kaplan-Meier curves were digitally reconstructed to accurately reproduce survival time and censoring at the individual level. SIFI values were established by counting the number of patients that needed to be iteratively flipped between treatment arms to cause a positive trial to lose statistical significance or for a negative trial to gain statistical significance. This number was then normalized as a percentage of the total trial enrollment. We calculated SIFI using flipping of best (longest) survivor from the intervention arm (SIFI_B), worst (shortest) survivor from the control arm (SIFI_W), and median survivor in the control arm (SIFI_M). Trials were classified based on whether they were testing immune-checkpoint inhibitors (ICIs), targeted therapies, or others. The Mann-Whitney U test was used to perform pairwise comparisons and the Kruskal-Wallis test was used for comparisons of more than 2 groups. Statistical analyses were performed in R. Results: Of the 332 trials assessed, 196 were positive and 136 were negative. There were 190 targeted therapy trials, 52 ICI trials, and 90 trials assessing other agents. The most common primary endpoint was progression-free survival (PFS) (154 trials), followed by overall survival (OS) (136 trials). For all trials, the median SIFI_B was 1.37%, SIFI_W was 1.71%, and SIFI_M was 2.74%. A practice changing ICI trial was the PACIFIC trial, which had SIFI_B of 1.96%, SIFI_W of 1.40%, and SIFI_M of 1.82%. Targeted therapy trials were significantly more robust than ICI trials with SIFI_W of 1.93% vs 1.28% (p=0.037) and SIFI_M of 3.14% vs 1.98% (p=0.012). Trials assessing OS as the primary endpoint were significantly more fragile than trials with PFS as the overall endpoint (SIFI_B 0.90% vs. 1.78%, p<0.001; SIFI_W 1.33% vs. 2.96%, p<0.001; SIFI_M 2.13% vs. 4.58%, p<0.001). Lastly, trials with negative outcomes were significantly more fragile than trials with positive outcomes (SIFI_B 1.08% vs. 1.68%, p<0.001; SIFI_W 1.36% vs. 2.32%, p<0.001; SIFI_M 2.21% vs. 3.70%, p<0.001). Conclusion: We found that targeted therapy trials, trials with PFS as the primary endpoint, and positive trials tend to be the most robust. Further work will need to validate the use of SIFI in designing more robust clinical trials.