Abstract:
The molecular diversity of breast cancer makes it impossible to identify prognostic markers that are applicable to all breast cancers. To overcome limitations of previous multigene prognostic classifiers, we propose a new dynamic predictor: instead of using a single universal training cohort and an identical list of informative genes to predict the prognosis of new cases, a case-specific predictor is developed for each test case. Gene expression data from 3,534 breast cancers with clinical annotation including relapse-free survival is analyzed. For each test case, we select a case-specific training subset including only molecularly similar cases and a case-specific predictor is generated. This method yields different training sets and different predictors for each new patient. The model performance was assessed in leave-one-out validation and also in 325 independent cases. Prognostic discrimination was high for all cases (n = 3,534, HR = 3.68, p = 1.67 E-56). The dynamic predictor showed higher overall accuracy (0.68) than genomic surrogates for Oncotype DX (0.64), Genomic Grade Index (0.61) or MammaPrint (0.47). The dynamic predictor was also effective in triple-negative cancers (n = 427, HR = 3.08, p = 0.0093) where the above classifiers all failed. Validation in independent patients yielded similar classification power (HR = 3.57). The dynamic classifier is available online at http://www.recurrenceonline.com/?q=Re_training. In summary, we developed a new method to make personalized prognostic prediction using case-specific training cohorts. The dynamic predictors outperform static models developed from single historical training cohorts and they also predict well in triple-negative cancers.