There may be several ways to handle the situation that patient data is missing that is required for a clinical prediction model. How well the situation is handled will determined how accurate the prediction estimate will be.


The best way of handling the situation of missing data is not to encounter it, by making sure that all data is collected.


Some options for handling missing data include:

(1) If the patient is available and the parameter is stable for the patient (for example, height), then measure the parameter now.

(2) If the probable value is within a limited range, then run the model with values within that range. The predicted values may be close enough that a precise value is not needed.

(3) Select another model that runs with the data that you do have.


Janssen et al reported 6 statistical methods for handling missing data. Most of the methods required access to the derivation data and/or the ability to recalculate the regression coefficients. The "multiple imputation" method (multiple imputations are used to imputer a missing value) provided a result closest to the true value.


The more data that is missing, the less likely that the estimate will be appropriate.


To read more or access our algorithms and calculators, please log in or register.