TY - JOUR
T1 - Calibrating Parameters for Microsimulation Disease Models: A Review and Comparison of Different Goodness-of-Fit Criteria
AU - Steen, Alexander
AU - van Rosmalen, Joost
AU - Kroep, Sonja
AU - Hees, Frank
AU - Steyerberg, Ewout
AU - de Koning, Harry
AU - Ballegooijen, Marjolein
AU - Lansdorp - Vogelaar, Iris
PY - 2016
Y1 - 2016
N2 - Background. Calibration (estimation of model parameters) compares model outcomes with observed outcomes and explores possible model parameter values to identify the set of values that provides the best fit to the data. The goodness-of-fit (GOF) criterion quantifies the difference between model and observed outcomes. There is no consensus on the most appropriate GOF criterion, because a direct performance comparison of GOF criteria in model calibration is lacking. Methods. We systematically compared the performance of commonly used GOF criteria (sum of squared errors [SSE], Pearson chi-square, and a likelihood-based approach [Poisson and/or binomial deviance functions]) in the calibration of selected parameters of the MISCAN-Colon microsimulation model for colorectal cancer. The performance of each GOF criterion was assessed by comparing the 1) root mean squared prediction error (RMSPE) of the selected parameters, 2) computation time of the calibration procedure of various calibration scenarios, and 3) impact on estimated cost-effectiveness ratios. Results. The likelihood-based deviance resulted in the lowest RMSPE in 4 of 6 calibration scenarios and was close to best in the other 2. The SSE had a 25 times higher RMSPE in a scenario with considerable differences in the values of observed outcomes, whereas the Pearson chi-square had a 60 times higher RMSPE in a scenario with multiple studies measuring the same outcome. In all scenarios, the SSE required the most computation time. The likelihood-based approach estimated the cost-effectiveness ratio most accurately (up to 20.15% relative difference versus 0.44% [SSE] and 13% [Pearson chi-square]). Conclusions. The likelihood-based deviance criteria lead to accurate estimation of parameters under various circumstances. These criteria are recommended for calibration in microsimulation disease models in contrast with other commonly used criteria.
AB - Background. Calibration (estimation of model parameters) compares model outcomes with observed outcomes and explores possible model parameter values to identify the set of values that provides the best fit to the data. The goodness-of-fit (GOF) criterion quantifies the difference between model and observed outcomes. There is no consensus on the most appropriate GOF criterion, because a direct performance comparison of GOF criteria in model calibration is lacking. Methods. We systematically compared the performance of commonly used GOF criteria (sum of squared errors [SSE], Pearson chi-square, and a likelihood-based approach [Poisson and/or binomial deviance functions]) in the calibration of selected parameters of the MISCAN-Colon microsimulation model for colorectal cancer. The performance of each GOF criterion was assessed by comparing the 1) root mean squared prediction error (RMSPE) of the selected parameters, 2) computation time of the calibration procedure of various calibration scenarios, and 3) impact on estimated cost-effectiveness ratios. Results. The likelihood-based deviance resulted in the lowest RMSPE in 4 of 6 calibration scenarios and was close to best in the other 2. The SSE had a 25 times higher RMSPE in a scenario with considerable differences in the values of observed outcomes, whereas the Pearson chi-square had a 60 times higher RMSPE in a scenario with multiple studies measuring the same outcome. In all scenarios, the SSE required the most computation time. The likelihood-based approach estimated the cost-effectiveness ratio most accurately (up to 20.15% relative difference versus 0.44% [SSE] and 13% [Pearson chi-square]). Conclusions. The likelihood-based deviance criteria lead to accurate estimation of parameters under various circumstances. These criteria are recommended for calibration in microsimulation disease models in contrast with other commonly used criteria.
U2 - 10.1177/0272989X16636851
DO - 10.1177/0272989X16636851
M3 - Article
C2 - 26957567
SN - 0272-989X
VL - 36
SP - 652
EP - 665
JO - Medical Decision Making
JF - Medical Decision Making
IS - 5
ER -