From OptiStatim, LLC (Dr Nathanson); and the Department of Medicine (Dr Higgins), Baystate Medical Center.
Correspondence to: Brian H. Nathanson, PhD, OptiStatim, LLC, PO Box 60844, Longmeadow, MA 01106; e-mail: firstname.lastname@example.org
Financial/nonfinancial disclosures: The authors have reported to CHEST the following conflicts of interest: Dr Nathanson’s company, OptiStatim, LLC, has an ongoing consulting agreement with the Cerner Corporation and participated in the development of the MPM-III model. Dr Higgins served previously as a consultant to Cerner and participated in the development of the MPM-III model. Dr Higgins also owns stock in Cerner.
Reproduction of this article is prohibited without written permission from the American College of Chest Physicians. See online for more details.
In the October 2012 issue of CHEST, Keegan et al1 compare three severity of illness scoring systems for critically ill patients (Acute Physiology and Chronic Health Evaluation [APACHE], Simplified Acute Physiology Score [SAPS], and Mortality Probability Model [MPM]) with and without a variable indicating the early presence of a do not resuscitate (DNR) order. They conclude that the model with the most variables (APACHE) had the best discriminatory ability, that adjusting for early DNR status did not improve model performance, and that all models were calibrated poorly. Although we agree with the authors’ first conclusion that more variables improve discrimination, we question the latter two.
This study implies that DNR status is not an important predictor of mortality in the ICU, even though it is significant in the MPM-III model. The authors saliently note that the SAPS 3 and APACHE IV models, which use more variables (20 and 142, respectively), may simply “capture” what predictive information DNR status conveys, making DNR status unnecessary. Unfortunately, the authors do not state if DNR status was statistically significant when added to the SAPS 3 or APACHE IV models.
Discrimination as measured by the C statistic (area under the receiver operating characteristic curve) indicates how well the model can separate survivors and nonsurvivors. If all nonsurvivors had higher predicted probabilities of dying than all survivors (regardless of the degree of predicted differences), then the C statistic would be 1, indicating perfect discrimination.2 Calibration as measured by the Hosmer-Lemeshow statistic measures how well predicted probabilities of mortality agree with observed numbers of deaths across deciles of risk.3
Cook2 has shown that because the C statistic is based on ranks, it is less sensitive than likelihood measures of model fit. Consequently, a strong predictor often causes minimal change to the C statistic in an existing model but can substantially change the predicted probabilities of those with the predictor. Instead, a Bayesian Information Criterion analysis or a related technique would be better suited to determine if a DNR status should be included in a model.
The Hosmer-Lemeshow statistic, like any inference test, is more likely to be statistically significant when the sample size is large. A significant Hosmer-Lemeshow test does not necessarily mean that a predictive model is suspect.4 Adjunct measures of model calibration, such as a calibration plot for each model, can be helpful when sample size is large.4
Finally, Cook2 has shown that there is a mathematical trade-off between good calibration and good discrimination, with well-calibrated models unable to achieve very high C statistics. Keegan et al’s1 study found that the MPM-III model had better calibration but worse discrimination than the APACHE or SAPS models. This does not suggest that one model is “better” than the others, but that each model has strengths and weaknesses and a more nuanced interpretation is required. Head-to-head comparisons of the three models are rare, and this article does fill a gap in the literature. However, more statistical analyses need to be done before one can definitively say that DNR status is not an important predictor of mortality or that these models are poorly calibrated.
Become a CHEST member and receive a FREE subscription as a benefit of membership.
Individuals can purchase this article on ScienceDirect.
Individuals can purchase a subscription to the journal.
Individuals can purchase a subscription to the journal or buy individual articles.
Learn more about membership or Purchase a Full Subscription.
Institutional access is now available through ScienceDirect and can be purchased at myelsevier.com.
Some tools below are only available to our subscribers or users with an online account.
Download citation file:
Web of Science® Times Cited:
Customize your page view by dragging & repositioning the boxes below.
Enter your username and email address. We'll send you a reminder to the email address on record.
Athens and Shibboleth are access management services that provide single sign-on to protected resources. They replace the multiple user names and passwords necessary to access subscription-based content with a single user name and password that can be entered once per session. It operates independently of a user's location or IP address. If your institution uses Athens or Shibboleth authentication, please contact your site administrator to receive your user name and password.