PURPOSE: To develop a reliable and valid statistical regression model that accurately identifies COPD patients with mild disease severity and can be reproduced in administrative databases.
METHODS: A cohort of patients who demonstrated obstruction via pulmonary function tests (PFT) (FEV1/FVC <70% predicted) were selected from the General Electric Healthcare Database, a database representing electronic medical records of patients treated at participating physician practices. Patients were stratified into mild COPD disease severity (FEV1>80%) and non-mild COPD disease severity (FEV1<80%). Frequencies of top 50 diagnoses, prescribed medications, and procedures assessed within six months prior to the mild PFT result were itemized per mild patients and compared to frequencies in non-mild COPD patients using chi-square tests. Significant variables were then selected and included in logistic regression models to predict the likelihood of being mild. An optimal logistic regression model was then chosen that maximized the c-statistic as well as sensitivity and specificity estimated through an optimal predicted probability level generated from the model. The model was then validated using a large managed care database.
RESULTS: There were 768 patients identified as having COPD from their PFT, of which 116 were mild (FEV1>80%). Final logistic regression model included variables readily available in claims databases: demographic variables, history of diseases, any prescribed medications, and procedures ordered. Model variables included sex, age, chest x-ray, ischemia, respiratory symptoms, antimuscarinics, sympathomimetics, antitussives, adrenals, antineoplastic agents, vitamin B complex, smooth muscle relaxants, antifungals, macrolides, replacements, cephalosporins, tetracyclines and short acting beta agonists. The final model had a c-statistic of 0.63 and at the predicted probability threshold of 0.16, sensitivity was 45.7% and specificity was 58%.
CONCLUSION: Initial development of this statistical model has shown that it may be a reliable and valid method of identifying COPD patients at early stages of disease. Currently, its reproducibility is being tested in a separate claims database to provide further evidence of its accuracy.
CLINICAL IMPLICATIONS: Early intervention and management of mild COPD patients may impact the disease progression.
DISCLOSURE: Anand Dalal, Employee Anand Dalal is currently employed by GlaxoSmithKline (GSK); Consultant fee, speaker bureau, advisory committee, etc. Chris Blanchette was a consultant at LRRI who received funding from GSK to complete this analysis. Analysis was done independently and without any direction from GSK. H Petersen was also a consultant at LRRI.; No Product/Research Disclosure Information