Machine Learning Model Provides Rapid Prediction of Clostridium difficile Risk

Author(s):

Hospital-specific models allow for earlier and more accurate identification of high-risk patients to better target infection prevention strategies.

Erica Shenoy, MD, PhD, MGH Division of Infectious Diseases, assistant professor, Harvard Medical School

Erica Shenoy, MD, PhD

According to a recently published paper, researchers from Massachusetts General Hospital (MGH), the University of Michigan (U-M) and Massachusetts Institute of Technology (MIT) developed investigational machine learning models tailored to individual institutions to predict a patient’s risk of developing Clostridium difficile infection (CDI) earlier than current methods.

Most prior work to date work in learning risk-stratification models for CDI focused on a one-size-fits-all method, limiting to a small number of risk factors, but past research has shown that a model leveraging the entire structured contents of the electronic health record (EHR) can statistically perform better than models based on a limited set of factors.

According to the researchers, there’s considerable evidence that hospital-specific factors play a role in predicting patient risk.

“Despite substantial efforts to prevent CDI and to institute early treatment upon diagnosis, rates of infection to increase,” co-senior author of the study, Erica Shenoy, MD, PhD, MGH Division of Infectious Diseases, assistant professor, Harvard Medical School, said in a statement. “We need better tools to identify the highest risk patients so that we can target both prevention and treatment interventions to reduce further transmission and improve patient outcomes.”

In the paper, researchers presented a generalizable machine-learning approach to using the structured data in an EHR to build a CDI risk-stratification model tailored to a specific facility.

The approach separated facility-specific risk-stratification models for CDI using EHR data collected during the regular course of patient care at 2 different hospitals.

“We report on the successful application of this approach to different patient populations, different facilities and different EHRs, and we show that it can be used to produce models that predict CDI several days in advance of clinical diagnosis,” study authors wrote. “The approach can be used at other institutions to create facility-specific predictive models that could be utilized prospectively to provide daily, automated, risk prediction for CDI, to target both clinical and infection prevention interventions more effectively.”

The study cohort included EHR data from 191,014 adult admissions to UM and 65,718 adult admissions to MGH. Researchers extracted patient demographics, admission details, patient history and daily hospitalization details that resulted in 4836 features from patients at UM and 1837 from patients at MGH.

Utilizing the test data, the models achieved area under the receiver operating characteristics curve (AUROC) values of 0.82 (95% confidence interval [CI], 0.80—0.84) and 0.75 (95% CI, 0.73–0.78) respectively. There were some similarities in the predictive factors between the 2 models, but many top factors differed.

Overall, the models were highly successful at predicting patients that would ultimately be diagnosed with the infection.

According to the study, in half of those infected, accurate predictions could have been made at least 5 days before diagnostic samples were collected, which would allow the highest-risk patients to be the focus of targeted antimicrobial interventions.

If validated in prospective studies, the risk prediction score may guide early screening for infection.

For those diagnosed earlier in course of the disease, initiation of treatment could limit the severity of the illness, and patients with a confirmed infection could be isolated and contact precautions instituted to prevent transmission to other patients.

The ability to identify those patients at greatest risk allows clinicians to focus prevention methods on those who would gain maximum benefit.

Shenoy adds that facilities exploring the application of similar algorithms to their own institutions will need to assemble the appropriate local subject-matter experts and validate the performance of the models in their own institutions.

The study, "A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers" was published in Infection Control and Hospital Epidemiology.