News
Article
Author(s):
Machine learning models improve accuracy in detecting MASH with F2-F3 fibrosis, surpassing noninvasive tests in sensitivity, specificity, and predictive value.
New data suggest leveraging artificial intelligence, more specifically metabolomics-based machine learning models, could help revolutionize screening for identifying patients with metabolic dysfunction-associated steatohepatitis (MASH) who would be candidates for resmetirom (Rezdiffra).
Presented at the 22nd Annual World Congress Insulin Resistance Diabetes & Cardiovascular Disease (WCIRDC), results of the study demonstrate the limitations of current noninvasive testing methods for detecting noncirrhotic MASH with moderate-to-severe fibrosis while highlighting the potential of machine learning models to improve accuracy, sensitivity, and predictive ability of current testing methods.
“We present the first assessment of [noninvasive tests] in identifying MASH with fibrosis F2 to F3,” wrote investigators. “We further develop and demonstrate that novel [machine-learning]-based [noninvasive tests] can accurately determine treatment eligibility alone or in combination with other [noninvasive tests].”
As the hepatology community prepares to enter 2025, the field is in a position it has never found itself in before following the historic FDA approval of resmetirom as the first therapy indicated for noncirrhotic nonalcoholic steatohepatitis with moderate to advanced fibrosis. This approval, which was awarded to Madrigal Pharmaceuticals, was based on a clinical development program that included 12 phase 1 studies, a pair of phase 2 studies, and 4 phase 3 studies.
After decades without effective options for the disease, which is now recognized as MASH, conversations among specialists have begun to include the urgent need for improved screening efforts and methodologies.
As investigators highlighted in the current study, which was led by Christos Mantzoros, MD, DSc, endocrinologist and principal investigator of the Mantzoros Laboratory at Beth Israel Deaconess Medical Center, and presented by Konstantinos Stefanakis, MD, there are no noninvasive tests developed for detecting MASH with fibrosis stages F2 to F3 but without cirrhosis. Mantzoros, Stefanskis, and fellow investigators launched the current study with the intent of evaluating and optimizing noninvasive tests as well as developing machine learning models to detect MASH with fibrosis stages F2 to F3 and rule out cirrhosis across a diverse multinational cohort.
Investigators leveraged a multinational biobank of 905 biopsy-confirmed participants representative of the entire spectrum of MASH and health controls as a data source. Using this study population, investigators assessed 28 biomarker-, imaging-, and algorithm-based NITs using both standard and re-optimized cutoffs to detect MASH F2 to F3 and rule out cirrhosis.
For the purpose of analysis, investigators also developed clinical/hormonal-based machine learning models and replicated the analysis to create metabolomics-based models among a 443-patient subgroup. Investigators pointed out these machine learning models were cross-validated in a 4:1 training:validation scheme.
Results of the analysis suggested the existing noninvasive tests attained suboptimal area under the curve (AUC), positive predictive value, and accuracy for MASH F2 to F3, except Fibroscan-AST (FAST) (AUC, 0.67; Negative Predictive Value [NPV], 92%; sensitivity, 81%). Results indicated novel clinical/hormonal machine learning models achieved greater values for AUC (0.86), sensitivity (93%), and NPV (97%) relative to all previously known noninvasive tests among the validation cohort.
Investigators also pointed out the addition of aminotransferases, metabolic syndrome components, BMI, and 3-ureidopropionate to metabolomics-based machine learning models contributed to a mean cross-validated AUC of 0.89 in the validation cohort, which increased to 0.91 following hyperparameter optimization and the addition of alpha-ketoglutarate.
When assessing noninvasive detection of cirrhosis, results indicated AGILE4+ had the highest sensitivity (92%) and NPV (99.7%), but investigators also called attention to results demonstrating machine learning models achieved an AUC up to 0.98, sensitivity of 98%, and NPV up to 99.9% among the validation cohort.
References: