AI Language Model Distinguishes Fibromyalgia From Other Chronic Pain Conditions

Author(s):

Victoria Johnson

The model gave special emphasis to words noting pain, fatigue and depressed moods.

Vincenzo Venerito

Credit: Loop (Frontiers in Medicine)

Large language model-driven sentiment analysis, especially utilizing prompt engineering, may facilitate fibromyalgia diagnosis by detecting subtle differences in pain expression.¹

“By dissecting and interpreting subjective information, such as patient-generated data from social media platforms, online forums and electronic health records, sentiment analysis can provide a unique lens through which to examine the complexity of emotions in FM,” investigators Vincenzo Venerito and Florenzo Iannon, both from the rheumatology unit in the Department of Precision and Regenerative Medicine and Ionian Area at the University of Bari "Aldo Moro", Bari, Italy, wrote.¹ “In this proof-of-concept study, we investigated whether a local LLM-driven sentiment analysis might also catch the nuance of pain expression patterns in FM due to specific lexicon by analyzing linguistic patterns and emotional cues.”

The investigators enrolled 40 patients with fibromyalgia as according to the 2016 American College of Rheumatology Criteria and 40 patients with chronic pain not due to fibromyalgia referred to rheumatology clinics. They transcribed responses to questions on pain and sleep and used the LLM Mistral-7B-Instruct-v0.2 to machine translate them to English and analyze them using prompt engineering targeting FM-associated language nuances for pain expression, or alternatively, without prompt engineering (ablated). Using rheumatologist diagnosis as ground truth, they calculated accuracy, precision, recall, specificity and area under the receiver operating characteristic curve (AUROC).

Venerito and Iannon found that the prompt-engineered approach had an accuracy of 0.87, precision of 0.92, recall of 0.84, specificity of 0.82 and AUROC of 0.86 for distinguishing fibromyalgia from other chronic pain conditions. In comparison, the ablated approach had an accuracy of 0.76, precision of 0.75, recall of 0.77, specificity of 0.75 and AUROC of 0.76 (McNemar’s test P <.001).¹

The prompt-engineered approach misclassified 10 patients (25.00%), including 2 patients with Axial spondyloarthritis (5.00%), 2 with subacromial bursitis due to calcifying tendinopathy (5.00%), 2 with spinal stenosis (5.00%), 1 with psoriatic arthirtis (2.50%), 1 with rheumatoid arthritis (2.50%), 1 with DeQuervain’s tenosynovitis (2.50%), and 1 with idiopathic transient osteoporosis of the hip (2.50%). Subacromial bursitis due to calcifying tendinopathy was more likely to be misclassified as, with an odds ratio of 29.57 (95% CI, 2.70-323.69) although this was not seen with the other conditions.¹

The investigators also conducted an attention weight analysis that found notable emphasis with the prompt-engineered sentiment analysis model given to words associated with widespread pain, fatigue, depressed mood and dysesthesia, such as ‘everywhere’, ‘spot’ (used to communicate a ‘leopard-spot’ pain), ‘exhaust’, ‘depressed’, ‘electric’, and ‘burning’.

“In conclusion, this study provides early evidence that LLM-driven sentiment analysis could be a useful tool to complement clinical assessment in diagnosing complex conditions like FM. Further validation in larger prospective cohorts is warranted. Additionally, optimizing model interpretability and integrating findings with patient-reported outcomes data could help translate these analytics into clinical impact for patients,” Venerito and Iannon concluded.¹

Other research exploring technology’s benefit impact in fibromyalgia recently found that digital acceptance and commitment therapy (ACT), a form of cognitive behavioral therapy (CBT), was safe and helped manage fibromyalgia in adult patients when compared with digital symptom tracking, with improvements in patient global impression of change scores, Revised Fibromyalgia Impact Questionnaire scores, Patient-Reported Outcomes Measurement Information System (PROMIS) scores, weekly pain intensity, and weekly pain interference at week 12 compared to active control in the phase 3 PROSPER-FM trial (NCT05243511).²

REFERENCES

Venerito V, Iannone F. Large language model-driven sentiment analysis for facilitating fibromyalgia diagnosis. RMD Open 2024;10:e004367. doi: 10.1136/rmdopen-2024-004367
Gendreau RM, McCracken LM, Williams DA, et al. Self-guided digital behavioural therapy versus active control for fibromyalgia (PROSPER-FM): a phase 3, multicentre, randomised controlled trial. Lancet. 2024; 404(10450): 364-374. doi: 10.1016/S0140-6736(24)00909-7

AI Language Model Distinguishes Fibromyalgia From Other Chronic Pain Conditions

REFERENCES

Venerito V, Iannone F. Large language model-driven sentiment analysis for facilitating fibromyalgia diagnosis. RMD Open 2024;10:e004367. doi: 10.1136/rmdopen-2024-004367

Gendreau RM, McCracken LM, Williams DA, et al. Self-guided digital behavioural therapy versus active control for fibromyalgia (PROSPER-FM): a phase 3, multicentre, randomised controlled trial. Lancet. 2024; 404(10450): 364-374. doi: 10.1016/S0140-6736(24)00909-7