Abstract
Background Oral lichen planus (OLP), oral lichenoid lesions (OLL), and squamous cell carcinoma on a lichenoid background (SCC-over-LP/LLP) overlap clinically, delaying malignant transformation recognition. Objective To evaluate a multimodal large language model (ChatGPT-5) against oral medicine (OM) specialists for tripartite classification (OLP/OLL/SCC-over-LP/LLP) and malignant-risk flagging. Methods Cross-sectional, paired diagnostic accuracy study adhering to STARD/STARD-AI. Retrospective, anonymized cases ( n = 262; OLP = 100, OLL = 100, SCC-over-LP/LLP = 62) were independently evaluated by ChatGPT-5 and a comparator panel of board-certified OM specialists using identical clinical histories and intraoral photographs (no histopathology provided to either). A separate reference standard panel (three OM experts) established the diagnosis using full clinical data and histopathology prior to index testing. Primary outcome: paired accuracy (McNemar). Secondary: certainty (1-5), management agreement (Gwet’s AC1), and recognition of malignant red-flag features. Results Overall accuracy was comparable (84.7% ChatGPT-5 vs 85.5% OM specialists; McNemar P = .856, Cohen’s h = 0.03). Sensitivity was high for OLP 0.99 and SCC-over-LP/LLP 0.85; OLL sensitivity 0.70 with specificity 1.00. Biopsy/referral agreement was near-perfect (AC1 = 0.91). Malignant-risk features were correctly identified in 88% of SCC-over-LP/LLP cases by ChatGPT-5 vs 92% by OM specialists ( P = .41). Conclusions A multimodal large language model can reach expert-level accuracy for OLP/OLL/SCC-over-LP/LLP and reliably flag malignant transformation risk, supporting its role as an adjunctive decision-support tool in OM.
| Original language | English |
|---|---|
| Article number | 109357 |
| Journal | International Dental Journal |
| Volume | 76 |
| Issue number | 1 |
| DOIs | |
| State | Published - Feb 2026 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Artificial intelligence
- Diagnostic accuracy
- Large language model
- Oral lichen planus
- Oral lichenoid lesion
- Squamous cell carcinoma
Fingerprint
Dive into the research topics of 'A Multimodal Large Language Model Framework for Clinical Subtyping and Malignant Transformation Risk Prediction in Oral Lichen Planus: A Paired Comparison With Expert Clinicians'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver