TY - JOUR
T1 - Diagnostic accuracy of AI-based models for autism spectrum disorder
T2 - A systematic review and meta-analysis with a focus on Arab populations
AU - Aldakhil, Ali Fahad
AU - Alasim, Khalid N.
N1 - Publisher Copyright:
© 2025 Elsevier Ltd.
PY - 2025/12
Y1 - 2025/12
N2 - Background Autism Spectrum Disorder (ASD) is a prevalent neurodevelopmental condition globally, including in Arab countries, where stigma, limited awareness, and scarce specialized services often delay diagnosis and care. Artificial intelligence (AI) offers scalable solutions for screening, early diagnosis, and intervention programmes. Aims To evaluate the diagnostic accuracy of AI-based models for ASD with a specific focus on Arab cohorts, and to appraise methodological quality and potential cultural influences on model performance. Methods We searched PubMed, Scopus, and Web of Science for studies published between January 2019 and September 2025. Eligible studies evaluated supervised AI systems, machine learning (ML), or deep learning (DL) that classify individuals as ASD versus non-ASD against a clinician-confirmed reference standard. Study quality was assessed using QUADAS-2. Diagnostic accuracy metrics (sensitivity, specificity, likelihood ratios, diagnostic odds ratio) were pooled using a bivariate random-effects model. Results Fifteen studies were included in the systematic review; ten studies were eligible for meta-analysis (59 model evaluations; 26,569 instances), comparing AI models against clinician-confirmed autism diagnoses. Pooled sensitivity was 91.8 % (95 % CI [89.0, 94.2]) and specificity 90.7 % (95 % CI [87.6, 93.5]), yielding a diagnostic odds ratio (DOR) of 109.0 (95 % CI [59.5, 227.9]), positive likelihood ratio (LR⁺) of 9.8, and negative likelihood ratio (LR⁻) of 0.09. Subgroup analysis revealed hybrid models (deep feature extractors with classical classifiers) achieved the highest accuracy (sensitivity 95.2 %, specificity 96.0 %), followed by conventional ML (sensitivity 91.6 %, specificity 90.3 %), and DL alone (sensitivity 87.3 %, specificity 86.0 %). In Arab-only cohorts, models showed higher sensitivity (94.2 %) but lower specificity (87.6 %), suggesting stronger rule-out potential but more false positives. Conclusion To our knowledge, this is the first systematic meta-analysis of AI-based ASD diagnostics confirms high accuracy, with hybrid models excelling compared to both traditional ML and DL alone. In Arab cohorts, models showed higher sensitivity but lower specificity, highlighting the importance of cultural and linguistic tailoring of assessment tools, diagnostic protocols, and datasets, alongside regional challenges such as stigma and limited resources. These findings support AI as a valuable tool for early detection and screening.
AB - Background Autism Spectrum Disorder (ASD) is a prevalent neurodevelopmental condition globally, including in Arab countries, where stigma, limited awareness, and scarce specialized services often delay diagnosis and care. Artificial intelligence (AI) offers scalable solutions for screening, early diagnosis, and intervention programmes. Aims To evaluate the diagnostic accuracy of AI-based models for ASD with a specific focus on Arab cohorts, and to appraise methodological quality and potential cultural influences on model performance. Methods We searched PubMed, Scopus, and Web of Science for studies published between January 2019 and September 2025. Eligible studies evaluated supervised AI systems, machine learning (ML), or deep learning (DL) that classify individuals as ASD versus non-ASD against a clinician-confirmed reference standard. Study quality was assessed using QUADAS-2. Diagnostic accuracy metrics (sensitivity, specificity, likelihood ratios, diagnostic odds ratio) were pooled using a bivariate random-effects model. Results Fifteen studies were included in the systematic review; ten studies were eligible for meta-analysis (59 model evaluations; 26,569 instances), comparing AI models against clinician-confirmed autism diagnoses. Pooled sensitivity was 91.8 % (95 % CI [89.0, 94.2]) and specificity 90.7 % (95 % CI [87.6, 93.5]), yielding a diagnostic odds ratio (DOR) of 109.0 (95 % CI [59.5, 227.9]), positive likelihood ratio (LR⁺) of 9.8, and negative likelihood ratio (LR⁻) of 0.09. Subgroup analysis revealed hybrid models (deep feature extractors with classical classifiers) achieved the highest accuracy (sensitivity 95.2 %, specificity 96.0 %), followed by conventional ML (sensitivity 91.6 %, specificity 90.3 %), and DL alone (sensitivity 87.3 %, specificity 86.0 %). In Arab-only cohorts, models showed higher sensitivity (94.2 %) but lower specificity (87.6 %), suggesting stronger rule-out potential but more false positives. Conclusion To our knowledge, this is the first systematic meta-analysis of AI-based ASD diagnostics confirms high accuracy, with hybrid models excelling compared to both traditional ML and DL alone. In Arab cohorts, models showed higher sensitivity but lower specificity, highlighting the importance of cultural and linguistic tailoring of assessment tools, diagnostic protocols, and datasets, alongside regional challenges such as stigma and limited resources. These findings support AI as a valuable tool for early detection and screening.
KW - Artificial intelligence
KW - Autism spectrum disorder
KW - Deep learning
KW - Diagnostic accuracy
KW - Hybrid models
KW - Machine learning
KW - Sensitivity
KW - Specificity
UR - https://www.scopus.com/pages/publications/105022240680
U2 - 10.1016/j.ridd.2025.105166
DO - 10.1016/j.ridd.2025.105166
M3 - Article
C2 - 41270703
AN - SCOPUS:105022240680
SN - 0891-4222
VL - 167
JO - Research in Developmental Disabilities
JF - Research in Developmental Disabilities
M1 - 105166
ER -