TY - JOUR
T1 - DA-ViT
T2 - Deformable Attention Vision Transformer for Alzheimer’s Disease Classification from MRI Scans
AU - Almansour, Abdullah G.M.
AU - Alshomrani, Faisal
AU - Almutairi, Abdulaziz T.M.
AU - Alalwany, Easa
AU - Alshuhri, Mohammed S.
AU - Alshaari, Hussein
AU - Alfahaid, Abdullah
N1 - Publisher Copyright:
Copyright © 2025 The Authors.
PY - 2025
Y1 - 2025
N2 - The early and precise identification of Alzheimer’s Disease (AD) continues to pose considerable clinical difficulty due to subtle structural alterations and overlapping symptoms across the disease phases. This study presents a novel Deformable Attention Vision Transformer (DA-ViT) architecture that integrates deformable Multi-Head Self-Attention (MHSA) with a Multi-Layer Perceptron (MLP) block for efficient classification of Alzheimer’s disease (AD) using Magnetic resonance imaging (MRI) scans. In contrast to traditional vision transformers, our deformable MHSA module preferentially concentrates on spatially pertinent patches through learned offset predictions, markedly diminishing processing demands while improving localized feature representation. DA-ViT contains only 0.93 million parameters, making it exceptionally suitable for implementation in resource-limited settings. We evaluate the model using a class-imbalanced Alzheimer’s MRI dataset comprising 6400 images across four categories, achieving a test accuracy of 80.31%, a macro F1-score of 0.80, and an area under the receiver operating characteristic curve (AUC) of 1.00 for the Mild Demented category. Thorough ablation studies validate the ideal configuration of transformer depth, headcount, and embedding dimensions. Moreover, comparison research indicates that DA-ViT surpasses state-of-the-art pre-trained Convolutional Neural Network (CNN) models in terms of accuracy and parameter efficiency.
AB - The early and precise identification of Alzheimer’s Disease (AD) continues to pose considerable clinical difficulty due to subtle structural alterations and overlapping symptoms across the disease phases. This study presents a novel Deformable Attention Vision Transformer (DA-ViT) architecture that integrates deformable Multi-Head Self-Attention (MHSA) with a Multi-Layer Perceptron (MLP) block for efficient classification of Alzheimer’s disease (AD) using Magnetic resonance imaging (MRI) scans. In contrast to traditional vision transformers, our deformable MHSA module preferentially concentrates on spatially pertinent patches through learned offset predictions, markedly diminishing processing demands while improving localized feature representation. DA-ViT contains only 0.93 million parameters, making it exceptionally suitable for implementation in resource-limited settings. We evaluate the model using a class-imbalanced Alzheimer’s MRI dataset comprising 6400 images across four categories, achieving a test accuracy of 80.31%, a macro F1-score of 0.80, and an area under the receiver operating characteristic curve (AUC) of 1.00 for the Mild Demented category. Thorough ablation studies validate the ideal configuration of transformer depth, headcount, and embedding dimensions. Moreover, comparison research indicates that DA-ViT surpasses state-of-the-art pre-trained Convolutional Neural Network (CNN) models in terms of accuracy and parameter efficiency.
KW - Alzheimer disease classification
KW - MRI analysis
KW - bayesian optimization
KW - deformable attention
KW - vision transformer
UR - https://www.scopus.com/pages/publications/105017127498
U2 - 10.32604/cmes.2025.069661
DO - 10.32604/cmes.2025.069661
M3 - Article
AN - SCOPUS:105017127498
SN - 1526-1492
VL - 144
SP - 2395
EP - 2418
JO - CMES - Computer Modeling in Engineering and Sciences
JF - CMES - Computer Modeling in Engineering and Sciences
IS - 2
ER -