TY - JOUR
T1 - A Novel Noise Removal and Interpretable Deep Learning Model for Diabetic Retinopathy Detection
AU - Alanazi, Sultan
AU - Khan, Sajid Ullah
AU - Alotaibi, Faisal M.
AU - Alonazi, Mohammed
N1 - Publisher Copyright:
© 2025 Wiley Periodicals LLC.
PY - 2025/11
Y1 - 2025/11
N2 - Diabetic retinopathy (DR) is a primary reason for visual impairment and blindness in individuals with diabetes worldwide. Timely detection of DR is essential to prevent vision loss in diabetics. However, noise and limited model transparency often compromise the accuracy of diagnosing retinal fundus images. Noise and interpretability are the two main challenges occurring in imaging datasets, overshadowing concerns such as class imbalance or device variability. These distortions are present in all datasets and devices, reducing the clarity of diagnostic signals at the pixel level and often obscuring early lesions within background noise. Addressing these challenges, this research introduces an innovative model called Explainable MINet-ViT, which combines advanced noise reduction techniques with explainable deep learning for more reliable identification of DR. The model incorporates a multi-level denoising network (MINet), modified by a noise-specific pre-processing module using a Variance-Stabilizing Transform (VST) and deep residual feature mapping. A hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) is employed to extract both local and global spatial information. We apply explainability strategies, such as Grad-CAM and SHAP, to ensure clinical interpretability by identifying the crucial retinal regions that influence model predictions. Quantitative and qualitative results show improved performance, robustness, and clinical applicability, achieving an accuracy of 97.6%, a sensitivity of 0.96, a specificity of 0.97, a Kappa of 0.92, and an AUC of 96.7%. Analyses of standard datasets reveal that our proposed model outperforms prior models in accuracy, noise robustness, and interpretability, rendering it exceptionally suitable for real-world clinical applications.
AB - Diabetic retinopathy (DR) is a primary reason for visual impairment and blindness in individuals with diabetes worldwide. Timely detection of DR is essential to prevent vision loss in diabetics. However, noise and limited model transparency often compromise the accuracy of diagnosing retinal fundus images. Noise and interpretability are the two main challenges occurring in imaging datasets, overshadowing concerns such as class imbalance or device variability. These distortions are present in all datasets and devices, reducing the clarity of diagnostic signals at the pixel level and often obscuring early lesions within background noise. Addressing these challenges, this research introduces an innovative model called Explainable MINet-ViT, which combines advanced noise reduction techniques with explainable deep learning for more reliable identification of DR. The model incorporates a multi-level denoising network (MINet), modified by a noise-specific pre-processing module using a Variance-Stabilizing Transform (VST) and deep residual feature mapping. A hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) is employed to extract both local and global spatial information. We apply explainability strategies, such as Grad-CAM and SHAP, to ensure clinical interpretability by identifying the crucial retinal regions that influence model predictions. Quantitative and qualitative results show improved performance, robustness, and clinical applicability, achieving an accuracy of 97.6%, a sensitivity of 0.96, a specificity of 0.97, a Kappa of 0.92, and an AUC of 96.7%. Analyses of standard datasets reveal that our proposed model outperforms prior models in accuracy, noise robustness, and interpretability, rendering it exceptionally suitable for real-world clinical applications.
KW - denoising
KW - diabetic retinopathy
KW - spatial information
KW - variance stabilizing transform
KW - vision transformers
UR - https://www.scopus.com/pages/publications/105020579161
U2 - 10.1002/ima.70245
DO - 10.1002/ima.70245
M3 - Article
AN - SCOPUS:105020579161
SN - 0899-9457
VL - 35
JO - International Journal of Imaging Systems and Technology
JF - International Journal of Imaging Systems and Technology
IS - 6
M1 - e70245
ER -