TY - JOUR
T1 - Bridging Reality and Synthetics
T2 - Optimizing Image Classification with Hybrid AI-Generated and Real-World Datasets
AU - Alabed, Abdallah Tariq Hasan
AU - Rasheed, Jawad
AU - Yesiltepe, Mirsat
AU - Alsubai, Shtwai
AU - Asuroglu, Tunc
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025/8
Y1 - 2025/8
N2 - The rapidly growing revolution of generative Artificial Intelligence software has moved into the counseling and disseminating synthetic images, thereby establishing a new paradigm for machine learning models. This study investigates the impact of combining real-world and AI-generated synthetic images on the performance of image classification models. Using three traffic-related datasets—potholes, speed bumps, and traffic lights—we applied data augmentation and tested seven configurations with varying real-to-synthetic image ratios. The DenseNet201 model, fine-tuned with the Adam optimizer, was used for all experiments. Results show that a 1:3 real-to-synthetic ratio enhances classification accuracy and generalization, with the highest validation accuracy reaching 97.36%. Our findings demonstrate that synthetic data, when properly integrated, serves as a cost-effective and scalable complement to real data, especially in scenarios with limited labeled samples.
AB - The rapidly growing revolution of generative Artificial Intelligence software has moved into the counseling and disseminating synthetic images, thereby establishing a new paradigm for machine learning models. This study investigates the impact of combining real-world and AI-generated synthetic images on the performance of image classification models. Using three traffic-related datasets—potholes, speed bumps, and traffic lights—we applied data augmentation and tested seven configurations with varying real-to-synthetic image ratios. The DenseNet201 model, fine-tuned with the Adam optimizer, was used for all experiments. Results show that a 1:3 real-to-synthetic ratio enhances classification accuracy and generalization, with the highest validation accuracy reaching 97.36%. Our findings demonstrate that synthetic data, when properly integrated, serves as a cost-effective and scalable complement to real data, especially in scenarios with limited labeled samples.
KW - Adam
KW - DenseNet201
KW - Image classification
KW - Machine learning
KW - Synthetic images
UR - http://www.scopus.com/inward/record.url?scp=105010525077&partnerID=8YFLogxK
U2 - 10.1007/s42979-025-04181-0
DO - 10.1007/s42979-025-04181-0
M3 - Article
AN - SCOPUS:105010525077
SN - 2662-995X
VL - 6
JO - SN Computer Science
JF - SN Computer Science
IS - 6
M1 - 632
ER -