TY - JOUR
T1 - Urdu ligature recognition system
T2 - An evolutionary approach
AU - Khan, Naila Habib
AU - Adnan, Awais
AU - Waheed, Abdul
AU - Zareei, Mahdi
AU - Aldosary, Abdallah
AU - Mohamed, Ehab Mahmoud
N1 - Publisher Copyright:
© 2021 Tech Science Press. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72%.
AB - Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72%.
KW - Classification rules
KW - Genetic algorithm
KW - Intra-feature hierarchical clustering
KW - Ligature recognition
KW - Urdu script
UR - http://www.scopus.com/inward/record.url?scp=85097175917&partnerID=8YFLogxK
U2 - 10.32604/cmc.2020.013715
DO - 10.32604/cmc.2020.013715
M3 - Article
AN - SCOPUS:85097175917
SN - 1546-2218
VL - 66
SP - 1347
EP - 1367
JO - Computers, Materials and Continua
JF - Computers, Materials and Continua
IS - 2
ER -