Urdu ligature recognition system: An evolutionary approach

Naila Habib Khan; Awais Adnan; Abdul Waheed; Mahdi Zareei; Abdallah Aldosary; Ehab Mahmoud Mohamed

doi:10.32604/cmc.2020.013715

Urdu ligature recognition system: An evolutionary approach

Naila Habib Khan
, Awais Adnan
, Abdul Waheed
, Mahdi Zareei
, Abdallah Aldosary
, Ehab Mahmoud Mohamed

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72%.

Original language	English
Pages (from-to)	1347-1367
Number of pages	21
Journal	Computers, Materials and Continua
Volume	66
Issue number	2
DOIs	https://doi.org/10.32604/cmc.2020.013715
State	Published - 2020

Keywords

Classification rules
Genetic algorithm
Intra-feature hierarchical clustering
Ligature recognition
Urdu script

Access to Document

10.32604/cmc.2020.013715

Cite this

@article{0eb932006f6440858cc5208b0fc28a15,

title = "Urdu ligature recognition system: An evolutionary approach",

abstract = "Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72\%.",

keywords = "Classification rules, Genetic algorithm, Intra-feature hierarchical clustering, Ligature recognition, Urdu script",

author = "Khan, \{Naila Habib\} and Awais Adnan and Abdul Waheed and Mahdi Zareei and Abdallah Aldosary and Mohamed, \{Ehab Mahmoud\}",

year = "2020",

doi = "10.32604/cmc.2020.013715",

language = "English",

volume = "66",

pages = "1347--1367",

journal = "Computers, Materials and Continua",

issn = "1546-2218",

publisher = "Tech Science Press",

number = "2",

}

TY - JOUR

T1 - Urdu ligature recognition system

T2 - An evolutionary approach

AU - Khan, Naila Habib

AU - Adnan, Awais

AU - Waheed, Abdul

AU - Zareei, Mahdi

AU - Aldosary, Abdallah

AU - Mohamed, Ehab Mahmoud

PY - 2020

Y1 - 2020

N2 - Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72%.

AB - Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72%.

KW - Classification rules

KW - Genetic algorithm

KW - Intra-feature hierarchical clustering

KW - Ligature recognition

KW - Urdu script

UR - https://www.scopus.com/pages/publications/85097175917

U2 - 10.32604/cmc.2020.013715

DO - 10.32604/cmc.2020.013715

M3 - Article

AN - SCOPUS:85097175917

SN - 1546-2218

VL - 66

SP - 1347

EP - 1367

JO - Computers, Materials and Continua

JF - Computers, Materials and Continua

IS - 2

ER -

Urdu ligature recognition system: An evolutionary approach

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this