Extending ImageNet to Arabic using Arabic WordNet

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

This paper investigates the extension of ImageNet and its millions of English-labeled images to Arabic using Arabic WordNet. The primary finding is the identification of Arabic synsets for 1219 of the 21,841 synsets used in ImageNet, which represents 1.1 million images. By leveraging the parent-child structure of synsets in ImageNet, this dataset is extended to 10,462 synsets (and 7.1 million images) that have an Arabic label, which is either a match or a direct hypernym, and to 17,438 synsets (and 11 million images) when a hypernym of a hypernym is included. Samples evaluated suggest that generating Arabic labels for images in ImageNet using hypernyms does indeed produce meaningful results. The precision values for seven evaluated samples exceeded 90%. Moreover, when all the images in the samples were combined, the precision value equaled 93%. For the entire ImageNet, when all hypernyms for a node are considered, an Arabic synset is found for all but four synsets. This represents the major contribution of this work: a dataset of 14,195,756 images that have Arabic labels. The resulting dataset presents Arabic labels for 99.9% of the images in ImageNet.

Original languageEnglish
Pages (from-to)8835-8852
Number of pages18
JournalMultimedia Tools and Applications
Volume81
Issue number6
DOIs
StatePublished - Mar 2022

Keywords

  • Arabic computer vision
  • Arabic WordNet
  • Computer vision
  • ImageNet
  • Language and computer vision
  • Linked data

Fingerprint

Dive into the research topics of 'Extending ImageNet to Arabic using Arabic WordNet'. Together they form a unique fingerprint.

Cite this