Extending imageNet to Arabic using Arabic WordNet

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

ImageNet has millions of images that are labeled with English WordNet synsets. This paper investigates the extension of ImageNet to Arabic using Arabic WordNet. The objective is to discover if Arabic synsets can be found for synsets used in ImageNet. The primary finding is the identification of Arabic synsets for 1, 219 of the 21, 841 synsets used in ImageNet, which represents 1.1 million images. By leveraging the parent-child structure of synsets in ImageNet, this dataset is extended to 10, 462 synsets (and 7.1 million images) that have an Arabic label, which is either a match or a direct hypernym, and to 17, 438 synsets (and 11 million images) when a hypernym of a hypernym is included. When all hypernyms for a node are considered, an Arabic synset is found for all but four synsets. This represents the major contribution of this work: a dataset of images that have Arabic labels for 99.9% of the images in ImageNet.

Original languageEnglish
Title of host publicationACL 2020 - Advances in Language and Vision Research, Proceedings of the 1st Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages1-6
Number of pages6
ISBN (Electronic)9781952148149
DOIs
StatePublished - 2020
Event1st Workshop on Advances in Language and Vision Research, ALVR 2020 at the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 - Virtual, Online, United States
Duration: 9 Jul 2020 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference1st Workshop on Advances in Language and Vision Research, ALVR 2020 at the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
Country/TerritoryUnited States
CityVirtual, Online
Period9/07/20 → …

Fingerprint

Dive into the research topics of 'Extending imageNet to Arabic using Arabic WordNet'. Together they form a unique fingerprint.

Cite this