TY - GEN
T1 - Extending imageNet to Arabic using Arabic WordNet
AU - Alsudais, Abdulkareem
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics.
PY - 2020
Y1 - 2020
N2 - ImageNet has millions of images that are labeled with English WordNet synsets. This paper investigates the extension of ImageNet to Arabic using Arabic WordNet. The objective is to discover if Arabic synsets can be found for synsets used in ImageNet. The primary finding is the identification of Arabic synsets for 1, 219 of the 21, 841 synsets used in ImageNet, which represents 1.1 million images. By leveraging the parent-child structure of synsets in ImageNet, this dataset is extended to 10, 462 synsets (and 7.1 million images) that have an Arabic label, which is either a match or a direct hypernym, and to 17, 438 synsets (and 11 million images) when a hypernym of a hypernym is included. When all hypernyms for a node are considered, an Arabic synset is found for all but four synsets. This represents the major contribution of this work: a dataset of images that have Arabic labels for 99.9% of the images in ImageNet.
AB - ImageNet has millions of images that are labeled with English WordNet synsets. This paper investigates the extension of ImageNet to Arabic using Arabic WordNet. The objective is to discover if Arabic synsets can be found for synsets used in ImageNet. The primary finding is the identification of Arabic synsets for 1, 219 of the 21, 841 synsets used in ImageNet, which represents 1.1 million images. By leveraging the parent-child structure of synsets in ImageNet, this dataset is extended to 10, 462 synsets (and 7.1 million images) that have an Arabic label, which is either a match or a direct hypernym, and to 17, 438 synsets (and 11 million images) when a hypernym of a hypernym is included. When all hypernyms for a node are considered, an Arabic synset is found for all but four synsets. This represents the major contribution of this work: a dataset of images that have Arabic labels for 99.9% of the images in ImageNet.
UR - http://www.scopus.com/inward/record.url?scp=85118097845&partnerID=8YFLogxK
U2 - 10.18653/v1/2020.alvr-1.1
DO - 10.18653/v1/2020.alvr-1.1
M3 - Conference contribution
AN - SCOPUS:85118097845
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 1
EP - 6
BT - ACL 2020 - Advances in Language and Vision Research, Proceedings of the 1st Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 1st Workshop on Advances in Language and Vision Research, ALVR 2020 at the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
Y2 - 9 July 2020
ER -