Remote intelligent perception system for multi-object detection

Abdulwahab Alazeb; Bisma Riaz Chughtai; Naif Al Mudawi; Yahya AlQahtani; Mohammed Alonazi; Hanan Aljuaid; Ahmad Jalal; Hui Liu

doi:10.3389/fnbot.2024.1398703

Remote intelligent perception system for multi-object detection

Abdulwahab Alazeb
, Bisma Riaz Chughtai
, Naif Al Mudawi
, Yahya AlQahtani
, Mohammed Alonazi
, Hanan Aljuaid
, Ahmad Jalal
, Hui Liu

Information Systems

Research output: Contribution to journal › Article › peer-review

47 Scopus citations

Abstract

Introduction: During the last few years, a heightened interest has been shown in classifying scene images depicting diverse robotic environments. The surge in interest can be attributed to significant improvements in visual sensor technology, which has enhanced image analysis capabilities. Methods: Advances in vision technology have a major impact on the areas of multiple object detection and scene understanding. These tasks are an integral part of a variety of technologies, including integrating scenes in augmented reality, facilitating robot navigation, enabling autonomous driving systems, and improving applications in tourist information. Despite significant strides in visual interpretation, numerous challenges persist, encompassing semantic understanding, occlusion, orientation, insufficient availability of labeled data, uneven illumination including shadows and lighting, variation in direction, and object size and changing background. To overcome these challenges, we proposed an innovative scene recognition framework, which proved to be highly effective and yielded remarkable results. First, we perform preprocessing using kernel convolution on scene data. Second, we perform semantic segmentation using UNet segmentation. Then, we extract features from these segmented data using discrete wavelet transform (DWT), Sobel and Laplacian, and textual (local binary pattern analysis). To recognize the object, we have used deep belief network and then find the object-to-object relation. Finally, AlexNet is used to assign the relevant labels to the scene based on recognized objects in the image. Results: The performance of the proposed system was validated using three standard datasets: PASCALVOC-12, Cityscapes, and Caltech 101. The accuracy attained on the PASCALVOC-12 dataset exceeds 96% while achieving a rate of 95.90% on the Cityscapes dataset. Discussion: Furthermore, the model demonstrates a commendable accuracy of 92.2% on the Caltech 101 dataset. This model showcases noteworthy advancements beyond the capabilities of current models.

Original language	English
Article number	1398703
Journal	Frontiers in Neurorobotics
Volume	18
DOIs	https://doi.org/10.3389/fnbot.2024.1398703
State	Published - 2024

Keywords

AlexNet
deep belief network
deep learning
image processing
intelligent perception
remote sensing
robotic environment

Access to Document

10.3389/fnbot.2024.1398703

Cite this

@article{0331a3583930405ba697ff2db56c5097,

title = "Remote intelligent perception system for multi-object detection",

abstract = "Introduction: During the last few years, a heightened interest has been shown in classifying scene images depicting diverse robotic environments. The surge in interest can be attributed to significant improvements in visual sensor technology, which has enhanced image analysis capabilities. Methods: Advances in vision technology have a major impact on the areas of multiple object detection and scene understanding. These tasks are an integral part of a variety of technologies, including integrating scenes in augmented reality, facilitating robot navigation, enabling autonomous driving systems, and improving applications in tourist information. Despite significant strides in visual interpretation, numerous challenges persist, encompassing semantic understanding, occlusion, orientation, insufficient availability of labeled data, uneven illumination including shadows and lighting, variation in direction, and object size and changing background. To overcome these challenges, we proposed an innovative scene recognition framework, which proved to be highly effective and yielded remarkable results. First, we perform preprocessing using kernel convolution on scene data. Second, we perform semantic segmentation using UNet segmentation. Then, we extract features from these segmented data using discrete wavelet transform (DWT), Sobel and Laplacian, and textual (local binary pattern analysis). To recognize the object, we have used deep belief network and then find the object-to-object relation. Finally, AlexNet is used to assign the relevant labels to the scene based on recognized objects in the image. Results: The performance of the proposed system was validated using three standard datasets: PASCALVOC-12, Cityscapes, and Caltech 101. The accuracy attained on the PASCALVOC-12 dataset exceeds 96\% while achieving a rate of 95.90\% on the Cityscapes dataset. Discussion: Furthermore, the model demonstrates a commendable accuracy of 92.2\% on the Caltech 101 dataset. This model showcases noteworthy advancements beyond the capabilities of current models.",

keywords = "AlexNet, deep belief network, deep learning, image processing, intelligent perception, remote sensing, robotic environment",

author = "Abdulwahab Alazeb and Chughtai, \{Bisma Riaz\} and \{Al Mudawi\}, Naif and Yahya AlQahtani and Mohammed Alonazi and Hanan Aljuaid and Ahmad Jalal and Hui Liu",

note = "Publisher Copyright: Copyright {\textcopyright} 2024 Alazeb, Chughtai, Al Mudawi, AlQahtani, Alonazi, Aljuaid, Jalal and Liu.",

year = "2024",

doi = "10.3389/fnbot.2024.1398703",

language = "English",

volume = "18",

journal = "Frontiers in Neurorobotics",

issn = "1662-5218",

publisher = "Frontiers Media SA",

}

TY - JOUR

T1 - Remote intelligent perception system for multi-object detection

AU - Alazeb, Abdulwahab

AU - Chughtai, Bisma Riaz

AU - Al Mudawi, Naif

AU - AlQahtani, Yahya

AU - Alonazi, Mohammed

AU - Aljuaid, Hanan

AU - Jalal, Ahmad

AU - Liu, Hui

PY - 2024

Y1 - 2024

N2 - Introduction: During the last few years, a heightened interest has been shown in classifying scene images depicting diverse robotic environments. The surge in interest can be attributed to significant improvements in visual sensor technology, which has enhanced image analysis capabilities. Methods: Advances in vision technology have a major impact on the areas of multiple object detection and scene understanding. These tasks are an integral part of a variety of technologies, including integrating scenes in augmented reality, facilitating robot navigation, enabling autonomous driving systems, and improving applications in tourist information. Despite significant strides in visual interpretation, numerous challenges persist, encompassing semantic understanding, occlusion, orientation, insufficient availability of labeled data, uneven illumination including shadows and lighting, variation in direction, and object size and changing background. To overcome these challenges, we proposed an innovative scene recognition framework, which proved to be highly effective and yielded remarkable results. First, we perform preprocessing using kernel convolution on scene data. Second, we perform semantic segmentation using UNet segmentation. Then, we extract features from these segmented data using discrete wavelet transform (DWT), Sobel and Laplacian, and textual (local binary pattern analysis). To recognize the object, we have used deep belief network and then find the object-to-object relation. Finally, AlexNet is used to assign the relevant labels to the scene based on recognized objects in the image. Results: The performance of the proposed system was validated using three standard datasets: PASCALVOC-12, Cityscapes, and Caltech 101. The accuracy attained on the PASCALVOC-12 dataset exceeds 96% while achieving a rate of 95.90% on the Cityscapes dataset. Discussion: Furthermore, the model demonstrates a commendable accuracy of 92.2% on the Caltech 101 dataset. This model showcases noteworthy advancements beyond the capabilities of current models.

AB - Introduction: During the last few years, a heightened interest has been shown in classifying scene images depicting diverse robotic environments. The surge in interest can be attributed to significant improvements in visual sensor technology, which has enhanced image analysis capabilities. Methods: Advances in vision technology have a major impact on the areas of multiple object detection and scene understanding. These tasks are an integral part of a variety of technologies, including integrating scenes in augmented reality, facilitating robot navigation, enabling autonomous driving systems, and improving applications in tourist information. Despite significant strides in visual interpretation, numerous challenges persist, encompassing semantic understanding, occlusion, orientation, insufficient availability of labeled data, uneven illumination including shadows and lighting, variation in direction, and object size and changing background. To overcome these challenges, we proposed an innovative scene recognition framework, which proved to be highly effective and yielded remarkable results. First, we perform preprocessing using kernel convolution on scene data. Second, we perform semantic segmentation using UNet segmentation. Then, we extract features from these segmented data using discrete wavelet transform (DWT), Sobel and Laplacian, and textual (local binary pattern analysis). To recognize the object, we have used deep belief network and then find the object-to-object relation. Finally, AlexNet is used to assign the relevant labels to the scene based on recognized objects in the image. Results: The performance of the proposed system was validated using three standard datasets: PASCALVOC-12, Cityscapes, and Caltech 101. The accuracy attained on the PASCALVOC-12 dataset exceeds 96% while achieving a rate of 95.90% on the Cityscapes dataset. Discussion: Furthermore, the model demonstrates a commendable accuracy of 92.2% on the Caltech 101 dataset. This model showcases noteworthy advancements beyond the capabilities of current models.

KW - AlexNet

KW - deep belief network

KW - deep learning

KW - image processing

KW - intelligent perception

KW - remote sensing

KW - robotic environment

UR - https://www.scopus.com/pages/publications/85195150132

U2 - 10.3389/fnbot.2024.1398703

DO - 10.3389/fnbot.2024.1398703

M3 - Article

AN - SCOPUS:85195150132

SN - 1662-5218

VL - 18

JO - Frontiers in Neurorobotics

JF - Frontiers in Neurorobotics

M1 - 1398703

ER -

Remote intelligent perception system for multi-object detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this