GroupFormer for hyperspectral image classification through group attention

Rahim Khan; Tahir Arshad; Xuefei Ma; Haifeng Zhu; Chen Wang; Javed Khan; Zahid Ullah Khan; Sajid Ullah Khan

doi:10.1038/s41598-024-74835-1

GroupFormer for hyperspectral image classification through group attention

Rahim Khan, Tahir Arshad, Xuefei Ma, Haifeng Zhu, Chen Wang, Javed Khan, Zahid Ullah Khan, Sajid Ullah Khan

Information Systems

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

Hyperspectral image (HSI) data has a wide range of valuable spectral information for numerous tasks. HSI data encounters challenges such as small training samples, scarcity, and redundant information. Researchers have introduced various research works to address these challenges. Convolution Neural Network (CNN) has gained significant success in the field of HSI classification. CNN’s primary focus is to extract low-level features from HSI data, and it has a limited ability to detect long-range dependencies due to the confined filter size. In contrast, vision transformers exhibit great success in the HSI classification field due to the use of attention mechanisms to learn the long-range dependencies. As mentioned earlier, the primary issue with these models is that they require sufficient labeled training data. To address this challenge, we proposed a spectral-spatial feature extractor group attention transformer that consists of a multiscale feature extractor to extract low-level or shallow features. For high-level semantic feature extraction, we proposed a group attention mechanism. Our proposed model is evaluated using four publicly available HSI datasets, which are Indian Pines, Pavia University, Salinas, and the KSC dataset. Our proposed approach achieved the best classification results in terms of overall accuracy (OA), average accuracy (AA), and Kappa coefficient. As mentioned earlier, the proposed approach utilized only 5%, 1%, 1%, and 10% of the training samples from the publicly available four datasets.

Original language	English
Article number	23879
Journal	Scientific Reports
Volume	14
Issue number	1
DOIs	https://doi.org/10.1038/s41598-024-74835-1
State	Published - Dec 2024

Keywords

Attention Module
Convolutional neural network
Hyperspectral image classification
Vision Transformer

Access to Document

10.1038/s41598-024-74835-1

Cite this

@article{f8a2c7900c2f4c29a49045e9bcbbace2,

title = "GroupFormer for hyperspectral image classification through group attention",

abstract = "Hyperspectral image (HSI) data has a wide range of valuable spectral information for numerous tasks. HSI data encounters challenges such as small training samples, scarcity, and redundant information. Researchers have introduced various research works to address these challenges. Convolution Neural Network (CNN) has gained significant success in the field of HSI classification. CNN{\textquoteright}s primary focus is to extract low-level features from HSI data, and it has a limited ability to detect long-range dependencies due to the confined filter size. In contrast, vision transformers exhibit great success in the HSI classification field due to the use of attention mechanisms to learn the long-range dependencies. As mentioned earlier, the primary issue with these models is that they require sufficient labeled training data. To address this challenge, we proposed a spectral-spatial feature extractor group attention transformer that consists of a multiscale feature extractor to extract low-level or shallow features. For high-level semantic feature extraction, we proposed a group attention mechanism. Our proposed model is evaluated using four publicly available HSI datasets, which are Indian Pines, Pavia University, Salinas, and the KSC dataset. Our proposed approach achieved the best classification results in terms of overall accuracy (OA), average accuracy (AA), and Kappa coefficient. As mentioned earlier, the proposed approach utilized only 5\%, 1\%, 1\%, and 10\% of the training samples from the publicly available four datasets.",

keywords = "Attention Module, Convolutional neural network, Hyperspectral image classification, Vision Transformer",

author = "Rahim Khan and Tahir Arshad and Xuefei Ma and Haifeng Zhu and Chen Wang and Javed Khan and Khan, \{Zahid Ullah\} and Khan, \{Sajid Ullah\}",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2024.",

year = "2024",

month = dec,

doi = "10.1038/s41598-024-74835-1",

language = "English",

volume = "14",

journal = "Scientific Reports",

issn = "2045-2322",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - GroupFormer for hyperspectral image classification through group attention

AU - Khan, Rahim

AU - Arshad, Tahir

AU - Ma, Xuefei

AU - Zhu, Haifeng

AU - Wang, Chen

AU - Khan, Javed

AU - Khan, Zahid Ullah

AU - Khan, Sajid Ullah

N1 - Publisher Copyright: © The Author(s) 2024.

PY - 2024/12

Y1 - 2024/12

N2 - Hyperspectral image (HSI) data has a wide range of valuable spectral information for numerous tasks. HSI data encounters challenges such as small training samples, scarcity, and redundant information. Researchers have introduced various research works to address these challenges. Convolution Neural Network (CNN) has gained significant success in the field of HSI classification. CNN’s primary focus is to extract low-level features from HSI data, and it has a limited ability to detect long-range dependencies due to the confined filter size. In contrast, vision transformers exhibit great success in the HSI classification field due to the use of attention mechanisms to learn the long-range dependencies. As mentioned earlier, the primary issue with these models is that they require sufficient labeled training data. To address this challenge, we proposed a spectral-spatial feature extractor group attention transformer that consists of a multiscale feature extractor to extract low-level or shallow features. For high-level semantic feature extraction, we proposed a group attention mechanism. Our proposed model is evaluated using four publicly available HSI datasets, which are Indian Pines, Pavia University, Salinas, and the KSC dataset. Our proposed approach achieved the best classification results in terms of overall accuracy (OA), average accuracy (AA), and Kappa coefficient. As mentioned earlier, the proposed approach utilized only 5%, 1%, 1%, and 10% of the training samples from the publicly available four datasets.

AB - Hyperspectral image (HSI) data has a wide range of valuable spectral information for numerous tasks. HSI data encounters challenges such as small training samples, scarcity, and redundant information. Researchers have introduced various research works to address these challenges. Convolution Neural Network (CNN) has gained significant success in the field of HSI classification. CNN’s primary focus is to extract low-level features from HSI data, and it has a limited ability to detect long-range dependencies due to the confined filter size. In contrast, vision transformers exhibit great success in the HSI classification field due to the use of attention mechanisms to learn the long-range dependencies. As mentioned earlier, the primary issue with these models is that they require sufficient labeled training data. To address this challenge, we proposed a spectral-spatial feature extractor group attention transformer that consists of a multiscale feature extractor to extract low-level or shallow features. For high-level semantic feature extraction, we proposed a group attention mechanism. Our proposed model is evaluated using four publicly available HSI datasets, which are Indian Pines, Pavia University, Salinas, and the KSC dataset. Our proposed approach achieved the best classification results in terms of overall accuracy (OA), average accuracy (AA), and Kappa coefficient. As mentioned earlier, the proposed approach utilized only 5%, 1%, 1%, and 10% of the training samples from the publicly available four datasets.

KW - Attention Module

KW - Convolutional neural network

KW - Hyperspectral image classification

KW - Vision Transformer

UR - http://www.scopus.com/inward/record.url?scp=85206122842&partnerID=8YFLogxK

U2 - 10.1038/s41598-024-74835-1

DO - 10.1038/s41598-024-74835-1

M3 - Article

C2 - 39396096

AN - SCOPUS:85206122842

SN - 2045-2322

VL - 14

JO - Scientific Reports

JF - Scientific Reports

IS - 1

M1 - 23879

ER -

GroupFormer for hyperspectral image classification through group attention

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this