Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data

Amr A.Abd El-Mageed, Ahmed E. Elkhouli, Amr A. Abohany, Mona Gafar

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

RNA Sequencing (RNA-Seq) has been considered a revolutionary technique in gene profiling and quantification. It offers a comprehensive view of the transcriptome, making it a more expansive technique in comparison with micro-array. Genes that discriminate malignancy and normal can be deduced using quantitative gene expression. However, this data is a high-dimensional dense matrix; each sample has a dimension of more than 20,000 genes. Dealing with this data poses challenges. This paper proposes RBNRO-DE (Relief Binary NRO based on Differential Evolution) for handling the gene selection strategy on (rnaseqv2 illuminahiseq rnaseqv2 un edu Level 3 RSEM genes normalized) with more than 20,000 genes to pick the best informative genes and assess them through 22 cancer datasets. The k-nearest Neighbor (k-NN) and Support Vector Machine (SVM) are applied to assess the quality of the selected genes. Binary versions of the most common meta-heuristic algorithms have been compared with the proposed RBNRO-DE algorithm. In most of the 22 cancer datasets, the RBNRO-DE algorithm based on k-NN and SVM classifiers achieved optimal convergence and classification accuracy up to 100% integrated with a feature reduction size down to 98%, which is very evident when compared to its counterparts, according to Wilcoxon’s rank-sum test (5% significance level).

Original languageEnglish
Article number46
JournalJournal of Big Data
Volume11
Issue number1
DOIs
StatePublished - Dec 2024

Keywords

  • Cancer bio-mark
  • Differential evolution (DE)
  • Gene selection
  • High-dimensionality
  • Meta-heuristic
  • Micro-array
  • Nuclear reaction optimization (NRO)
  • Relief algorithm
  • RNA sequencing (RNA-Seq)

Fingerprint

Dive into the research topics of 'Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data'. Together they form a unique fingerprint.

Cite this