TY - JOUR
T1 - Pyramidal attention with progressive multi-stage iterative feature refinement for salient object segmentation
AU - Khan, Rahim
AU - Alzaben, Nada
AU - Daradkeh, Yousef Ibrahim
AU - Zhu, Xianxun
AU - Ullah, Inam
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/10
Y1 - 2025/10
N2 - Accurate detection of salient objects in complex visual scenes remains a fundamental yet challenging task in visual intelligence, often impeded by significant scale variation, background clutter, and indistinct object boundaries. While recent approaches attempt to exploit multi-level features, they frequently encounter limitations such as semantic misalignment across feature hierarchies, spatial detail degradation, and weak cross-dataset generalization. To overcome these challenges, we propose a novel Pyramidal Attention Mechanism (PAM) with Progressive Multi-stage Iterative Feature Refinement Network (PIFRNet) designed for robust and precise Salient Object Detection (SOD). Specifically, our method begins by hierarchically aggregating features from four representative stages of a powerful backbone, ensuring rich multi-scale context and semantic diversity. To bridge semantic gaps and recover fine structures, we introduce a Progressive Bilateral Feature Refinement (PBFR) module, which enhances early-stage features through cascaded convolutions and spatial attention. Furthermore, the novel PAM, equipped with dilated convolutions, is introduced to refine high-level semantics and reinforce object completeness. The network integrates these components through a multi-stage iterative refinement process, enabling gradual enhancement of spatial precision and structural fidelity. Extensive experiments conducted on five public SOD benchmarks demonstrate that our approach achieves superior performance compared to state-of-the-art methods, both quantitatively and qualitatively. Cross-dataset evaluations further validate its strong generalization capability, making it highly applicable to real-world visual intelligence scenarios.
AB - Accurate detection of salient objects in complex visual scenes remains a fundamental yet challenging task in visual intelligence, often impeded by significant scale variation, background clutter, and indistinct object boundaries. While recent approaches attempt to exploit multi-level features, they frequently encounter limitations such as semantic misalignment across feature hierarchies, spatial detail degradation, and weak cross-dataset generalization. To overcome these challenges, we propose a novel Pyramidal Attention Mechanism (PAM) with Progressive Multi-stage Iterative Feature Refinement Network (PIFRNet) designed for robust and precise Salient Object Detection (SOD). Specifically, our method begins by hierarchically aggregating features from four representative stages of a powerful backbone, ensuring rich multi-scale context and semantic diversity. To bridge semantic gaps and recover fine structures, we introduce a Progressive Bilateral Feature Refinement (PBFR) module, which enhances early-stage features through cascaded convolutions and spatial attention. Furthermore, the novel PAM, equipped with dilated convolutions, is introduced to refine high-level semantics and reinforce object completeness. The network integrates these components through a multi-stage iterative refinement process, enabling gradual enhancement of spatial precision and structural fidelity. Extensive experiments conducted on five public SOD benchmarks demonstrate that our approach achieves superior performance compared to state-of-the-art methods, both quantitatively and qualitatively. Cross-dataset evaluations further validate its strong generalization capability, making it highly applicable to real-world visual intelligence scenarios.
KW - Bilateral merging
KW - Hierarchical aggregation
KW - Multi-scale representation
KW - Pyramidal attention
KW - Saliency detection
KW - Visual intelligence
UR - http://www.scopus.com/inward/record.url?scp=105012090330&partnerID=8YFLogxK
U2 - 10.1016/j.imavis.2025.105670
DO - 10.1016/j.imavis.2025.105670
M3 - Article
AN - SCOPUS:105012090330
SN - 0262-8856
VL - 162
JO - Image and Vision Computing
JF - Image and Vision Computing
M1 - 105670
ER -