Adaptive heartbeat regulation using double deep reinforcement learning in a Markov decision process framework

Walid Ayadi; Emad Alkhazraji; Haitham khaled; Yassine Bouteraa; Masoud Abedini; Ardashir Mohammadzadeh

doi:10.1038/s41598-025-19411-x

Adaptive heartbeat regulation using double deep reinforcement learning in a Markov decision process framework

Walid Ayadi
, Emad Alkhazraji
, Haitham khaled
, Yassine Bouteraa
, Masoud Abedini
, Ardashir Mohammadzadeh

Computer Engineering

Research output: Contribution to journal › Article › peer-review

Abstract

The erratic nature of cardiac rhythms can precipitate a multitude of pathologies. Consequently, the endeavor to achieve stabilization of the human heartbeat has garnered significant scholarly interest in recent years. In this context, an adaptive nonlinear disturbance compensator (ANDC) strategy has been meticulously developed to ensure the stabilization of cardiac activity. Moreover, a double deep reinforcement learning (DDRL) algorithm has been employed to adaptively calibrate the tunable coefficients of the ANDC controller. To facilitate this, as well as to replicate authentic environmental conditions, a dynamic model of the heart has been constructed utilizing the framework of the Markov Decision Process (MDP). The proposed methodology functions in a closed-loop configuration, wherein the ANDC controller guarantees both stability and disturbance mitigation, while the DDRL agent persistently refines control parameters in accordance with the observed state of the system. Two categories of input signals, namely normal signals and MDP-based stochastic signals, are administered to assess the system’s efficacy under both standard and uncertain conditions. Furthermore, the influence of pathological neural activity is emulated through the introduction of external signals characterized by eight discrete frequency components. Quantitative assessments employing metrics such as peak amplitude, signal energy, and zero-crossing rate are performed for each state of the cardiovascular model. The findings substantiate that the ANDC-DDRL strategy effectively stabilizes cardiac rhythms across diverse conditions, surpassing the performance of conventional baseline methods.

Original language	English
Article number	35347
Journal	Scientific Reports
Volume	15
Issue number	1
DOIs	https://doi.org/10.1038/s41598-025-19411-x
State	Published - Dec 2025

Keywords

Adaptive nonlinear disturbance compensator (ANDC)
Cardiovascular system
Double deep reinforcement learning (DDRL)
Heartbeat
Markov decision process (MDP)

Access to Document

10.1038/s41598-025-19411-x

Cite this

@article{c8315325ce6149ccae71c53eeffe5751,

title = "Adaptive heartbeat regulation using double deep reinforcement learning in a Markov decision process framework",

abstract = "The erratic nature of cardiac rhythms can precipitate a multitude of pathologies. Consequently, the endeavor to achieve stabilization of the human heartbeat has garnered significant scholarly interest in recent years. In this context, an adaptive nonlinear disturbance compensator (ANDC) strategy has been meticulously developed to ensure the stabilization of cardiac activity. Moreover, a double deep reinforcement learning (DDRL) algorithm has been employed to adaptively calibrate the tunable coefficients of the ANDC controller. To facilitate this, as well as to replicate authentic environmental conditions, a dynamic model of the heart has been constructed utilizing the framework of the Markov Decision Process (MDP). The proposed methodology functions in a closed-loop configuration, wherein the ANDC controller guarantees both stability and disturbance mitigation, while the DDRL agent persistently refines control parameters in accordance with the observed state of the system. Two categories of input signals, namely normal signals and MDP-based stochastic signals, are administered to assess the system{\textquoteright}s efficacy under both standard and uncertain conditions. Furthermore, the influence of pathological neural activity is emulated through the introduction of external signals characterized by eight discrete frequency components. Quantitative assessments employing metrics such as peak amplitude, signal energy, and zero-crossing rate are performed for each state of the cardiovascular model. The findings substantiate that the ANDC-DDRL strategy effectively stabilizes cardiac rhythms across diverse conditions, surpassing the performance of conventional baseline methods.",

keywords = "Adaptive nonlinear disturbance compensator (ANDC), Cardiovascular system, Double deep reinforcement learning (DDRL), Heartbeat, Markov decision process (MDP)",

author = "Walid Ayadi and Emad Alkhazraji and Haitham khaled and Yassine Bouteraa and Masoud Abedini and Ardashir Mohammadzadeh",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2025.",

year = "2025",

month = dec,

doi = "10.1038/s41598-025-19411-x",

language = "English",

volume = "15",

journal = "Scientific Reports",

issn = "2045-2322",

publisher = "Nature Research",

number = "1",

}

TY - JOUR

T1 - Adaptive heartbeat regulation using double deep reinforcement learning in a Markov decision process framework

AU - Ayadi, Walid

AU - Alkhazraji, Emad

AU - khaled, Haitham

AU - Bouteraa, Yassine

AU - Abedini, Masoud

AU - Mohammadzadeh, Ardashir

N1 - Publisher Copyright: © The Author(s) 2025.

PY - 2025/12

Y1 - 2025/12

N2 - The erratic nature of cardiac rhythms can precipitate a multitude of pathologies. Consequently, the endeavor to achieve stabilization of the human heartbeat has garnered significant scholarly interest in recent years. In this context, an adaptive nonlinear disturbance compensator (ANDC) strategy has been meticulously developed to ensure the stabilization of cardiac activity. Moreover, a double deep reinforcement learning (DDRL) algorithm has been employed to adaptively calibrate the tunable coefficients of the ANDC controller. To facilitate this, as well as to replicate authentic environmental conditions, a dynamic model of the heart has been constructed utilizing the framework of the Markov Decision Process (MDP). The proposed methodology functions in a closed-loop configuration, wherein the ANDC controller guarantees both stability and disturbance mitigation, while the DDRL agent persistently refines control parameters in accordance with the observed state of the system. Two categories of input signals, namely normal signals and MDP-based stochastic signals, are administered to assess the system’s efficacy under both standard and uncertain conditions. Furthermore, the influence of pathological neural activity is emulated through the introduction of external signals characterized by eight discrete frequency components. Quantitative assessments employing metrics such as peak amplitude, signal energy, and zero-crossing rate are performed for each state of the cardiovascular model. The findings substantiate that the ANDC-DDRL strategy effectively stabilizes cardiac rhythms across diverse conditions, surpassing the performance of conventional baseline methods.

AB - The erratic nature of cardiac rhythms can precipitate a multitude of pathologies. Consequently, the endeavor to achieve stabilization of the human heartbeat has garnered significant scholarly interest in recent years. In this context, an adaptive nonlinear disturbance compensator (ANDC) strategy has been meticulously developed to ensure the stabilization of cardiac activity. Moreover, a double deep reinforcement learning (DDRL) algorithm has been employed to adaptively calibrate the tunable coefficients of the ANDC controller. To facilitate this, as well as to replicate authentic environmental conditions, a dynamic model of the heart has been constructed utilizing the framework of the Markov Decision Process (MDP). The proposed methodology functions in a closed-loop configuration, wherein the ANDC controller guarantees both stability and disturbance mitigation, while the DDRL agent persistently refines control parameters in accordance with the observed state of the system. Two categories of input signals, namely normal signals and MDP-based stochastic signals, are administered to assess the system’s efficacy under both standard and uncertain conditions. Furthermore, the influence of pathological neural activity is emulated through the introduction of external signals characterized by eight discrete frequency components. Quantitative assessments employing metrics such as peak amplitude, signal energy, and zero-crossing rate are performed for each state of the cardiovascular model. The findings substantiate that the ANDC-DDRL strategy effectively stabilizes cardiac rhythms across diverse conditions, surpassing the performance of conventional baseline methods.

KW - Adaptive nonlinear disturbance compensator (ANDC)

KW - Cardiovascular system

KW - Double deep reinforcement learning (DDRL)

KW - Heartbeat

KW - Markov decision process (MDP)

UR - https://www.scopus.com/pages/publications/105018289660

U2 - 10.1038/s41598-025-19411-x

DO - 10.1038/s41598-025-19411-x

M3 - Article

C2 - 41068235

AN - SCOPUS:105018289660

SN - 2045-2322

VL - 15

JO - Scientific Reports

JF - Scientific Reports

IS - 1

M1 - 35347

ER -

Adaptive heartbeat regulation using double deep reinforcement learning in a Markov decision process framework

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this