Reinforcement Learning-Based Integrated Risk Aware Dynamic Treatment Strategy for Consumer-Centric Next-Gen Healthcare

Divya Nimma; Pinapati Lakshmana Rao; Janjhyam Venkata Naga Ramesh; Fadl Dahan; Desidi Narsimha Reddy; Venkatachalam Selvakumar; Yodgorkhon Ilkhamova; Pradeep Jangir

doi:10.1109/TCE.2025.3565900

Reinforcement Learning-Based Integrated Risk Aware Dynamic Treatment Strategy for Consumer-Centric Next-Gen Healthcare

Divya Nimma, Pinapati Lakshmana Rao, Janjhyam Venkata Naga Ramesh, Fadl Dahan, Desidi Narsimha Reddy, Venkatachalam Selvakumar, Yodgorkhon Ilkhamova, Pradeep Jangir

Information Systems

Research output: Contribution to journal › Article › peer-review

Abstract

Reinforcement learning (RL) has gained prominence in healthcare due to its ability to optimize treatment strategies without relying on predefined mathematical models. However, existing approaches face critical challenges: (1) the optimality of learned strategies is assessed without considering treatment risks, potentially leading to unsafe recommendations; (2) distribution shift issues cause learned strategies to diverge from physician decisions; and (3) past observational data and treatment history are often overlooked, leading to suboptimal state representations. To address these limitations, we propose a Dynamic Treatment Strategy Generation Model that integrates Dead Ends with an Offline Supervised Actor-Critic approach (DOSAC-DTR). Our model incorporates Dead Ends into the Actor-Critic framework to evaluate risks associated with recommended treatments. Additionally, physician oversight is embedded to mitigate distribution shift and align the learned strategy with expert decisions while maximizing expected outcomes. To enhance state representation, we employ an LSTM-based encoder-decoder model to capture essential patient history, ensuring robust decision-making. Experimental results on real-world datasets (MIMIC-III) demonstrate that DOSAC-DTR significantly reduces mortality rates (Sepsis: 3.51%, Ventilation: 13.74%) and improves treatment alignment with physicians (Jaccard similarity: 0.362, 0.126) compared to baseline models. These findings underscore the potential of reinforcement learning in personalized healthcare, improving both treatment efficacy and patient safety.

Original language	English
Journal	IEEE Transactions on Consumer Electronics
DOIs	https://doi.org/10.1109/TCE.2025.3565900
State	Accepted/In press - 2025

Keywords

Actor-Critic
Consumer-Centric Data
Dynamic Treatment
Next-Gen Healthcare
Reinforcement learning

Access to Document

10.1109/TCE.2025.3565900

Cite this

@article{2fc25ea3103e402094fb48a02e4be692,

title = "Reinforcement Learning-Based Integrated Risk Aware Dynamic Treatment Strategy for Consumer-Centric Next-Gen Healthcare",

abstract = "Reinforcement learning (RL) has gained prominence in healthcare due to its ability to optimize treatment strategies without relying on predefined mathematical models. However, existing approaches face critical challenges: (1) the optimality of learned strategies is assessed without considering treatment risks, potentially leading to unsafe recommendations; (2) distribution shift issues cause learned strategies to diverge from physician decisions; and (3) past observational data and treatment history are often overlooked, leading to suboptimal state representations. To address these limitations, we propose a Dynamic Treatment Strategy Generation Model that integrates Dead Ends with an Offline Supervised Actor-Critic approach (DOSAC-DTR). Our model incorporates Dead Ends into the Actor-Critic framework to evaluate risks associated with recommended treatments. Additionally, physician oversight is embedded to mitigate distribution shift and align the learned strategy with expert decisions while maximizing expected outcomes. To enhance state representation, we employ an LSTM-based encoder-decoder model to capture essential patient history, ensuring robust decision-making. Experimental results on real-world datasets (MIMIC-III) demonstrate that DOSAC-DTR significantly reduces mortality rates (Sepsis: 3.51\%, Ventilation: 13.74\%) and improves treatment alignment with physicians (Jaccard similarity: 0.362, 0.126) compared to baseline models. These findings underscore the potential of reinforcement learning in personalized healthcare, improving both treatment efficacy and patient safety.",

keywords = "Actor-Critic, Consumer-Centric Data, Dynamic Treatment, Next-Gen Healthcare, Reinforcement learning",

author = "Divya Nimma and Rao, \{Pinapati Lakshmana\} and Ramesh, \{Janjhyam Venkata Naga\} and Fadl Dahan and Reddy, \{Desidi Narsimha\} and Venkatachalam Selvakumar and Yodgorkhon Ilkhamova and Pradeep Jangir",

note = "Publisher Copyright: {\textcopyright} 1975-2011 IEEE.",

year = "2025",

doi = "10.1109/TCE.2025.3565900",

language = "English",

journal = "IEEE Transactions on Consumer Electronics",

issn = "0098-3063",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Reinforcement Learning-Based Integrated Risk Aware Dynamic Treatment Strategy for Consumer-Centric Next-Gen Healthcare

AU - Nimma, Divya

AU - Rao, Pinapati Lakshmana

AU - Ramesh, Janjhyam Venkata Naga

AU - Dahan, Fadl

AU - Reddy, Desidi Narsimha

AU - Selvakumar, Venkatachalam

AU - Ilkhamova, Yodgorkhon

AU - Jangir, Pradeep

PY - 2025

Y1 - 2025

N2 - Reinforcement learning (RL) has gained prominence in healthcare due to its ability to optimize treatment strategies without relying on predefined mathematical models. However, existing approaches face critical challenges: (1) the optimality of learned strategies is assessed without considering treatment risks, potentially leading to unsafe recommendations; (2) distribution shift issues cause learned strategies to diverge from physician decisions; and (3) past observational data and treatment history are often overlooked, leading to suboptimal state representations. To address these limitations, we propose a Dynamic Treatment Strategy Generation Model that integrates Dead Ends with an Offline Supervised Actor-Critic approach (DOSAC-DTR). Our model incorporates Dead Ends into the Actor-Critic framework to evaluate risks associated with recommended treatments. Additionally, physician oversight is embedded to mitigate distribution shift and align the learned strategy with expert decisions while maximizing expected outcomes. To enhance state representation, we employ an LSTM-based encoder-decoder model to capture essential patient history, ensuring robust decision-making. Experimental results on real-world datasets (MIMIC-III) demonstrate that DOSAC-DTR significantly reduces mortality rates (Sepsis: 3.51%, Ventilation: 13.74%) and improves treatment alignment with physicians (Jaccard similarity: 0.362, 0.126) compared to baseline models. These findings underscore the potential of reinforcement learning in personalized healthcare, improving both treatment efficacy and patient safety.

AB - Reinforcement learning (RL) has gained prominence in healthcare due to its ability to optimize treatment strategies without relying on predefined mathematical models. However, existing approaches face critical challenges: (1) the optimality of learned strategies is assessed without considering treatment risks, potentially leading to unsafe recommendations; (2) distribution shift issues cause learned strategies to diverge from physician decisions; and (3) past observational data and treatment history are often overlooked, leading to suboptimal state representations. To address these limitations, we propose a Dynamic Treatment Strategy Generation Model that integrates Dead Ends with an Offline Supervised Actor-Critic approach (DOSAC-DTR). Our model incorporates Dead Ends into the Actor-Critic framework to evaluate risks associated with recommended treatments. Additionally, physician oversight is embedded to mitigate distribution shift and align the learned strategy with expert decisions while maximizing expected outcomes. To enhance state representation, we employ an LSTM-based encoder-decoder model to capture essential patient history, ensuring robust decision-making. Experimental results on real-world datasets (MIMIC-III) demonstrate that DOSAC-DTR significantly reduces mortality rates (Sepsis: 3.51%, Ventilation: 13.74%) and improves treatment alignment with physicians (Jaccard similarity: 0.362, 0.126) compared to baseline models. These findings underscore the potential of reinforcement learning in personalized healthcare, improving both treatment efficacy and patient safety.

KW - Actor-Critic

KW - Consumer-Centric Data

KW - Dynamic Treatment

KW - Next-Gen Healthcare

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=105004918295&partnerID=8YFLogxK

U2 - 10.1109/TCE.2025.3565900

DO - 10.1109/TCE.2025.3565900

M3 - Article

AN - SCOPUS:105004918295

SN - 0098-3063

JO - IEEE Transactions on Consumer Electronics

JF - IEEE Transactions on Consumer Electronics

ER -

Reinforcement Learning-Based Integrated Risk Aware Dynamic Treatment Strategy for Consumer-Centric Next-Gen Healthcare

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this