Millimeter-wave concurrent beamforming: A multi-player multi-armed bandit approach

Ehab Mahmoud Mohamed; Sherief Hashima; Kohei Hatano; Hani Kasban; Mohamed Rihan

doi:10.32604/cmc.2020.011816

Millimeter-wave concurrent beamforming: A multi-player multi-armed bandit approach

Ehab Mahmoud Mohamed, Sherief Hashima, Kohei Hatano, Hani Kasban, Mohamed Rihan

Electrical Engineering

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

The communication in the Millimeter-wave (mmWave) band, i.e., 30~300 GHz, is characterized by short-range transmissions and the use of antenna beamforming (BF). Thus, multiple mmWave access points (APs) should be installed to fully cover a target environment with gigabits per second (Gbps) connectivity. However, inter-beam interference prevents maximizing the sum rates of the established concurrent links. In this paper, a reinforcement learning (RL) approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links. Specifically, the problem is formulated as a multiplayer multiarmed bandit (MAB), where mmWave APs act as the players aiming to maximize their achievable rewards, i.e., data rates, and the arms to play are the available beam directions. In this setup, a selfish concurrent multiplayer MAB strategy is advocated. Four different MAB algorithms, namely, ϵ-greedy, upper confidence bound (UCB), Thompson sampling (TS), and exponential weight algorithm for exploration and exploitation (EXP3) are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations. After a few rounds of interactions, mmWave APs learn how to select concurrent beams that enhance the overall system performance. The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.

Original language	English
Pages (from-to)	1987-2007
Number of pages	21
Journal	Computers, Materials and Continua
Volume	65
Issue number	3
DOIs	https://doi.org/10.32604/cmc.2020.011816
State	Published - 2020

Keywords

Concurrent transmissions
Millimeter wave (mmWave)
Multiarmed bandit (MAB)
Reinforcement learning

Access to Document

10.32604/cmc.2020.011816

Cite this

@article{c078652ec7274986bab3ccee7128a02d,

title = "Millimeter-wave concurrent beamforming: A multi-player multi-armed bandit approach",

abstract = "The communication in the Millimeter-wave (mmWave) band, i.e., 30\textasciitilde{}300 GHz, is characterized by short-range transmissions and the use of antenna beamforming (BF). Thus, multiple mmWave access points (APs) should be installed to fully cover a target environment with gigabits per second (Gbps) connectivity. However, inter-beam interference prevents maximizing the sum rates of the established concurrent links. In this paper, a reinforcement learning (RL) approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links. Specifically, the problem is formulated as a multiplayer multiarmed bandit (MAB), where mmWave APs act as the players aiming to maximize their achievable rewards, i.e., data rates, and the arms to play are the available beam directions. In this setup, a selfish concurrent multiplayer MAB strategy is advocated. Four different MAB algorithms, namely, ϵ-greedy, upper confidence bound (UCB), Thompson sampling (TS), and exponential weight algorithm for exploration and exploitation (EXP3) are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations. After a few rounds of interactions, mmWave APs learn how to select concurrent beams that enhance the overall system performance. The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.",

keywords = "Concurrent transmissions, Millimeter wave (mmWave), Multiarmed bandit (MAB), Reinforcement learning",

author = "Mohamed, \{Ehab Mahmoud\} and Sherief Hashima and Kohei Hatano and Hani Kasban and Mohamed Rihan",

year = "2020",

doi = "10.32604/cmc.2020.011816",

language = "English",

volume = "65",

pages = "1987--2007",

journal = "Computers, Materials and Continua",

issn = "1546-2218",

publisher = "Tech Science Press",

number = "3",

}

TY - JOUR

T1 - Millimeter-wave concurrent beamforming

T2 - A multi-player multi-armed bandit approach

AU - Mohamed, Ehab Mahmoud

AU - Hashima, Sherief

AU - Hatano, Kohei

AU - Kasban, Hani

AU - Rihan, Mohamed

PY - 2020

Y1 - 2020

N2 - The communication in the Millimeter-wave (mmWave) band, i.e., 30~300 GHz, is characterized by short-range transmissions and the use of antenna beamforming (BF). Thus, multiple mmWave access points (APs) should be installed to fully cover a target environment with gigabits per second (Gbps) connectivity. However, inter-beam interference prevents maximizing the sum rates of the established concurrent links. In this paper, a reinforcement learning (RL) approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links. Specifically, the problem is formulated as a multiplayer multiarmed bandit (MAB), where mmWave APs act as the players aiming to maximize their achievable rewards, i.e., data rates, and the arms to play are the available beam directions. In this setup, a selfish concurrent multiplayer MAB strategy is advocated. Four different MAB algorithms, namely, ϵ-greedy, upper confidence bound (UCB), Thompson sampling (TS), and exponential weight algorithm for exploration and exploitation (EXP3) are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations. After a few rounds of interactions, mmWave APs learn how to select concurrent beams that enhance the overall system performance. The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.

AB - The communication in the Millimeter-wave (mmWave) band, i.e., 30~300 GHz, is characterized by short-range transmissions and the use of antenna beamforming (BF). Thus, multiple mmWave access points (APs) should be installed to fully cover a target environment with gigabits per second (Gbps) connectivity. However, inter-beam interference prevents maximizing the sum rates of the established concurrent links. In this paper, a reinforcement learning (RL) approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links. Specifically, the problem is formulated as a multiplayer multiarmed bandit (MAB), where mmWave APs act as the players aiming to maximize their achievable rewards, i.e., data rates, and the arms to play are the available beam directions. In this setup, a selfish concurrent multiplayer MAB strategy is advocated. Four different MAB algorithms, namely, ϵ-greedy, upper confidence bound (UCB), Thompson sampling (TS), and exponential weight algorithm for exploration and exploitation (EXP3) are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations. After a few rounds of interactions, mmWave APs learn how to select concurrent beams that enhance the overall system performance. The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.

KW - Concurrent transmissions

KW - Millimeter wave (mmWave)

KW - Multiarmed bandit (MAB)

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85091827824&partnerID=8YFLogxK

U2 - 10.32604/cmc.2020.011816

DO - 10.32604/cmc.2020.011816

M3 - Article

AN - SCOPUS:85091827824

SN - 1546-2218

VL - 65

SP - 1987

EP - 2007

JO - Computers, Materials and Continua

JF - Computers, Materials and Continua

IS - 3

ER -

Millimeter-wave concurrent beamforming: A multi-player multi-armed bandit approach

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this