Optimal Deep Neural Network-Based Model for Answering Visual Medical Question

Karim Gasmi; Ibtihel Ben Ltaifa; Gaël Lejeune; Hamoud Alshammari; Lassaad Ben Ammar; Mahmood A. Mahmood

doi:10.1080/01969722.2021.2018543

Optimal Deep Neural Network-Based Model for Answering Visual Medical Question

Karim Gasmi
, Ibtihel Ben Ltaifa
, Gaël Lejeune
, Hamoud Alshammari
, Lassaad Ben Ammar
, Mahmood A. Mahmood

Computer Sciences

Research output: Contribution to journal › Article › peer-review

24 Scopus citations

Abstract

Over the last few years, the amount of available information has increased exponentially in all professional fields, including the medical field. Modern-day patients have access to a wealth of medical information, whether it be from brochures, newspapers, television campaigns, or internet documents. To facilitate and accelerate the search for medical information, more precise systems have been implemented, such as visual question-and-answer systems. A visual question-and-answer system is designed to provide direct and precise answers to questions asked in natural language. In this context, we propose an optimal deep neural network model based on an adaptive optimization algorithm, which takes medical images and natural language questions as input, then provides precise answers as output. Our model starts by classifying medical questions following an embedding phase. We then use a deep learning model for visual and textual feature extraction and emergence. In this paper, we aim to maximize the accuracy rate and minimize the number of epochs in order to accelerate the process. This is a multi-objective optimization problem. The selection of deep learning model parameters, such as epoch number and batch size, is an essential step in improving the model, thus, we use an adaptive genetic algorithm to determine the optimal deep learning parameters. Finally, we propose a dense layer for answer retrieval. To evaluate our model, we used the ImageCLEF 2019 VQA data set. Our model outperforms existing visual question-and-answer systems and offers a significantly higher retrieval accuracy rate.

Original language	English
Pages (from-to)	403-424
Number of pages	22
Journal	Cybernetics and Systems
Volume	53
Issue number	5
DOIs	https://doi.org/10.1080/01969722.2021.2018543
State	Published - 2022

Keywords

Bi-LSTM
deep learning
EfficientNet
genetic algorithm
medical visual question answering
optimization

Access to Document

10.1080/01969722.2021.2018543

Cite this

@article{2dcca62c644e4011a6fe19bd92d4f6f8,

title = "Optimal Deep Neural Network-Based Model for Answering Visual Medical Question",

abstract = "Over the last few years, the amount of available information has increased exponentially in all professional fields, including the medical field. Modern-day patients have access to a wealth of medical information, whether it be from brochures, newspapers, television campaigns, or internet documents. To facilitate and accelerate the search for medical information, more precise systems have been implemented, such as visual question-and-answer systems. A visual question-and-answer system is designed to provide direct and precise answers to questions asked in natural language. In this context, we propose an optimal deep neural network model based on an adaptive optimization algorithm, which takes medical images and natural language questions as input, then provides precise answers as output. Our model starts by classifying medical questions following an embedding phase. We then use a deep learning model for visual and textual feature extraction and emergence. In this paper, we aim to maximize the accuracy rate and minimize the number of epochs in order to accelerate the process. This is a multi-objective optimization problem. The selection of deep learning model parameters, such as epoch number and batch size, is an essential step in improving the model, thus, we use an adaptive genetic algorithm to determine the optimal deep learning parameters. Finally, we propose a dense layer for answer retrieval. To evaluate our model, we used the ImageCLEF 2019 VQA data set. Our model outperforms existing visual question-and-answer systems and offers a significantly higher retrieval accuracy rate.",

keywords = "Bi-LSTM, deep learning, EfficientNet, genetic algorithm, medical visual question answering, optimization",

author = "Karim Gasmi and Ltaifa, \{Ibtihel Ben\} and Ga{\"e}l Lejeune and Hamoud Alshammari and Ammar, \{Lassaad Ben\} and Mahmood, \{Mahmood A.\}",

note = "Publisher Copyright: {\textcopyright} 2021 Taylor \& Francis Group, LLC.",

year = "2022",

doi = "10.1080/01969722.2021.2018543",

language = "English",

volume = "53",

pages = "403--424",

journal = "Cybernetics and Systems",

issn = "0196-9722",

publisher = "Taylor and Francis Ltd.",

number = "5",

}

TY - JOUR

T1 - Optimal Deep Neural Network-Based Model for Answering Visual Medical Question

AU - Gasmi, Karim

AU - Ltaifa, Ibtihel Ben

AU - Lejeune, Gaël

AU - Alshammari, Hamoud

AU - Ammar, Lassaad Ben

AU - Mahmood, Mahmood A.

PY - 2022

Y1 - 2022

N2 - Over the last few years, the amount of available information has increased exponentially in all professional fields, including the medical field. Modern-day patients have access to a wealth of medical information, whether it be from brochures, newspapers, television campaigns, or internet documents. To facilitate and accelerate the search for medical information, more precise systems have been implemented, such as visual question-and-answer systems. A visual question-and-answer system is designed to provide direct and precise answers to questions asked in natural language. In this context, we propose an optimal deep neural network model based on an adaptive optimization algorithm, which takes medical images and natural language questions as input, then provides precise answers as output. Our model starts by classifying medical questions following an embedding phase. We then use a deep learning model for visual and textual feature extraction and emergence. In this paper, we aim to maximize the accuracy rate and minimize the number of epochs in order to accelerate the process. This is a multi-objective optimization problem. The selection of deep learning model parameters, such as epoch number and batch size, is an essential step in improving the model, thus, we use an adaptive genetic algorithm to determine the optimal deep learning parameters. Finally, we propose a dense layer for answer retrieval. To evaluate our model, we used the ImageCLEF 2019 VQA data set. Our model outperforms existing visual question-and-answer systems and offers a significantly higher retrieval accuracy rate.

AB - Over the last few years, the amount of available information has increased exponentially in all professional fields, including the medical field. Modern-day patients have access to a wealth of medical information, whether it be from brochures, newspapers, television campaigns, or internet documents. To facilitate and accelerate the search for medical information, more precise systems have been implemented, such as visual question-and-answer systems. A visual question-and-answer system is designed to provide direct and precise answers to questions asked in natural language. In this context, we propose an optimal deep neural network model based on an adaptive optimization algorithm, which takes medical images and natural language questions as input, then provides precise answers as output. Our model starts by classifying medical questions following an embedding phase. We then use a deep learning model for visual and textual feature extraction and emergence. In this paper, we aim to maximize the accuracy rate and minimize the number of epochs in order to accelerate the process. This is a multi-objective optimization problem. The selection of deep learning model parameters, such as epoch number and batch size, is an essential step in improving the model, thus, we use an adaptive genetic algorithm to determine the optimal deep learning parameters. Finally, we propose a dense layer for answer retrieval. To evaluate our model, we used the ImageCLEF 2019 VQA data set. Our model outperforms existing visual question-and-answer systems and offers a significantly higher retrieval accuracy rate.

KW - Bi-LSTM

KW - deep learning

KW - EfficientNet

KW - genetic algorithm

KW - medical visual question answering

KW - optimization

UR - https://www.scopus.com/pages/publications/85122006090

U2 - 10.1080/01969722.2021.2018543

DO - 10.1080/01969722.2021.2018543

M3 - Article

AN - SCOPUS:85122006090

SN - 0196-9722

VL - 53

SP - 403

EP - 424

JO - Cybernetics and Systems

JF - Cybernetics and Systems

IS - 5

ER -

Optimal Deep Neural Network-Based Model for Answering Visual Medical Question

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this