TY - JOUR
T1 - Deep convolutional neural network architecture design as a bi-level optimization problem
AU - Louati, Hassen
AU - Bechikh, Slim
AU - Louati, Ali
AU - Hung, Chih Cheng
AU - Ben Said, Lamjed
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/6/7
Y1 - 2021/6/7
N2 - During the last decade, deep neural networks have shown a great performance in many machine learning tasks such as classification and clustering. One of the most successful networks is the CNN (Convolutional Neural Network), which has been applied in many application domains such as pattern recognition, medical diagnosis, and signal processing. Despite the very interesting performance of CNNs, their architecture design is still so far a major challenge for researchers and practitioners. Several works have been proposed in the literature with the aim to find optimized architectures such as ResNet and VGGNet. Unfortunately, most of these architectures are either manually defined by experts or automatically designed by greedy induction algorithms. Recent works suggest the use of Evolutionary Algorithms (EAs) thanks to their ability to escape locally-optimal architectures. Despite the fact that EAs have shown interesting performance, researchers in this direction have considered the design task as a single-level optimization problem; which represents the main research gap we tackle in this paper. The main contribution behind our work consists in the fact that CNN architecture design has a hierarchical nature and thus could be seen as a Bi-Level Optimization Problem (BLOP) where: (1) the upper level minimizes the network complexity defined by the number of blocks and the number of nodes per block; and (2) the lower level optimizes the convolution block ‘graphs’ topologies by maximizing the classification accuracy. Motivated by the originality of our observation with respect to the state of the art, we frame for the first time the CNN architecture design problem as a BLOP and then solve it using an adapted version of an existing efficient bi-level EA; through the definition of the solution encoding, the fitness function, and the variation operators at each level. The adapted EA is named BLOP-CNN and is assessed on the image classification task using the commonly employed CIFAR-10 and CIFAR-100 benchmark data sets. The analysis of our experimental results show the merits of our proposed method in providing the user with optimized architectures that outperform many recent and prominent architectures coming from the three different approaches, namely: manual design, reinforcement learning-based generation, and evolutionary optimization. Moreover, to show the applicability of our approach, we have conducted a case study on the detection of the COVID-19 using a set of benchmark chest X-ray and Computed Tomography (CT) images.
AB - During the last decade, deep neural networks have shown a great performance in many machine learning tasks such as classification and clustering. One of the most successful networks is the CNN (Convolutional Neural Network), which has been applied in many application domains such as pattern recognition, medical diagnosis, and signal processing. Despite the very interesting performance of CNNs, their architecture design is still so far a major challenge for researchers and practitioners. Several works have been proposed in the literature with the aim to find optimized architectures such as ResNet and VGGNet. Unfortunately, most of these architectures are either manually defined by experts or automatically designed by greedy induction algorithms. Recent works suggest the use of Evolutionary Algorithms (EAs) thanks to their ability to escape locally-optimal architectures. Despite the fact that EAs have shown interesting performance, researchers in this direction have considered the design task as a single-level optimization problem; which represents the main research gap we tackle in this paper. The main contribution behind our work consists in the fact that CNN architecture design has a hierarchical nature and thus could be seen as a Bi-Level Optimization Problem (BLOP) where: (1) the upper level minimizes the network complexity defined by the number of blocks and the number of nodes per block; and (2) the lower level optimizes the convolution block ‘graphs’ topologies by maximizing the classification accuracy. Motivated by the originality of our observation with respect to the state of the art, we frame for the first time the CNN architecture design problem as a BLOP and then solve it using an adapted version of an existing efficient bi-level EA; through the definition of the solution encoding, the fitness function, and the variation operators at each level. The adapted EA is named BLOP-CNN and is assessed on the image classification task using the commonly employed CIFAR-10 and CIFAR-100 benchmark data sets. The analysis of our experimental results show the merits of our proposed method in providing the user with optimized architectures that outperform many recent and prominent architectures coming from the three different approaches, namely: manual design, reinforcement learning-based generation, and evolutionary optimization. Moreover, to show the applicability of our approach, we have conducted a case study on the detection of the COVID-19 using a set of benchmark chest X-ray and Computed Tomography (CT) images.
KW - Bi-level optimization
KW - Deep CNN architecture design
KW - Deep learning
KW - Evolutionary algorithms
UR - http://www.scopus.com/inward/record.url?scp=85100778682&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2021.01.094
DO - 10.1016/j.neucom.2021.01.094
M3 - Article
AN - SCOPUS:85100778682
SN - 0925-2312
VL - 439
SP - 44
EP - 62
JO - Neurocomputing
JF - Neurocomputing
ER -