Abstract
Text to face generation is a sub-domain of text to image synthesis. It has a huge impact on new research areas along with the wide range of applications in the public safety domain. Due to the lack of dataset, the research work focused on the text to face generation is very limited. Most of the work for text to face generation until now is based on the partially trained generative adversarial networks, in which the pre-trained text encoder has been used to extract the semantic features of the input sentence. Later, these semantic features have been utilized to train the image decoder. In this research work, we propose a fully trained generative adversarial network to generate realistic and natural images. The proposed work trained the text encoder as well as the image decoder at the same time to generate more accurate and efficient results. In addition to the proposed methodology, another contribution is to generate the dataset by the amalgamation of LFW, CelebA and locally prepared dataset. The dataset has also been labeled according to our defined classes. Through performing different kinds of experiments, it has been proved that our proposed fully trained GAN outperformed by generating good quality images by the input sentence. Moreover, the visual results have also strengthened our experiments by generating the face images according to the given query.
Original language | English |
---|---|
Article number | 9163356 |
Pages (from-to) | 1250-1260 |
Number of pages | 11 |
Journal | IEEE Access |
Volume | 9 |
DOIs | |
State | Published - 2021 |
Keywords
- CNN
- data augmentation
- face synthesis
- GAN
- image generation
- legal identity for all
- text to face