Development of Naïve Algorithm for Generation of Digital Image by Generative Adversarial Text using Convolutional Generative Adversarial Network Algorithm

Authors

  • Bharti Kumari Research Scholar, SHEAT College of Engineering, Varanasi, Uttar Pradesh, India Author
  • Sonam Singh Assistant Professor, SHEAT College of Engineering, Varanasi, Uttar Pradesh, India Author

Keywords:

Text To Image Conversion, eep Convolutional Generative Adversarial Networks, Virtual Reality, Augmented Reality, MS-COCO Dataset

Abstract

Text-to-image synthesis is a novel endeavor within the realm of picture synthesis. In previous studies, the primary objective of text-to-image synthesis was to match words and pictures by retrieval based on sentences or keywords. The advancement of deep learning, particularly the use of deep generative models in picture synthesis, has led to significant advances in image synthesis. Generative adversarial networks (GANs) are very influential generative models that have found effective applications in computer vision, natural language processing, and other fields. This paper aims to comprehensively examine and consolidate the latest research on text-to-image synthesis using Generative Adversarial Networks (GANs). The input for GANs-based text-to-image synthesis now encompasses not just the conventional text description, but also incorporates scene layout and conversation text. It may be categorized into three classes based on advancements in text information usage, network topology, and output control conditions. Deep convolutional generative adversarial networks (GANs) are capable of producing visually captivating pictures that belong to certain categories, such as album covers, room interiors, and faces. In this study, we propose a new and innovative deep architecture and GAN formulation to efficiently connect the progress made in text and picture modeling. Our approach aims to convert visual notions from letters to pixels. We showcase the proficiency of our model in producing realistic photos of birds and flowers based on elaborate textual descriptions.

Downloads

Download data is not yet available.

References

E. Perez, F. Strub, H. De Vries, V. Dumoulin, A. Courville, FiLM: Visual reasoning with a general conditioning layer, arXiv preprint arXiv:1709.07871, 2017.

T. Miyato, T. Kataoka, M. Koyama, Y. Yoshida, Spectral Normalization for Generative Adversarial Networks, arXiv:1802.05957, 2018.

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, S. Hochreiter, GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium, arXiv:1706.08500, 2017.

T. Qiao, J. Zhang, D. Xu, D. Tao, MirrorGAN: Learning Text-to-image Generation by Redescription, arXiv:1903.05854, 2019.

J. Zhu, T. Park, P. Isola, A.A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, Computer Vision and Pattern Recognition (CVPR) (2017) 2242–2251.

Z. Zhang, Y. Xie, L. Yang, Photographic text-to-image synthesis with a hierarchically-nested adversarial network, Conference on Computer Vision and Pattern Recognition (2018) 6199–6208.

M. Zhu, P. Pan, W. Chen, Y. Yang, DM-GAN: Dynamic memory generative adversarial networks for text-to-image synthesis, in, IEEE Computer Vision and Pattern Recognition(CVPR) (2019) 5802–5810.

J. Sun, Y. Zhou, B. Zhang, ResFPA-GAN: Text-to-Image Synthesis with Generative Adversarial Network Based on Residual Block Feature Pyramid Attention, in: IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), 2019, pp. 317–322

M. Arjovsky, L. Bottou, Towards Principled Methods for Training Generative Adversarial Networks, arXiv:1701.04862, 2017.

C.K. Sønderby, J. Caballero, L. Theis, W. Shi, F. Huszár, Amortised MAP Inference for Image Super-resolution, arXiv:1610.04490, 2019.

C. Doersch, Tutorial on Variational Autoencoders, arXiv:1606.05908, 2016.

A.B.L. Larsen, S.K. Sønderby, H. Larochelle, O. Winther, Autoencoding beyond pixels using a learned similarity metric, in, International Conference on Machine Learning(ICML) (2020) 2341–2349.

T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved Techniques for Training GANs, arXiv:1606.03498, 2016.

H. Dong, S. Yu, C.Wu, Y. Guo, semantic image synthesis via adversarial learning, International Conference on Computer Vision (ICCV) (2017) 5707–5715.

C. Gulcehre, S. Chandar, K. Cho, Y. Bengio, Dynamic neural turing machine with continuous and discrete addressing schemes, Neural Computation 30 (4) (2018) 857–884

A. Dash, J.C.B. Gamboa, S. Ahmed, M. Liwicki, M.Z. Afzal, TAC-GAN – Text Conditioned Auxiliary Classifier Generative Adversarial Network, arXiv: 1703.06412, 2017.

H. Park, Y. Yoo, N. Kwak, MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis, arXiv:1805.01123v5, 2023.

M. Cha, Y.L. Gwon, H.T. Kung, Adversarial Learning of Semantic Relevance in Text to Image Synthesis, Proceedings of the AAAI Conference on Artificial Intelligence, 33 (2019) 3272–3279.

P. Anderson, B. Fernando, M. Johnson, S. Gould, SPICE: Semantic propositional image caption evaluation, Adaptive Behavior 11 (4) (2016) 382–398.

S. Liu, Z. Zhu, N. Ye, S. Guadarrama, K. Murphy, Improved image captioning via policy gradient optimization of SPIDEr, IEEE International Conference on Computer Vision (ICCV) (2017) 873–881.

J. Johnson, A. Gupta, L. Fei-Fei, Image generation from scene graphs, Conference on Computer Vision and Pattern Recognition (2018) 1219–1228.

Q. Chen, V. Koltun, Photographic image synthesis with cascaded refinement networks, International Conference on Computer Vision (ICCV) (2019) 1520–1529.

S. Hong, D. Yang, J. Choi, H. Lee, Inferring semantic layout for hierarchical text-to-image synthesis, Computer Vision and Pattern Recognition(CVPR) (2022) 7986–7994.

Downloads

Published

13-04-2024

Issue

Section

Research Articles

How to Cite

Development of Naïve Algorithm for Generation of Digital Image by Generative Adversarial Text using Convolutional Generative Adversarial Network Algorithm. (2024). International Journal of Scientific Research in Science and Technology, 11(2), 965-974. https://ijsrst.com/index.php/home/article/view/IJSRST24112176

Similar Articles

1-10 of 100

You may also start an advanced similarity search for this article.