Generating Scene Layout from Textual Descriptions Using Transformer
- Resource Type
- Conference
- Authors
- Takahashi, Haruka; Kuriyama, Shigeru
- Source
- 2023 10th International Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA) Advanced Informatics: Concept, Theory and Application (ICAICTA), 2023 10th International Conference on. :1-6 Oct, 2023
- Subject
- Computing and Processing
Training
Image synthesis
Generative AI
Computational modeling
Layout
Predictive models
Transformers
Text-to-Image
layout generation
Transformer
- Language
Creating images through generative AI technology is still challenging because it requires specifying detailed layouts and contents of objects. This paper introduces a layout generation method that uses Transformer-based deep neural networks to generate scene representations with multiple objects. This approach allows for an explainable layout of the image's objects, generated automatically from the text. Unlike the conventional layout generation process that requires sequential object predictions and post-processing of duplicate bounding boxes, our end-to-end approach uses parallel decoding. We conducted experiments to compare our method's quality and computational cost against existing ones, demonstrating its effectiveness and efficiency in generating layouts from textual representations. Our Text-to-Layout approach offers practical authoring tools that are computable with relatively lightweight networks while facilitating explainable image generation.