A Survey On Image Captioning
- Resource Type
- Conference
- Authors
- Osaid, Muhammad; Memon, Zulfiqar Ali
- Source
- 2022 International Conference on Emerging Trends in Smart Technologies (ICETST) Emerging Trends in Smart Technologies (ICETST), 2022 International Conference on. :1-6 Sep, 2022
- Subject
- Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Computer vision
Neural networks
Market research
Natural language processing
Decoding
Convolutional neural networks
Task analysis
image captioning
supervised
unsupervised
encoder
decoder
LSTM(long short-term memory)
RNN(recurrent neural network)
CNN(convolutional neural network)
- Language
Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder and decoder framework, but there are lot of other techniques, like supervised and unsupervised image captioning. There is also a technique called scene graph alignment Which is used for unsupervised captioning. Some authors simply reconstructed older procedures rather than using the new ones in order to produce better results. The majority of publications utilised RNN or LSTM as the encoder and CNN as the decoder. Many centred graphs were also employed by authors to rebuild several sentences from it. Another strategy is known as the actor and critic model, in which the actor performs some task and the critic offers criticism to help the results.