학술논문

GRU-Enhanced Decoding by Lightweight Transformer for Image Captioning

Document Type

Conference

Author

Chaudhary, Daksh; Sharma, Kapil; Sharma, Vikas; Jain, Prasuk

Source

2024 14th International Conference on Cloud Computing, Data Science & Engineering (Confluence) Cloud Computing, Data Science & Engineering (Confluence), 2024 14th International Conference on. :407-410 Jan, 2024

Subject

Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Robotics and Control Systems
Signal Processing and Analysis
Visualization
Production
Transformers
Feature extraction
Encoding
Decoding
Machine translation

Language

ISSN

2766-421X

Abstract

Creating descriptive phrases with visual and textual data is known as image captioning. Transformers use an encoder and decoder configuration to manage language comprehension and machine translation. As part of our effort to create a small and lightweight model that can be deployed with ease, we present the Lightweight Transformer with an embedded GRU decoder for image captioning. We reduce the usual architecture in this model by reducing the number of encoders and decoders to only one encoder and a GRU-integrated decoder. Furthermore, including multilevel rich visual features from Inception V3 enhances the encoder's performance. We conducted a number of thorough experiments to assess the effectiveness of this suggested Lightweight Transformer architecture using the Viz Wiz Captions dataset.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송