학술논문

The Visual Assistant - Image-to-Speech Generator

Document Type

Conference

Author

Raj, Amrit; Ghosh, Sanchita; Gupta, Bharat

Source

2023 IEEE 3rd International Conference on Sustainable Energy and Future Electric Transportation (SEFET) Sustainable Energy and Future Electric Transportation (SEFET), 2023 IEEE 3rd International Conference on. :1-6 Aug, 2023

Subject

Aerospace
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Transportation
Training
Visualization
Navigation
Roads
Computational modeling
Visual impairment
Urban areas
Flask
CNN
RNN
LSTM
TensorFlow
Keras
GloVe
Greedy Search
Beam Search
EfficientNetV2L

Language

Abstract

In this paper, we presented an image-to-voice conversion model designed to assist drivers. The model can be integrated into vehicle navigation systems to analyze images of roads and signs captured by onboard cameras. It can then convert into voice instructions, providing real-time guidance to the driver. This can be particularly useful in unfamiliar areas or when visibility is limited. Image caption Generator is a well-known research field within the field of artificial intelligence that focuses on image comprehension in addition to providing a linguistic description of the image. To be able to produce sentences that are well formed requires a syntactic as well as a semantic understanding of the language. Additionally, our model can contribute to the development of intelligent transportation systems by providing an accessible means for visually impaired individuals to navigate public spaces. By converting visuals into voice instructions, the model ensures that individuals with visual impairments can independently and confidently travel through urban environments, enhancing their mobility and inclusivity.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송