학술논문

Visual World to an Audible Experience: Visual Assistance for the Blind And Visually Impaired

Document Type

Conference

Author

Mohith, S Shiv; Vijay, S; V, Sanjana; Krupa, Niranjana

Source

2020 IEEE 17th India Council International Conference (INDICON) India Council International Conference (INDICON), 2020 IEEE 17th. :1-6 Dec, 2020

Subject

Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Fields, Waves and Electromagnetics
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Visualization
Webcams
Blindness
Real-time systems
Integrated circuit modeling
Task analysis
Long short term memory
Deep Learning
Visual Question Answering
Image Captioning
Real Time

Language

ISSN

2325-9418

Abstract

This paper aims at assisting visually impaired people through Deep Learning (DL) by providing a system that can describe the surroundings as well as answer questions about the surroundings of the user. The system majorly consists of two models, an Image Captioning (IC) model, and a Visual Question Answering (VQA) model. The IC model is a Convolutional Neural Network and Recurrent Neural Network based architecture that incorporates a form of attention while captioning. This paper proposes two models, Multi-Layer Perceptron based and Long Short Term Memory (LSTM) based, for the VQA task that answer questions related to the input image. The IC model has achieved an average BLUE 1 score of 0.46. The LSTM based VQA model has given an overall accuracy of 47 percent. These two models are integrated along with Speech to Text and Text to Speech components to form a single system that works in real time.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송