학술논문

Data Generation Using Sequence-to-Sequence

Document Type

Conference

Author

Joshi, Akshat; Mehta, Kinal; Gupta, Neha; Valloli, Varun Kannadi

Source

2018 IEEE Recent Advances in Intelligent Computational Systems (RAICS) Intelligent Computational Systems (RAICS), 2018 IEEE Recent Advances in. :108-112 Dec, 2018

Subject

Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
General Topics for Engineers
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Data models
Decoding
Feeds
Training
Computational modeling
Error analysis
Context modeling
Sequence2Sequence
NLP
transliteration
LSTM
encoder
decoder

Language

Abstract

Sequence to Sequence models have shown a lot of promise for dealing with problems such as Neural Machine Translation (NMT), Text Summarization, Paraphrase Generation etc. Deep Neural Networks (DNNs) work well with large and labeled training sets but in sequence-to-sequence problems, mapping becomes a much harder task due to the differences in syntax, semantics and length. Moreover usage of DNNs is constrained by the fixed dimensionality of the input and output, which is not the case with most of the Natural Language Processing (NLP) problems. Our primary focus was to build transliteration systems for Indian languages. In the case of Indian languages, monolingual corpora are abundantly available but a parallel one which can be directly applied to transliteration problem is scarce. With the available parallel corpus, we could only build weak models. We propose a system to leverage the mono-lingual corpus to generate a clean and quality parallel corpus for transliteration, which is then iteratively used to tune the existing weak transliteration models. The results that we got prove our hypothesis that the process of generation of clean data can be validated objectively by evaluating the models alongside the efficiency of the system to generate data in each iteration.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송