학술논문

Unsupervised multiple choices question answering via universal corpus
Document Type
Working Paper
Source
Subject
Computer Science - Computation and Language
Language
Abstract
Unsupervised question answering is a promising yet challenging task, which alleviates the burden of building large-scale annotated data in a new domain. It motivates us to study the unsupervised multiple-choice question answering (MCQA) problem. In this paper, we propose a novel framework designed to generate synthetic MCQA data barely based on contexts from the universal domain without relying on any form of manual annotation. Possible answers are extracted and used to produce related questions, then we leverage both named entities (NE) and knowledge graphs to discover plausible distractors to form complete synthetic samples. Experiments on multiple MCQA datasets demonstrate the effectiveness of our method.
Comment: 5 pages, 1 figures, published to ICASSP 2024