학술논문

Refinement of Utterance Fluency Feature Extraction and Automated Scoring of L2 Oral Fluency with Dialogic Features
Document Type
Conference
Source
2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022 Asia-Pacific. :1312-1320 Nov, 2022
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Annotations
Electric breakdown
Manuals
Information processing
Syntactics
Predictive models
Maintenance engineering
Language
ISSN
2640-0103
Abstract
We propose an automated scoring method of fluency that is compatible with second language dialogic responses. Human judgements of L2 oral fluency in dialogue tasks has different nature from scoring of monologue, and it is necessary to capture a dialogic aspect of fluency. Because utterances in dialogue tend to be less fluent than in monologue, procedures such as pruning disfluency words and classifying pauses by their syntactic locations are essential for automated scoring systems to extract utterance features strongly correlated to human ratings. However, existing automated utterance feature extractors have suffered from the difficulties to detect disfluency words and pauses locations due to the technical challenges. Moreover, conventional automated scoring methods of L2 spoken dialogue often predict oral proficiency for each turn, and dialogic features have not been considered properly. To address these gaps between the nature of fluency in dialogue and existing automated scoring methodologies, we refine an automated utterance feature extractor and design a fluency scoring model based on dialogic features. Experiments showed that the substantial agreement of disfluency word and pause location detection between our feature extractor and human (Cohen'sκ > 0.61). We also found that the proposed scoring method outperformed in predicting subjective fluency scores (QW κ = 0.833) than a conventional turn-level scoring model (0.654) and even a manual rating (0.799). We additionally compared the current assessment approach considering disfluency features and pause location, and it improved the accuracy on predicting subjective fluency scores. These results may suggest what and how utterance and dialogic features should be utilized in automated scoring of spoken dialogue.