학술논문

AI-Powered Toolkit for Automated Swallowing Kinematics Analysis in X-Ray Videofluoroscopy
Document Type
Conference
Source
2022 4th Novel Intelligent and Leading Emerging Sciences Conference (NILES) Novel Intelligent and Leading Emerging Sciences Conference (NILES), 2022 4th. :71-74 Oct, 2022
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Stomach
Pulmonary diseases
Heuristic algorithms
Pipelines
Kinematics
Manuals
Bones
Language
Abstract
Dysphagia or swallowing dysfunction is any impairment in the swallowing function that may cause difficulty or discomfort in initiating or transferring the bolus from the oral cavity into the stomach. Dysphagia can cause the bolus to reroute into the airway, known as aspiration, which can lead to more adverse outcomes such as pneumonia or even death. Videofluoroscopic swallowing study (VFSS) is the gold standard procedure for dysphagia diagnosis. In VFSS, trained clinicians calculate swallowing kinematics and inspect pathophysiological processes in a frame-by-frame manner. Though effective, VFSS evaluation is time-consuming, prone to subjectivity in judgment, and human error. In this study, we present a cascaded pipeline that employs various deep learning algorithms to automate VFSS analysis to identify swallowing abnormalities. The pipeline initially segments the VFSS video into static and dynamic frames which include all the relevant features of swallowing for the subsequent VFSS analysis tasks. These tasks include pharyngeal swallow segmentation, hyoid bone tracking, bolus segmentation, and aspiration detection. The pipeline starts with a shallow neural network (NN) that differentiates between static and dynamic VFSS frames with a 98% accuracy using spatio-temporal features from TV-L1 optical flow. Then, a Single Shot Multi-box Detector (SSD) model localizes the hyoid bone body with a mean average precision (mAP) of 40% at an intersection over union (IOU) of 0.5 in a fast and beyond average performance even when the hyoid bone is occluded by the mandible. So far, the developed automated pipeline has shown comparable performance to the manual analysis performed by trained clinicians.