학술논문

DiveNet: Dive Action Localization and Physical Pose Parameter Extraction for High Performance Training
Document Type
Periodical
Source
IEEE Access Access, IEEE. 11:37749-37767 2023
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Sports
Location awareness
Streaming media
Motion segmentation
Convolutional neural networks
Pose estimation
Training
Deep learning
diving sports pose
action localization
sports analytics
Language
ISSN
2169-3536
Abstract
The tremendous progress of deep convolution neural networks has shown promising results on the classification of various sports activities. However, the accurate localization of a particular sports event or activity in a continuous video stream is still a challenging problem. The accurate detection of sports actions enables the comparison of different performances, objectively. In this work, we propose the DiveNet action localization module to detect the springboard diving sports action in an unconstrained environment. We used Temporal Convolution Network (TCN) over a backbone feature extractor to localize diving actions, with low latency. We estimate the divers center of mass (COM) trajectory and the peak dive height using the temporal demarcations provided by the action localization step via the projectile motion formula. In addition, we train a DiveNet pose regression network, which extends the Unipose architecture with direct physical parameter estimation, i.e COM and 2D joint keypoints. We propose a new homography computation method between the diving motion plane and the image-view for each dive. This enables the representation of physical parameters in metric scale, without any calibration. We release the first publicly available diving sports video dataset, recorded at 60 Hz with a static camera setup for different springboard heights. DiveNet action localization achieves an accuracy of 95% with a single frame latency (< 25 ms). The DiveNet pose regression model shows competitive results around 70% PCK on different diving pose datasets. We achieve COM accuracy of 6 pixels, dive peak height sensitivity of 20 cm and mean joint angle errors around 10 degrees.