학술논문

Improving 3D Pose Estimation For Sign Language

Document Type

Conference

Author

Ivashechkin, Maksym; Mendez, Oscar; Bowden, Richard

Source

2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) Acoustics, Speech, and Signal Processing Workshops (ICASSPW), 2023 IEEE International Conference on. :1-5 Jun, 2023

Subject

Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Visualization
Solid modeling
Three-dimensional displays
Pose estimation
Neural networks
Gesture recognition
Kinematics
3D pose estimation
hand and body reconstruction

Language

Abstract

This work addresses 3D human pose reconstruction in single images. We present a method that combines Forward Kinematics (FK) with neural networks to ensure a fast and valid prediction of 3D pose. Pose is represented as a hierarchical tree/graph with nodes corresponding to human joints that model their physical limits. Given a 2D detection of keypoints in the image, we lift the skeleton to 3D using neural networks to predict both the joint rotations and bone lengths. These predictions are then combined with skeletal constraints using an FK layer implemented as a network layer in PyTorch. The result is a fast and accurate approach to the estimation of 3D skeletal pose. Through quantitative and qualitative evaluation, we demonstrate the method is significantly more accurate than MediaPipe in terms of both per joint positional error and visual appearance. Furthermore, we demonstrate generalization over different datasets and sign languages. The implementation in PyTorch runs at between 100-200 milliseconds per image (including CNN detection) using CPU only.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송