학술논문

Viewport Forecasting in 360° Virtual Reality Videos with Machine Learning
Document Type
Conference
Source
2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR) Artificial Intelligence and Virtual Reality (AIVR), 2019 IEEE International Conference on. :74-747 Dec, 2019
Subject
Computing and Processing
Magnetic heads
Videos
Virtual reality
Trajectory
Head
Headphones
Machine learning
machine learning
virtual reality
cloud gaming
360° video
body motion prediction
eye tracking
head mounted display
Language
Abstract
Objective. Virtual reality (VR) cloud gaming and 360° video streaming are on the rise. With a VR headset, viewers can individually choose the perspective they see on the head-mounted display by turning their head, which creates the illusion of being in a virtual room. In this experimental study, we applied machine learning methods to anticipate future head rotations (a) from preceding head and eye motions, and (b) from the statistics of other spherical video viewers. Approach. Ten study participants watched each 3 1/3 hours of spherical video clips, while head and eye gaze motions were tracked, using a VR headset with a built-in eye tracker. Machine learning models were trained on the recorded head and gaze trajectories to predict (a) changes of head orientation and (b) the viewport from population statistics. Results. We assembled a dataset of head and gaze trajectories of spherical video viewers with great stimulus variability. We extracted statistical features from these time series and showed that a Support Vector Machine can classify the range of future head movements with a time horizon of up to one second with good accuracy. Even population statistics among only ten subjects show prediction success above chance level. %Both approaches resulted in a considerable amount of prediction success using head movements, but using gaze movement did not contribute to prediction performance in a meaningful way. Even basic machine learning models can successfully predict head movement and aspects thereof, while being naive to visual content. Significance. Viewport forecasting opens up various avenues to optimize VR rendering and transmission. While the viewer can see only a section of the surrounding 360° sphere, the entire panorama has typically to be rendered and/or broadcast. The reason is rooted in the transmission delay, which has to be taken into account in order to avoid simulator sickness due to motion-to-photon latencies. Knowing in advance, where the viewer is going to look at may help to make cloud rendering and video streaming of VR content more efficient and, ultimately, the VR experience more appealing.