학술논문

Context-Aware Head-and-Eye Motion Generation with Diffusion Model
Document Type
Conference
Source
2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR) VR Virtual Reality and 3D User Interfaces (VR), 2024 IEEE Conference. :157-167 Mar, 2024
Subject
Computing and Processing
Solid modeling
Visualization
Head
Motion segmentation
Computational modeling
Avatars
Virtual environments
Human-centered computing
Human computer interaction (HCI)
Interaction paradigms
Virtual reality
Language
ISSN
2642-5254
Abstract
In humanity’s ongoing quest to craft natural and realistic avatars within virtual environments, the generation of authentic eye gaze behaviors stands paramount. Eye gaze not only serves as a primary non-verbal communication cue, but it also reflects cognitive processes, intent, and attentiveness, making it a crucial element in ensuring immersive interactions. However, automatically generating these intricate gaze behaviors presents significant challenges. Traditional methods can be both time-consuming and lack the precision to align gaze behaviors with the intricate nuances of the environment in which the avatar resides. To overcome these challenges, we introduce a novel two-stage approach to generate context-aware head-and-eye motions across diverse scenes. By harnessing the capabilities of advanced diffusion models, our approach adeptly produces contextually appropriate eye gaze points, further leading to the generation of natural head-and-eye movements. Utilizing Head-Mounted Display (HMD) eye-tracking technology, we also present a comprehensive dataset, which captures human eye gaze behaviors in tandem with associated scene features. We show that our approach consistently delivers intuitive and lifelike head-and-eye motions and demonstrates superior performance in terms of motion fluidity, alignment with contextual cues, and overall user satisfaction.