학술논문

Ensemble Deep Learning for Human-Object Interaction Detection
Document Type
Conference
Source
2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC) Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), 2022 2nd International. :81-86 May, 2022
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Visualization
Computational modeling
Object detection
Predictive models
Feature extraction
Ubiquitous computing
Prediction algorithms
Machine Learning
Computer Vision
Human Activity Recognition
Visual Relationship
Language
Abstract
Human-object interaction (HOI) detection is the task of predicting the visual relationships between humans and their surroundings in images and videos by locating and inferring the interactions between human-object pairs. The majority of existing models have approached this task by detecting human and object instances and predicting interactions between them using a single model and relying on visual features to differentiate between different actions. In this article, we propose a novel method for solving the HOI detection problem by employing ensemble voting on simple models that use only spatial features to predict human-object pair actions. Additionally, we solve the false-positive pairs generated by mis-grouping and non-interaction objects in the image using a bipartite matching. Our proposed approach outperforms many state-of-the-art models on the V-COCO dataset while requiring less inference time.