학술논문

Mending of Spatio-Temporal Dependencies in Block Adjacency Matrix
Document Type
Working Paper
Source
Subject
Computer Science - Machine Learning
Electrical Engineering and Systems Science - Image and Video Processing
Language
Abstract
In the realm of applications where data dynamically evolves across spatial and temporal dimensions, Graph Neural Networks (GNNs) are often complemented by sequence modeling architectures, such as RNNs and transformers, to effectively model temporal changes. These hybrid models typically arrange the spatial and temporal learning components in series. A pioneering effort to jointly model the spatio-temporal dependencies using only GNNs was the introduction of the Block Adjacency Matrix \(\mathbf{A_B}\) \cite{1}, which was constructed by diagonally concatenating adjacency matrices from graphs at different time steps. This approach resulted in a single graph encompassing complete spatio-temporal data; however, the graphs from different time steps remained disconnected, limiting GNN message-passing to spatially connected nodes only. Addressing this critical challenge, we propose a novel end-to-end learning architecture specifically designed to mend the temporal dependencies, resulting in a well-connected graph. Thus, we provide a framework for the learnable representation of spatio-temporal data as graphs. Our methodology demonstrates superior performance on benchmark datasets, such as SurgVisDom and C2D2, surpassing existing state-of-the-art graph models in terms of accuracy. Our model also achieves significantly lower computational complexity, having far fewer parameters than methods reliant on CLIP and 3D CNN architectures.
Comment: Accepted at ICONIP 2024