학술논문

Sparse Graph Transformer With Contrastive Learning
Document Type
Periodical
Source
IEEE Transactions on Computational Social Systems IEEE Trans. Comput. Soc. Syst. Computational Social Systems, IEEE Transactions on. 11(1):892-904 Feb, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
General Topics for Engineers
Transformers
Representation learning
Task analysis
Topology
Natural language processing
Training
Optimization
Graph pruning
graph representation learning
graph Transformer
sparse attention
unsupervised learning
Language
ISSN
2329-924X
2373-7476
Abstract
Information aggregation and propagation over networks via graph neural networks (GNNs) plays an important role in node or graph representation learning, which currently depend on the calculation with a fixed adjacency matrix, facing over-smoothing problem, and difficulty to stack multiple layers for high-level representations. In contrast, Transformer calculates an importance score for each node to learn its embedding via the attention mechanism and has achieved great successes in many natural language processing (NLP) and computer vision (CV) tasks. However, Transformer is inflexible to extend to graphs, as its input and output must have the same dimension. It will also become intractable to allocate attention over a large-scale graph due to distractions. Moreover, most graph Transformers are trained in supervised ways, which consume additional resources to annotate samples with potentially wrong labels and have limited generalization of representations. Therefore, this article attempts to build a new Sparse Graph Transformer with Contrastive learning for graph representation learning, called SGTC. Specifically, we first employ centrality measures to remove the redundant topological information from input graph according to the influences of nodes and edges, then disturb the pruned graph to get two different augmentation views, and learn node representations in a contrastive manner. Besides, a novel sparse attention mechanism is also proposed to capture structural features of graphs, which effectively save memory and training time. SGTC can produce low-dimensional and high-order node representations, which have better generalization for multiple tasks. The proposed model is evaluated on three downstream tasks over six networks, and experimental results confirm its superior performance against the state-of-the-art baselines.