학술논문

INT-Label: Lightweight In-Band Network-Wide Telemetry via Distributed Labeling
Document Type
Periodical
Source
IEEE Transactions on Parallel and Distributed Systems IEEE Trans. Parallel Distrib. Syst. Parallel and Distributed Systems, IEEE Transactions on. 35(5):751-767 May, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Labeling
Telemetry
Probes
Network topology
Topology
Monitoring
Bandwidth
Adaptive labeling
distributed labeling
in-band network telemetry
network-wide coverage
P4
Language
ISSN
1045-9219
1558-2183
2161-9883
Abstract
In-band Network Telemetry (INT) enables hop-by-hop device-internal state exposure for maintaining and troubleshooting data center networks. To achieve network-wide telemetry coverage, orchestration on top of the INT primitive is required. A straightforward solution would flood the network with INT probe packets for maximum measurement coverage, which leads to a huge bandwidth overhead. A refined solution leverages the SDN controller to collect the network topology information and carry out centralized probing path planning, which, however, is inefficient in reacting to topology changes. To tackle the above problems, we propose INT-label , a Lightweight In-band Network-Wide Telemetry architecture via the Distributed Labeling approach. INT-label periodically labels the sampled packets with device-internal states. It is cost-effective with a minor bandwidth overhead and able to seamlessly adapt to topology changes. In order to reduce the number of labeled packets, we introduce a times-based probabilistic labeling algorithm, which allows fewer packets to carry more INT information than the interval-based algorithm. In addition, to counteract the degradation of telemetry resolution due to loss of labeled packets, we design a feedback mechanism which can adaptively change the instant labeling frequency. We provide theoretical proof that INT-label can achieve network-wide telemetry. We analyze the impact of transmission delay on coverage rate and labeling times distribution under the INT-label architecture. Evaluation on software P4 switches suggests that INT-label can achieve 99.72% measurement coverage under the labeling frequency of 20 times per second. With the adaptive labeling enabled, even if 60% of the packets are lost, the coverage can still reach 92%.