Advanced Queue Measurement Techniques Workshop Insights

fine grained queue measurement via tapping n.w
1 / 8
Embed
Share

Explore insights from a workshop on advanced queue measurement techniques, covering topics such as fine-grained queue measurement, high drops with low link utilization, legacy switch challenges, offline analysis methods, high-delay event types, unstable queue investigations, and more.

  • Queue Measurement
  • Workshop Insights
  • Networking Technology
  • Traffic Analysis
  • Buffer Sizing

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Fine-grained queue measurement via tapping tapping: an experience report Xiaoqi Chen Department of Computer Science Hyojoon Kim Office of Information Technology Princeton University Workshop on Buffer Sizing, 12/3/2019 1

  2. High drops, low link utilization Princeton campus network: 100G peering with Internet2/ESNet Problematic queuing at one border switch: Internet2 Upstream 100G to Internet2 100G Downstream 8x10G to 8 servers High drop count on a particular port Link utilization always <20% at this port 10G What happened? Workshop on Buffer Sizing, 12/3/2019 2

  3. Legacy switch: lack of visibility Current tools (e.g. SNMP) provide only 1-minute granularity In general, legacy switches provide 1-second granularity at best by polling the high watermark (max queue length) counter 5x Queue Length 3x 1x 16:00:0 0 0:00:00 Time in day (24h) 8:00:00 16:00:0 0 We know bursts exist, but can t analyze the root cause Workshop on Buffer Sizing, 12/3/2019 3

  4. Offline analysis Ingress Egress P P Can we tap traffic to analyze queuing, even if links today are 100G? Yes! Use a smart NIC or P4 switch Tap both ingress / egress port Calculate an invariant signature*to match a packet s two appearances Get queuing delay When delay is high, analyze and report Ingress Egress P P p4.org logo Ingress Ingress Ingress Packet P, flow ID F Enqueued at t=2 Packet P, flow ID F Dequeued at t=8 Delay 8-2=6 * See our CoNEXT 19 ConQuest paper for more detail. Workshop on Buffer Sizing, 12/3/2019 4

  5. Result: three types of high-delay events 1. Steady state One large flow Queue length stable 2. Outlier packets Bug, or noise? Slow path (control plane)? 3. Unstable! Queue oscillates wildly Massive drops Workshop on Buffer Sizing, 12/3/2019 5

  6. Investigation: when queue is unstable Source hosts are perfSonar nodes: an active measurement tool! Throughput tests? Latency tests? Ongoing investigation Workshop on Buffer Sizing, 12/3/2019 6

  7. Summary, Q&A Scrutinizing queues in production networks is possible Even if the legacy device does not support microseconds granularity Tapping can be used to assist many buffer sizing experiments Interim solution before all switches support fine-grained queue measurement Workshop on Buffer Sizing, 12/3/2019 7

  8. Backup: Queue size CDF while unstable Workshop on Buffer Sizing, 12/3/2019 8

Related


More Related Content