Enhancing Network Performance Monitoring with Co-Design Software and Hardware

co designing software and hardware n.w
1 / 30
Embed
Share

Explore how the co-design of software and hardware can revolutionize performance measurement in networks, addressing issues like high tail latencies and inadequate measurement support with precise visibility into network conditions provided by switches. Discover better tools for monitoring network performance.

  • Network performance
  • Co-design
  • Software
  • Hardware
  • Performance measurement

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Co-Designing Software and Hardware for Declarative Performance Measurement Srinivas Narayana MIT CSAIL October 7, 2016

  2. An example: High tail latencies Delay completion of flows (and applications)

  3. An example: High tail latencies Where is the queue buildup? How did queues build up? UDP on-off traffic? Fan-in? Which other flows cause queue buildup? Throttle UDP? Traffic pattern?

  4. An example: High tail latencies What measurement support do you need? Where is the queue buildup? How did queues build up? UDP on-off traffic? Fan-in? Which other flows cause queue buildup? Throttle UDP? Traffic pattern?

  5. Existing measurement support Sampling (NetFlow, sFlow) May not sample packets/events you care about Counting (OpenSketch, UnivMon, ) Only traffic counting Time granularity too coarse Network Packet capture (Endace, Niksun, Fmadio, ) Too much data to collect everywhere & always Endpoint data collection (Pingmesh, ) Data distributed over several hosts Insufficient network visibility Host

  6. Network performance questions Flow-level packet drop rates Queue latency EWMA per connection Route flapping incidents Persistently long queues TCP incast and outcast Interference from bursty traffic Incidence of TCP reordering and retransmissions Understanding congestion control schemes Incidence and lengths of flowlets High end to end latencies ...

  7. Can we build better performance monitoring tools for networks?

  8. An example: High tail latencies Switches have precise visibility into network conditions

  9. Performance monitoring on switches (+) Precise visibility of performance (e.g., queue buildup, flows, ) (+) Speed (line rates of 10-100 ports of 10-100 Gb/s/port) (-) Costly and time-consuming to build hardware (2-3 years) Ideally, new hardware should suit diverse measurement needs

  10. Co-design software and hardware (1) Design declarative query language to ask performance questions (2) Design hardwareprimitives to support query language at line rate

  11. Performance query system Declarative performance queries Network operator Accurate query results with low overhead Diagnostic apps

  12. (1) Declarative language abstraction Write SQL-like performance queries on an abstract table: (Packet headers, Queue id, Queue size, Time at queue ingress, Time at queue egress, Packet path info) For all packets at all queues

  13. (1) Example performance queries Queues and traffic sources with high latency SELECT srcip, qid FROM T WHERE tout - tin > 1ms Traffic counters per source & destination SELECT srcip, dstip, count, sum_len GROUPBY srcip, dstip User-defined fold functions Queue latency EWMA SELECT 5tuple, qid, ewma GROUPBY 5tuple, qid def ewma (lat_est, (tin, tout)): lat_est = alpha * lat_est + (1 - alpha) * (tout-tin) Packet ordering matters (recent queue sizes more significant)

  14. (2) Hardware primitives Good news: Many existing primitives prove useful! Selection: Match-action rules Per-packet latency and queue data: In-band network telemetry (INT)

  15. (2) Hardware support for aggregation SELECT 5tuple, ewma GROUPBY 5tuple 5tuple EWMA Run a key-value store on switches? K-V store supports read-modify-write operations

  16. (2) Challenges in building switch K-V Run at line rate 1GHz packet rate Scale to millions of keys e.g., number of connections No existing memory is fast and large enough!

  17. (2) Split key-value store Off-chip Backing store (DRAM) On-Chip Cache (SRAM) Key Value Update / Initialize Merge Value Key key evicted key

  18. Need more information? See our upcoming HotNets 16 paper!

  19. Why performance monitoring? Determine network performance bottlenecks Exonerate network as source of problems New proposals for scheduling, congestion control, what works?

  20. Semantically useful language primitives Per-packet performance attributes: latency, loss Isolate traffic with interesting performance Aggregate performance stats over sets of packets Simultaneous occurrences of multiple conditions Compose results to pose more complex questions

  21. The SRAM cache

  22. Does caching lead to correct results? In general, no Forget keys and their values when evicted Keys previously evicted can come back! Merging SRAM and DRAM may eventually produce correct results

  23. Merging example SELECT COUNT, SUM(pkt_len) GROUPBY srcip, dstip Off-chip Backing store (DRAM) Srcip, Dstip Count Sum(pkt_len) On-Chip Cache (SRAM) Key Value Update / Initialize Merge Value Key key evicted key True COUNT = COUNTSRAM + COUNTDRAM True SUM(pkt_len) = SUMSRAM + SUMDRAM

  24. Linear-in-state condition State updates of the form S = A * S + B A, B: functions on a bounded history of packets

  25. Linear-in-state update correctDRAM = finalSRAM An*defaultSRAM + An*previousDRAM previousDRAM correctDRAM State defaultSRAM finalSRAM Updates

  26. Current work: Compiler Detecting linear-in-state condition Queries that use multiple key-value stores Network-wide queries Nested queries An equivalent of linear-in-state for multiple tables?

  27. How YOU can help Useful performance diagnostics questions? How would you evaluate this system? What s a reasonable prototype?

Related


More Related Content