
Query-Driven Streaming Network Telemetry with Flexible Telemetry for Management
Explore the innovative Sonata project led by Jennifer Rexford and team, focusing on query-driven streaming network telemetry. Learn about the spark-like query language, query-driven collection and analysis, and compiling individual operators to enhance network management and security. Discover the power of familiar query language, dataflow operators, and fine-grained tuples for efficient network monitoring and analysis.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Sonata: Query-Driven Streaming Network Telemetry Jennifer Rexford With Arpit Gupta, Rob Harrison, Nick Feamster, Marco Canini (KAUST), Walter Willinger (Niksun)
Flexible Telemetry for Network Management E.g., DNS reflection attacks Many analysis questions Performance diagnosis Cyber attack detection Traffic engineering Familiar query language Packet as a tuple Dataflow operators E.g., Spark-like queries DNS Src: DNS Dst: Victim DNS Src: Victim Dst: DNS Src: DNS Dst: Victim Src: Victim Dst: DNS Attacker Victim 1
Spark-Like Query Language 1 2 3 4 5 6 7 victimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) 2
Query-Driven Collection and Analysis Query partitioning Scalability challenges Many packets and flows Many queries Many switches Query-driven collection Integrate collection and analysis Customize collection to the query Fine-grained with low overhead tuples 3
Compiling Individual Operators (p) filter(p) 1 2 3 4 5 6 7 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) p udp.sport == 53 4
Compiling Individual Operators (p) filter(p) 1 2 3 4 5 6 7 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) control ingress { apply(filter_map_table); ... apply(reduce_table); if (m.reduce_table_sum > Th) { apply(report_table) } 5
Compiling Individual Operators f map(f) 1 2 3 4 5 6 7 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) f m.dstIP = p.dstIP m.srcIP = p.srcIP * 6
Compiling Individual Operators f reduce(f) 1 2 3 4 5 6 7 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) * idx = hash(m.dstIP) * stateful[idx] += 1 7
Partitioning Decisions pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) .filter((dstIP, count) => count > Th) .filter((dstIP, count) => count > Th) .filter((dstIP, count) => count > Th) 1 2 3 4 5 6 7 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) 6 7 .reduce(keys=(dstIP,), sum) 6 7 1 2 3 4 5 6 7 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) 5 1 2 3 4 5 pvictimIPs = pktStream .filter(p => p.udp.sport == 53) 1 2 3 4 10
Training Data Workload dependence Operators consume common switch resources Determine which operators reduce the tuples the most Detect and adapt to traffic dynamics Sonata uses training data to size hash tables Report collisionsto detect deviation! Iterative refinement Queries that drill down through the data over time 12
Experimental Results Wide range of queries Performance, security, traffic engineering Realistic traffic CAIDA packet traces Practical data-plane model 16 stages, 8 stateful operations, 8 Mb stateful memory Significant scalability benefits Orders of magnitude fewer tuples sent to stream processor 13
Conclusions Flexible and scalable telemetry Query-driven collection and analysis Query partitioning between Spark and P4 Query planning as an integer linear program Ongoing work Network-wide queries (SOSR 18 heavy-hitters paper) Wider range of data-plane targets More sophisticated data structures Open-source software at sonata.cs.princeton.edu, paper to appear at SIGCOMM 18 14