Accelerating NF Processing Through Task Offloading in Networking Stacks

application agnostic offloading of application n.w
1 / 24
Embed
Share

Explore the advancements in accelerating NF processing through task offloading in networking stacks. Learn about the challenges, benefits of programmability, and the evolution of network stack solutions.

  • NF Processing
  • Task Offloading
  • Networking Stacks
  • Programmability
  • Network Functions

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Application Agnostic Offloading of Application Agnostic Offloading of Datagram Processing Datagram Processing Helge Reelfs ITC 2018, Vienna, AT, September 2018 http://comsys.rwth-aachen.de/

  2. Shift towards programmability Shift towards programmability C Cu ur rr re en nt t example: SD example: SDN N & Programmability has advantages Programmability has advantages Hope for realizing faster & more flexible networks Simpler implementations by modularization Separated control Flexibility has a price Flexibility has a price - - How to make this fast? Goal: Accelerate NF processing by task offloading Goal: Accelerate NF processing by task offloading NFV = run in VM & NFV NFV How to make this fast? frequent subtask Network function offload for acceleration Rule Processor 2 Jens Helge Reelfs

  3. Networking Stacks Networking Stacks Where are we? Where are we? Increasing line rates challenge packet processing Increasing line rates challenge packet processing CPU speeds do not scale with increasing line rates Performance problems at high line rates (e.g. >>10Gbps) Main overhead factors Main overhead factors Memory allocations and copy operations System calls and context switches Does it matter after all? Socket API Transmission User-Kernel Copy Protocol Stack Driver Reception NIC 0 1000 2000 3000 4000 5000 [ns] Source: Larsen et al., Architectural Breakdown of End-to-End Latency in a TCP/IP network, J Parallel Prog, 37:6, (2009) 3 Jens Helge Reelfs

  4. Current Network Stack & Current Answers (simplified) Current Network Stack & Current Answers (simplified) Other HW Offload (not shown) Bypassing Kernel APP Classical Network Stack Kernel- Software APP STACK user kernel Netmap/DPDK APP TRANS TRANS TRANS NET NET NET kernel HW MAC MAC MAC 4 Jens Helge Reelfs

  5. But where is the problem? But where is the problem? Stock Kernel Stock Kernel Too generic (on purpose) slow Other new APIs limited success, very specific Kernel Kernel- -Applications Applications Possible security threat, updatability Kernel Bypass Kernel Bypass Enslaving NIC to single applications, tailored stack implementations Other Offloading Other Offloading Highly tailored to application and hardware Isn t there anything in between? 5 Jens Helge Reelfs

  6. Classical Network Stack (simplified) Classical Network Stack (simplified) Result Result 132.187.12.42 DNS- server remove remove it! it! user CTX SW Copy kernel TRANS NET kernel CTX SW Copy hardware MAC Request Request A record for itc30.org? 6 Jens Helge Reelfs

  7. Application Agnostic Packet Processor Application Agnostic Packet Processor Z ZZ ZZ Z DNS- Server control plane user kernel Result Result TRANS AAPP 132.187.12.42 NET kernel CTX SW Copy hardware MAC Request Request A record for Itc30.org? 7 Jens Helge Reelfs

  8. What is Application What is Application- -Agnostic? Agnostic? Efficient byte wise matching Efficient byte wise matching Offset + Length Simple construction of responses Simple construction of responses Copy data from incoming packet Copy data from template Chaining of matching & construction rules Chaining of matching & construction rules Rule# Rule# Match Match Reply Reply 1 Offset: 0 Content: Time is an illusion. Offset: 0 Content: Lunchtime doubly so. 2 Offset: 12 Content: www.heise.de [A] Offset: 0 Copy: bytes 0-1 Offset: 2 Content: 193.99.144.85 [A] 8 Jens Helge Reelfs

  9. Evaluation: Testbed Setup Evaluation: Testbed Setup 4 clients (dnsperf) Uniform shuffled workload across clients Server with AAPP Kernel 10 Gbps links 10 Gbps links Modified Bind DNS server Modified Bind DNS server 9 Jens Helge Reelfs

  10. Evaluation: DNS Server Evaluation: DNS Server UDP Transport UDP Transport Synthetic round robin workload Synthetic round robin workload Speedup Speedup of factor 4.9 to 5.5 Speedup of factor 4.9 to 5.5 10 Jens Helge Reelfs

  11. Evaluation: DNS Server Evaluation: DNS Server UDP Transport UDP Transport Real World Power Real World Power- -Law Workload ( Offload top-N zones entries Law Workload (~ ~ ISP trace) ISP trace) 11 Jens Helge Reelfs

  12. Beyond the paper Beyond the paper http://comsys.rwth-aachen.de/

  13. Evaluation: static HTTP Server Evaluation: static HTTP Server TCP Transport TCP Transport Full throttle Full throttle GET / from all clients from all clients Speedup Speedup of factor up to 2.6 Speedup of factor up to 2.6 13 Jens Helge Reelfs

  14. Going deeper! Going deeper! eBPF eBPF to the rescue to the rescue APP User Successor of classical Berkeley Packet Filter Successor of classical Berkeley Packet Filter eBPF eBPF = in = in- -Kernel sandboxed virtual machine Kernel sandboxed virtual machine Running @ Linux TC subsystem Running @ Linux TC subsystem TRANS IP Kernel TC Driver/XDP (Smart)NIC HW Speedup 14 Jens Helge Reelfs

  15. Conclusions Conclusions Offloading processing has advantages Offloading processing has advantages faster This comes at a cost This comes at a cost Matching & Response language is restricted TCP support is limited Encryption not supported May work with in-Kernel TLS No user space encryption (e.g., QUIC) Shift to Shift to eBPF eBPF enables safe arbitrary offloading enables safe arbitrary offloading Lower layers lack functionality Lower layers lack functionality eBPF eBPF as a handy tool to network offloading!!? as a handy tool to network offloading!!? only UDP only UDP 15 Jens Helge Reelfs

  16. Thanks. Thanks. Helge Reelfs ITC 2018, Vienna, AT, September 2018 http://comsys.rwth-aachen.de/

  17. Offloading to other Host Offloading to other Host 17 Jens Helge Reelfs

  18. DNS Example DNS Example 18 Jens Helge Reelfs

  19. AAPP API AAPP API 19 Jens Helge Reelfs

  20. SmartNIC SmartNIC Offloading ( Offloading (Droptest Droptest) ) 20 Jens Helge Reelfs

  21. SmartNIC SmartNIC Offloading (Complex program test) Offloading (Complex program test) 21 Jens Helge Reelfs

  22. Related Work Related Work Previously: Offload packet Processing to hardware Current: Bypass the kernel: User-land stacks Netmap: Mux/Demux flows on specialized NICs! [INFOCOM 01] Bypass the Kernel! [e.g., USENIX ATC 12] Frankenstack Per App Sockets [Submission] Offload computation partly to GPUs! [SIGCOMM 10] Use per-app stacks! [e.g., CCR 14/, SIGCOMM 14] Implement TCP in user-level! [e.g., IBM Technical Report] Optimize I/O and cache coherence! [e.g., ANCS 12] Build a user space wrapper for Kernel stack! [USITS 01] 22 Jens Helge Reelfs

  23. Condition Matching via Hashes Condition Matching via Hashes 23 Jens Helge Reelfs

  24. DNS Real World Workload DNS Real World Workload 24 Jens Helge Reelfs

Related


More Related Content