In-Depth Guide to RDMA Network Technology

rdma nic in lossless network n.w
1 / 13
Embed
Share

Explore the intricate workings of RDMA (Remote Direct Memory Access) technology, including RDMA NIC features, RoCE NIC setup, Ethernet data formats, and transport types. Delve into the operational aspects of RDMA NICs, covering request-response processes and data transfer mechanisms. Gain insights into the potential of RDMA technology for enhancing network performance and efficiency.

  • RDMA Technology
  • Network Technology
  • RoCE NIC
  • Ethernet Data Format
  • Data Transfer

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. RDMA NIC In Lossless Network Qingchun Song

  2. Background Microsoft forecast in 2002 To reach 10Gb/s line rate with TCP, request 5 CPUs in Tx side and 12 CPUs in Rx side Dedicated processor/FPGA offloads TCP Add extra cost Increased the software develop workload New protocol RDMA(Remote Direct Memory Access)/RoCE(RDMA Over Converged Ethernet) CPU offload Kernel bypass

  3. RoCE NIC(Network Interface Controller) Full Stack RoCE NIC is an extension of regular NIC. Besides support the Ethernet specification, it should support RoCE specification, too. 1. Upper level protocols- application layer RDMA is message based transaction 2. Transport layer - The most important layer The transport header Queue Pair(QP) includes a send work queue and a receive work queue Verbs as the abstract layer of RDMA protocol The shim layer between verbs and other interface 3. Network layer - Traditional kernel layer Unicast and multicast operations 4. Link layer Packet relay in same subnet Flow control, error detection and switching 5. Physical layer(RDMA NIC + cable + switch) RDMA NIC RDMA NIC Link width Data encoding, voltage

  4. Ethernet RoCE RDMA Data Format For RoCE packet, IBTA defined to use InfiniBand BTH+ and payload as the payload of RoCE The BTH+ includes the definition of datagram, RDMA operation type, acknowledge of RDMA operation, Destination QP(queue pair), and so on IB transport layer guarantees the data reliability from hardware RoCE payload use UDP port to connect to UDP/IP header

  5. The Working Principles Of RDMA NIC RDMA NIC

  6. The Transport Type Of RDMA NIC UD UC RC (Unreliable Datagram (Unreliable Datagram (Reliable Datagram + Non-Connected) + Connected) + Connected) Send / Receive Yes Yes Yes RDMA Write No Yes Yes RDMA Read / Atomic No No Yes

  7. The Operation Of RDMA NIC RDMA NIC RDMA NIC RDMA NIC RDMA NIC Requester Requester Responder Responder sync Post RR Post SR Post SR data + addr + rkey data ACK ACK Poll CQ Poll CQ Poll CQ RDMA Write Send/Receive

  8. The Operation Of RDMA NIC RDMA NIC RDMA NIC RDMA NIC RDMA NIC Requester Responder Requester Responder addr + rkey Post SR Post SR data + addr + rkey data data Poll CQ Poll CQ RDMA Atomic RDMA Read

  9. The Lossless Network For RoCE Congested Traffic (ECN marked) Congested Traffic Congestion Notification Sender RDMA NIC Switch Receiver RDMA NIC Reaction Point (RP) Congestion Point (CP) Notification Point (NP)

  10. GPU GPU DirectRDMA DirectRDMA RDMA NIC was de-facto NIC Green Line: RDMA Latency Red Line: TCP Latency Green Line: TCP Bandwidth Red Line: RDMA Bandwidth

  11. NVMeOF NVMe: Non-volatile Memory Express over PCI Express An efficient programming interface for accessing NVM devices over a PCIe bus Lock-free multi thread/process NVM access NVMeOF: Non-volatile Memory Express over Fabrics RDMA NIC was de-facto NIC Initiator CPU Memory Linux Block Device API NVME SQs NVME CQs RDMA SQs RDMA RQs RDMA CQs NVMe Driver Data Buffers Transport Abstraction RDMA Driver PCIe NVME PCIe PCIe NVMe Function NIC Function NIC SSD Controler Target CPU Memory RDMA RDMA SQs RDMA RQs RDMA CQs NVME SQs NVME CQs NVME over Fabrics Control Path Data Buffers RDMA Driver Control Path NVMe Driver Control Path RDMA Fabric PCIe PCIe PCIe PCIe NVMe Function NVMe Function NVMe Function RDMA Transport RDMA NIC SSD Controler SSD Controler SSD Controler

  12. Conclusion RDMA: A game changer for large scale system Makes the data-centric computing to become the reality The ideal network in more and more data center, AI center and HPC center

  13. Thank You

Related


More Related Content