Data Streaming and Datagram Features in OpenFabrics 2.0

openfabrics 2 0 rsockets requirements n.w
1 / 15
Embed
Share

Explore the latest advancements in data streaming and datagram features within OpenFabrics 2.0, including improvements in connection setup, RDMA write operations, post receives, and more. Enhance your understanding of high-performance computing technologies.

  • Data Streaming
  • Datagram
  • OpenFabrics 2.0
  • RDMA
  • High Performance

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. OpenFabrics 2.0 rsockets+ requirements Sean Hefty - Intel Corporation Bob Russell, Patrick MacArthur - UNH

  2. Data Streaming Current: RDMA CM for connection setup Single wait object and event queue CM and CQ use same fd In-band disconnect notification Associate transport resource with an fd fstat, dup2 Fork support Migrate resources between user space and kernel chroot support 2 www.openfabrics.org

  3. Data Streaming Current: RDMA write with immediate Eliminate address and rkey exchange Receiver selects key Sender uses offset Eliminate need for immediate data Generate event based on write: location and length 3 www.openfabrics.org

  4. Data Streaming Eliminate posting receives No buffer is provided Concern is overrunning CQ, not RQ Replace RDMA write with send Receiver posts single buffer that hardware packs multiple messages into Eliminates RDMA header Count of completed sends Full completion data unnecessary 4 www.openfabrics.org

  5. Data Streaming Split received data into two buffers Separate header and user data Pack tightly, but use multiple buffers Partial completion event Notification of partial transfer for large requests Allow receive side to being processing 5 www.openfabrics.org

  6. Data Streaming Nonblocking support Signal when transport is ready to accept new data Available QP and CQ resources, send credits Keepalive support 0-byte send that does not generate a remote event Similar to RDMA-write, but eliminate header 6 www.openfabrics.org

  7. Datagram User selectable transport address (QPN) High QPN lookup costs Message backlog Multi-receive message buffer Single buffer receives multiple messages Split received data into two buffers Separate header and user data Pack tightly, but use multiple buffers 7 www.openfabrics.org

  8. Datagram Fast address resolution Compact address data Multicast support Fast access to multicast group 8 www.openfabrics.org

  9. General Requests Increase size of immediate data Provide easy mechanism to discover if immediate data is supported and size Slab based allocation for receive buffers Eliminate wasted space dealing with max message size Eliminate posting of dummy receive for immediate data 9 www.openfabrics.org

  10. General Requests Add timeout parameters to all CM operations E.g. connect, accept, disconnect, join multicast Timeout parameters for reading events Ability to cancel a pending I/O Including CM operations 10 www.openfabrics.org

  11. General Requests Error handling must be consistent Do not leave to providers Document which error codes every call can return Similar to POSIX error code documentation Use a single error return convention Return -1 and set errno? Return errno? (prefered) Return +errno? Consistent error values in events Do not mix transport and errno values Easy mechanism to display error text www.openfabrics.org 11

  12. General Requests Query current status of local queues Generating an async event (e.g. SRQ) compounds the issue of dealing with multiple fd s Eliminate need for these events or provide in-band notification Support memory registration across multiple devices Register at the system level, not per PD per HCA 12 www.openfabrics.org

  13. General Requests Need simple, programmatic way to detect memory alignment restrictions Or avoid any alignment needs Need better way to discover supported inline sizes Providers should ensure that that reported values actually improve performance 13 www.openfabrics.org

  14. General Requests Define reasonable minimum requirements on providers for: Number of SGEs Inline size Immediate data size CM private data length With a supported minimum for any message 14 www.openfabrics.org

  15. General Requests Asynchronous interface can be source of races E.g. completions before call returns Have provider update user counters before generating completion Support multiple providers at run-time Provide test suite to verify provider conformance to API specifications Example programs Error conventions Min/max values www.openfabrics.org 15

More Related Content