Endpoint Interface Query Options

application mapping over ofiwg sfi n.w
1 / 10
Embed
Share

Explore the various query interfaces related to optimizing message queues, RMA/Atomics operations, and more in the context of application mapping and MPI implementation. Understand endpoint capabilities, data transfer flags, and event queue associations for efficient communication processes.

  • Query Interfaces
  • Endpoint Optimization
  • RMA/Atomics
  • MPI Implementation
  • Data Transfer

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Application Mapping Over OFIWG SFI Sean Hefty

  2. MPI Over SFI Example MPI Implementation over SFI Demonstrates possible usage model Initialization Send injection Send Completions Polling RMA Counters Completions 2

  3. Query Interfaces: Tagged Reliable unconnected endpoint /* Tagged provider */ hints.type = FID_RDM; #ifdef MPIDI_USE_AV_MAP hints.addr_format = FI_ADDR; #else hints.addr_format = FI_ADDR_INDEX; #endif hints.protocol = FI_PROTO_UNSPEC; hints.ep_cap = FI_TAGGED | FI_BUFFERED_RECV | FI_REMOTE_COMPLETE | hints.op_flags = FI_REMOTE_COMPLETE; Address vector optimized for minimal memory footprint and no internal lookups Transport agnostic Behavior required by endpoint FI_CANCEL; Default flags to apply to data transfer operations 3

  4. Query Interfaces: RMA/Atomics Separate endpoint for RMA operations /* RMA provider */ hints.type = FID_RDM; #ifdef MPIDI_USE_AV_MAP hints.addr_format = FI_ADDR; #else hints.addr_format = FI_ADDR_INDEX; #endif hints.protocol = FI_PROTO_UNSPEC; hints.ep_cap = FI_RMA | FI_ATOMICS | FI_REMOTE_COMPLETE | FI_REMOTE_READ | FI_REMOTE_WRITE; hints.op_flags = FI_REMOTE_COMPLETE; Support for RMA and atomic operations Remote RMA read and write support 4

  5. Query Interfaces: Message Queue Event queue optimized to report tagged completions eq_attr.mask = FI_EQ_ATTR_MASK_V1; eq_attr.domain = FI_EQ_DOMAIN_COMP; eq_attr.format = FI_EQ_FORMAT_TAGGED; fi_eq_open(domainfd, &eq_attr, &p2p_eqfd, NULL); Event queue optimized to report RMA completions eq_attr.mask = FI_EQ_ATTR_MASK_V1; eq_attr.domain = FI_EQ_DOMAIN_COMP; eq_attr.format = FI_EQ_FORMAT_DATA; fi_eq_open(domainfd, &eq_attr, rma_eqfd, NULL); fi_bind(tagged_epfd, p2p_eqfd, FI_SEND | FI_RECV); fi_bind(rma_epfd, rma_eqfd, FI_READ | FI_WRITE); Associate endpoints with event queues 5

  6. Query Limits Query endpoint limits optlen = sizeof(max_buffered_send); fi_getopt(tagged_epfd, FI_OPT_ENDPOINT, FI_OPT_MAX_INJECTED_SEND, &max_buffered_send, &optlen); Maximum inject data size buffer is reusable immediately after function call returns optlen = sizeof(max_send); fi_getopt(tagged_epfd, FI_OPT_ENDPOINT, FI_OPT_MAX_MSG_SIZE, &max_send, &optlen); Maximum application level message size 6

  7. Short Send int MPIDI_Send(buf, count, datatype, rank, tag, comm, context_offset, **request) { data_sz = get_size(count, datatype); if (data_sz <= max_buffered_send) { match_bits = init_sendtag(comm->context_id + comm->rank, tag, 0); Small sends map directly to tagged-injectto call context_offset, COMM_TO_PHYS(comm, rank), match_bits); } else { ... } fi_tinjectto(tagged_epfd, buf, data_sz, Fabric address provided directly to provider 7

  8. Large Message Send Large sends require request allocation int MPIDI_Send(buf, count, datatype, rank, tag, comm, context_offset, **request) { /* code for type calculations, tag creation, etc */ REQUEST_CREATE(sreq); fi_tsendto(MPIDI_Global.tagged_epfd,send_buf, data_sz, NULL, COMM_TO_PHYS(comm,rank), match_bits, &(REQ_OF2(sreq)->of2_context)); *request = sreq; } SFI completion context embedded in request object 8

  9. Progress/Polling for Completions int MPIDI_Progress() { eq_tagged_entry_t wc; fid_eq_t fd[2] = {p2p_eqfd, rma_eqfd}; for(i=0;i<2;i++) { MPID_Request *req; rc = fi_eq_read(fd[i],(void *)&wc, sizeof(wc)); handle_errs(rc); req = context_to_request(wc.op_context); req->callback(req); } } Fields align on tagged entry to data_entry 9

  10. RMA Completions (Counters and Completions) int MPIDI_Win_fence(MPID_Win *win) { /* synchronize software counters via completions */ PROGRESS_WHILE(win->started!=win->completed); /* Syncronize hardware counters */ fi_sync(WIN_OF2(win)->rma_epfd, FI_WRITE|FI_READ|FI_BLOCK, NULL); /* Notify any request based objects that use counter completion */ RequestQ->notify() } 10

More Related Content