Optimizing Charm++ Process Launching for Improved User Experience

user facing improvements to charm process n.w
1 / 12
Embed
Share

Enhance your Charm++ process launching with user-facing improvements such as standalone execution, tweaks to existing launch scheme, specifying the number of nodes directly, limiting the number of hosts used, topology-aware launching, and automatic provisioning options. Make the most out of your Charm++ framework with these advanced features designed for efficiency and performance.

  • Charm++
  • Process Launching
  • User Experience
  • Optimization
  • Topology-Aware

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. User-facing Improvements to Charm++ Process Launching Evan Ramos Charmworks, Inc.

  2. Context Standalone execution Charmrun process launcher netlrts verbs

  3. Tweaks to Existing Launch Scheme

  4. ++n ++np Goal: Specify number of nodes directly Example: 16 processes with 24 threads each Previously: 16 24 = 384 ./charmrun ++p 384 ++ppn 24 ./hello Now supported: ./charmrun ++n 16 ++ppn 24 ./hello ./charmrun ++p 384 ++n 16 ./hello ./charmrun ++p 384 ++n 16 ++ppn 24 ./hello

  5. ++numHosts Goal: Limit the number of hosts used, hosts < PEs Example: 4 hosts, 8 PEs Previously: Required nodelist modification Now supported: ./charmrun ++p 8 ++numHosts 4 ./hello Sample nodelist file: group main +shell "ssh" host ambition.cs.illinois.edu host beauty.cs.illinois.edu host charity.cs.illinois.edu host courage.cs.illinois.edu host devotion.cs.illinois.edu host esteem.cs.illinois.edu

  6. Topology-Aware Launching

  7. Topology-Aware Launching Request processes and worker threads per units of hardware Processor topology queried via Portable Hardware Locality library (hwloc) https://www.open-mpi.org/projects/hwloc/ CPU affinity set automatically

  8. Topology-Aware Launch Options Charmrun Standalone Processes Worker Threads ++oneWthPerHost ++oneWthPerSocket ++oneWthPerCore ++oneWthPerPU Worker Threads +oneWthPerHost (equivalent to +p1) +oneWthPerSocket +oneWthPerCore +oneWthPerPU ++processPerHost N ++processPerSocket N ++processPerCore N ++processPerPU N

  9. Automatic Topology-Aware Launch Options Charmrun Standalone Automatic Provisioning ++auto-provision ++autoProvision Currently equivalent to: non-SMP ++processPerCore 1 SMP ++processPerSocket 1 ++oneWthPerPU Automatic Provisioning +auto-provision +autoProvision Currently equivalent to: non-SMP +oneWthPerHost SMP +oneWthPerPU

  10. Limitations Requires homogeneous processor topology Not implemented for other process launchers

  11. Potential Future Developments Heterogeneity of processes per host - likely straightforward Heterogeneity of threads per process - requires involved changes to RTS Support with process launchers used by other network layers Application-specified optimal launch scheme More parameters for more situations

  12. Feedback and suggestions welcome! evan@hpccharm.com

More Related Content