Optimizing Exascale Network Power Management through Link Control

toward runtime power management of exascale n.w
1 / 19
Embed
Share

Explore the challenges of managing power in exascale networks, focusing on the efficient control of network links to reduce power consumption. Discover insights on network utilization, communication patterns, and solutions to minimize energy waste.

  • Exascale Networks
  • Power Management
  • Link Control
  • Network Utilization
  • Communication Patterns

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Toward Runtime Power Management of Exascale Networks by On/Off Control of Links Ehsan Totoni University of Illinois-Urbana Champaign, PPL Charm++ Workshop, April 16 2013

  2. Power challenge Power is a major challenge Blue Waters consuming up to 13 MW Enough to electrify a small town Power and cooling infrastructure Up to 30% of power in network Projected for future by Peter Kogge Saving 25% power in current Cray XT system by turning down network Work from Sandia Ehsan Totoni 2

  3. Network link power Network is not energy proportional Consumption is not related to utilization Near peak most of the time Unlike processor Recent study: Work from Google in ISCA 10 50% of power in network of non-HPC data center When CPU s underutilized Up to 65% of network s power is in links Ehsan Totoni 3

  4. Exascale networks Dragonfly IBM PERCS in Power 775 machines Cray Aries network in XC30 Cascade DOE Exascale Report High dimensional Tori 5D Torus in IBM Blue Gen/Q 6D Torus in K Computer Higher radix -> a lot of links! Ehsan Totoni 4

  5. Communication patterns Applications communication patterns are different Network topology designed for a wide range of applications NPB CG MILC Ehsan Totoni 5

  6. Fraction of links ever used Ehsan Totoni 6

  7. Nearest neighbor usage Ehsan Totoni 7

  8. More expensive links Ehsan Totoni 8

  9. Nearest neighbor Ehsan Totoni 9

  10. Solution to power waste Many of the links are never used For common applications Are networks over-built? Maybe FFTs are crucial But processors are also overbuilt Let s make them energy proportional Consume according to workload Just like processors Turn off unused links Commercial network exists (Motorola) Ehsan Totoni 10

  11. Runtime system solution Hardware can cause delays According to related work Not enough application knowledge Small window size Compiler does not have enough info Input dependent program flow Application does not know hardware Significant programming burden to expose Runtime system is the best mediates all communication knows the application knows the hardware Ehsan Totoni 11

  12. Feasibility Not probably available for your cluster downstairs Need to convince hardware vendors Runtime hints to hardware, small delay penalty if wrong Multiple jobs: interference Isolated allocations are becoming common Blue Genes allocate cubes already Capability machines are for big jobs Ehsan Totoni 12

  13. Software design choices Random mapping and indirect routing have similar performance but different link usages Ehsan Totoni 13

  14. Power model We saw many links that are never used Used links are not used all the time For only a fraction of iteration time Compute-communicate paradigm A power model for network capacity utilization Average utilization of all the links Assume that links are turned magically on and off At the exact right time No switching overhead Example: network used one tenth of iteration time Ehsan Totoni 14

  15. Model results Ehsan Totoni 15

  16. Scheduling on/offs Runtime roughly knows when a message will arrive For common iterative HPC applications Low noise systems (e.g. IBM Blue Genes) There is a delay for switching the link 10 s for current implementation Much smaller than iteration time Runtime can be conservative Schedule on s earlier Similar to having more switching delay Ehsan Totoni 16

  17. Delay overhead Ehsan Totoni 17

  18. Results summary Ehsan Totoni 18

  19. Questions? Are you convinced? Ehsan Totoni 19

More Related Content