NCAP: Network-Driven Packet Context-Aware Power Management

NCAP: Network-Driven Packet Context-Aware Power Management
Slide Note
Embed
Share

This study introduces NCAP, a power management solution for underutilized servers running online data-intensive applications. By utilizing aggressive processor power management based on network packet context, NCAP significantly reduces processor energy consumption while meeting service level agreements. The methodology, challenges, and evaluation results of this approach are outlined, highlighting its potential to address energy inefficiency in server architectures.

  • Power Management
  • Network-Driven
  • Packet Context-Aware
  • Server Architecture
  • Energy Efficiency

Uploaded on Mar 11, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. NCAP: Network-Driven Packet Context-Aware Power Management for Client-Server Architecture Mohammad Alian, Ahmed Abulila, Lokesh Jindal, Daehoon Kim, Nam Sung Kim Electrical & Computer Engineering University of Illinois Urbana-Champaign 1

  2. 2 Executive Summary Executive Summary problem: servers running on-line data-intensive (OLDI) applications (e.g., web serving) are often underutilized energy is wasted at low- to medium-load levels solution? power management challenge: conventional power management polices fail to satisfy SLA (deadline for processing a request) poor low-power state governors overhead of transitioning b/w power states NCAP: aggressive processor power management based on context of network packet through a network interface card (NIC) 37-69% lower processor energy consumption than a baseline server while satisfying SLA

  3. 3 Outline Outline motivation power management challenges in servers running OLDI-applications methodology dist-gem5 full-system/cycle-accurate simulator for distributed computer systems background client-server architecture conventional power management network-driven, packet context-aware, power management HW/SW joint approach augmenting conventional power management techniques evaluation motivation methodology background NCAP evaluation conclusion

  4. [1] L. A. Barroso, et al. The Datacenter as a Computer Second Edition, 2013 4 Motivation Motivation Cooling 15% online data intensive (OLDI) services e.g., key-value stores, web-serving client-server architecture obligated to satisfy SLA (Service Level Agreement) load on an OLDI server dynamically changes unpredictable load in short time periods load imbalance amongst servers in a data-center OLDI server energy consumption servers energy-inefficient at low- to medium-load levels causes SLA violation for OLDI services due to slow power state transitioning CPUs 42% Others 17% Disks 14% DRAM 12% Server Power Distribution [1] motivation methodology background NCAP evaluation conclusion

  5. 5 Methodology Methodology run two representative OLDI applications Memcached and ApacheBench model a multi-node computer cluster using dist-gem5 one server and several clients at various load levels category configuration O3 core 4 cores, each w/ 15 P states (0.65V/0.8GHz to 1.2V/3.1GHz) & 3 C states (C1, C3, & C6) 10Gbps Intel GbE NIC and switch w/ 1 s network OS Linux Ubuntu 11.04 (kernel version 2.6) motivation methodology background NCAP evaluation conclusion

  6. 6 dist-gem5 parallel/distributed gem5 for simulating distributed computer systems each node is a full-system model operating system + processor + memory + IO network switch process model an arbitrary network topology, speed, latency dist-gem5 available @ gem5.org unified infrastructure for evaluating distributed systems enable researches cutting across operating system, processor/memory architecture, and network o heterogeneous datacenters o performance/power studies of networked systems o big-data architecture, Clients Apps OS C0 C1 LLC Disk Memory IO NIC Network Servers motivation methodology background NCAP evaluation conclusion

  7. 7 Client Client- -Server Architecture Server Architecture Clients Request Response Network Request Response Servers motivation methodology background NCAP evaluation conclusion

  8. 8 Client Client- -Server Architecture Server Architecture Clients Request Response C0 C1 Network LLC Request Response Disk Memory IO NIC Servers motivation methodology background NCAP evaluation conclusion

  9. 9 Client Client- -Server Architecture Server Architecture 1. receive a packet encapsulating a request 2. process packet Clients Request Response C0 C1 Network LLC Request Response Disk Memory IO NIC PKT Servers motivation methodology background NCAP evaluation conclusion

  10. 10 Client Client- -Server Architecture Server Architecture 1. receive a packet encapsulating a request 2. process packet 3. process request 4. send response Clients Request Response C0 REQ C1 RES Network LLC Request Response Disk Memory IO NIC Servers motivation methodology background NCAP evaluation conclusion

  11. 11 Client Client- -Server Architecture Server Architecture 1. receive a packet encapsulating a request 2. process packet 3. process request 4. send response Clients Request Response C0 REQ C1 Network LLC Request Response Disk Memory Takes ~80 s on average IO RES NIC Servers motivation methodology background NCAP evaluation conclusion

  12. 12 Client Client- -Server Architecture Server Architecture 1. receive a packet encapsulating a request 2. process packet 3. process request 4. send response Clients Request Response C0 REQ C1 Network LLC Request Response Disk Memory IO RES NIC Servers motivation methodology background NCAP evaluation conclusion

  13. 13 Processor Power Management Processor Power Management performance (active P) state: P0 to Pn w/ f(Pn-1) > f(Pn) performace and powersave o running cores at P0 and Pn ondemand o periodically sampling core utilization and adjusting P-state proportionally power (or idle C) state: C0 (idle), C1 (halt), C3 (sleep), C6 (off) menu o transition cores to different C states based on the transition history ladder o gradually transition cores to deep C states motivation methodology background NCAP evaluation conclusion

  14. 14 Processor Power Management Processor Power Management HW/SW overheads for changing P and C states HW overhead of 5 to 60 s in intel i7-3770 governor invocation overhead o minimum invocation period of ondemand in Linux is hard coded to 10ms Apache running at different load levels 1.2 Norm. Resp. Time voltage (6.25mV/ s) 0.9 0.6 HIGH MED 0.3 frequency LOW 0.0 PLL relocking time ~10 s 0 10 20 30 40 Ondemand Invocation Period (ms) motivation methodology background NCAP evaluation conclusion

  15. 15 Conventional Power Management Conventional Power Management OLDI applications at med-load level A: Apache M: Memcached power management policies SLA Violation 1.1 1 Normalize Energy to 0.9 transition to Idle? V/F level 0.8 no highest 0.7 dynamic (ondemand) no 0.6 yes (menu) highest 0.5 0.4 yes (menu) dynamic (ondemand) 0.1 0.4 0.7 1 1.3 1.6 1.9 2.2 Response Time Normalized to SLA motivation methodology background NCAP evaluation conclusion

  16. 16 ondemand Governor Governor correlation b/w BW(Rx/Tx) and U(core) (i.e., F(core)) BW(Rx/Tx) as a proxy for U(core) changes late reaction to high/low core utilization 11ms and 7ms late frequency increase/decrease BW(rx,tx) ~ U(core) 1.0 3.5 BW(rx) 7ms Frequency (GHz) 3 0.8 2.5 BW(tx) Utilization 11ms 0.6 2 U(core) 1.5 Apache w/ 10ms invocation period for ondemand governor 0.4 1 0.2 F(core) 0.5 0.0 0 0.14 0.19 Time (s) 0.24 motivation methodology background NCAP evaluation conclusion

  17. 17 menu Governor Governor ineffective in waking up cores in time unpredictable BW(Rx) surge several short transitions during BW surge energy efficiency degradation 1.0 2.0 BW(rx) Sleep Time (ms) 0.8 BW(tx) 1.5 Utilization 0.6 U(core) 1.0 Apache w/ ondemand and menu governors 0.4 TC1 0.5 0.2 TC3 0.0 0.0 TC6 0.14 0.19 Time (s) 0.24 motivation methodology background NCAP evaluation conclusion

  18. 18 Proposed NCAP P/C Proposed NCAP P/C- -State Governor State Governor Desired P/C-state governor react to change in core utilization in a timely manner BW(rx) U(core) Approaches predict changes in core utilization core utilization is highly correlated w/ network activity hide P/C-state transition latency overlap P/C-state transition w/ packet reception and processing Network Stack Copy to User SoftIRQ DRAM Interrupt Handler rx_desc_ring ... 1 2 n p k t s k b s k b s k b DMA NIC CPU NIC DRAM PCIe Channel RC MC motivation methodology background NCAP evaluation conclusion

  19. 19 NCAP Power Management NCAP Power Management BW(Rx) Surge BW(Rx) Surge High latency-critical request arrival rates detected w/ special HW in NIC 0 66 eth TCP payload Max Freq Idle Active C0 C1 G E T / a . h t m ... Template {GET, POST, XY} 2 LLC compare NIC will notify CPU by sending an interrupt to: activate cores boost frequency disable menu governor Disk Memory IO Deep Idle NIC R R R motivation methodology background NCAP evaluation conclusion

  20. 20 NCAP Power Management NCAP Power Management BW(Rx) Surge BW(Rx) Surge High latency-critical request arrival rates detected w/ special HW in NIC 0 66 eth TCP payload Max Freq Idle Active C0 C1 G E T / a . h t m ... Template {GET, POST, XY} 2 LLC compare NIC will notify CPU by sending an interrupt to: activate cores boost frequency disable menu governor overlap P/C state transition time with packet reception and processing Disk Memory IO Deep Idle NIC R R motivation methodology background NCAP evaluation conclusion

  21. 21 NCAP Power Management NCAP Power Management Low latency-critical request arrival rates & low Tx interval are detected i.e., processing for bursty requests is done NIC will notify CPU by sending an interrupt to: reduce cores frequency enable menu governor Max Freq Idle Active C0 C1 LLC Disk Memory IO Deep Idle NIC motivation methodology background NCAP evaluation conclusion

  22. 22 NCAP Power Management NCAP Power Management Low latency-critical request arrival rates & low Tx interval are detected i.e., processing for bursty requests is done NIC will notify CPU by sending an interrupt to: reduce cores frequency enable menu governor Max Freq Idle Active C0 C1 LLC Disk Memory IO Deep Idle NIC R motivation methodology background NCAP evaluation conclusion

  23. 23 Effectiveness of NCAP Effectiveness of NCAP Frequency change Frequency change Apache med-load level top ondemand menu bottom NCAP BW(rx) F(core) 1.0 3.5 3.0 Frequency (GHz) 0.8 2.5 Utilization 0.6 2.0 1.5 0.4 1.0 0.2 0.5 0.0 0.0 BW(rx) F(core) 0.2 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 1.0 3.5 Time (s) 3.0 Frequency (GHz) 0.8 2.5 Utilization 0.6 2.0 1.5 0.4 1.0 0.2 0.5 0.0 0.0 0.2 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 Time (s) motivation methodology background NCAP evaluation conclusion

  24. 24 Effectiveness of NCAP Effectiveness of NCAP Energy and Response Time Energy and Response Time 95th response time normalized to SLA energy consumption normalized to perf Norm. 95th Resp. Time to SLA SLA violation Norm. Energy to "perf" 2 Norm. Time and Energy Low-Load 1.5 1 0.5 0 perf ond perf.idle ond.idle ncap perf ond perf.idle ond.idle ncap Norm. 95th Resp. Time to SLA apache SLA violation Norm. Energy to "perf" memcached config Idle? Freq level 2 Norm. Time and Energy perf No Highest ondemand 1.5 Med-Load ond No 1 perf.idle menu ond.idle menu Highest ondemand NCAP 0.5 0 perf ond perf.idle ond.idle ncap perf ond perf.idle ond.idle ncap NCAP ncap apache memcached motivation methodology background NCAP evaluation conclusion

  25. 25 Effectiveness of NCAP Effectiveness of NCAP Energy and Response Time Energy and Response Time 95th response time normalized to SLA energy consumption normalized to perf Norm. 95th Resp. Time to SLA SLA violation Norm. Energy to "perf" 2 Norm. Time and Energy 21%-49% lower energy 20%-35% lower energy Low-Load 1.5 1 0.5 0 perf ond perf.idle ond.idle ncap perf ond perf.idle ond.idle ncap Norm. 95th Resp. Time to SLA apache SLA violation Norm. Energy to "perf" memcached config Idle? Freq level 2 Norm. Time and Energy perf No Highest ondemand 1.5 Med-Load ond No 1 perf.idle menu ond.idle menu Highest ondemand NCAP 0.5 0 perf ond perf.idle ond.idle ncap perf ond perf.idle ond.idle ncap NCAP ncap apache memcached motivation methodology background NCAP evaluation conclusion

  26. 26 Effectiveness of NCAP Effectiveness of NCAP Energy and Response Time Energy and Response Time 95th response time normalized to SLA energy consumption normalized to perf Norm. 95th Resp. Time to SLA SLA violation Norm. Energy to "perf" 2 Norm. Time and Energy Low-Load 37%-69% lower energy 1.5 1 0.5 0 perf ond perf.idle ond.idle ncap perf ond perf.idle ond.idle ncap Norm. 95th Resp. Time to SLA apache SLA violation Norm. Energy to "perf" memcached config Idle? Freq level 2 Norm. Time and Energy perf No Highest ondemand 1.5 Med-Load ond No 1 perf.idle menu ond.idle menu Highest ondemand NCAP 0.5 0 perf ond perf.idle ond.idle ncap perf ond perf.idle ond.idle ncap NCAP ncap apache memcached motivation methodology background NCAP evaluation conclusion

  27. 27 Conclusion Conclusion conventional power management policies cannot react to the high processing demand of OLDI applications in a timely manner use of ondemand and/or menu governors violation of SLA at unpredictable surge of requests NCAP exploits the following to proactively make power management actions packet reception/processing delay in the network stack correlation between network traffic and core utilization NCAP significantly reduces energy consumption at low to medium load levels w/o violating SLA 37-69% lower processor energy consumption than a baseline server while satisfying SLA motivation methodology background NCAP evaluation conclusion

  28. 28 Thank You Thank You

Related


More Related Content