Addressing Shared Resource Contention in Multicore Processors via Scheduling

1 / 14

Embed Share

Explore how contention-aware scheduling and classification schemes can mitigate performance degradation in multicore processors caused by shared resources like the Last Level Cache (LLC). Learn about factors, scheduling algorithms, and evaluation on real systems.

luisant Follow

Uploaded on Jul 01, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Embedded System Lab. Addressing Shared Resource Contention in Multicore Processors via Scheduling snt2426@gmail.com Embedded System Lab.

Index 1. Introduction 2. Classification Schemes 3. Factors Causing Performance Degradation 4. Scheduling Algorithms 5. Evaluation on Real Systems Embedded System Lab.

Introduction Figure 1 shows the performance degradation that occurs due to sharing an LLC with another application, relative to running solo (contention free). Embedded System Lab.

References Previous : Cache Partitioning, Page coloring non-trivial changes, copying of physical memory, addressing only shared cache Embedded System Lab.

Classification Schemes Contention Aware Scheduling = Scheduling Policy + Classification Scheme Offline Perfect Scheduling Policy proposed by jiang SDC vs Animal Classes vs Miss rate vs Pain methodology : the best case vs evaluated classification schemes (estimated best schedule) Embedded System Lab.

Classification Schemes Reason : no account miss rate Based on stack distance profiles (pin tool) Other resources : contention for memory controller, memory bus, resources involved in prefetching Embedded System Lab.

Factors Causing Performance Degradation Cache contention is by far not the dominant cause of performance degradation Miss rate turned out to be an excellent heuristic for contention Embedded System Lab.

Factors Causing Performance Degradation We assumed that the dominant cause of performance degradation is contention for the space in the shared cache. (ex : cache partitioning, page coloring) Embedded System Lab.

Scheduling Algorithms Miss rate classification schemes + Centralized Sort scheduling policy very easy to obtain online via hardware perf counters sort APP by miss rate distribute APP across cores User level scheduler (Using affinity interfaces provided by Linux) SIMPLE 1. DI (Distributed Intensity) scheduler Estimate the the miss rate based on the stack-distance-profiles Offline 2. DIO (DI Online) scheduler Miss rate measured online more resilient to APP Embedded System Lab.

Scheduling Algorithms Why DI use miss rates Why DI should work is that miss rates of applications are relatively stable Embedded System Lab.

Evaluation on Real Systems Measure the aggregate workload completion time of each DI, DIO perform better than RANDOM and are with in 2% if OPTIMAL In former case : 13% Isolated case : 4% The biggest advantage of DI and DIO is stable results avoid worst case Embedded System Lab.

Evaluation on Real Systems Average : DEFAULT < DIO or DEFAULT > DIO -> DEFAULT is able to perform well on average Individual : DIO > DEFAULT Embedded System Lab.

Evaluation on Real Systems Deviation of execution time of consecutive runs of the same APP in the same workload DIO < DI < DEFAULT Embedded System Lab.

References 1. http://www.slideshare.net/davidkftam/rapidmrcpresentation 2. M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches. In MICRO 39: roceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pages 423 432, 2006. 3. S. Cho and L. Jin. Managing Distributed, Shared L2 Caches through OS-Level Page Allocation. In MICRO 39: Proceedings of the 39thAnnual IEEE/ACM International Symposium on Microarchitecture, pages 455 468, 2006. Embedded System Lab.

Addressing Shared Resource Contention in Multicore Processors via Scheduling

Download Presentation

Presentation Transcript

Related

More Related Content