Assessing the Scalability of Garbage Collectors on Many Cores
This study evaluates the scalability of garbage collectors on multi-core systems, analyzing empirical results and factors affecting performance. Explore the impact of varying core counts and GC threads on application scalability, highlighting challenges and negative scalability trends.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh Gidra Julien Sopena Regal-LIP6/INRIA Ga l Thomas Marc Shapiro
Introduction Why? MREs are ubiquitous! GC, a vital component of it performance is critical? Hardware is more and more multi-resourced. Are GCs scaling with such hardware? Current solutions not evaluated on true many-cores! What? Assesses GC scalability : Empirical Results. Possible factors affecting the GC scalability. Lokesh Gidra 2
Multi-Node Architecture C0 C1 C5 C0 C1 C5 15 40 L2 L2 L2 L2 L2 L2 L3 L3 315 125 MC MC To other nodes DRAM DRAM Remote access >> Local access Our machine has 8 nodes with 6 cores each Lokesh Gidra 3
Parallel Copying Garbage Collection Mutator Threads GC Threads Application Time Pause Time Total Time Live Object Dead Object From Space To Space Lokesh Gidra 4
GCs effect on Application Scalability (Lusearch) Mutator Threads = GC Threads = Varying Number of Cores Up-to 6 cores: 3X performance improvement. More than 6 cores: No improvement in total time. Proportion of pause time increases up-to 50%. Lokesh Gidra 5
GC Scalability (Lusearch) Mutator Threads = Cores = 48 and, Varying Number of GC Threads Pause time increases with GC threads Negative Scalability! Lokesh Gidra 6
1. Remote Scanning GC Threads Node 0 Node 1 87.7% scans were remote! Node 2 Random (Default) object allocation GC0 GC1 GC2 GC3 Node 3 Live Object Dead Object From Space To Space Lokesh Gidra 7
2. Remote Copying GC Threads Node 0 82.7% copies were remote! Node 1 Node 2 GC0 GC1 GC2 GC3 Node 3 Live Object Dead Object From Space To Space Lokesh Gidra 8
3. Load Balancing Based on work stealing technique. 1 task queue per GC thread. Task Queue Owner: Push and Pop Other GC Threads: Steal (Pop) Shared Variable: size (task queue size) Highly unbalanced load: Requires a lot of stealing. Keep doing until all are done. Performance Impact: 2-4 cache misses/stealing! 33.3% improvement in pause time by disabling it! Lokesh Gidra 9
Conclusion GC does affect application s scalability it matters! GC doesn t scale with the hardware! Bottlenecks: Remote Scanning Remote Copying Load Balancing Future Work: Fix the bottlenecks does it help GC to scale? Lokesh Gidra 10
DaCapo Benchmarks Scalability Lokesh Gidra 11
Revisiting App. (Lusearch) Scalability Lokesh Gidra 12