Two Threads Are Better Than One

Two Threads Are Better Than  One
Slide Note
Embed
Share

In the role of Series Performance Engineer at Royal Bank of Canada, Craig Hodgins delves into the intricacies of optimizing performance through parallel processing. This involves analyzing the efficiency of utilizing multiple threads, a crucial aspect in enhancing the overall speed and reliability of systems. His work directly contributes to the smooth operation and competitiveness of the bank's digital infrastructure, showcasing the importance of efficient thread management in modern computing environments.

  • Performance Engineer
  • Threads
  • Royal Bank of Canada
  • Optimization

Uploaded on Feb 17, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Two Threads Are Better Than One Craig Hodgins zSeries Performance Engineer Royal Bank of Canada

  2. What is SMT2? SMT2 is Simultaneous Multithreading x2 CPU is now called a core An instruction stream is now called a thread Allows 2 threads to execute on one zIIP core

  3. Why SMT2? processor speeds are approaching the physical limits attempt to use parallelism to increase capacity

  4. Faster execution but lower throughput Slower execution but higher throughput

  5. SMT2 Requirements Enabled Turned ON

  6. Roll Out Methodology new system measurement metrics may affect performance tools, capacity planning, and chargeback reporting for example RMF, MXG, TDS desirable to detect and assess any measurement impacts as early as possible on test systems before rolling out to production [sysprog/dev/test/prod] APAR Identifier ...... OA47662 Last Changed ........ 15/08/07 * PROBLEM DESCRIPTION: RMF Monitor III PROC and PROCU * reports: * Lost of precision for APPL% and EAPPL% * * fields when running in PROCVIEW CORE * * mode and MT_1 mode only. * * *

  7. Rollout Methodology Enabling at least one LPAR per production sysplex with different characteristics and workload mix would be useful In other words, don t do the whole sysplex at one time I created a spreadsheet to track the project

  8. SMT2 Verification Review messages after SET OPT=xx Review SDSF Review RMF

  9. Messages After SET OPTxx 00:27:05 E SET OPT=MH 00:27:05 E IEE252I MEMBER IEAOPTMH FOUND IN SYS1.PARMLIB 00:27:05 E IEE536I OPT VALUE MH NOW IN EFFECT 00:27:06 E IWM066I MT MODE CHANGED FOR PROCESSOR CLASS zIIP. THE MT MODE WAS CHANGED FROM 1 TO 2.

  10. SDSF D M=CPU

  11. RMF CPC Report

  12. New Metrics MT-2 MAX CF (Capacity Factor) is the ratio of the maximum amount of work that can be accomplished using 2 threads to the amount of work that would have been accomplished with 1 thread MT-2 Max CF is workload dependent (the max value is 2 and IBM expects average values of about 1.4) The MT-2 CF is the ratio of the maximum amount of work that has been accomplished using 1 or 2 threads to the amount of work that would have been accomplished with multithreading turned off The Average Thread Density shows the average number of threads that have been simultaneously active in the measured interval

  13. SMT2 Benefits SMT delivers more throughput per core, therefore more capacity Less power and cooling required per unit of capacity But an individual SMT2 thread is slower than a single thread would be (we ll see why in a minute) If an SMT2 core provides 140% of the capacity of a single thread, then two threads will (on average) each run at 70% of the single-thread speed when both threads are active Increased sharing of low-level resources by threads makes the amount of work that a thread can do dependent on what else the core is doing

  14. What Causes the Slowdown? A major cause is the sharing of processor cache On recent System z processors, there are two levels of cache that are private to each core (L1 and L2) If a core has more than one thread, these caches will be shared across both threads Each thread is forced to get by with a smaller footprint in these caches and so incurs more L1 and L2 misses than if the caches were not shared Other resources must also be shared: The execution pipes The translation lookaside buffer (TLB) Physical General Purpose Registers Store Buffers and other resources on the core

  15. What to Expect Actual throughput for SMT2 can range from less than 100% to close to 200%, depending upon the usage of the shared resources If programs running on the same core utilize the same resources (competing), they will run slower than before If programs use different resources (complimentary), they can run close to the ideal maximum speed Running the same application multiple times shows less repeatable CPU usage because it may run in differing environments

  16. What Did RBC See? Using 3 LPARs as a sample . There was no noticeable response time or task delay impact with slower SMT2 zIIP threads There was no zIIP CPU consumption or chargeback volume change We realized approximately 10% reduction in relative physical zIIP utilization on a large LPAR, but only 3% reduction on smaller LPARs. The overall weighted zIIP capacity utilization benefit from SMT2 across all large and small LPARs was about 8% (compare to IBM s claim of expected 25%-40% zIIP capacity benefit from SMT2). No major issues (23 LPARs converted with 17 left to go)

  17. What Does the Future Hold? Other platforms have had SMTx for years IBM currently only supports SMT2 on a zIIP IBM future support?

  18. Considerations Vendors need to catch up with SMT2 IBM (RMF PTF) MXG May have to make reporting changes internally

  19. Recommendations / Summary SMT2 should be explored in order to exploit capacity and throughput improvements on a z13 Enable SMT2 in a formal and controlled manner Compare before/after metrics carefully Workload drives results/benefits Your mileage will vary

  20. References There are various CMG and SHARE papers available on the Internet IBM marketing/technical material EPV white papers Google SMT2 z13

  21. Q&A and Discussion Are you on z13 boxes? Has your company implemented SMT2? If not, why not? If so, what did you see?

Related


More Related Content