QoPS: A QoS based Scheme for Parallel Job Scheduling

QoPS: A QoS based Scheme for Parallel Job Scheduling
Slide Note
Embed
Share

Study on independent parallel job scheduling models, resource mapping, backfilling techniques, and differentiated service classes. Focus on achieving soft and hard real-time guarantees based on user-specified deadlines and resource usage cost models.

  • Job scheduling
  • Parallel computing
  • QoS
  • Real-time guarantees
  • Resource mapping

Uploaded on Feb 17, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. QoPS: A QoS based Scheme for Parallel Job Scheduling M. Islam P. Balaji P. Sadayappan and D. K. Panda Computer and Information Science The Ohio State University Presented by Gerald Sabin

  2. Job Schedulers Today Independent Parallel Job Scheduling Model Dynamically arriving Independent Parallel Jobs Resource mapping: Submitted Jobs to Resources present Number of Techniques studied over the years Backfilling (Ex: Conservative, EASY, No Guarantee) Priority based scheduling Differentiated service to different classes of jobs Soft Real-time or Best Effort guarantees to the completion time Hard Real-Time or Deadline-based scheduling Allow Users to specify the deadline they desire Cost model based on Resources Used AND Deadline Specified Requires a deadline-based scheduling algorithm: LONG OVERDUE ! 06/24/2003 The Ohio State University 2

  3. QoS for Job Scheduling Two Components in providing QoS Cost Model Component Based on Resources Used AND Deadline Specified More urgent jobs are charged more Guarantees the service requested Job Scheduling Component Admission Control Can we meet the specified deadline? Once admitted, cannot miss the specified deadline We only deal with the Job Scheduling Component 06/24/2003 The Ohio State University 3

  4. Overview Related Work The QoPS Algorithm Simulation Approach Experimental Results Conclusions and Future Work 06/24/2003 The Ohio State University 4

  5. Related Work Feitelson s Slack-Based (SB) Scheduling [feit97] Focused on improving Utilization and Turnaround time Jobs have an associated slack, based on their priority This determines how much they can be delayed Ramamritham s Real-Time (RT) Scheduling [krithi90] Deadline-based scheduling algorithm Non-periodic Single Processor Jobs Statically available at start time [feit97]: Supporting Priorities and Improving Utilization of the IBM SP2 Scheduler using Slack based Backfilling , D. Talby, D. G. Feitelson, IPPS, Apr 97 [krithi90]: Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems , K. Ramamritham, J. A. Stankovic, P-F. Shiah, TPDS, Apr 90 06/24/2003 The Ohio State University 5

  6. Slack-Based (SB) Scheduling Algorithm When a job (JN+1) arrives Calculate its slack (based on its priority) If J1, J2, , JN are already present and scheduled in that order Try placing the job (JN+1) in each possible position in this list For each of the N+1 schedules feasible, calculate a cost function f A schedule is feasible if no job exceeds the slack given to it Choose the schedule with the best cost function value fbest f0 f1 f2 f3 f4 f5 f6 J7 J6 J6 J6 J6 J5 J5 J5 J5 J4 J4 J4 J4 J3 J3 J3 J3 J2 J2 J2 J7 J1 J1 J7 J2 J7 J1 J1 Cost Function Evaluation Cost Function Evaluation Cost Function Evaluation 06/24/2003 The Ohio State University 6

  7. Real-Time (RT) Scheduling Algorithm Static Scheme, so there s no concept of new jobs arriving Sort jobs based on a heuristic function Start from a NULL schedule For each of the jobs If placing the job in the current schedule misses its deadline Backtrack to the last known feasible schedule If (number of backtracks > p) Discard the Schedule If all jobs have been placed within their deadlines Accept the Schedule 06/24/2003 The Ohio State University 7

  8. Working of the RT Algorithm Sorted by Earliest Deadline first (EDF) JN JN-1 JN-2 . . . J3 J2 J1 NULL J1 J3 J2 J4 J2 J3 J4 06/24/2003 The Ohio State University 8

  9. Modified Slack-Based (MSB) Algorithm Modified Slack-Based (MSB) Algorithm Motivation of SB: To improve Utilization and Response Time SB assigns slack to jobs based on job priorities MSB assigns slack to jobs based on the deadline specified Rest of the algorithm is unchanged 06/24/2003 The Ohio State University 9

  10. Modified Real-Time (MRT) Algorithm Modified Real-Time (MRT) Algorithm RT was designed for non-periodic uni-processor jobs All jobs are Statically available at the start of the execution MRT involves two modifications to RT To allow dynamically arriving jobs Run the algorithm every time a job arrives To allow scheduling of parallel jobs Allowing backfilling of jobs 06/24/2003 The Ohio State University 10

  11. Overview Related Work The QoPS Algorithm Simulation Approach Experimental Results Conclusions and Future Work 06/24/2003 The Ohio State University 11

  12. The Basic QoPS Algorithm Similar to the MSB algorithm, but Provides more flexibility in reordering scheduled jobs When a job (JN+1) arrives If J1, J2, , JN are already present and scheduled in that order Place the job (JN+1) at the start of all jobs Try scheduling the jobs in that order If all jobs are able to meet their deadlines, Great ! Admit it ! If some job fails, we have two options: Option1: Consider the failed job as a critical job Push the failed job to the start of the schedule and retry k number of such re-orderings of existing jobs are allowed If (number of re-orderings > k) switch to option 2 Option2: Back off exponentially in the position at which you try placing job (JN+1) and retry 06/24/2003 The Ohio State University 12

  13. Working of the QoPS Algorithm J13 J12 J11 J10 J9 J8 J7 J6 J13 J5 J6 J4 J5 J3 J2 J1 J1 J2 J1 J13 J3 J1 J13 J3 J2 J13 J3 J2 J4 Current Violations = 0 Current Violations = 1 Current Violations = 2 Current Violations = 0 J4 J3 J2 J1 J2 J1 J13 J3 J2 J1 J13 J3 J13 Max. Violations Allowed = 2 06/24/2003 The Ohio State University 13

  14. Overview Related Work The QoPS Algorithm Simulation Approach Experimental Results Conclusions and Future Work 06/24/2003 The Ohio State University 14

  15. Simulation Approach CTC/SDSC Trace Load Variation Duplication/Expansion Deadline Calculator Deadline-based Trace QoPS Simulation MSB MRT EASY Simulation Simulation Simulation 06/24/2003 The Ohio State University 15

  16. Trace Generation Many job logs available, but no associated deadlines Synthetic Deadline Generation Generate a schedule for the job trace using EASY For any job J, if the Turnaround time in this schedule is T Deadline for J = Arrival Time + max (runtime, (1-SF) x T) SF is the Stringency factor (0 < SF < 1) 0 would give the least stringent deadlines and 1 the most stringent Some jobs might not come with deadlines Very lax deadlines to prevent starvation If T is the current expected Turnaround time, Deadline = Arrival Time + max (24hrs, R x T) R is the Relaxation Factor of the schedule 06/24/2003 The Ohio State University 16

  17. Overview Related Work The QoPS Algorithm Simulation Approach Experimental Results Conclusions and Future Work 06/24/2003 The Ohio State University 17

  18. Experimental Results Two evaluation scenarios Scenario1: All jobs have deadlines Pure comparison of the three algorithms Scenario2: Mixed jobs: Some have deadlines, others are artificially provided More realistic Tests Conducted: Job Acceptance rate Impact on Non-deadline Jobs Utilization Variation, etc 06/24/2003 The Ohio State University 18

  19. Admittance Capacity Comparison Unadmitted Jobs Vs Load Unadmitted Proc. Secs Vs Load 25 30 QoPS MRT MSB % Unadmitted Proc. Secs. 25 20 % Unadmitted Jobs 20 15 15 10 10 QoPS MRT MSB 5 5 0 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Duplication) Load (Duplication) All jobs have deadlines; Stringency Factor = 0.2; CTC Trace QoPS admits the most number of jobs (and Processor Seconds) 06/24/2003 The Ohio State University 19

  20. Utilization Comparison Utilization Vs Load (Stringency Factor = 0.2) Utilization Vs Load (Stringency Factor = 0.5) 0.9 0.9 QoPS MRT MSB EASY 0.8 0.8 0.7 0.7 0.6 0.6 Utilization Utilization 0.5 0.5 0.4 0.4 0.3 0.3 QoPS MRT MSB EASY 0.2 0.2 0.1 0.1 0 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Duplication) Load (Duplication) All jobs have deadlines; CTC Trace Deadline-based schemes lose about 10% Utilization 06/24/2003 The Ohio State University 20

  21. Admittance Capacity Comparison (Mixed Jobs) Unadmitted Jobs Vs Load Unadmitted Proc. Secs Vs Load 12 9 QoPS MRT MSB QoPS MRT MSB 8 % Unadmitted Proc. Secs. 10 7 % Unadmitted Jobs 8 6 5 6 4 4 3 2 2 1 0 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Duplication) Load (Duplication) 20% jobs have deadlines; Stringency Factor = 0.2; CTC Trace QoPS admits the most number of jobs (and Processor Seconds) 06/24/2003 The Ohio State University 21

  22. Response Time and Slow Down Vs Load Response Time Vs Load Slow Down Vs Load 45 35000 QoPS MRT MSB EASY 40 30000 QoPS MRT MSB EASY Response Time (secs) 35 25000 30 Slow Down 20000 25 20 15000 15 10000 10 5000 5 0 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Duplication) Load (Duplication) 20% jobs have deadlines; Stringency Factor = 0.2; CTC Trace QoPS gives the best slow-down in spite of accepting more jobs; Unfair to EASY 06/24/2003 The Ohio State University 22

  23. Utilization Vs Load (Mixed Jobs) Utilization Vs Load 0.85 0.8 0.75 0.7 QoPS MRT MSB EASY Utilization 0.65 0.6 0.55 0.5 0.45 0.4 1 1.1 1.2 1.3 1.4 1.5 1.6 Load EASY has a higher Utilization Accepts more (all) jobs; Unfair to the deadline-based schemes 06/24/2003 The Ohio State University 23

  24. Response Time and Slow Down Vs Utilization 20% jobs have deadlines; Stringency Factor = 0.2; CTC Trace Fairer Comparison; QoPS still performs better in most cases, especially Slow Down 06/24/2003 The Ohio State University 24

  25. Overview Related Work The QoPS Algorithm Simulation Approach Experimental Results Conclusions and Future Work 06/24/2003 The Ohio State University 25

  26. Conclusions Deadline-based scheduling is desirable No such scheme for parallel jobs Previous schemes can be extended, but Not proposed for this kind of scheduling Might not fit in perfectly Proposed the QoPS algorithm Allows jobs to specify required deadlines Admission control checks admissibility Job Scheduler schedules admitted jobs Outperforms extended previous schemes (MSB and MRT) But, the main idea is not performance Deadline Scheduling is a necessity and QoPS is an effort to meet it 06/24/2003 The Ohio State University 26

  27. Future Work Cost Metric component in QoS Currently using a first fit mechanism Best fit is expected to do much better Job Shedding Vs Non Job Shedding If deadline can t be met Should we reject the job (will the user try again?) Should we give it the best available deadline Grid based extensions to QoPS 06/24/2003 The Ohio State University 27

  28. Thank You ! 06/24/2003 The Ohio State University 28

  29. Backup Slides

  30. Admittance Capacity for SDSC trace Unadmitted Jobs Vs Load Unadmitted Proc. Secs Vs Load 25 25 QoPS MRT MSB % Unadmitted Proc. Secs. 20 20 % Unadmitted Jobs 15 15 10 10 QoPS MRT MSB 5 5 0 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Duplication) Load (Duplication) All jobs have deadlines; Stringency Factor = 0.2; CTC Trace QoPS admits the most number of jobs (and Processor Seconds) 06/24/2003 The Ohio State University 30

  31. Admittance Capacity with Job Expansion Unadmitted Jobs Vs Load Unadmitted Proc. Secs Vs Load 9.5 22 QoPS MRT MSB 9 21 % Unadmitted Proc. Secs. 8.5 % Unadmitted Jobs 20 8 19 7.5 7 18 6.5 17 QoPS MRT MSB 6 16 5.5 5 15 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Expansion) Load (Expansion) All jobs have deadlines; Stringency Factor = 0.2; CTC Trace QoPS admits the most number of jobs (and Processor Seconds) 06/24/2003 The Ohio State University 31

  32. Impact of Relaxation Factor Average Response Time Vs Load Average Slow Down Vs Load 35000 14 Average Response Time (secs) Factor = 2 Factor = 5 Factor = 10 Factor = 2 Factor = 5 Factor = 10 30000 12 Average Slow Down 25000 10 20000 8 15000 6 10000 4 5000 2 0 0 1 1.1 1.2 1.3 1.4 1.5 1.6 1 1.1 1.2 1.3 1.4 1.5 1.6 Load (Duplication) Load (Duplication) 80% jobs have deadlines; Stringency Factor = 0.2; CTC Trace With low R , Longer jobs perform better (reflects in Resp. Time) 06/24/2003 The Ohio State University 32

Related


More Related Content