Adaptive Random Test Case Prioritization Techniques

Slide Note

This content discusses the importance of test case prioritization in software testing, including coverage-based prioritization techniques and the challenges faced in real-life programs. It also explores adaptive random test case prioritization and its impact on fault detection and code coverage goals.

sriy Follow

Uploaded on Apr 12, 2025 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Adaptive Random Test Case Prioritization Speaker: Bo Jiang* Co-authors: Zhenyu Zhang*, W.K.Chan , T.H.Tse* *The University of Hong Kong City University of Hong Kong 1

Contents Background Motivation Adaptive Random Test Case Prioritization Experiments and Results Analysis Related Works Conclusion & Future work 2

Regression Testing Techniques Obsolete Test Case Elimination Program P Test Suite T Test Suite T Test Case Reduction Accounts for 50% of the cost of software maintenance. Test Case Augmentation Test Case Selection Test Case Prioritization Test Suite T Test Suite T Test Suite T Program P Test Suite T 3

Test Case Prioritization Definition Test case prioritization permutes a test suite T for execution to meet a chosen testing goal. Typical testing goals Rate of code coverage Rate of fault detection Rate of requirement coverage Merits No impact on the fault detection ability 4

Coverage-based Test Case Prioritization Technique Total-statement/function/branch Highest code coverage first Resolve tie-case randomly Additional-statement/function/branch Additional highest code coverage first Reset when no more coverage can be achieved Resolve tie-case randomly Disadvantages Hard to scale to larger programs 5

Contents Background Motivation Adaptive Random Test Case Prioritization Experiments and Results Analysis Related Works Conclusion & Future work 6

Problem With Total Techniques GREP FLEX APFD Elbaum et al. @ TSE 2002 7

Problem With Total(greedy) Techniques GREP FLEX APFD Total strategy may NOT be effective for real-life program Elbaum et al. @ TSE 2002 8

Problems with Additional Techniques 45 40 35 Time Used for Prioritization 30 25 20 15 10 5 0 1 2 3 4 5 6 Total Siemens Total Unix Random Unix Additional Siemens Additional Unix Random Siemens 9

Problems with Additional Techniques 45 40 Additional Techniques may NOT be efficient for real-life programs. 35 Time Used for Prioritization 30 25 20 15 10 5 0 1 2 3 4 5 6 Total Siemens Total Unix Random Unix Additional Siemens Additional Unix Random Siemens 10

Problems with Additional Techniques 45 40 35 Time Used for Prioritization 30 Can we find a prioritization techniques that is both effective and efficient for real life program? 25 20 15 10 5 0 1 2 3 4 5 6 Total Siemens Total Unix Random Unix Additional Siemens Additional Unix Random Siemens 11

Adaptive Random Testing (ART) Adaptive Random Testing (ART) A technique for test case generation Evenly spread randomly generated test cases across the input domain. In empirical study, ART can detect failures using up to 50% fewer test cases than random testing. 12

Fixed-Sized-Candidate-Set ART Algorithm Random generate a test case and execute it. 13

Fixed-Sized-Candidate-Set ART Algorithm Randomly generate a set of candidate test cases. 14

Fixed-Sized-Candidate-Set ART Algorithm For each candidate test case, find its nearest neighbor within the executed test cases. 15

Fixed-Sized-Candidate-Set ART Algorithm Select the test case which has longest distance with its nearest neighbor and execute it. 16

Fixed-Sized-Candidate-Set ART Algorithm Randomly generate a set of candidate test cases. 17

Fixed-Sized-Candidate-Set ART Algorithm For each candidate test case, find its nearest neighbor within the executed test cases. 18

Fixed-Sized-Candidate-Set ART Algorithm For each candidate test case, find its nearest neighbor within the executed test cases. 19

Fixed-Sized-Candidate-Set ART Algorithm For each candidate test case, find its nearest neighbor within the executed test cases. 20

Fixed-Sized-Candidate-Set ART Algorithm For each candidate test case, find its nearest neighbor within the executed test cases. 21

Fixed-Sized-Candidate-Set ART Algorithm For each candidate test case, find its nearest neighbor within the executed test cases. 22

Fixed-Sized-Candidate-Set ART Algorithm Select the test case which has longest distance with its nearest neighbor and execute it. 23

Fixed-Sized-Candidate-Set ART Algorithm Repeat until a failure is encountered. X 24

Adaptive Random Testing (ART) ART is based on the observation that failure turned to cluster across the input domain. Intuitively, evenly spread the test case may increase the probability of exposing the first fault faster. In test case prioritization, we also want to increase the rate of fault detection. 25

Use ART directly for test case prioritization? The variety of black-box input information makes it hard to define a general distance metric. Video streams Images Xml The white-box coverage information of the previously executed test cases are readily available Statement coverage Branch coverage Function coverage And 26

Distribution of Failures in Profile Space on LilyPond William Dickinson et al. @ FSE, 2001. 27

MDS Display of Distribution of Failures in Profile Space on LilyPond Failures tend to cluster together. William Dickinson et al. @ FSE, 2001. 28

MDS Display of Distribution of Failures in Profile Space on GCC William Dickinson et al. @ FSE, 2001. 29

Distribution of Failures in Profile Space on GCC Failures tend to cluster together. William Dickinson et al. @ FSE, 2001. 30

Use ART directly for test case prioritization? Why NOT use such low-cost white-box information to evenly spread test cases across the code coverage space? The variety of black-box input information makes it hard to define a uniform distance metric. Video streams Images Xml The white-box coverage information of the previously executed test cases are readily available Statement coverage Branch coverage Function coverage 31

Contents Background Motivation Adaptive Random Test Case Prioritization Experiments and Results Analysis Related Works Conclusion & Future work 32

Adaptive Random Test Case Prioritization Generate candidate set Random select a test case into the candidate set If code coverage improve, continue; Otherwise, stop. Merits: No magic number, non-parametric Select the farthest candidate from the prioritized set Distance between test cases Distance between a candidate test case and the already prioritized test cases Repeat until all test cases are prioritized 33

Adaptive Random Test Case Prioritization How to measure the distance of test cases Jaccard Distance | 1 ) , ( 2 1 t t Jaccard = ( ) ( | / | ) 2 ( ) ( | ) 2 s t s t s t s t 1 1 General distance metric for binary data Can also use other distance metric for substitution. How to select the test case from the candidate set that is farthest away from the already prioritized test cases? Maximize the minimum distance (maxmin for short) Chen et al. @ ASIAN '04, LNCS 2004 Maximize the average distance (maxavg for short) Ciupa et al. @ ICSE 2008 Maximize the maximum distance (maxmax for short) 34

Contents Background Motivation Adaptive Random Test Case Prioritization Experiments and Results Analysis Related Works Conclusion & Future Work 35

Research Questions Do different levels of coverage information have significant impact on ART techniques? Do different definitions of test set distances have significant impacts on ART techniques? Are ART techniques efficient? 36

Subject Programs No. of Faulty Versions Subject LOC Test Pool Size tcas 41 133 137 1608 schedule 9 291 294 2650 schedule2 10 261 263 2710 tot_info 23 272 274 1052 print_tokens 7 341 342 4130 print_tokens2 10 350 354 4115 replace 32 508 515 5542 flex 21 8571 10124 567 grep 17 8053 9089 809 gzip 55 4081 5159 217 sed 17 4756 9289 370 37

Techniques Studied in the Paper Group Name Random random Descriptions Random prioritization Level of Coverage Info. statement function branch statement function branch total-st total-fn total-br addtl-st addtl-fn addtl-br Total Additional Test Set Distance (f2) Maximize minimum distance Maximize average distance Maximize maximum distance Maximize minimum distance Maximize average distance Maximize maximum distance Maximize minimum distance Maximize average distance Maximize maximum distance ART Level of Coverage Info. ART-fn-maxmin ART-fn-maxavg ART-fn-maxmax ART-br-maxmin ART-br-maxavg ART-br-maxmax ART-st-maxmin ART-st-maxavg ART-st-maxmax Function ART Branch Statement 38

Experiment Setup Dynamic coverage information collection gcov tool Effectiveness Metric APFD: weighted average of the percentage of faults detected over the life of the suite Process For each of the 11 subject programs, randomly select 20 test suite, and repeat 50 times for each ART techniques. 39

Research Questions Do different levels of coverage information have significant impact on ART techniques? Do different definitions of test set distances have significant impacts on ART techniques? Are ART techniques efficient? 40

Do different levels of coverage information have significant impact on ART techniques? Fix the other variable: definitions of test set distances. Perform multiple comparison between each pair of coverage information and gather the statistics. statement > function statement = function statement < function branch > statement branch = statement branch > function branch = function 41

Do different levels of coverage information have significant impact on ART techniques? Fix the other variable: definitions of test set distances. Perform multiple comparison between each pair of coverage information and gather the statistics. As confirmed by previous research: Branch > Statement > Function statement > function statement = function statement < function branch > statement branch = statement branch > function branch = function 42

Research Questions Do different levels of coverage information have significant impact on ART techniques? Branch > Statement > Function Do different definitions of test set distances have significant impacts on ART techniques? Is ART techniques efficient? 43

The Impact of Test Set Distance Fix the other variable: definitions of coverage information Perform multiple comparison between each pair of test set distance and gather the statistics. maxmin > maxavg maxmin = maxavg maxmin < maxavg maxavg > maxmax maxavg = maxmax maxavg < maxmax maxmin > maxmax maxmin = maxmax 44

The Impact of Test Set Distance Fix the other variable: definitions of coverage information Perform multiple comparison between each pair of test set distance and gather the statistics. Max-Min > Max-Avg Max-Max maxmin > maxavg maxmin = maxavg maxmin < maxavg maxavg > maxmax maxavg = maxmax maxavg < maxmax maxmin > maxmax maxmin = maxmax 45

Best ART Technique ART-br-maxmin is the best ART prioritization Technique 46

Research Questions Do different levels of coverage information have significant impact on ART techniques? Branch > Statement > Function Do different definitions of test set distances have significant impacts on ART techniques? Max-Min > Max-Avg > Max-Max How does ART-br-maxmin compare with greedy? Is ART techniques efficient? 47

Multiple Comparisons for ART-br-maxmin on Siemens random total-st total-fn total-br addtl-st addtl-fn addtl-br ART-st-maxavg ART-st-maxmin ART-st-maxmax ART-fn-maxavg ART-fn-maxmin ART-fn-maxmax ART-br-maxavg ART-br-maxmin ART-br-maxmax 0.8 0.82 0.84 0.86 0.88 0.9 0.92 6 groups have means significantly different from ART-br-maxmin 48

Multiple Comparisons for ART-br-maxmin on Siemens Only maginal difference difference between ART-br-maxmin and traditional coverage- based techniques, and it is not statistical significant. random total-st total-fn total-br addtl-st addtl-fn addtl-br ART-st-maxavg ART-st-maxmin ART-st-maxmax ART-fn-maxavg ART-fn-maxmin ART-fn-maxmax ART-br-maxavg ART-br-maxmin ART-br-maxmax 49 0.8 0.82 0.84 0.86 0.88 0.9 0.92 6 groups have means significantly different from ART-br-maxmin

Multiple Comparisons for ART-br-maxmin on UNIX random total-st total-fn total-br addtl-st addtl-fn addtl-br ART-st-maxavg ART-st-maxmin ART-st-maxmax ART-fn-maxavg ART-fn-maxmin ART-fn-maxmax ART-br-maxavg ART-br-maxmin ART-br-maxmax 0.4 0.5 0.6 0.7 0.8 0.9 1 8 groups have means significantly different from ART-br-maxmin 50

Adaptive Random Test Case Prioritization Techniques

Download Presentation

Presentation Transcript

Related

More Related Content