Automated Testing of Mobile Applications Using Reinforcement Learning
Mobile applications play a crucial role in modern daily life, necessitating thorough testing. GUI testing is a common approach, involving interactions on the device's screen to detect anomalies. This study focuses on applying reinforcement learning for automated testing of mobile applications, emphasizing state definition, reward mechanisms, and learning methods. The use of reinforcement-based testing allows for the identification of state transitions and the learning of effective search strategies to maximize rewards without relying heavily on model accuracy.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Applying Reinforcement Learning for Automated Testing of Mobile Application Focusing on State Definition, Reward, and Learning Method Keita Murase Shingo Takada (Keio University) 1
Introduction Mobile applications are an important part of daily life. They need to be tested just like any other software. GUI testing is one approach to testing. GUI testing: actually performing operations on the screen displayed on the device and checking for anomalies. 2
GUI Testing Android GUI testing Find bugs by conducting actions such as tap, swipe which lead to changes in the screen. Manual GUI testing is expensive. Need automatic testing! Automatically conduct a series of actions. 3
Automatic GUI Testing Random testing [2] Choose actions randomly Issues: efficiency and stability Model-based testing [3] Statically analyze code to obtain series of actions which are represented in graphs. Issues: constructing accurate graphs. We focus on reinforcement learning. [2] Monkey https://developer.android.com/studio/test/monkey [3] Shengqian Yang et al. Static Window Transition Graphs for Android ASE 2015 4
Reinforcement-based Testing Rewards are given for state transitions. Search strategies are learned to obtain large rewards. Does not depend on the accuracy of the model since the strategy is based on the actual state transitions. Q-learning Store the value of the state and actions in a table, and update it using rewards. Tap 5 5 Swipe 2 0 Back 0 1 Title screen Menu screen 5
Q-learning Q-function ? ??,??: Estimated reward when taking action ??at state ?? Stored in a table. Updating Q-function Reward function ??= ? ??,?? Estimated reward based on reward function ( : discount rate) ??+ max? ??+1,??+1 Update (? learning rate) ? ??,?? = 1 ? ? ??,?? + ? ??+ max? ??+1,??+1 6
Related work (1) Qdroid [4] The reward is large when the screen changes, making it easier to avoid meaningless actions that have no effect on the screen. ARES [5] Reward is large when a screen change occurs. Reward is also given when a bug is found. Issue: Q-function may converge to a constant value, causing the same transition to be repeated. [4] Tuyet Vuong et al. Semantic Analysis for Deep Q-Network in Android GUI Testing SEKE 2019 [5] Andrea Romdhana et al. Deep Reinforcement Learning for Black-Box Testing of Android Apps ACM TOSEM 2021 7
Related work (2) Q-testing[6] Reward changes with time. Reward is large when a similar screen has not already been visited. Issue: Learning may not converge when only the immediate reward is updated. [6] Minxue Pan et al. Reinforcement Learning Based Curiosity-Driven Testing of Android Applications ISSTA 2020 8
Proposed approach Focus on state, reward, and learning State Based on the existence of each UI element. Use UIAutomator [7] Reward Reward is large when there are few transitions that are similar to past transitions. Dynamic change of reward Learning Periodically iterate learning based on finished transitions Make Q-function converge [7] UI Automator2 https://github.com/openatx/uiautomator2 9
State Vectorize information obtained from UIAutomator Resource ID Possible operations: Clickable, scrollable, etc. (0,1,1,1,1,1) (1,1,1,1,1,1) 10
Reward Number of times action ?? occurred at state ? ???? ?,?? Distance between state ? and state ?? ??? ?,?? Penalty for transition ??,?? ??? ?,?? ???? ?,??+1 ??????? ??,?? = ? Reward Give a constant high reward if the penalty is less than a threshold Give a low threshold if the penalty is greater than a threshold ? ?? , ??????? ??,?? < ? ? ??,?? = ???? ??????? ??,??,??????? ??,?? ? 11
Penalty example Consider penalty for selecting HELP button at (0,1,1,1,1,1) If the HELP button has been selected three times in the past for the upper right screen ??? ?,?? ???? ?,??+1= 3 0+1= 3 (0,1,1,1,1,1) If the HELP button has been selected three times in the past for the lower right screen ??? ?,?? ???? ?,??+1= 3 1+1= 1.5 Total penalty ??? ?,?? ???? ?,??+1=3 1+3 ??????? ??,?? = ? Reward ???? 4.5, if penalty is greater than or equal to threshold 2= 4.5 (1,1,1,1,1,1) 12
Learning Q-function is kept in a table. Row is state Column is action When a new state is visited a new row is added, and rewards are initialized Periodic update of Q-table. 1. Recalculate the reward for the previous transitions. 2. Take the last N transitions in the order of newest to oldest and repeat updating the Q-function. 3. Randomly select the previous transitions, and repeat updating the Q-function. 13
Implementation [4] Tuyet Vuong et al. Semantic Analysis for Deep Q- Network in Android GUI Testing SEKE 2019 [7] UI Automator2 https://github.com/openatx/uiautomator2 Interactor Based on Qdroid[4] UI analysis done by UIAutomator[7] Executes actions on Android application Agent Updates Q-table Selects appropriate actions 14
Interactor Analyzes UI Uses UIAutomator Obtains Resource ID, possible actions, etc Adds UI information Back button Menu button Clickable button in a random position Executes action on Android application Uses UIAutomator Virtual elements 15
Agent Obtain UI information from Interactor Updates Q-table based on transition history Selects action to execute 16
Evaluation Research Questions RQ1: How is the performance compared to other tools? RQ2: How much influence do our changes have? RQ3: How does coverage change over time? Execution environment OS: Ubuntu 14.04.1 LTS CPU: AMD Ryzen Threadripper 3990X 64-Core Memory: 6113MB Emulator: Android 4.4 17
Evaluation Target applications: 12 Used AndroTest [8] Same as Qdroid [4] Each application was executed three times two hours each Average was computed for analysis. Method coverage Number of unique crashes [4] Tuyet Vuong et al. Semantic Analysis for Deep Q-Network in Android GUI Testing SEKE 2019 [8] Shauvik Roy Choudhary et al. Automated Test Input Generation for Android: Are We There Yet? ASE 2015 18
Overall Results Application Coverage (%) Qdroid 43.2 82.6 64.8 38.2 59.6 37.9 53.8 62.5 58.7 56.1 46.1 89.1 57.7 # Crashes Qdroid 3.7 0 0 0.7 1 0 0 0 0 0 0 0 5.4 Qdroid2 47.1 80.5 79.0 43.2 51.4 64.7 92.3 60.4 58.7 82.8 64.2 75.6 66.7 Our Tool 49.3 83.7 80.2 50.4 48.7 66.9 92.3 62.3 58.7 88.1 73.6 83.7 69.8 Our Tool 6.0 0 0.7 1.7 0.7 0 0 0 0 0 0 0 9.1 Any Memo Dalvik Explorer Hot Death Mileage Mini Note Viewer Multi SMS Sender Munch Life My Expenses Random Music Player Tippy Tipper Weight Chart Who has my stuff Average 19
RQ1: Performance comparison Overall, our approach performed better than Qdroid. Coverage #Crashes 57.7% 69.8% Qdroid Our Tool 5.4 9.1 Exceptions: MiniNoteViewer and WhoHasMyStuff These already had menu buttons. Adding virtual menu buttons was unnecessary. 20
RQ2: Influence of our changes Comparison of our tool vs Qdroid2 Qdroid2: we implemented the part of Qdroid related to the UI acquisition (adding virtual elements) and the operation of the app in the same way as the proposed method Change in UI acquisition led to widening of search area. Increase in coverage! Exceptions Possibility of unnecessary extra action options. Average 57.7% 66.7% 69.8% Qdroid Qdroid2 Our Tool 21
RQ3: Coverage change over time Most apps showed an initial rapid increase, followed by a gradual increase. Much code is involved immediately after starting. Some apps reached maximum value early. App was small. Particular type of input was necessary. Some apps was still increasing after two hours. App was large. Required complicated procedures. 22
RQ3: Coverage change over time Some apps reached maximum value early. App was small. Particular type of input was necessary. Munch Life 23
RQ3: Coverage change over time Some apps were still increasing after two hours. App was large. Required complicated procedures. Mileage 24
Conclusion We applied reinforcement learning to Android testing, focusing on state definition, reward, and periodic learning. Evaluation showed improvement compared to Qdroid. Future work Further evaluation, including more apps and comparison with other tools. Extend supported UI actions. Adjust various parameter. 25