Randomized Subexponential Algorithms in Games on Graphs

games on graphs n.w
1 / 22
Embed
Share

Explore the concept of randomized subexponential algorithms in the context of games on graphs, touching on strategy iteration, values, sum of values, and strategy improvement algorithms. The discussion includes Switch-All algorithm, Randomized algorithms for LP-type problems, and Recursive Random Action Selection. Discover the application of these algorithms in optimizing strategies for two-player games.

  • Games
  • Graphs
  • Algorithms
  • Strategy
  • Randomized

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Games on Graphs Uri Zwick Tel Aviv University Lecture 6 Randomized Subexponential Algorithms Started sometime in 2018 Last modified 20/12/2020

  2. Strategy iteration for two-player games Start with an arbitrary strategy ? of player 0. Compute an optimal counter-strategy ? to ?. If there are improving switches for ?, keeping ? fixed, perform some of them. Repeat until there are no improving switches. Final strategies are optimal for the two players. We assume this time that player 0 is the maximizer. Consider the move from (?,?) to (? ,? ). From the point of view of player 0, controlling ?, all switches performed are improving! Thus ?,? (? ,? ). (??,?< ?? ,? )

  3. Values ? ? iff ?,? ? where ? ? is a best response to ?. ? ,? ? Games with total or discounted cost objectives: ??? ? = ??? ?,?(?) = ? ?????,? ?(?) A strategy ? that maximizes ??? ? also maximizes ????,? ?(?) for every ? ?. If ? is obtained from ? by performing improving switches, then ??? ? < ???(? ). For games with a limiting average cost objective, the definition of improving switches, and ? ? are more subtle.

  4. Sum of Values In the analysis, it is convenient to assign to each pair of strategies a single value, rather a vector of values. If ? is a strategy of max, we let ? = ? ? be an optimal counter-strategy of min. ?,? ?(?) = ? ??????? ??? ? = ??? ?,? ? A strategy ? that maximizes ??? ? also maximizes ?????? If ? is obtained from ? by performing improving switches, then ??? ? < ???(? ). ?,? ?(?) for every ? ?.

  5. Strategy improvement algorithms Switch-All: choose the best switch from each state. [ Howard (1960) ] Switch-All is polynomial for discounted TBSGs. Number of iterations is ? [ Ye (2011) ] [ Miltersen-Hansen-Z (2012) ] [Sherrer (2016) ] ? 1 1 ?. 1 ?log Switch-All is exponential for non-discounted MDPs. [ Friedmann (2009) ] [ Fearnley (2010) ] All known deterministic variants take exponential time. A randomized variant has a sub-exponential running time.

  6. Randomized subexponential algorithms ? number of states/vertices ? number of actions/edges 2? ? log ? Randomized algorithms for LP and LP-type problems [ Kalai (1992,1997) ] [ Matou ek-Sharir-Welzl (1992) ] [ G rtner (1995) ] [ Hansen-Z (2015) ] Applicability to games [ Ludwig (1995) ] [ Halman (2007) ] [ Bj rklund-Sandberg-Vorobyov (2003) ]

  7. Recursive Random Action Selection Corresponds to a primal simplex pivoting rule [Kalai (1992)] Start with an arbitrary strategy ? of max. Compute an optimal counter-strategy ? = ? ? of player 1 and the values ?????? ?,? ?(?) and ??? ? = ? ??????? ?,? ?(?). Randomly select an action ? in ? and remove its alternatives. Run the algorithm recursively on the resulting game to find an optimal strategy ? that uses ?. If there is no improving switch, then ? is optimal. Otherwise, let ? ? ? , for some improving switch ? , and run the algorithm recursively from ? , possibly removing ? from the game.

  8. Recursive Random Action Removal Corresponds to a dual simplex pivoting rule [MSW (1992)] Start with an arbitrary strategy ? of max. Compute an optimal counter-strategy ? = ? ? of player 1 and the values ?????? ?,? ?(?) and ??? ? = ? ??????? ?,? ?(?). Randomly choose an action ?not in ? and remove it. Run the algorithm recursively on the resulting game to find an optimal strategy ? that does not use ?. If ? is not an improving switch, then ? is optimal. Otherwise, let ? ? ? . Run the algorithm recursively from ? , possibly removing all alternatives of ?.

  9. Random Selection vs. Random Removal 2? ? log ? 2? ? log ? ? number of states/vertices ? number of actions/edges For games Random Removal is always better. On binary games, i.e., games with two actions per state, the algorithms are identical. 2? ?

  10. Analysis of random removal algorithm I For every action ?, let ??? (?) be the value of the optimal strategy that does not use ?. Let ? be the current strategy (of player 0). Let ?1,?2, ,?? ? be all the actions not in ? s.t. ??? ?1 ??? ?2 ??? ?? ? Suppose that the algorithm chooses to exclude ??. Let ? be the optimal strategy not using ??. Let ? = ? ??, assuming ?? is improving for ? . ??? ?1 ??? ?? = ??? ? < ???(? )

  11. Analysis of random removal algorithm II ??? ?1 ??? ?? = ??? ? < ???(? ) Actions ?1,?2, ,?? must be in ? . (In particular, they belong to different states.) Furthermore, as the strategies have increasing values, ?1,?2, ,?? must be in all subsequent strategies, and in particular in the optimal strategy! Thus, the states to which ?1,?2, ,?? belong are effectively removed from the game. (The algorithm does not know which states are removed.) The first recursive call essentially chooses a random number ? between 1 and ? ? and removes ? states from the game! If ? ?, the first recursive call returns an optimal strategy.

  12. Recurrence relation Let ?(?,?) be an upper bound on the expected number of improving switches performed on ?-action, ?-state games. Expected running time is polynomially related to ?(?,?). We may assume that ? 2?. (States with only one action can be eliminated .) ? 1 ? ?,? = ? ? 1,? + 1 + ? ? 2?,? ? ? ? ?=1 ? 2? 1,? = ?(2? 2,? 1) ? ?,0 = 0

  13. Analysis for binary games [ Ludwig (1995) ] [ G rtner (2002) ] A game is binary if every state has exactly two actions. Let ?(?) be (an upper bound on) the expected number of improving switches performed on ?-state binary games. ? ? ? = ? ? 1 +1 1 + ? ? ? ? ?=1 Simple proof by induction ? 0 = 0 2 ? ? ? ?? ?!2 ??/2 ?! 1 ?! ? ? ?2 ? ? ? = ?=1 ?=1 ?=1

  14. Usual and Unusual recurrence relations ? ? = ? 2? ? ? = 2? ? 1 ? ? = ? ? 1 + ? ? 2 ? ? = ? ?? ? ? ? = ? ? 1 +1 ? ? = 2? ? ? ? ? ? ?=1 ? 2 ? ? = ?? log ? ? ? = ? ? 1 + ? The recurrence relation for ? ?,? is similar to the third. Much harder to analyze but fairly similar behavior.

  15. Solving the Recurrence relation ? 1 ? ?,? = ? ? 1,? + ? ? 1 + ? ? 2?,? ? ?=1 ? ?,? = ? ? + ?,? ? ?,? = ?(? ?,?) Let ? = ? ?be the excess . ? ? ?,? = ? ? 1,? +1 ? ?=1 ? ? ?,? ? ? Theorem: [MSW] ? ?,? exp 2 ?ln ?+ ? ? + ln? Complicated analysis using generating functions.

  16. A simple transformation ? 1 ? ?,? = ? ? 1,? + ? ? 1 + ? ? 2?,? ? ?=1 Let ? = ? ?be the excess . ? ?,? = 1 + ? ? + ?,? ? ? ?,? = ? ? 1,? +1 ? ?=1 ? ? ?,? ?

  17. Upper bounds on ? ?,? ? ? ?,? = ? ? 1,? +1 ? ?=1 ? ? ?,? ? ? Theorem: [MSW] ? ?,? exp 2 ?ln ?+ ? ? + ln? Complicated analysis using generating functions. Theorem: [HZ] ? ?,? exp ? ?ln? ? Also complicated. Interesting when ?,? = ? . We sketch a relatively simple proof, by induction, of ? ?,? exp 2 ?ln?

  18. Simple upper bound by induction (on ?) ? ? ?,? = ? ? 1,? +1 ? ?=1 ? ? ?,? ? Claim:? ?,? ?,? = exp 2 ?ln? ? 1 ? + 1,? ?,? + ? + 1 ? + 1 ?,? ? ?=1 ? ???,? = 1 ? ? ln ? (?,?) ? + 1,? ?,? ? ? 1 ? 1 ? + 1 ?,? ? 1 ?,? 1 ?,? ?? <1 ? ? + 1 ? ?=0 ? ln? (?,?) ? 0 ?=1 ?,? ?? =2 ?ln? 1 ? (?,?) < ln? (?,?) 2ln?

  19. From binary to general? A state with ? actions can be replaced by a tree of ? 1 binary states. Total number of states becomes ? ?. 2 ? ln? ? ?2 ? ? Binarized ?2 ? ln ? ? General ? ? General algorithm is better roughly when ? ?ln?. ? ? ln? ? Improved ? ?

  20. Slightly improved algorithms [ Kalai (1997) ] [ G rtner (2002) ] [ Hansen-Z (2015) ] ? ? ln? ? ?? ? ln ? 1 When ? = ??, ? > 2. (Instead of ?2 ? ? ? ? 1 ?.) 1 ? ?,? = ? ? 1,? + ? ? 1 + ? ? 2?,? ? ?=1 For LPs, the algorithm needs to be changed, and becomes much more complicated. For games, the improved analysis applies to the existing algorithm!

  21. Slightly improved binary algorithm [ Hansen-Z (2015) ] For binary games, the running time becomes ?2?, instead of ?2 ?. [ Hansen-Z ] Not clear how to get running times of ?? ?.

  22. END of LECTURE 6

Related


More Related Content