Game Playing Exercises
5. Adversarial Search
5.1 Suppose you have an oracle, $OM(s)$, that correctly predicts the opponent’s move in any state. Using this, formulate the definition of a game as a (single-agent) search problem. Describe an algorithm for finding the optimal move.
5.2 Consider the problem of solving two 8-puzzles.
-
Give a complete problem formulation in the style of Chapter search-chapter.
-
How large is the reachable state space? Give an exact numerical expression.
-
Suppose we make the problem adversarial as follows: the two players take turns moving; a coin is flipped to determine the puzzle on which to make a move in that turn; and the winner is the first to solve one puzzle. Which algorithm can be used to choose a move in this setting?
-
Does the game eventually end, given optimal play? Explain.
5.3 Imagine that, in Exercise [two-friends-exercise], one of the friends wants to avoid the other. The problem then becomes a two-player game. We assume now that the players take turns moving. The game ends only when the players are on the same node; the terminal payoff to the pursuer is minus the total time taken. (The evader “wins” by never losing.) An example is shown in Figure pursuit-evasion-game-figure.
-
Copy the game tree and mark the values of the terminal nodes.
-
Next to each internal node, write the strongest fact you can infer about its value (a number, one or more inequalities such as “$\geq 14$”, or a “?”).
-
Beneath each question mark, write the name of the node reached by that branch.
-
Explain how a bound on the value of the nodes in (c) can be derived from consideration of shortest-path lengths on the map, and derive such bounds for these nodes. Remember the cost to get to each leaf as well as the cost to solve it.
-
Now suppose that the tree as given, with the leaf bounds from (d), is evaluated from left to right. Circle those “?” nodes that would not need to be expanded further, given the bounds from part (d), and cross out those that need not be considered at all.
-
Can you prove anything in general about who wins the game on a map that is a tree?
5.4 [game-playing-chance-exercise]Describe and implement state descriptions, move generators, terminal tests, utility functions, and evaluation functions for one or more of the following stochastic games: Monopoly, Scrabble, bridge play with a given contract, or Texas hold’em poker.
5.5 Describe and implement a real-time, multiplayer game-playing environment, where time is part of the environment state and players are given fixed time allocations.
5.6 Discuss how well the standard approach to game playing would apply to games such as tennis, pool, and croquet, which take place in a continuous physical state space.
5.7 [minimax-optimality-exercise] Prove the following assertion: For every game tree, the utility obtained by max using minimax decisions against a suboptimal min will never be lower than the utility obtained playing against an optimal min. Can you come up with a game tree in which max can do still better using a suboptimal strategy against a suboptimal min?
Player $A$ moves first. The two players take turns moving, and each player must move his token to an open adjacent space in either direction. If the opponent occupies an adjacent space, then a player may jump over the opponent to the next open space if any. (For example, if $A$ is on 3 and $B$ is on 2, then $A$ may move back to 1.) The game ends when one player reaches the opposite end of the board. If player $A$ reaches space 4 first, then the value of the game to $A$ is $+1$; if player $B$ reaches space 1 first, then the value of the game to $A$ is $-1$.
5.8 Consider the two-player game described in Figure line-game4-figure.
-
Draw the complete game tree, using the following conventions:
-
Write each state as $(s_A,s_B)$, where $s_A$ and $s_B$ denote the token locations.
-
Put each terminal state in a square box and write its game value in a circle.
-
Put loop states (states that already appear on the path to the root) in double square boxes. Since their value is unclear, annotate each with a “?” in a circle.
-
-
Now mark each node with its backed-up minimax value (also in a circle). Explain how you handled the “?” values and why.
-
Explain why the standard minimax algorithm would fail on this game tree and briefly sketch how you might fix it, drawing on your answer to (b). Does your modified algorithm give optimal decisions for all games with loops?
-
This 4-square game can be generalized to $n$ squares for any $n > 2$. Prove that $A$ wins if $n$ is even and loses if $n$ is odd.
5.9 This problem exercises the basic concepts of game playing, using tic-tac-toe (noughts and crosses) as an example. We define $X_n$ as the number of rows, columns, or diagonals with exactly $n$ $X$’s and no $O$’s. Similarly, $O_n$ is the number of rows, columns, or diagonals with just $n$ $O$’s. The utility function assigns $+1$ to any position with $X_3=1$ and $-1$ to any position with $O_3 = 1$. All other terminal positions have utility 0. For nonterminal positions, we use a linear evaluation function defined as ${Eval}(s) = 3X_2(s) + X_1(s) - (3O_2(s) + O_1(s))$.
-
Approximately how many possible games of tic-tac-toe are there?
-
Show the whole game tree starting from an empty board down to depth 2 (i.e., one $X$ and one $O$ on the board), taking symmetry into account.
-
Mark on your tree the evaluations of all the positions at depth 2.
-
Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and 0, and use those values to choose the best starting move.
-
Circle the nodes at depth 2 that would not be evaluated if alpha–beta pruning were applied, assuming the nodes are generated in the optimal order for alpha–beta pruning.
5.10 Consider the family of generalized tic-tac-toe games, defined as follows. Each particular game is specified by a set $\mathcal S$ of squares and a collection $\mathcal W$ of winning positions. Each winning position is a subset of $\mathcal S$. For example, in standard tic-tac-toe, $\mathcal S$ is a set of 9 squares and $\mathcal W$ is a collection of 8 subsets of $\cal W$: the three rows, the three columns, and the two diagonals. In other respects, the game is identical to standard tic-tac-toe. Starting from an empty board, players alternate placing their marks on an empty square. A player who marks every square in a winning position wins the game. It is a tie if all squares are marked and neither player has won.
-
Let $N= |{\mathcal S}|$, the number of squares. Give an upper bound on the number of nodes in the complete game tree for generalized tic-tac-toe as a function of $N$.
-
Give a lower bound on the size of the game tree for the worst case, where ${\mathcal W} = {{\,}}$.
-
Propose a plausible evaluation function that can be used for any instance of generalized tic-tac-toe. The function may depend on $\mathcal S$ and $\mathcal W$.
-
Assume that it is possible to generate a new board and check whether it is a winning position in 100$N$ machine instructions and assume a 2 gigahertz processor. Ignore memory limitations. Using your estimate in (a), roughly how large a game tree can be completely solved by alpha–beta in a second of CPU time? a minute? an hour?
5.11 Develop a general game-playing program, capable of playing a variety of games.
-
Implement move generators and evaluation functions for one or more of the following games: Kalah, Othello, checkers, and chess.
-
Construct a general alpha–beta game-playing agent.
-
Compare the effect of increasing search depth, improving move ordering, and improving the evaluation function. How close does your effective branching factor come to the ideal case of perfect move ordering?
-
Implement a selective search algorithm, such as B* @Berliner:1979, conspiracy number search @McAllester:1988, or MGSS* @Russell+Wefald:1989 and compare its performance to A*.
5.12 Describe how the minimax and alpha–beta algorithms change for two-player, non-zero-sum games in which each player has a distinct utility function and both utility functions are known to both players. If there are no constraints on the two terminal utilities, is it possible for any node to be pruned by alpha–beta? What if the player’s utility functions on any state differ by at most a constant $k$, making the game almost cooperative?
5.13 Describe how the minimax and alpha–beta algorithms change for two-player, non-zero-sum games in which each player has a distinct utility function and both utility functions are known to both players. If there are no constraints on the two terminal utilities, is it possible for any node to be pruned by alpha–beta? What if the player’s utility functions on any state sum to a number between constants $-k$ and $k$, making the game almost zero-sum?
5.14 Develop a formal proof of correctness for alpha–beta pruning. To do this, consider the situation shown in Figure alpha-beta-proof-figure. The question is whether to prune node $n_j$, which is a max-node and a descendant of node $n_1$. The basic idea is to prune it if and only if the minimax value of $n_1$ can be shown to be independent of the value of $n_j$.
-
Mode $n_1$ takes on the minimum value among its children: $n_1 = \min(n_2,n_21,\ldots,n_{2b_2})$. Find a similar expression for $n_2$ and hence an expression for $n_1$ in terms of $n_j$.
-
Let $l_i$ be the minimum (or maximum) value of the nodes to the left of node $n_i$ at depth $i$, whose minimax value is already known. Similarly, let $r_i$ be the minimum (or maximum) value of the unexplored nodes to the right of $n_i$ at depth $i$. Rewrite your expression for $n_1$ in terms of the $l_i$ and $r_i$ values.
-
Now reformulate the expression to show that in order to affect $n_1$, $n_j$ must not exceed a certain bound derived from the $l_i$ values.
-
Repeat the process for the case where $n_j$ is a min-node.
5.15 Prove that the alpha–beta algorithm takes time $O(b^{m/2})$ with optimal move ordering, where $m$ is the maximum depth of the game tree.
5.16 Suppose you have a chess program that can evaluate 5 million nodes per second. Decide on a compact representation of a game state for storage in a transposition table. About how many entries can you fit in a 1-gigabyte in-memory table? Will that be enough for the three minutes of search allocated for one move? How many table lookups can you do in the time it would take to do one evaluation? Now suppose the transposition table is stored on disk. About how many evaluations could you do in the time it takes to do one disk seek with standard disk hardware?
5.17 Suppose you have a chess program that can evaluate 10 million nodes per second. Decide on a compact representation of a game state for storage in a transposition table. About how many entries can you fit in a 2-gigabyte in-memory table? Will that be enough for the three minutes of search allocated for one move? How many table lookups can you do in the time it would take to do one evaluation? Now suppose the transposition table is stored on disk. About how many evaluations could you do in the time it takes to do one disk seek with standard disk hardware?