Project - tictactoe

player

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def player(board):
    """
    Returns player who has the next turn on a board.
    """
    x_count = sum(row.count(X) for row in board)
    o_count = sum(row.count(O) for row in board)
    if x_count == o_count:
        return X
    else:
        return O

这里通过遍历 board,统计 X 与 O 的数量来确定当前的回合的 player 是谁

copilot 在审查我的代码的时候给出了比较巧妙的统计方法,我其实对 python 的一些类的方法不是很熟悉,因此自己写的会比较笨

actions

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def actions(board):
    """
    Returns set of all possible actions (i, j) available on the board.
    """
    actions = set()
    for i in range(3):
        for j in range(3):
            if board[i][j] is EMPTY:
                actions.add((i,j))
    return actions

这里依旧是遍历,然后把空的位置作为 action 给添加

按照要求,这里的 actions 应该是 set 类型

result

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def result(board, action):
    """
    Returns the board that results from making move (i, j) on the board.
    """
    if not isinstance(action, tuple) or len(action) != 2:
        raise ValueError("Invalid action")
    
    i,j = action
    
    if i < 0 or i > 2 or j < 0 or j > 2:
        raise ValueError("Invalid action")
        
    if board[i][j] is not EMPTY:
        raise ValueError("Invalid action: cell already occupied")
    
    new_board = copy.deepcopy(board)
    symbol = player(board)
    new_board[i][j] = symbol
    return new_board

这里是通过当前的 boardaction,来返回 action 后的结果

按照文档的要求,需要对异常情况进行处理,但我一开始忽视了

同时,这里需要用到深度拷贝,防止更改原本的 board(需要导入 copy

winner

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def winner(board):
    """
    Returns the winner of the game, if there is one.
    """
    symbol = board[0][0]
    if symbol is not EMPTY and symbol == board[1][1] and symbol == board[2][2]:
        return symbol
    symbol = board[0][2]
    if symbol is not EMPTY and symbol == board[1][1] and symbol == board[2][0]:
        return symbol
    for i in range(3):
        row_symbol = board[i][0]
        col_symbol = board[0][i]
        if row_symbol is not EMPTY and row_symbol == board[i][1] and row_symbol == board[i][2]:
            return row_symbol
        if col_symbol is not EMPTY and col_symbol == board[1][i] and col_symbol == board[2][i]:
            return col_symbol
    return None

这里需要检查是否获胜,也就是某个方向能三个连起来

在检查的时候需要注意,在赋值给 symbol ,比较之前,需要查看是否为 empty,不然的话三个 empty 连着也会 return

terminal

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def terminal(board):
    """
    Returns True if game is over, False otherwise.
    """
    if winner(board) is not None:
        return True
    else:
        for i in range(3):
            for j in range(3):
                if board[i][j] is EMPTY:
                    return False
        return True

这里是检查是否结束

有赢家的话,就结束

没赢家,且没空位,结束

没有加,有空位,没结束

utility

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def utility(board):
    """
    Returns 1 if X has won the game, -1 if O has won, 0 otherwise.
    """
    result = winner(board)
    if result is X:
        return 1
    elif result is O:
        return -1
    else:
        return 0

给分环节

minimax

这里是最重要的部分,也是最难的部分,需要实现 minimax

这里我根据课上给的伪代码,实现了min valuemax value 的函数,用于作为辅助函数帮着递归接下来的情况

对于 minimax 函数,需要根据 player 来分情况讨论

X 是追求 max,O 最求 min

这里由于要求是返回 action,而不是 value,因此我们在 minimax 中不能直接套用 min/max value

而是需要遍历 actions,在 value 改变时,同时改变 action 的值,实现记录,以便最终返回

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def minimax(board):
    """
    Returns the optimal action for the current player on the board.
    """
    current_player = player(board)
    possible_actions = actions(board)
    best_action = None
    
    if terminal(board):
        return None
    if current_player is X:
        best_value = float('-inf')
        for action in actions(board):
            value = min_value(result(board,action))
            if value > best_value:
                best_value = value
                best_action = action
    
    if current_player is O:
        best_value = float('inf')
        for action in actions(board):
            value = min_value(result(board,action))
            if value < best_value:
                best_value = value
                best_action = action
    
    return best_action

def max_value(board):
    v =  float('-inf')
    if terminal(board):
        return utility(board)
    for action in actions(board):
        v = max(v, min_value(result(board,action)))
    return v

def min_value(board):
    v = float('inf')
    if terminal(board):
        return utility(board)
    for action in actions(board):
        v = min(v, max_value(result(board,action)))
    return v
Licensed under CC BY-NC-SA 4.0
最后更新于 May 23, 2025 02:29 UTC
comments powered by Disqus
使用 Hugo 构建
主题 StackJimmy 设计