player
1
2
3
4
5
6
7
8
9
10
|
def player(board):
"""
Returns player who has the next turn on a board.
"""
x_count = sum(row.count(X) for row in board)
o_count = sum(row.count(O) for row in board)
if x_count == o_count:
return X
else:
return O
|
这里通过遍历 board
,统计 X 与 O 的数量来确定当前的回合的 player
是谁
copilot 在审查我的代码的时候给出了比较巧妙的统计方法,我其实对 python 的一些类的方法不是很熟悉,因此自己写的会比较笨
actions
1
2
3
4
5
6
7
8
9
10
|
def actions(board):
"""
Returns set of all possible actions (i, j) available on the board.
"""
actions = set()
for i in range(3):
for j in range(3):
if board[i][j] is EMPTY:
actions.add((i,j))
return actions
|
这里依旧是遍历,然后把空的位置作为 action
给添加
按照要求,这里的 actions
应该是 set
类型
result
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
def result(board, action):
"""
Returns the board that results from making move (i, j) on the board.
"""
if not isinstance(action, tuple) or len(action) != 2:
raise ValueError("Invalid action")
i,j = action
if i < 0 or i > 2 or j < 0 or j > 2:
raise ValueError("Invalid action")
if board[i][j] is not EMPTY:
raise ValueError("Invalid action: cell already occupied")
new_board = copy.deepcopy(board)
symbol = player(board)
new_board[i][j] = symbol
return new_board
|
这里是通过当前的 board
和 action
,来返回 action
后的结果
按照文档的要求,需要对异常情况进行处理,但我一开始忽视了
同时,这里需要用到深度拷贝,防止更改原本的 board
(需要导入 copy
)
winner
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
def winner(board):
"""
Returns the winner of the game, if there is one.
"""
symbol = board[0][0]
if symbol is not EMPTY and symbol == board[1][1] and symbol == board[2][2]:
return symbol
symbol = board[0][2]
if symbol is not EMPTY and symbol == board[1][1] and symbol == board[2][0]:
return symbol
for i in range(3):
row_symbol = board[i][0]
col_symbol = board[0][i]
if row_symbol is not EMPTY and row_symbol == board[i][1] and row_symbol == board[i][2]:
return row_symbol
if col_symbol is not EMPTY and col_symbol == board[1][i] and col_symbol == board[2][i]:
return col_symbol
return None
|
这里需要检查是否获胜,也就是某个方向能三个连起来
在检查的时候需要注意,在赋值给 symbol
,比较之前,需要查看是否为 empty
,不然的话三个 empty
连着也会 return
terminal
1
2
3
4
5
6
7
8
9
10
11
12
|
def terminal(board):
"""
Returns True if game is over, False otherwise.
"""
if winner(board) is not None:
return True
else:
for i in range(3):
for j in range(3):
if board[i][j] is EMPTY:
return False
return True
|
这里是检查是否结束
有赢家的话,就结束
没赢家,且没空位,结束
没有加,有空位,没结束
utility
1
2
3
4
5
6
7
8
9
10
11
|
def utility(board):
"""
Returns 1 if X has won the game, -1 if O has won, 0 otherwise.
"""
result = winner(board)
if result is X:
return 1
elif result is O:
return -1
else:
return 0
|
给分环节
minimax
这里是最重要的部分,也是最难的部分,需要实现 minimax
这里我根据课上给的伪代码,实现了min value
和 max value
的函数,用于作为辅助函数帮着递归接下来的情况
对于 minimax
函数,需要根据 player 来分情况讨论
X 是追求 max,O 最求 min
这里由于要求是返回 action
,而不是 value
,因此我们在 minimax
中不能直接套用 min/max value
而是需要遍历 actions
,在 value
改变时,同时改变 action
的值,实现记录,以便最终返回
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
def minimax(board):
"""
Returns the optimal action for the current player on the board.
"""
current_player = player(board)
possible_actions = actions(board)
best_action = None
if terminal(board):
return None
if current_player is X:
best_value = float('-inf')
for action in actions(board):
value = min_value(result(board,action))
if value > best_value:
best_value = value
best_action = action
if current_player is O:
best_value = float('inf')
for action in actions(board):
value = min_value(result(board,action))
if value < best_value:
best_value = value
best_action = action
return best_action
def max_value(board):
v = float('-inf')
if terminal(board):
return utility(board)
for action in actions(board):
v = max(v, min_value(result(board,action)))
return v
def min_value(board):
v = float('inf')
if terminal(board):
return utility(board)
for action in actions(board):
v = min(v, max_value(result(board,action)))
return v
|