I suggest you make the following changes to chapter4 gamblers_problem to show multiple best actions:
x_axis = []
y_axis = []
......
# line 63
max_values = np.where(np.round(action_returns[1:], 5)==np.amax(np.round(action_returns[1:], 5)))[0]+1
x_axis.extend([state]*len(max_values))
for i in max_values:
y_axis.extend([actions[i]])
......
plt.scatter(x_axis, y_axis)