What happens if a game has no Nash equilibrium over the space of deterministic strategies? Once again the problem can be alleviated by expanding the strategy space to include randomized strategies. In Section 9.3.3 it was explained that every zero-sum game under Formulation 9.7 has a randomized saddle point on the space of randomized strategies. It was shown by Nash that every nonzero-sum game under Formulation 9.8 has a randomized Nash equilibrium [730]. This is a nice result; however, there are a couple of concerns. There may still exist other admissible equilibria, which means that there is no reliable way to avoid regret unless the players collaborate. The other concern is that randomized Nash equilibria unfortunately cannot be computed using the linear programming approach of Section 9.3.3. The required optimization is instead a form of nonlinear programming [94,664,731], which does not necessarily admit a nice, combinatorial solution.

Recall the definition of randomized strategies from Section
9.3.3. For a pair of randomized strategies,
the expected costs,
and
, can be
computed using (9.57). A pair of
strategies is said to be a *randomized Nash
equilibrium* if

and

In game-theory literature, this is usually referred to as a

Using the cost matrices and , the Nash equilibrium conditions can be written as

and

Unfortunately, the computation of randomized Nash equilibria is
considerably more challenging than computing saddle points. The main
difficulty is that Nash equilibria are not necessarily security
strategies. By using security strategies, it is possible to decouple
the decisions of the players into separate linear programming
problems, as was seen in Example 9.16. For the
randomized Nash equilibrium, the optimization between the players
remains coupled. The resulting optimization is often referred to as
the *linear complementarity problem*. This can be formulated as a
nonlinear programming problem [664,731], which means that
it is a nonlinear optimization that involves both equality and
inequality constraints on the domain (in this particular case, a
*bilinear programming* problem is obtained [59]).

in which the final step uses the fact that and . Similarly, the expected cost for is

If is fixed, then the final equation in (9.75) is linear in ; likewise, if is fixed, then (9.76) is linear in . In the case of computing saddle points for zero-sum games, we were allowed to make this assumption; however, it is not possible here. We must choose to simultaneously optimize (9.75) while and (9.76) while .

It turns out that this problem is simple enough to solve with calculus. Using the classical optimization method of taking derivatives, a candidate solution can be found by computing

(9.77) |

and

(9.78) |

Extrema occur when both of these simultaneously become 0. Solving and yields , which is a randomized Nash equilibrium. The deterministic Nash equilibria are not detected by this method because they occur on the boundary of and , where the derivative is not defined.

The computation method in Example 9.20 did not appear too difficult because there were only two actions per player, and half of the matrix costs were 0. In general, two complicated equations must be solved simultaneously. The expressions, however, are always second-degree polynomials. Furthermore, they each become linear with respect to the other variables if or is held fixed.