Suppose that the nondeterministic model of nature is used. A dynamic programming recurrence, (10.39), will be derived. This directly yields an iterative approach that computes a plan that minimizes the worst-case cost. The following presentation shadows that of Section 2.3.1; therefore, it may be helpful to refer back to this periodically.

An optimal plan will be found by computing optimal cost-to-go functions. For , let denote the worst-case cost that could accumulate from stage to under the execution of the optimal plan (compare to (2.5))

Inside of the 's and 's of (10.33) are the last terms of the cost functional, (10.8). For simplicity, the ranges of each and in the 's and 's of (10.33) have not been indicated. The optimal cost-to-go for is

which is the same as (2.6) for the predictable case.

Now consider making passes over , each time computing from , as ranges from down to . In the first iteration, is copied from . In the second iteration, is computed for each as (compare to (2.7))

in which and . Since and , substitutions are made into (10.35) to obtain (compare to (2.8))

which computes the costs of all optimal one-step plans from stage to stage .

More generally, can be computed once is given. Carefully study (10.33), and note that it can be written as (compare to (2.9))

by pulling the first cost term out of the sum and by separating the minimization over from the rest, which range from to . The second and do not affect the term; thus, can be pulled outside to obtain (compare to (2.10))

The inner 's and 's represent , which yields the recurrence (compare to (2.11))

Steven M LaValle 2012-04-20