Before detailing the method further, some explanation of existing
names seems required. Consider the term *reinforcement learning*.
In machine learning, most decision-theoretic models are expressed in
terms of *reward* instead of *cost*. Thus, the task is to
make decisions or find plans that *maximize* a *reward
functional*. Choosing good actions under this model appears to
provide positive reinforcement in the form of a reward. Therefore,
the term *reinforcement* is used. Using cost and minimization
instead, some alternative names may be *decision-theoretic
learning* or *cost-based learning*.

The term *learning* is associated with the problem because
estimating the probability distribution
or
is clearly a learning problem. However, it is important to remember
that there is also the planning problem of computing cost-to-go
functions (or reward-to-go functions) and determining a plan that
optimizes the costs (or rewards). Therefore, the term *reinforcement planning* may be just as reasonable.

The general framework is referred to as *neuro-dynamic
programming* in [97] because the formulation and resulting
algorithms are based on dynamic programming. Most often, a variant of
value iteration is obtained. The *neuro* part refers to a family
of functions that can be used to approximate plans and cost-to-go
values. This term is fairly specific, however, because other function
families may be used. Furthermore, for some problems (e.g., over
small, finite state spaces), the cost values and plans are represented
without approximation.

The name *simulation-based methods* is used in [95], which
is perhaps one of the most accurate names (when used in the context of
dynamic programming). Thus, *simulation-based dynamic
programming* or *simulation-based planning* nicely reflects the
framework explained here. The term *simulation* comes from the
fact that a Monte Carlo simulator is used to generate samples for
which the required distributions are learned during planning. You
are, of course, welcome to use your favorite name, but keep in mind
that under all of the names, the idea remains the same. This will be
helpful to remember if you intend to study related literature.

