Figure 2.4 gives a general template of search algorithms, expressed using the state-space representation. At any point during the search, there will be three kinds of states:
The set of alive states is stored in a priority queue, , for which a priority function must be specified. The only significant difference between various search algorithms is the particular function used to sort . Many variations will be described later, but for the time being, it might be helpful to pick one. Therefore, assume for now that is a common FIFO (First-In First-Out) queue; whichever state has been waiting the longest will be chosen when is called. The rest of the general search algorithm is quite simple. Initially, contains the initial state . A while loop is then executed, which terminates only when is empty. This will only occur when the entire graph has been explored without finding any goal states, which results in a FAILURE (unless the reachable portion of is infinite, in which case the algorithm should never terminate). In each while iteration, the highest ranked element, , of is removed. If lies in , then it reports SUCCESS and terminates; otherwise, the algorithm tries applying every possible action, . For each next state, , it must determine whether is being encountered for the first time. If it is unvisited, then it is inserted into ; otherwise, there is no need to consider it because it must be either dead or already in .
The algorithm description in Figure 2.4 omits several details that often become important in practice. For example, how efficient is the test to determine whether in line 4? This depends, of course, on the size of the state space and on the particular representations chosen for and . At this level, we do not specify a particular method because the representations are not given.
One important detail is that the existing algorithm only indicates whether a solution exists, but does not seem to produce a plan, which is a sequence of actions that achieves the goal. This can be fixed by inserting a line after line 7 that associates with its parent, . If this is performed each time, one can simply trace the pointers from the final state to the initial state to recover the plan. For convenience, one might also store which action was taken, in addition to the pointer from to .
Lines 8 and 9 are conceptually simple, but how can one tell whether has been visited? For some problems the state transition graph might actually be a tree, which means that there are no repeated states. Although this does not occur frequently, it is wonderful when it does because there is no need to check whether states have been visited. If the states in all lie on a grid, one can simply make a lookup table that can be accessed in constant time to determine whether a state has been visited. In general, however, it might be quite difficult because the state must be compared with every other state in and with all of the dead states. If the representation of each state is long, as is sometimes the case, this will be very costly. A good hashing scheme or another clever data structure can greatly alleviate this cost, but in many applications the computation time will remain high. One alternative is to simply allow repeated states, but this could lead to an increase in computational cost that far outweighs the benefits. Even if the graph is very small, search algorithms could run in time exponential in the size of the state transition graph, or the search may not terminate at all, even if the graph is finite.
One final detail is that some search algorithms will require a cost to be computed and associated with every state. If the same state is reached multiple times, the cost may have to be updated, which is performed in line 12, if the particular search algorithm requires it. Such costs may be used in some way to sort the priority queue, or they may enable the recovery of the plan on completion of the algorithm. Instead of storing pointers, as mentioned previously, the optimal cost to return to the initial state could be stored with each state. This cost alone is sufficient to determine the action sequence that leads to any visited state. Starting at a visited state, the path back to can be obtained by traversing the state transition graph backward in a way that decreases the cost as quickly as possible in each step. For this to succeed, the costs must have a certain monotonicity property, which is obtained by Dijkstra's algorithm and search, and will be introduced in Section 2.2.2. More generally, the costs must form a navigation function, which is considered in Section 8.2.2 as feedback is incorporated into discrete planning.
Steven M LaValle 2012-04-20