Policy Improvement Algorithm - Generating Equations Previous Next
The first step in the policy improvement algorithm is to use , and  to solve the following set of 2 equations,
for the unkown values of  and .
In the interactive routine, the  and  are copied from the appropriate tables displayed along the bottom of the screen. For state , the initial decision is . For state , the initial decision is . The resulting equations are shown below, with the coefficients highlighted in the tables below.