The first step in the policy improvement algorithm is to use , ,
and
to solve the following set of 2 equations,
for the unkown values of
and .
In the interactive routine, the
and
are copied from the appropriate tables displayed along the bottom of the
screen. For state ,
the initial decision is .
For state ,
the initial decision is .
The resulting equations are shown below, with the coefficients highlighted
in the tables below.