Policy Improvement Algorithm - Changing Policies Previous Next
Old Policy:

New Policy:

After one iteration of the policy improvement algorithm, the policy has been changed (i.e., improved) from using "Operator repair" when the computer is down to using "Expert repair" when the computer is down.
Is the current solution optimal?
We do not know, because the new policy is different from the old policy. Thus, another iteration of the policy improvement algorithm needs to be performed.
Next we will determine the set of equations which need to be solved for , and .