Policy Improvement Algorithm - Improving Policy Previous Next
Shown below is the above expression evaluated for state  and . The  and  used in these equations are highlighted at the bottom of the screen
Decison  clearly minimizes the expression for state 0.