
And the answer is easily obtained if you just

subtract 3 for each step.

We get 88 and 85 over here.

We could also reach the same value going around here.

So, 85 would have been the right answer,

and this will be the value function after convergence.

It's beautiful to see that the value function is effective

the distance to the positive absorbing state times 3

subtracted from 100.

So, we have 97, 94, 91, 88, 85 and so on.

This is a degenerate case.

If we have a deterministic state transition function,

it gets more tricky to calculate for the stochastic case.