A different solution
LP formulation
Minimise ΣsεS V*(s)
Under constraints
For every s, a
V*(s) ≥ R(s) + c(a) +
            γΣs’εS Pr(s’|a,s)V*(s’)
A big LP. So other tricks used to solve it!