A different solution
•
LP formulation
•
Minimise Σ
sεS
V*(s)
Under constraints
For every s, a
V*(s) ≥ R(s) + c(a) +
γΣ
s’εS
Pr(s’|a,s)V*(s’)
A big LP. So other tricks used to solve it!