Probabilistic - Total
Modelled as MDPs. (also called fully
observable MDPs)
Output : Policy (State Action)
Bellman Equation
V*(s)=minaεA(s) [c(a)+Σs’εS V*(s’)P(s’|s,a)]