Disjunctive - Total
Output : Policy (State -> Action)
Bellman Equation
V*(s)=minaεA(s) [c(a)+maxs’εF(a,s) V*(s’)]