Policy evaluation
•
Given a policy Π:S
A, find value of each
state using this policy.
•
V
Π
(s) = R(s) + c(Π(s)) +
γ[Σ
s’εS
Pr(s’|a,s)V
Π
(s’)]
•
This is a system of linear equations
involving |S| variables.