Job Shop Scheduling

Computation of optimal policy


•	Given the value function V*(s), for each
	state, do Bellman backups and the action
	which maximises the inner product term is
	the optimal action.
•	Optimal policy is stationary (time
	independent) – intuitive for infinite horizon
	case.