Intelligent control through learning and optimization
AMATH / CSE 579
Spring 2015: MW 3:30-4:50, EEB 045
Emo Todorov
Office: CSE 434
Email: todorov@cs.washington.edu
Course Description
Design of near-optimal controllers for complex dynamical systems, using
analytical techniques, machine learning, and optimization. Topics from
deterministic and stochastic optimal control, reinforcement learning and
dynamic programming, numerical optimization in the context of control, and
robotics. Prerequisite: vector calculus, linear algebra, and Matlab.
Recommended: differential equations, stochastic processes, and optimization.
Homework
Homework 1, due May 3
Homework 2, due May 24
Homework 3, due June 12
Office hours, Thursday June 11, 3:00pm to 4:30pm, CSE 434.
Code
MDP solver: all problem formulations and algorithms
acrobot.m: acrobot dynamics
testacrobot.m: visualize acrobot dynamics
Lecture slides
Introduction
Markov Decision Processes and Bellman Equations
Controlled Diffusions and Hamilton-Jacobi-Bellman Equations
Linearly-Quadratic-Gaussian Controllers and Kalman Filters
Pontryagin's Maximum Principle
Trajectory Optimization
Linearly-Solvable Stochastic Optimal Control Problems
Relevant research papers
Least-Squares Policy Iteration
An Analysis of Linear Models, Linear Value-Function Approximation, and
Feature Selection for Reinforcement Learning
Value Function Approximation in Reinforcement Learning Using the Fourier Basis
Least Squares Solutions of the HJB Equation With Neural Network Value-Function Approximators
Reinforcement Learning In Continuous Time and Space
Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes
Dynamical Movement Primitives: Learning Attractor Models forMotor Behaviors
An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Compositionality of Optimal Control Laws
Policy Gradients in Linearly-Solvable MDPs
Robot Trajectory Optimization using Approximate Inference
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Natural Actor-Critic
SIMBICON: Simple Biped Locomotion Control
Optimizing Walking Controllers for Uncertain Inputs and Environments
Trajectory Optimization for Full-Body Movements with Complex Contacts
Dynamic Optimization of Human Walking
Computer Optimization of a Minimal Biped Model Discovers Walking and Running
Optimal Sensorimotor Transformations for Balance
Synthesis of Detailed Hand Manipulations Using Contact Smoothing
Continuous Character Control with Low-Dimensional Embeddings
Motion Fields for Interactive Character Animation
General Readings
A. Barto and R. Sutton (1998) Reinforcement learning: An introduction (online book)
E. Todorov (2006) Optimal control theory (book chapter)
D. Bertsekas (2008) Dynamic programming (lecture slides)
R. Tedrake (2009) Underactuated robotics: Learning, planning and control (lecture notes)
B. Van Roy (2004) Approximate dynamic programming (lecture notes)
P. Abbeel (2009) Advanced robotics (lecture slides)