# Intelligent control through learning and optimization

## AMATH / CSE 579

### Spring 2015: MW 3:30-4:50, EEB 045

### Emo Todorov

Office: CSE 422

Email: todorov@cs.washington.edu

## Course Description

Design of near-optimal controllers for complex dynamical systems, using
analytical techniques, machine learning, and optimization. Topics from
deterministic and stochastic optimal control, reinforcement learning and
dynamic programming, numerical optimization in the context of control, and
robotics. Prerequisite: vector calculus, linear algebra, and Matlab.
Recommended: differential equations, stochastic processes, and optimization.

## Homework

Homework 1, due April 30

## Code

MDP solver: all problem formulations and algorithms

## Lecture slides

Introduction

Markov Decision Processes and Bellman Equations

Controlled Diffusions and Hamilton-Jacobi-Bellman Equations

## Future lecture slides

Linearly-Solvable Stochastic Optimal Control Problems

Linearly-Quadratic-Gaussian Controllers and Kalman Filters

Research Talk

Pontryagin's Maximum Principle

Trajectory Optimization

## Relevant research papers

Least-Squares Policy Iteration

An Analysis of Linear Models, Linear Value-Function Approximation, and
Feature Selection for Reinforcement Learning

Value Function Approximation in Reinforcement Learning Using the Fourier Basis

Least Squares Solutions of the HJB Equation With Neural Network Value-Function Approximators

Reinforcement Learning In Continuous Time and Space

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

Dynamical Movement Primitives: Learning Attractor Models forMotor Behaviors

An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Compositionality of Optimal Control Laws

Policy Gradients in Linearly-Solvable MDPs

Robot Trajectory Optimization using Approximate Inference

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

Natural Actor-Critic

SIMBICON: Simple Biped Locomotion Control

Optimizing Walking Controllers for Uncertain Inputs and Environments

Trajectory Optimization for Full-Body Movements with Complex Contacts

Dynamic Optimization of Human Walking

Computer Optimization of a Minimal Biped Model Discovers Walking and Running

Optimal Sensorimotor Transformations for Balance

Synthesis of Detailed Hand Manipulations Using Contact Smoothing

Continuous Character Control with Low-Dimensional Embeddings

Motion Fields for Interactive Character Animation

## General Readings

A. Barto and R. Sutton (1998) Reinforcement learning: An introduction (online book)

E. Todorov (2006) Optimal control theory (book chapter)

D. Bertsekas (2008) Dynamic programming (lecture slides)

R. Tedrake (2009) Underactuated robotics: Learning, planning and control (lecture notes)

B. Van Roy (2004) Approximate dynamic programming (lecture notes)

P. Abbeel (2009) Advanced robotics (lecture slides)