PREPRINTS
Learning Complex Dexterous Manipulation with Deep Reinforcement
Learning and Demonstrations
Rajeswaran A, Kumar V, Gupta A, Schulman J, Todorov E and Levine S
Movie
Project
Towards generalization and simplicity in continuous control
Rajeswaran A, Lowrey K, Todorov E and Kakade S. To appear in Neural Information Processing Systems 2017.
Movie
arXiv
Goal Directed Dynamics
Todorov E
Movie
Learning dexterous manipulation policies from experience and imitation
Kumar V, Gputa A, Todorov E and Levine S
Movie
Graphical Newton
Srinivasan A and Todorov E
PHD THESES
New techniques in deep representation learning
Galen Andrew (2016). University of Washington
Automated discovery and learning of complex movement behaviors
Igor Mordatch (2015). University of Washington
Design and control of an anthropomorphic robotic hand: Learning advantages from
the human body and brain
Zhe (Joseph) Xu (2015). University of Washington
Automating stochastic control
Krishnamurthy Dvijotham (2014). University of Washington
Valuefunction approximation methods for linearlysolvable Markov decision processes
Minguyan Zhong (2013). University of Washington
Theory and implementation of biomimetic motor controllers
Yuval Tassa (2011). Hebrew University of Jerusalem
Exploratory studies of human sensorimotor learning with system identification and stochastic optimal control
Alex Simpkins (2009). University of California San Diego
Computational and psychophysical studies of goaldirected arm movements
Dan Liu (2008). University of California San Diego
Optimal control for biological movement ystems
Weiwei Li (2006). University of California San Diego
Studies of goaldirected movements
Emanuel Todorov (1998). Massachusetts Institute of Technology
PEERREVIEWED PUBLICATIONS BY YEAR
 2016 
Realtime state estimation with wholebody multicontact dynamics: A modified UKF approach
Lowrey K, Dao J and Todorov E. In IEEE/RAS International Conference on Humanoid Robots
Movie
Optimal control with learned local models: Application to dexterous manipulation
Kumar V, Todorov E and Levine S. In IEEE International Conference on Robotics and Automation 2016.
Best Manipulation Paper Award
Movie
Design of a highly biomimetic anthropomorphic robotic hand: Towards artificial limb regeneration
Xu Z and Todorov E. IEEE International Conference on Robotics and Automation 2016.
Movie
 2015 
Interactive control of diverse complex characters with neural networks
Mordatch I, Lowrey K, Andrew G, Popovic Z and Todorov E (2015). In Neural Information Processing Systems.
Movie
Physically consistent state estimation and system identification for contacts
Kolev S and Todorov E (2015). In IEEE/RAS International Conference on Humanoid Robots.
Movie
MuJoCo HAPTIX: A virtual reality system for hand manipulation
Kumar V and Todorov E (2015). In IEEE/RAS International Conference on Humanoid Robots.
Movie
Wholebody modelpredictive control applied to the HRP2 humanoid robot
Koenemann J, Del Prete A, Tassa Y, Todorov E, Stasse O, Bennewitz M and Mansard N (2015). In IEEE/RAS International Conference on Intelligent Robots and Systems
Movie
EnsembleCIO: Fullbody dynamic motion planning that transfers to physical humanoids
Mordatch I, Lowrey K and Todorov E (2015). In IEEE/RAS International Conference on Intelligent Robots and Systems
Movie
Simulation tools for modelbased robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX
Erez T, Tassa Y and Todorov E (2015). In International Conference on Robotics and Automation
Movie
Convex structured controller design in finite horizon
Dvijotham K, Todorov E and Fazel M (2015). IEEE Transactions on Control of Network Systems, vol 2, issue 1
 2014 
Convex riskaverse control design
Dvijotham K, Todorov E and Fazel M (2015). In IEEE Conference on Decision and Control
Universal convexification via riskaversion
Dvijotham K, Fazel M and Todorov E (2014). In Uncertainty in Artificial Intelligence
Facebook Best Student Paper
Combining the benefits of function approximation and trajectory optimization
Mordatch I and Todorov E (2014). In Robotics: Science and Systems
Movie
Physicallyconsistent sensor fusion in contactrich behaviors
Lowrey K, Kolev S, Tassa Y, Erez T and Todorov E (2014). In IEEE/RAS International Conference on Intelligent Robots and Systems
Movie
From inverse kinematics to optimal control
Geoffroy P, Mansard N, Raison M, Achiche S, Tassa Y and Todorov E (2014). In Advances in Robot Kinematics 2014
Convex and analyticallyinvertible dynamics with contacts and constraints: Theory and implementation in MuJoCo
Todorov E (2014). In International Conference on Robotics and Automation
Movie
Controllimited Differential Dynamic Programming
Tassa Y, Mansard N and Todorov E (2014). In International Conference on Robotics and Automation
Movie
Realtime behaviour synthesis for dynamic hand manipulation
Kumar V, Tassa Y, Erez T and Todorov E (2014). In International Conference on Robotics and Automation
Movie
Design, optimization, calibration and a case study of a 3Dprinted,
lowcost fingertip sensor for robotic manipulation
Xu Z, Kolev S and Todorov E (2014). In International Conference on Robotics and Automation
Movie
 2013 
Animating human lower limbs using ContactInvariant Optimization
Mordatch I, Wang J, Todorov E and Koltun V (2013). In SIGGRAPH ASIA
Movie
Fast, strong and compliant pneumatic actuation for
dexterous tendondriven hands
Kumar V, Xu Z and Todorov E (2013). In International Conference on Robotics and Automation
Movie
STAC: Simultaneous tracking and calibration
Wu T, Tassa Y, Kumar V, Movellan J and Todorov E (2013). In IEEE/RAS International Conference on Humanoid Robots
Movie
A lowcost and modular, 20DOF anthropomorphic robotic hand: Design, actuation and modeling
Xu Z, Kumar V and Todorov E (2013). In IEEE/RAS International Conference on Humanoid Robots
Movie
An integrated system for realtime modelpredictive control of humanoid robots
Erez T, Lowrey K, Tassa Y, Kumar V, Kolev S and Todorov E (2013). In IEEE/RAS International Conference on Humanoid Robots
Movie
Convex control design via covariance minimization
Dvijotham K, Todorov E and Fazel M (2013). In 51st Annual Allerton Conference on Communication, Control, and Computing
Convexity of optimal linear controller design
Dvijotham K, Theodorou E, Todorov E and Fazel M (2013). In IEEE Conference on Decision and Control
Timevarying nonlinear policy gradients
Theodorou E, Dvijotham K, and Todorov E (2013). In IEEE Conference on Decision and Control
Modeling and identification of pneumatic actuators
Tassa Y, Wu T, Movellan J and Todorov E (2013). In IEEE International Conference on Mechatronics and Automation
Best paper finalist
From informationtheoretic dualities to pathintegral and Kullback Leibler control: Continuous and discretetime formulationis
Theodorou E, Dvijotham K and Todorov E (2013). In 16th Yale Workshop on Learning and Adaptive Systems
Multirobot active SLAM with relative entropy optimization
Kontitsis M, Theodorou E and Todorov E (2013). In American Control Conference
The deltasensitivity and its application to stochastic optimal control of nonlinear diffusions
Theodorou E and Todorov E (2013). In American Control Conference
Free energy based policy gradients
Theodorou E, Najemnik J and Todorov E (2013). In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
Value function approximation and modelpredictive control
Zhong M, Johnson M, Tassa Y, Erez T and Todorov E (2013). In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
 2012 
Linearlysolvable Markov games
Dvijotham K and Todorov E (2012). In American Control Conference
Information theoretic views on pathintegral control
Theodorou E and Todorov E (2012). In NIPS Workshop on Information of Action and Perception
Reduced dimensionality control for the ACT hand
Malhotra M, Rombokas E, Theodorou E, Todorov E and Matsuoka Y (2012). In International Conference on Robotics and Automation
Tendondriven control of biomechanical and robotic systems: A pathintegral reinforcement learning approach
Rombokas E, Theodorou E, Malhotra M, Todorov E and Matsuoka Y (2012). In International Conference on Robotics and Automation
MuJoCo: A physics engine for modelbased control
Todorov E, Erez T and Tassa Y (2012). In IEEE/RSJ International Conference on Intelligent Robots and Systems
Design of an anthropomorphic robotic finger system with biomimetic artificial joints
Xu Z, Kumar V, Matsuoka Y and Todorov E (2012). In IEEE Biomedical Robotics and Biomechatronics
Synthesis and stabilization of complex behaviors through online trajectory optimization
Tassa Y, Erez T and Todorov E (2012). In IEEE/RSJ International Conference on Intelligent Robots and Systems
Movie
Trajectory optimization for domains with contacts using inverse dynamics
Erez T and Todorov E (2012). In IEEE/RSJ International Conference on Intelligent Robots and Systems
Stochastic optimal control for nonlinear Markov jump diffusion processes
Theodorou E and Todorov E (2012). In American Control Conference
Relative entropy and free energy dualities: Connections to path integral and KL control
Theodorou E and Todorov E (2012). In IEEE Conference on Decision and Control
Contactinvariant optimization for hand manipulation
Mordatch I, Popovic, Z and Todorov E(2012). In Eurographics / ACM SIGGRAPH Symposium on Computer Animation
Project Page
Discovery of complex behaviors through contactinvariant optimization
Mordatch I, Todorov E and Popovic, Z (2012). In ACM SIGGRAPH
Project Page
Linearlysolvable optimal control
Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press
 2011 
Optimal limitcycle control recast as Bayesian inference
Tassa Y, Erez T and Todorov E (2011). In World Congress of the International Federation of Automatic Control
Aggregation methods for linearlysolvable MDPs
Zhong M and Todorov E (2011). In World Congress of the International Federation of Automatic Control
Finding the most likely trajectories of optimallycontrolled stochastic systems
Todorov E (2011). In World Congress of the International Federation of Automatic Control
Inverse optimality design for biological movement systems
Li W, Todorov E and Liu D (2011). In World Congress of the International Federation of Automatic Control
A unifying framework for linearlysolvable control
Dvijotham K and Todorov E (2011). In Uncertainty in Artificial Intelligence
Infinitehorizon model predictive control for nonlinear periodic tasks with contacts
Erez T, Tassa Y and Todorov E (2011). In Robotics: Science and Systems
Neuromuscular stochastic optimal control of a tendondriven index finger model
Theodorou E, Todorov E and ValeroCuevas F (2011). In American Control Conference
Design and analysis of an artificial finger joint for anthropomorphic robotic hands
Xu Z, Todorov E, Dellon B, Matsuoka Y (2011). In International Conference on Robotics and Automation
Modular biomimetic robots that can interact with the world the way we do
Simpkins A, Kelley M and Todorov E (2011). In International Conference on Robotics and Automation
A convex, smooth and invertible contact model for trajectory optimization
Todorov E (2011). In International Conference on Robotics and Automation
Movie
Firstexit model predictive control of fast discontinuous dynamics: Application to ball bouncing
Kulchenko P and Todorov E (2011). In International Conference on Robotics and Automation
Movie
Complex object manipulation with hierarchical optimal control
Simpkins A and Todorov E (2011). In IEEE Adaptive Dynamic Programming and Reinforcement Learning
Policy gradient methods with model predictive control applied to ball bouncing
Kulchenko P and Todorov E (2011). In IEEE Adaptive Dynamic Programming and Reinforcement Learning
Moving leastsquares approximations for linearlysolvable stochastic optimal control problems
Zhong M and Todorov E (2011). J Control Theory Appl, 9(3): 451463
Moving leastsquares approximations for linearlysolvable optimal control problems
Zhong M and Todorov E (2011). In IEEE Adaptive Dynamic Programming and Reinforcement Learning
Highorder local dynamic programming
Tassa Y and Todorov E (2011). In IEEE Adaptive Dynamic Programming and Reinforcement Learning
Movie
 2010 
Policy gradients in linearlysolvable MDPs
Todorov E (2010). In Advances in Neural Information Processing Systems 24
Inverse optimal control with linearlysolvable MDPs
Dvijotham K and Todorov E (2010). In International Conference on Machine Learning
Identification and control of a pneumatic robot
Todorov E, Hu C, Simpkins A and Movellan J (2010). In IEEE Biomedical Robotics and Biomechatronics
Movie1;
Movie2;
Movie3
Stochastic complementarity for local control of discontinuous dynamics
Tassa Y and Todorov E (2010). In Robotics: Science and Systems
Movie
Position estimation and control of compact BLDC motors based on analog linear Hall effect sensors
Simpkins A and Todorov E (2010). In American Control Conference
Stochastic differential dynamic programming
Theodorou E, Tassa Y and Todorov E (2010). In American Control Conference
Implicit nonlinear complementarity: A new approach to contact dynamics
Todorov E (2010). In International Conference on Robotics and Automation
Movie
A first optimal control solution for a complex, nonlinear, tendon driven neuromuscular finger model
Theodorou E, Todorov E and ValeroCuevas F (2010). In ASME Summer Bioengineering Conference
 2009 
Compositionality of optimal control laws
Todorov E (2009). In Advances in Neural Information Processing Systems 22, pp 18561864, Bengio et al (eds), MIT Press
Efficient computation of optimal actions
Todorov E (2009). PNAS, 106(28): 1147811483
Commentary;
Supplementary information
Structured variability of muscle activations supports the minimal intervention principle of motor control
ValeroCuevas F, Venkadesan M and Todorov E (2009). Journal of Neurophysiology,
102: 5968
Journal cover
Eigenfunction approximation methods for linearlysolvable optimal control problems
Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 161  168
Hierarchical optimal control of a 7DOF arm model
Liu D and Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 50  57
Practical numerical methods for stochastic optimal control of biological systems in continuous time and space
Simpkins A and Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 212  218
Realtime motor control using recurrent neural networks
Huh D and Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 42  49
Iterative local dynamic programming
Todorov E and Tassa Y (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning), pp 90  95
 2008 
General duality between optimal control and estimation
Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286  4292
Optimal tradeoff between exploration and exploitation
Simpkins A, de Callafon R and Todorov E (2008). In proceedings of the American Control Conference, pp 3338
Parallels between sensory and motor information processing
Todorov E (2008). In The Cognitive Neurosciences, 4th ed, Gazzaniga (ed), MIT Press
 2007 
Predicting reaching targets from human EEG
Hammon P, Makeig S, Poizner H, Todorov E and de Sa V (2007). IEEE Signal Processing Magazine, 25: 6977
Evidence for the flexible sensorimotor strategies predicted by optimal feedback control
Liu D and Todorov E (2007). Journal of Neuroscience, 27: 93549368
Iterative linearization methods for approximately optimal control and estimation of nonlinear stochastic systems
Li W and Todorov E (2007). International Journal of Control, 80: 14391453
Probabilistic inference of multijoint movements, skeletal parameters and marker attachments from diverse sensor data
Todorov E (2007). IEEE Transactions on Biomedical Engineering, 54: 19271939
State estimation with finite signalstonoise models via linear matrix inequalities
Li W, Skelton R and Todorov E (2007). Journal of Dynamic Systems, Measurement and Control, 129: 136143
 2006 
Linearlysolvable Markov decision problems
Todorov E (2006). In Advances in Neural Information Processing Systems 19: 13691376, Scholkopf et al (eds), MIT Press
Iterative optimal control and estimation design for nonlinear stochastic systems
Li W and Todorov E (2006). In proceedings of the 45th IEEE Conference on Decision and Control, pp 32423247
Imitiation learning for reaching and grasping in virtual environments
Singh N and Todorov E (2006). In proceedings of the 5th International Conference on Development and Learning
Optimal control theory
Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269298, MIT Press
 2005 
From task parameters to motor synergies: A hierarchical framework for approximatelyoptimal control of redundant manipulators
Todorov E, Li W and Pan X (2005). Journal of Robotic Systems, 22(11):691710
Towards an integrated systems for estimating multijoint movement from diverse sensor data
Pan X, Todorov E and Li W (2005). In proceedings of the 27th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 49824985
Hierarchical feedback and learning for multijoint arm movement control
Li W, Todorov E and Pan X (2005). In proceedings of the 27th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 44004403
A generalized iterative LQG method for locallyoptimal feedback control of constrained nonlinear stochastic systems
Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300306
MATLAB code
Estimation and control of systems with multiplicative noise via linear matrix inequalities
Li W, Skelton R and Todorov E (2005). In proceedings of the American Control Conference, pp 18111816
Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system
Todorov E (2005). Neural Computation, 17(5): 10841108
MATLAB code
 2004 
Hierarchical optimal control of redundant biomechanical systems
Li W, Todorov E and Pan X (2004). In proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 46184621
Development of clinicianfriendly software for musculoskeletal modeling and control
Davoodi R, Urata C, Todorov E and Loeb G (2004). In proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 46224625
Analysis of the synergies underlying complex hand manipulation
Todorov E and Ghahramani Z (2004). In proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 46374640
Iterative linearquadratic regulator design for nonlinear biological movement systems
Li W and Todorov E (2004). In proceedings of the 1st International Conference on Informatics in Control, Automation and Robotics, vol 1, pp 222229
Optimality principles in sensorimotor control
Todorov E (2004). Nature Neuroscience 7(9): 907915
 2003 
Optimal control methods suitable for biomechanical systems
Todorov E and Li W (2003). In proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp
17581761
Unsupervised learning of sensorymotor primitives
Todorov E and Ghahramani Z (2003). In proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 17501753
A minimal intervention principle for coordinated movement
Todorov E and Jordan M (2003). In Advances in Neural Information Processing Systems 15: 2734, Becker et al (eds), MIT Press
On the role of primary motor cortex in arm movement control
Todorov E (2003). In Progress in Motor Control III, ch 6, pp 125166, Latash and Levin (eds), Human Kinetics
 2002 
Optimal feedback control as a theory of motor coordination
Todorov E and Jordan M (2002). Nature Neuroscience 5(11): 12261235
News and views;
Neuroscience news;
Supplementary information
A biomechanical model of the partially paralyzed human arm
Davoodi R, Brown I, Todorov E and Loeb G (2002). In proceedings of the 7th Annual Conference of the International Functional Electric Stimulation Society
Cosine tuning minimizes motor errors
Todorov E (2002). Neural Computation 14(6): 12331260
Use of virtual environments in motor learning and rehabilitation
Holden M and Todorov E (2002). In Handbook of Virtual Environments, ch 49, pp 9991026, Stanney K (ed), Lawrence Erlbaum Associates
 2000 and earlier 
One motor cortex, two different views
Todorov E, debate with Georgopoulos A, Ashe J, Moran D, Schwartz A and Scott S (2000). Nature Neuroscience 3(10): 963965
Direct cortical control of muscle activation in voluntary arm movements: a model
Todorov E (2000). Nature Neuroscience 3(4): 391398
News and views
Virtual environment training improves motor performance in two patients with stroke
Holden M, Todorov E, Callahan J and Bizzi E (1999). Neurology Report 23(2): 5767
Smoothness maximization along a predefined path accurately predicts the speed profiles of complex arm movements
Todorov E and Jordan M (1998). Journal of Neurophysiology 80(2): 696714
A local circuit approach to understanding integration of longrange inputs in primary visual cortex
Somers D, Todorov E et al (1998). Cerebral Cortex 8(3): 204211
Augmented feedback presented in a virtual environment accelerates learning of a difficult motor task
Todorov E, Shadmehr R and Bizzi E (1997). Journal of Motor Behavior 29(2): 147158
Modeling visual cortical contrast adaptation effects
Todorov E, Siapas A, Somers D and Nelson S (1997). In Computational Neuroscience: Trends in Reseach 5: 525531, Bower (ed), Kluwer Academic
A local circuit integration approach to understanding visual cortical receptive fields
Somers D, Todorov E and Siapas A (1997). In Computational Neuroscience: Trends in Reseach 5: 505510, Bower (ed), Kluwer Academic
A model of recurrent interactions in primary visual cortex
Todorov E, Siapas A and Somers D (1997). In Advances in Neural Information Processing Systems 9: 118126, Mozer, Jordan, Petsche (eds), MIT Press
Variable gain control in local cortical circuitry supports contextdependent modulation by longrange connections
Somers D, Toth L, Todorov E et al (1996). In Lateral Interactions in Cortex  Structure and Function, ch 4, Sirosh et al (eds), Online Book
Catastrophic interference in human motor learning
BrashersKrug T, Shadmehr R and Todorov E (1995). In Advances in Neural Information Processing Systems 7: 1926, Tesauro, Touretzky, Leen (eds), MIT Press
Factorial learning by clustering features
Tenenbaum J and Todorov E (1995). In Advances in Neural Information Processing Systems 7: 561568, Tesauro, Touretzky, Leen (eds), MIT Press
