MuJoCo is our physics engine designed for model-based control. It combines recursive algorithms in generalized coordinates, and velocity-stepping using contact impulse solvers. The solvers are based on our new formulations of the physics of contact, which preserve realism while facilitating optimal control via contact smoothing. MuJoCo has become the cornerstone in our efforts to design intelligent controllers. It can be downloaded from www.mujoco.org.

Analytically-invertible dynamics with contacts and constraints (ICRA 2014)
MuJoCo: A physics engine for model-based control (IROS 2012)
A convex, smooth and invertible contact model for trajectory optimization (ICRA 2011)
Implicit nonlinear complementarity: A new approach to contact dynamics (ICRA 2010)
Stochastic complementarity for local control of discontinuous dynamics (RSS 2010)


We are developing algorithms and software for model-predictive control (MPC). The speed and accuracy of MuJoCo together with refinements in the underlying optimization method (iterative LQR), contact models and cost function design, have enabled real-time MPC for high-degree-of-freedom systems with rich contact interactions. Important challenges remain: modeling and estimation errors, as well as insufficient computing power for longer-horizon planning. If these challenges are overcome - as we believe they will be - this approach is likely to revolutionize robotics as well as gaming.

Receding-horizon online optimization for dexterous object manipulation
Control-limited differential dynamic programming (ICRA 2014)
Real-time behavior synthesis for dynamic hand manipulation (ICRA 2014)
An integrated system for real-time model-predictive control of humanoid robots (Humanoids 2013)
Synthesis and stabilization of complex behaviors through online trajectory optimization (IROS 2012)
Infinite-horizon model-predictive control for nonlinear periodic tasks with contacts (RSS 2011)
First-exit model-predictive control of fast discontinuous dynamics: Application to ball bouncing (ICRA 2011)


Contact-invariant optimization (CIO) is a new approach to optimal control, introducing auxiliary decision variables that enable the optimizer to reason in a continuous fashion about contact events. This is the most powerful dynamic planning method to date. It generates remarkably rich and complex behaviors fully automatically, without need for motion capture, manual scripting or careful initialization. CIO is currently used offline, but we hope to adapt it to the MPC setting. Another application is detailed optimal control modelling of biological movements.

Ensembe-CIO: Full-body dynamic motion planning for physical humanoids
Animating human lower limbs using Contact-Invariant Optimization (SIGGRAPH ASIA 2013)
Contact-Invariant Optimization for hand manipulation (SCA 2012)
Discovery of complex behaviors through Contact-Invariant Optimization (SIGGRAPH 2012)


We are adapting our optimal control machinery to solve state estimation and system identification problems. Instead of treating the two problems as being separate, we solve them simultaneously - by jointly optimizing the movement trajectory and the model parameters so as to fit the available measurements. This is similar in spirit to SLAM, except here the estimation problem (corresponding to localization in SLAM) is defined in a high-dimensional configuration space. Applications include sensor calibration, inference of kinematic and dynamic model parameters, and robust tracking. Having accurate models is essential for model-based approaches to control, thus we are motivated to make this work and scale it up to complex robots.

Physically-consistent sensor fusion in contact-rich behaviors (IROS 2014)
STAC: Simultaneous tracking and calibration (Humanoids 2013)
Modeling and identification of pneumatic actuators (2013)
Identification and control of a pneumatic robot (BioRob 2010)
Optimal trade-off between exploration and exploitation (ACC 2008)
Probabilistic inference of trajectories, skeletal parameters and marker attachments (IEEE TBE 2007)


Our detour in the world of hardware is motivated by interest in dexterous object manipulation combined with shortage of suitable robotic hands. After working with electric motors, we settled on pneumatic actuation - which we find to be a lot better than its reputation suggests. We now have two 20-DOF hands actuated by a custom 40-cylinder pneumatic drive. These hands are faster and almost as compliant as human hands. We are not aware of a tactile sensing solution of similar quality, and human studies indicate that tactile sensing is critical. Nevertheless we look forward to applying our control methodology to dexterous object manipulation, with fingers crossed.

A low-cost, 20-DOF anthropomorphic robotic hand: Design, actuation and modeling (Humanoids 2013)
Fast, strong and compliant pneumatic actuation for dexterous tendon-driven hands (ICRA 2013)
Design of an anthropomorphic robotic finger system with biomimetic artificial joints (BioRob 2012)
Modular bio-mimetic robots that can interact with the world the way we do (ICRA 2011)


We identified a family of stochastic optimal control problems that are linearly-solvable, in the sense that the exponentiated optimal value function z(x) is the solution to a linear equation (see figure) involving the state cost q(x) and the uncontrolled stochastic dynamics p(y|x). The key restriction is that the controls and noise must act in the same subspace. This problem family has many surprising properties: duality with Bayesian inference, compositionality of optimal control laws, deterministic characterization of the most likely trajectory under the stochastic optimal control law, convexity of the inverse optimal control problem as well as a related policy gradient learning problem. We hope to translate these theoretical advances into practical methods for large-scale control problems, by leveraging our efficient trajectory optimizers to bootstrap learning of global controllers.

Relative entropy and free energy dualities: Connections to path integral and KL control (CDC 2012)
Linearly-solvable optimal control (Book chapter, 2012)
Linearly-solvable Markov games (ACC 2012)
Aggregation methods for linearly-solvable MDPs (IFAC 2011)
Finding the most likely trajectories of optimally-controlled stochastic systems (IFAC 2011)
Policy gradients in linearly-solvable MDPs (NIPS 2010)
Inverse optimal control with linearly-solvable MDPs (ICML 2010)
Efficient computation of optimal actions (PNAS 2009)
General duality between optimal control and estimation (CDC 2008)
Linearly-solvable Markov decision problems (NIPS 2006)


Hierarchical control is a tempting idea which we revisit every couple of years. Ideally we will have an automated method that is generally applicable and outperforms the best non-hierarchical methods. The hard part is constructing the necessary low-level representations/features automatically instead of relying on human intuition. While such a method does not yet exist, we believe that some yet-undiscovered form of unsupervised learning will eventually succeed. In the meantime, we have developed general methodology that can leverage pre-existing low-level representations - which we treat as feedback controllers, transforming the plant into an augmented plant that accepts higher-level control signals.

Compositionality of optimal control laws (NIPS 2009)
Hierarchical framework for approximately-optimal control of redundant manipulators (JRS 2005)
Analysis of the synergies underlying complex hand manipulation (IEEE EMBC 2004)
Unsupervised learning of sensory-motor primitives (IEEE EMBC 2003)


We have developed a range of computational models based on the idea that the brain controls the body in the best way possible, i.e. optimally. Given how hard it is to find the exact solutions to optimal control problems (perhaps even for the brain), it may be more useful to think of optimization as a process that improves behavior over time, instead of being a normative model. The same neural machinery that discovers the optimal solution in simpler or well-practiced tasks may only approximate the optimal solution in more challenging or un-ecological or novel tasks. Another important issue - which is the main difference among competing optimal control models - is identifying the specific cost functions and constraints that define the optimization problem. Further progress in this area requires qualitatively better methods for computing what is optimal, allowing researchers to go beyond point-mass dynamics. Now that we have developed such methods, we look forward to applying them to richer motor behaviors, and in particular to inferring cost functions from data in an automated way. Replicating complex biological movements in robotic systems is another promising direction, and we now have both the hardware and the control methodology to pursue it.

EMG variability supports the minimal intervention principle of motor control (J Neurophys 2009)
Evidence for the flexible sesorimotor strategies predicted by optimal feedback control (J Neurosci 2007)
Stochastic optimal control and estimation methods with signal-dependent noise (Neural Comp 2005)
Optimality principles in sensorimotor control (Nat Neurosci 2004)
Optimal feedback control as a theory of motor coordination (Nat Neurosci 2002)
Cosine tuning minimizes motor errors (Neural Comp 2002)
Path-constrained smoothness maximization predicts complex speed profiles (J Neurophys 1998)

Copyright © 2010-2013 Emo Todorov