NIPS 2016 Workshop, Fri Dec 9th 08:00 AM – 06:30 PM

Neurorobotics: A Chance for New Ideas,
Algorithms and Approaches

Fields of Research in Neurorobotics
Modern robots are complex machines with many compliant actuators and various types of sensors including depth and vision cameras, tactile electrodes and dozens of proprioceptive sensors. The obvious challenges are to process these high dimensional input patterns, memorize low dimensional representations of them and to generate the desired motor commands to interact in dynamically changing environments. Similar challenges exist in brain machine interfaces (BMIs) where complex prostheses with perceptional feedback are controlled, or in motor neuroscience where in addition cognitive features need to be considered. Despite this broad research overlap the developments happened mainly in parallel and were not ported or exploited in the related domains. The main bottleneck for collaborative studies has been a lack of interaction between the core robotics, the machine learning and the neuroscience communities.

→ Slides of the Speaker
→ Schedule
→ Accepted papers


→ Pieter Abbeel
→ Jan Babic
→ Johanni Brea
→ Sylvain Calinon
→ Tobi Delbrück
→ Chelsea Finn
→ Martin Giese
→ Moritz Grosse-Wentrup
→ Raia Hadsell
→ Frank Hutter
→ Bert Kappen
→ Kristian Kersting
→ Robert Legenstein
→ Sergey Levine
→ Jean-Pascal Pfister
→ Juergen Schmidhuber
→ Paul Schrater
→ Peter Stone
→ Richard Sutton
→ Emo Todorov
→ Paul Verschure


→ Convolutional Networks and Real-time Robotic and Prosthetic applications.
→ Deep Learning for Robotics and Prosthetics.
→ End-to-End Robotics / Learning.
→ Feature Representations for Big Data.
→ Movement Representations, Movement Primitives and Muscle Synergies.
→ Neural Network Hardware Implementation, Neuromorphic Hardware.
→ Recurrent Networks and Reservoirs for Control of high dimensional systems.
→ Reinforcement Learning and Bayesian Optimization in Neural Networks from multiple reward sources.
→ Sampling Methods and Spiking Networks for Robotics.
→ Theoretical Learning Concepts, Synaptic Plasticity Rules for Neural Networks.


The goal of this workshop is to bring together researcher from the robotics, the machine learning and the neuroscience communities. Robotic applications can be a source and inspiration for theoretical concepts while the sophisticated networks can provide the basis for new ideas and models in neuroscience. In this context, among the questions which we intend to tackle are

1. Reinforcement Learning, Imitation, and Active Learning:

→Which adaptations to reinforcement learning algorithms are necessary to learn from few samples like animals?
→How to learn from structured rewards?
→How to learn abstract concepts of complex behavior that can be used in transfer learning?
→Which reinforcement learning methods can learn on multiple/different time-scales?

2. Model Representations and Features:

→How to adapt to redundant input features?
→What are proper models of state dependent signal noise?
→How to represent multi dimensional and complex solution spaces?
→How to represent and learn causal dependencies between features in different spaces?

3. Feedback and Control:

→How can models of state dependent signal noise be used for control?
→Which model features are sufficient to react to dynamically changing distributions of the inputs or the motor commands?
→Which movement policies and models are needed to control underacted systems?
→How to adapt to failures in control and in learning?

Schedule (Scroll to the right for details)

Day One

Day One, Neurorobotics WS, Fri Dec 9th 2016
14.20-14.30 Introduction by Elmar Rueckert and Martin Riedmiller
Session One: Reinforcement Learning, Imitation, and Active Learning
1 14.30-15.00 Juergen Schmidhuber (Scientific Director of the Swiss AI Lab IDSIA)
15.00-15.30 Posters and Coffee
2 15.30-16.00 Sergey Levine (University of California, Berkeley)
3 16.00-16.30 Pieter Abbeel (University of California, Berkeley)
4 16.30-17.00 Johanni Brea (École polytechnique fédérale de Lausanne, EPFL)
17.00-17.20 Posters and Coffee
5 17.20-17.45 Paul Schrater (University of Minnesota)
6 17.45-18.10 Frank Hutter (University Freiburg)
7 18.10-18.35 Raia Hadsell (Google DeepMind)
18.35-19.00 Panel Discussion, Session One

Day Two

Day Two, Neurorobotics WS, Sat Dec 10th 2016
Session One: Reinforcement Learning, Imitation, and Active Learning
08.30-08.35 Introduction by Elmar Rueckert and Martin Riedmiller
8 08.35-09.05 Robert Legenstein (Graz University of Technology)
9 09.05-09.35 Sylvain Calinon(Idiap Research Institute, EPFL Lausanne)
10 09.35-10.05 Chelsea Finn (University of California, Berkeley)
11 10.05-10.35 Peter Stone (University of Texas at Austin)
10.35-11.00 Posters and Coffee
12 11.00-11.30 Paul Verschure (Catalan Institute of Advanced Research)
Session Two: Model Representations and Features
13 11.30-12.00 Tobi Delbrück (University of Zurich and ETH Zurich)
14 12.00-12.30 Moritz Grosse-Wentrup (Max Planck Institute Tuebingen)
15 12.30-13.00 Kristian Kersting (Technische Universität Dortmund)
13.00-14.00 Lunch break
Session Three: Feedback and Control
16 14.00-14.30 Emo Todorov (University of Washington)
17 14.30-15.00 Richard Sutton (University of Alberta)
15.00-15.30 Posters and Coffee
18 15.30-16.00 Bert Kappen (Radboud University)
19 16.00-16.30 Jean-Pascal Pfister (University of Zurich and ETH Zurich)
16.30-17.00 Posters and Coffee
20 17.00-17.30 Jan Babic (Josef Stefan Institute Ljubijana)
21 17.30-18.00 Martin Giese (University Clinic Tübingen)
18.00-18.30 Panel Discussion, Session Two and Session Three

The Speakers

→ Pieter Abbeel
→ Jan Babic
→ Johanni Brea
→ Sylvain Calinon
→ Tobi Delbrück
→ Chelsea Finn
→ Martin Giese
→ Moritz Grosse-Wentrup
→ Raia Hadsell
→ Frank Hutter
→ Bert Kappen
→ Kristian Kersting
→ Robert Legenstein
→ Sergey Levine
→ Jean-Pascal Pfister
→ Juergen Schmidhuber
→ Paul Schrater
→ Peter Stone
→ Richard Sutton
→ Emo Todorov
→ Paul Verschure

Talk by Juergen Schmidhuber
(Scientific Director of the Swiss AI Lab IDSIA)

On Learning to Think:
Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

Juergen Schmidhuber's talk on Learning to Think
I address the general problem of reinforcement learning (RL) in partially observable environments. In July 2013, we published our large RL recurrent neural networks (RNNs) which learned from scratch to drive simulated cars from high-dimensional video input. However, real brains are more powerful in many ways. In particular, they learn a predictive model of their initially unknown environment, and somehow use it for abstract (e.g., hierarchical) planning and reasoning. Guided by algorithmic information theory, we describe RNN-based AIs (RNNAIs) designed to do the same. Such an RNNAI can be trained on never-ending sequences of tasks, some of them provided by the user, others invented by the RNNAI itself in a curious, playful fashion, to improve its RNN-based world model. Unlike our previous model-building RNN-based RL machines dating back to 1990, the RNNAI learns to actively query its model for abstract reasoning and planning and decision making, essentially "learning to think." The basic ideas can be applied to many other cases where one RNN-like system exploits the algorithmic information content of another. They also explain concepts such as "mirror neurons.”

Talk by Sergey Levine
(University of California, Berkeley)

Deep Sensorimotor Learning at Scale

The problem of building an autonomous robot has traditionally been viewed as one of integration: connecting together modular components, each one designed to handle some portion of the perception and decision making process. For example, a vision system might be connected to a planner that might in turn provide commands to a low-level controller that drives the robot's motors. In this talk, I will discuss how ideas from deep learning can allow us to build robotic control mechanisms that combine both perception and control into a single system. This system can then be trained end-to-end on the task at hand. I will show how this end-to-end approach actually simplifies the perception and control problems, by allowing the perception and control mechanisms to adapt to one another and to the task. I will also present recent work on scaling up deep robotic learning on a cluster consisting of multiple robotic arms, and demonstrate results for learning grasping strategies that involve continuous feedback and hand-eye coordination using deep convolutional neural networks, as well as model-based visual control using predictive video models.

Talk by Paul Schrater
(University of Minnesota)

Controlling your own learning: Autodidactic learning architectures

Human learners express an array of meta-cognitive learning behaviors that are difficult to capture in basic RL architectures. Humans can reason about what is learnable: we make intelligent (and predictably incorrect) choices about time allocation, we predict our likely asymptotic performance on new tasks, and we track our competency in the tasks we train in. We argue that the ability to represent and reason about our own competencies is essential for an rational agent to ameliorate complexity/generalization and time investment trade-offs in learning. We show that these abilities can be represented in a mixed architecture that separately computes predictability and performance, using a mixed architecture: an intrinsically motivated model-based architecture that explicitly computes predictability and competence and an extrinsically motivated Dyna-like model-free architecture that optimizes external reward signals. Computing competence explicitly produces a novel architecture, which we term autodidactic, which arises from a latent variables in a variational Bayes bound on information-theoretic intrinsic motivational terms. By representing and reasoning about their own learning trade-offs, Autodidactic architectures are a better match to human learners and provide another step in the direction of fully autonomous agents.

Talk by Robert Legenstein
(Graz University of Technology)

From policy gradient to policy sampling in spiking neural networks

Spine dynamics at cortical neurons appears to be inherently stochastic. This raises the question of how goal-directed network configurations can emerge through an interplay of a large number of local stochastic processes. We show that these local stochastic synaptic plasticity processes give rise to a well-defined stationary distribution of network configurations, and thereby also to concrete computational functions of the network. When the neural network is considered to represent a policy in the reinforcement learning framework, this corresponds to a sampling of the policy from the posterior distribution over policies that maximize reward.

Talk by Sylvain Calinon
(Idiap Research Institute, EPFL Lausanne)

Learning and control with movement primitives in multiple coordinate systems

Sylvain Calinon's talk on Learning and control with movement primitives in multiple coordinate systems.
Human-centric robot applications requires a tight integration of learning and control. This connexion can be facilitated by representing the tasks to achieve in a probabilistic form. For modern robots in dynamically changing environments, this representation can take various forms. Movements must be enriched with perception, force and impedance information to anticipate the users' behaviors and generate coordinated, safe and natural gestures.
In these applications, the developed models often have to serve several purposes (recognition, prediction, synthesis), and are shared by different learning strategies (imitation, emulation, incremental refinement or stochastic optimization). The aim is to facilitate the transfer of skills from end-users to robots, or in-between robots, by exploiting multimodal sensory information and by developing intuitive teaching interfaces.
In this presentation, I will show an approach combining model predictive control with a compact representation of movement/skill primitives in multiple coordinate systems. It provides a structure to the problem that allows robots to start learning from a reduced number of examples while satisfying a wide range of scenarios. The proposed approach will be illustrated in varied applications, with robots that are either close to us (robot for dressing assistance), part of us (prosthetic hand with EMG and tactile sensing), or far away from us (teleoperation of bimanual robot in deep water).

Talk by Tobi Delbrück
(University of Zurich and ETH Zurich)

Silicon Retina Technology

Tobi Delbrück's talk on Silicon Retina Technology
Machine vision based on conventional imagers faces a fundamental latency-power tradeoff, where decreasing latency is only possible by burning more power to process frames at a faster rate. This tradeoff makes it hard to achieve fast+always on operation, which is desirable in applications ranging over human interaction in mobile electronics, robotics, and vision prosthetics. A potential solution is offered by asynchronous “silicon retina” vision sensors that offer a spike-event output like biological retinas. These neuromorphic sensors have the cost of ~5x-larger current pixel size, but offer advantages in terms of sub-millisecond latency, >100dB dynamic range, 1us temporal resolution, and post-processing costs that typically are 10% of the frame-based equivalent. These event-based sensors offer opportunities for theoretical and practical developments of new classes of algorithms aimed at many dynamic vision applications. The presentation will include a demonstration of a recent-vintage DAVIS sensor, whose output for a spinning dot is shown in the figure.

Talk by Moritz Grosse-Wentrup
(Max Planck Institute Tuebingen)

How does the global configuration of brain rhythms shape motor skill?

Motor skills are not constant but vary on multiple time scales. For instance, we all have experienced good and bad days, i.e., days on which our performance in a well-trained motor task is either well above or well below our average performance. The neural basis of such performance variations, however, remains largely unknown. In this talk, I argue that the global configuration of brain rhythms explains (a substantial percentage of) variations in motor skill. In particular, I demonstrate that the performance of healthy subjects in a visuo-motor adaptation task can be predicted from their global configuration of resting-state brain rhythms. I discuss the implications of this finding for understanding disorders of the motor system, and argue that BCI-based motor rehabilitation should extend its focus beyond sensorimotor rhythms.

Talk by Martin Giese
(University Clinic Tübingen)

Integration of movement primitive-based long-term predictive planning with model predictive control of the humanoid HRP-2

The modeling of complex coordinated human movements is essential for the understanding of human motor control, and the human-like control of complex movements in humanoid robots. Human motor control is highly predictive and often synthesizes movements, taking into account goals and constraints becoming relevant only for later subsequent movements. This requires movement planning over multiple actions or steps. We present an architecture that realizes such highly predictive motor control for the humanoid robot HRP-2 for the task of reaching while walking. The system implements highly flexible online planning of full-body movements for multi-step sequences, realizing human-like coordination of periodic and non-periodic movement primitives, and with a guarantee for the dynamic feasibility of the generated movement for the HRP-2. Opposed to optimal control approaches, which often are too slow for the online computation of multi-step sequences, our architecture allows the control of such complex behaviors in real-time. We demonstrate the superiority of the proposed approach to simpler machine learning-based approaches that train control signals based on dynamically feasible training trajectories. The proposed system implements a planning of such multi-step walking-reaching behaviors according to the 'end-state-comfort principle', which has been observed for such tasks in humans. Supported by European Union under grant agreements FP7-ICT-2013-10/ 611909 (Koroibot), H2020 ICT-23-2014 /644727 (COGIMON), DFG KA 1258/15-1.

Talk by Kristian Kersting
(Technische Universität Dortmund)

Thinking Machine Learning

Our minds make inferences that appear to go far beyond standard data science approaches. Whereas people can learn richer representations and use them for a wider range of machine learning tasks, machine learning algorithms have been mainly employed in a stand-alone context, constructing a single function from a table of training examples. In this talk, I shall touch upon an approach to machine learning that can capture these human learning aspects by combining graphs, databases, and relational logic in general with statistical learning and optimization. Here, high-level (logical) features such as individuals, relations, functions, and connectives provide declarative clarity and succinct characterizations of the machine learning problem. While attractive from a modeling viewpoint, this declarative machine learning programming also often assuredly complicates the underlying model, making solving it potentially very slow. Hence, I shall also touch upon ways to reduce the solver costs. One promising direction to speed up is to cache local structures in the computational models. I shall illustrate this for probabilistic inference, linear programs, and convex quadratic programs, all working horses of data machine learning.
Based on joint works with Martin Mladenov, Amir Globerson, Martin Grohe, Vaishak Belle and many others

Talk by Jan Babic
(Josef Stefan Institute Ljubijana)

Bilateral sensorimotor adaptation for human-robot collaboration

In this talk, I will describe a concept of obtaining complex robotic skills based on the human sensorimotor learning capabilities. The idea is to include the human in the robot control loop and to consider the target robotic platform as a tool that can be iteratively controlled by a human. The skilled control of the robot by the human provides data that are simultaneously used for construction of autonomous controllers that control the robot either independently or in collaboration with the human. Moreover I will explain how we use the same concept in the opposite direction to investigate human motor control mechanisms employed by the central nervous system during the whole-body motion. To demonstrate the applicability of the concept, I will present several robotic examples including cooperative dynamic manipulation skills and adaptive control of exoskeleton robots, as well as several studies of human motor control where we investigated how humans adapt the motion of the body during real-world motor learning tasks.

Accepted Papers

The following abstracts were accepted and will be part of an special issue in Frontiers in Neurorobotics.

→ Kyuhwa Lee, Ruslan Aydarkhanov, Luca Randazzo and José Millán. Neural Decoding of Continuous Gait Imagery from Brain Signals. (ID 2)
→ Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine and Pieter Abbeel. Episodic MPC Improvement with the Hindsight Plan. (ID 11)
→ Jim Mainprice, Arunkumar Bryavan, Daniel Kappler, Dieter Fox, Stefan Schaal and Nathan Ratliff. Functional manifold projections in Deep-LEARCH. (ID 12)
→ Nutan Chen, Maximilian Karl and Patrick van der Smagt. Dynamic Movement Primitives in Latent Space of Time-Dependent Variational Autoencoders. (ID 1)
→ Alexander Gabriel, Riad Akrour and Gerhard Neumann. Empowered Skills. (ID 7)
→ Dieter Buechler, Roberto Calandra and Jan Peters. Modeling Variability of Musculoskeletal Systems with Heteroscedastic Gaussian Processes. (ID 10)
→ David Sharma, Daniel Tanneberg, Moritz Grosse-Wentrup, Jan Peters and Elmar Rueckert. Adapting Brain Signals with Reinforcement Learning Strategies for Brain Computer Interfaces. (ID 16)
→ Dmytro Velychko, Benjamin Knopp and Dominik Endres. The Variational Coupled Gaussian Process Dynamical Model. (ID 5)
→ Felix End, Riad Akrour and Gerhard Neumann. Layered Direct Policy Search for Learning Hierarchical Skills. (ID 6)
→ Erwan Renaudo, Benoît Girard, Raja Chatila and Mehdi Khamassi. Bio-inspired habit learning in a robotic architecture. (ID 9)

The two awards for the best paper and the best poster will be announced in the Friday's panel discussion, Dec. 9th, 2016 at 6 pm.


Elmar Rueckert Elmar Rueckert is a postdoctoral scholar at the Intelligent Autonomous Systems lab headed by Jan Peters. He has a strong expertise in learning spiking neural networks, probabilistic planning and robot control. Before joining IAS in 2014, he has been with the Institute for Theoretical Computer Science at Graz University of Technology, where he received his Ph.D. under the supervision of Wolfgang Mass. His Thesis, "On Biologically inspired motor skill learning in robotics through probabilistic inference" concentrated on probabilistic inference for motor skill learning and on learning biologically inspired movement representations.

Martin Riedmiller Martin Riedmiller joined Google DeepMind in 2015 as research scientist. He received a Diploma in Computer Science in 1992 and a PhD on Self-learning Neural Controllers in 1996 from University of Karlsruhe. He has been a professor at TU Dortmund (2002), University of Osnabrück (2003-2009), and University of Freiburg (2009-2015) where he headed the Machine Learning Lab. His general research interest is applying machine learning techniques to interesting real world problems. His RoboCup team Brainstormes won five international competitions in the 2D Simulation and MiddleSize leagues.