Philipp Kratzer, Marc Toussaint and Jim Mainprice

Machine Learning and Robotics Lab, University of Stuttgart- Planning robot motion in close proximity to humans is challenging
- Key ability is to forecast human behavior
- In this work:
- Novel prediction method for full-body motion
- Joint human-robot planning framework

- Given trajectory of full-body motion $s_{0:t}$, predict subsequent states$~s_{t+1:T}$
- A good prediction respects both:
**human body dynamics**and**environmental context**

- Good results using
**recurrent neural networks (RNNs)**^{1, 2} - No notion of environmental context $\rightarrow$ challenging to incorporate

- K. Fragkiadaki, S. Levine, P. Felsen, and J. Malik, "Recurrent network models for human dynamics" (2015)
- H. Wang and J. Feng, "Vred: A position-velocity recurrent encoder-decoder for human motion prediction" (2019)

- Gradient-based optimization algorithms widely used in robotics
- Flexible framework for motion planning
- Motion planning with non-convex obstacles possible (e.g. CHOMP
^{1})

- N. Ratliff, M. Zucker, J. A. Bagnell, and S. Srinivasa, "Chomp: Gradient optimization techniques for efficient motion planning" (2009)

Offline: Learn a state-of-the-art RNN model from data $\mathcal{D}$

Online: use model to predict trajectory

Extract constraints from environment and update the prediction using numerical optimization techniques

RNN cells with multiple Gated Recurrent Units and a final linear layer

We input the base rotation, joint angles and velocities to the network and make it predict the next velocity

By adding the velocities to the state we can retrieve the following state $s_{t+1}$

By looping the velocity and states back into the RNN cell, multiple future steps can be predicted

Finally we add our controls $\delta$ to the velocity input. This allows to change the prediction of the final network

$f_\text{RNN} \ldots$ recurrent neural network

$p_d=\phi_d(f_\text{RNN}(\delta)) \ldots$ position at time d and $\Delta_H=||p_{d+1}-p_d||$

SDF $\ldots$ signed distance function

$\lambda, \alpha \ldots$ hyperparameters

*Low-level:*Keep changes to the inputs small*Goalset:*End up close to a specified point$~p^*$*Collision:*Do not collide with obstacles

- Idea: Optimize Human and Robot trajectory
**at the same time** - Allows to
**plan for the robot**while**predicting human motion** - Bidirectional information flow
*Human-robot:*Human predictions and a robotics agent $x$ should not collide: $$ c_{\text{j}}(\delta, x) = \sum_{d=t+1}^T \exp \big\{ -\alpha \| p_d -x_d \| \big\} \Delta_H \Delta_R $$$p_d=\phi_d(f_\text{RNN}(\delta)) \ldots$ position at time d and $\Delta_H=||p_{d+1}-p_d||$

$\phi_d \ldots$ inverse kinematics map at time $d$

$f_\text{RNN} \ldots$ recurrent neural network

$x_d \ldots$ robot position at time d and $\Delta_R=||x_{d+1}-x_d||$

$\alpha \ldots$ hyperparameter

- Final loss is weighted sum
- Compute gradients using automatic differentiation
- We optimize using a gradient-based optimizer (limited-memory BFGS)

- 1 actor, total of ~120min
- Data split into training and testing (9:1)
- 3 data sets: 1)
*walking*2)*pick and place*3)*pick and place with chairs as obstacles* - Purely kinematic model with joint angles

Method | 0.5s | 1s | 1.5s | 2s |
---|---|---|---|---|

Zero velocity baseline | 2.45 | 5.50 | 9.49 | 11.74 |

initial prediction without optimization | 1.25 |
2.48 | 4.56 | 6.39 |

ours with goal constraint | 1.25 |
2.27 | 2.52 | 2.18 |

ours with goal and obstacle objective | 1.25 |
2.18 |
2.41 |
2.10 |

- Averaged distance to ground truth over 22 reaching trajectories
- Best performance with goal and obstacle optimization

- Presented novel prediction method for full-body motion
- Possible to adapt prediction to environmental context
- Combined human and robot trajectory optimization
- More compact derivative computation
- Object affordances (e.g. similar to Koppula and Saxena
^{1}) could be used to guide the optimizer

- H. S. Koppula, and A. Saxena, "Anticipating human activities using object affordances for reactive robotic response" (2015)