The object and the setting is the same as in the previous EKF project (to fuse lidar and radar measurements in order to track a bicyclist), but this time a more advanced filter is used. This more advanced filter is called the Unscented Kalman Filter (UKF). A good paper on the UKF can be found here.

The key differences to the EKF project are:

  • a more complex, non-linear motion model,
  • the way non-linearities are dealt with.


What’s wrong with the EKF, why the need for another KF? As discussed in the previous post the EKF handles non-linearities by applying a first-order linearization as an approximation. This can be good enough, but it is always an approximation and may fail in certain situations. The first project only needed this linearization for the non-linear mapping from state space to radar measurement space, since the motion model, represented by the dot product
F * x was still linear. The UKF is able to handle non-linearities more accurate. It captures the posterior mean and covariance to the 3rd order Taylor series expansion for any non-linearity, whereas the EKF only achieves first order accuracy (Wan, Merwe).

State Vector & Motion Model

The state vector x in the EKF project consists of the four variables [x, y, v_x, v_y] and uses the very basic constant velocity (CV) motion model. In this project we’ll use the constant turn rate and velocity magnitude model (CTRV). The new state vector contains the x,y position, the vehicle’s velocity v as well as the yaw angle psi and its rate of change psi_dot (derivative).

The state transition from timestep k to timestep k+1 can then be defined as follows:

where nu is the noise vector.

Source: Udacity’s course lectures

The solution of the differential equation for the case
yaw rate != 0 is shown below:

To keep this article short, I’ll explain the derivation of the solution in an extra article.

View derivation


The key idea of the UKF is to use so-called Sigma points to map distributions through a non-linear function instead of approximating this non-linear function with a linear one. So the UKF creates some Sigma points, maps them through the non-linear transition function and then tries to reconstruct what the original distribution would look like in the new space. The UKF can be structured into the following steps which are then repeated:



If measurement model is non-linear:

Else if measurement model is linear:

  • normal KF update

Before we start there is a trick…

UKF Augmentation

The process model is defined as

which means that it depends on the state vector as well as on the noise vector. But the noise vector also has a non-linear effect :scream:. Luckily the UKF is pretty :sunglasses: about this. We just have to augment our state vector with the noise variables, which are both zero (white noise and stuff…). This also means that the process covariance matrix gets expanded. The new state and covariance matrix looks like:

Prediction Steps

1 Generate Sigma points

The Sigma points are sampled from specific locations of the distribution. Together they form a matrix of dimensions
n_a X n_sigma + 1, where n_a is the number of state variables (after augmentation) and n_sigma the number of Sigma points. It is defined as follows:

To fit the image I had to use the transpose here. The original equation can be shown with right click on the graphic below.

with the scaling parameter lambda set to:

How many Sigma points to choose? This many:

In C++ I use the Eigen Library which is also able to take the square root of a matrix using the Cholesky decomposition. Each sigma point vector is the sum of the state vector and only one column of the covariance matrix. So we are basically stacking vectors horizontally and make a matrix sandwich. This may be more clear in code:

  # Prior state and covariance
  VectorXd x = VectorXd(n_x);
  x << ... # some values
  MatrixXd P = MatrixXd(n_x, n_x);
  P << ... # some values

  MatrixXd Xsig = MatrixXd(n_x, 2 * n_x + 1);
  //calculate square root of P
  MatrixXd A = P.llt().matrixL();

  Xsig.col(0) = x;
  for (int i = 0; i < n_x; ++i) {
    Xsig.col(i + 1) = x + sqrt(lambda + n_x) * A.col(i);
    Xsig.col(i + 1 + n_x) = x - sqrt(lambda + n_x) * A.col(i);

Back to overview

2 Predict Sigma points

Now every Sigma point (which really is a vector with 7 values) is mapped through the process model into the state space, defined by the 5 state variables. So the process model performs the mapping . The process model is not only a space mapping, but also predicts where the points should be at the next timestep, based on the motion model. We apply the process model for every column of the augmented Sigma point matrix to predict a new state vector. The result is a matrix with the dimensions n_x_orig X n_sigma + 1. This gives us the a priori estimation for the Sigma points matrix. I’ll drop the sigma from here on, too many indices…

The process model makes predictions for every Sigma point (=column) in the Sigma point matrix. The result is the predicted Sigma point matrix in state space.

Back to overview

3 Predict state vector and covariance

Based on the predictions for the Sigma points, we are now able to predict what we’re really after, namely the real state vector and its covariance matrix. The prediction for the state vector is the weighted sum of the predicted Sigma points, where the index i of the predicted Sigma point matrix denotes the specific column:

The weights are defined as

The predicted covariance matrix is the weighted square difference between each Sigma point and the predicted mean

Back to overview

Update Steps

I only describe the case for a non-linear measurement model. This is the case for radar measurements, but not for lidar. The processing chain for lidar measurements from here on is the same as for the vanilla KF.

4 Map predicted Sigma points to measurement space

If the state to measurement space transform is non-linear, we could start the unscented transform again, based on our predicted mean and covariance matrix. But instead of generating new Sigma points, we’re going green and recycle our predicted Sigma point matrix. The measurement model is described by

We can get around the augmentation step, because the noise vector omega has a purely additive effect and we can set it to 0 if we compensate for that later, when we calculate the covariance in measurement space.

The transformation can be described as

The mapping is performed by applying the measurement model on each column of the predicted Sigma point matrix. Since the predicted Sigma points live in the original state space with 5 dimensions and radar measurements contain 3 variables, this gives us . The measurement transform for the whole predicted Sigma point matrix can then be described as .

The radar measurement vector is given by

and the mapping from state variables to measurement variables is defined with

With the Sigma point matrix transformed to measurement space, we can use it to calculate the predicted mean in measurement space the same way we did before in state space:

and also the covariance in measurement space, but now we compensate for the (linear) noise by also adding the sensor covariance matrix R.

The sensor covariance matrix R contains the standard deviations for every measurement variable (assuming they are independent from each other) and is given by the sensor manufacturer.

Almost done! What’s left is the final update step :sweat_smile:.

Back to overview

5 Update state and covariance

Now we connect the uncertainty of the state and measurement space by calculating the cross-correlation matrix T, which we need to calculate the Kalman Gain K. Then we can finally perform the posterior update.

And now the drum roll please … here comes the final update for the state vector and its covariance matrix:

And this is why we can’t have nice things… :unamused: Now that I got that out of my head, I can finally make my whiteboard white again erase my whiteboard!

This helped me to keep myself sane during the project…

Back to overview


View on Github