2020-07-16 UPDATE: I have modified this post from its original publication. The previous version (available from the Internet Archive) used the {RStan} package, which, as it turns out, is a nightmare to get working on Windows. This version uses the in-active-development {CmdStanR} interface – a lightweight interface to Stan based on CmdStan and works exactly the same as its Python counterpart CmdStanPy.

2020-07-30 UPDATE: CmdStan 2.24 has improved performance with ODEs and I’ve updated the Stan model code to use the new API. See release candidate announcement for more details. I kept the original, pre-2.24 model code at the bottom of the post for reference.

This post – inspired by / based on Demetri Pananos’s Introduction of pymc3.ode API – is my first since moving my whole website and blog from Squarespace to a site created with {blogdown} (which uses R Markdown and Hugo), version-controlled on GitHub, and hosted on Netlify. That process wasn’t too hard, mostly just long and boring.

Anyway, I wanted to (1) use the new setup in a cool way, (2) share something I learned – namely, how ODE models are specified in Stan, and (3) figure out how to get everything working correctly (e.g. MathJax, SVG charts, Wikipedia context cards).

These are the packages I’ll be using:

library(dplyr) # data wrangling
library(cmdstanr) # inference
library(zeallot) # multi-assignment
# Visualization:
library(ggplot2)
library(hrbrthemes)
theme_set(theme_ipsum_rc(
  base_size = 14, subtitle_size = 16, axis_title_size = 12
))

Problem

An object of mass \(m = 2\) is brought to some height and allowed to fall freely until it reaches the ground. A differential equation describing the object’s speed \(y\) over time is

$$ y’ = mg - \gamma y $$

The force the object experiences in the downwards direction is \(mg\) – where \(g = 9.8\) is the force of gravity – while the force the object experiences in the opposite direction (due to air resistance \(\gamma = 0.4\)) is proportional to how fast the object is presently moving.

freefall <- function(t, y, params) {
  dy <- 2.0 * params["g"] - params["gamma"] * y
  return(list(dy))
}
times <- seq(0, 10, 0.5)
c(gamma, g, y0, sigma, n) %<-% c(0.4, 9.8, -2, 2, length(times))

set.seed(42)
speed <- deSolve::ode(
  y0, times, freefall,
  parms = c("g" = g, "gamma" = gamma)
) %>%
  as.data.frame %>%
  setNames(c("time", "y")) %>%
  mutate(y_obs = y + rnorm(n, 0, sigma)) # measurement noise

The results look like this:

The goal is to infer \(\gamma\), \(g\), and \(\sigma\) from the data. We will use Stan to perform Bayesian inference.

Solution

We compile the following Stan model as speed_model (via cmdstan_model):

functions {
  vector freefall(real t,        // time
                  vector y,      // state
                  vector theta,  // parameters
                  real m) {      // mass
    vector[1] dydt;
    // theta[2] is gravity, theta[1] is gamma
    dydt[1] = m * theta[2] - theta[1] * y[1];
    return dydt;
  }
}
data {
  int<lower = 1>   T;
  real             t0;
  real<lower = t0> ts[T];
  real             y0_obs;
  real             y_obs[T];
  real             m; // mass of object
}
parameters {
  real<lower = 0>      sigma; // noise
  vector<lower = 0>[2] theta; // [gamma, g]
  vector[1]            y0;    // initial condition to be inferred
}
model {
  vector[1] mu[T] = ode_bdf(freefall, y0, t0, ts, theta, m);
  sigma ~ cauchy(0, 5);
  theta[1] ~ gamma(2, 0.2); // mostly [0, 1]
  theta[2] ~ gamma(3, 2);   // mostly [0, 15]
  y0 ~ std_normal();        // prior on initial condition
  y0_obs ~ normal(y0[1], sigma);
  for (t in 1:T) {
    y_obs[t] ~ normal(mu[t], sigma);
  }
}

Stan’s ode_bdf expects times \(t = 1, \ldots, T\), not starting at 0 – since \(t_0\) is provided along with \(y_0\). (This is important!)

If we didn’t want to infer the initial condition \(y_0\), we would need to:

  • change y0_obs to y0 in the data block
  • remove:
    • vector[1] y0; from the parameters block
    • the prior y0 ~ std_normal(); from the model block
    • the likelihood y0_obs ~ normal(y0[1], sigma); from he model block

Nothing in the ode_bdf call needs to change since it can handle a y0 that’s either data or parameter, it just has to be a vector type. For more information see Ordinary Differential Equations in Stan User Guide and ODE Solvers in Stan Functions Reference.

Okay, now on to inference by calling $sample() on the compiled model and providing the data:

speed_data <- list(
  T = n - 1,
  t0 = 0,
  ts = times[-1],
  m = 2,
  y0_obs = speed$y_obs[1],
  y_obs = speed$y_obs[-1]
)

speed_fit <- speed_model$sample(
  data = speed_data,
  seed = 42L,
  refresh = 0, # silence progress notification
  output_dir = here("temp")
)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 45.5 seconds.
## Chain 2 finished in 45.6 seconds.
## Chain 4 finished in 45.9 seconds.
## Chain 3 finished in 48.1 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 46.3 seconds.
## Total execution time: 48.6 seconds.

Note: The output_dir is optional. I specify it here to help with caching my RMarkdown chunks.

Here are the results:

speed_fit$summary()
## # A tibble: 5 x 10
##   variable     mean   median     sd    mad      q5     q95  rhat ess_bulk
##   <chr>       <dbl>    <dbl>  <dbl>  <dbl>   <dbl>   <dbl> <dbl>    <dbl>
## 1 lp__     -45.3    -44.9    1.57   1.32   -48.3   -43.5    1.00    1283.
## 2 sigma      2.76     2.69   0.507  0.464    2.08    3.70   1.00    1841.
## 3 theta[1]   0.398    0.398  0.0377 0.0363   0.337   0.459  1.00    1416.
## 4 theta[2]   9.67     9.68   0.711  0.681    8.52   10.8    1.00    1375.
## 5 y0[1]      0.0352   0.0273 0.909  0.907   -1.41    1.57   1.00    2016.
## # ... with 1 more variable: ess_tail <dbl>

And here’s a pretty plot of those results:

Pre-2.24 model code

functions {
  real[] freefall(real t,        // time
                  real[] y,      // state
                  real[] theta,  // parameters
                  real[] x_r,    // data (real)
                  int[] x_i) {   // data (integer))
    real dydt[1];
    //  x_r[1] is mass, theta[2] is gravity, theta[1] is gamma
    dydt[1] = x_r[1] * theta[2] - theta[1] * y[1];
    return dydt;
  }
}
data {
  int<lower = 1>   T;
  real             t0;
  real<lower = t0> times[T];
  real             y0_obs;
  real             y_obs[T, 1];
  real             m; // mass of object
}
transformed data {
  real x_r[1];
  int  x_i[0];
  x_r[1] = m;
}
parameters {
  real<lower = 0> sigma;    // noise
  real<lower = 0> theta[2]; // [gamma, g]
  real            y0[1];    // initial condition to be inferred
}
model {
  real y[T, 1];
  // Priors:
  sigma ~ cauchy(0, 5);
  theta[1] ~ gamma(2, 0.2); // mostly [0, 1]
  theta[2] ~ gamma(3, 2);   // mostly [0, 15]
  y0 ~ std_normal(); // prior on initial condition
  // Likelihood:
  y0_obs ~ normal(y0, sigma);
  y = integrate_ode_bdf(freefall, y0, t0, times, theta, x_r, x_i);
  y_obs[, 1] ~ normal(y[, 1], sigma);
}