This post – inspired by / based on Demetri Pananos‘s Introduction of pymc3.ode API – is my first since moving my whole website and blog from Squarespace to a site created with {blogdown} (which uses R Markdown and Hugo), version-controlled on GitHub, and hosted on Netlify. That process wasn’t too hard, mostly just long and boring.

Anyway, I wanted to (1) use the new setup in a cool way, (2) share something I learned – namely, how ODE models are specified in Stan, and (3) figure out how to get everything working correctly (e.g. MathJax, SVG charts, Wikipedia context cards).

These are the packages I’ll be using:

library(dplyr) # data wrangling
library(rstan) # inference
library(zeallot) # multi-assignment
# Visualization:
library(ggplot2)
library(hrbrthemes)
theme_set(theme_ipsum_rc(
  base_size = 14, subtitle_size = 16, axis_title_size = 12
))

Problem

An object of mass \(m = 2\) is brought to some height and allowed to fall freely until it reaches the ground. A differential equation describing the object’s speed \(y\) over time is

$$ y’ = mg - \gamma y $$

The force the object experiences in the downwards direction is \(mg\) – where \(g = 9.8\) is the force of gravity – while the force the object experiences in the opposite direction (due to air resistance \(\gamma = 0.4\)) is proportional to how fast the object is presently moving.

freefall <- function(t, y, params) {
  dy <- 2.0 * params["g"] - params["gamma"] * y
  return(list(dy))
}
times <- seq(0, 10, 0.5)
c(gamma, g, y0, sigma, n) %<-% c(0.4, 9.8, -2, 2, length(times))

set.seed(42)
speed <- deSolve::ode(
  y0, times, freefall,
  parms = c("g" = g, "gamma" = gamma)
) %>%
  as.data.frame %>%
  setNames(c("time", "y")) %>%
  mutate(y_obs = y + rnorm(n, 0, sigma)) # measurement noise

The results look like this:

The goal is to infer \(\gamma\), \(g\), and \(\sigma\) from the data. We will use Stan to perform Bayesian inference.

Solution

We compile the following Stan model as speed_model (via stan_model):

functions {
  real[] freefall(real t,        // time
                  real[] y,      // state
                  real[] theta,  // parameters
                  real[] x_r,    // data (real)
                  int[] x_i) {   // data (integer))
    real dydt[1];
    //  x_r[1] is mass, theta[2] is gravity, theta[1] is gamma
    dydt[1] = x_r[1] * theta[2] - theta[1] * y[1];
    return dydt;
  }
}
data {
  int<lower = 1>   T;
  real             t0;
  real<lower = t0> times[T];
  real             y0_obs;
  real             y_obs[T, 1];
  real             m; // mass of object
}
transformed data {
  real x_r[1];
  int  x_i[0];
  x_r[1] = m;
}
parameters {
  real<lower = 0> sigma;    // noise
  real<lower = 0> theta[2]; // [gamma, g]
  real            y0[1];    // initial condition to be inferred
}
model {
  real y[T, 1];
  // Priors:
  sigma ~ cauchy(0, 5);
  theta[1] ~ gamma(2, 0.2); // mostly [0, 1]
  theta[2] ~ gamma(3, 2);   // mostly [0, 15]
  y0 ~ std_normal(); // prior on initial condition
  // Likelihood:
  y0_obs ~ normal(y0, sigma);
  y = integrate_ode_bdf(freefall, y0, t0, times, theta, x_r, x_i);
  y_obs[, 1] ~ normal(y[, 1], sigma);
}

Stan’s integrate_ode_bdf expects times \(t = 1, \ldots, T\), not starting at 0 – since \(t_0\) is provided along with \(y_0\). (This is important!)

If we didn’t want to infer the initial condition \(y_0\), we would need to:

  • change y0_obs to y0 in the data block
  • remove:
    • real y0[1]; from the parameters block
    • the prior y0 ~ normal(0, 1); from the model block
    • the likelihood y0_obs ~ normal(y0, sigma); from the model block

Nothing in the integrate_ode_bdf call needs to change since it can handle a y0 that’s either data or parameter. For more information refer to the Ordinary Differential Equations chapter in Stan User’s Guide

Okay, now on to inference by sampling:

speed_data <- list(
  T = n - 1,
  t0 = 0,
  times = times[-1],
  m = 2,
  y0_obs = speed$y_obs[1],
  y_obs = array(speed$y_obs[-1], dim = c(n - 1, 1))
)

speed_fit <- sampling(
  speed_model,
  data = speed_data,
  refresh = 0 # silence progress notification
)
print(speed_fit)
## Inference for Stan model: speed_model.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##            mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
## sigma      2.74    0.01 0.52   1.96   2.37   2.66   3.03   3.93  1689 1.00
## theta[1]   0.40    0.00 0.04   0.32   0.38   0.40   0.42   0.48  1189 1.00
## theta[2]   9.69    0.02 0.73   8.21   9.23   9.70  10.17  11.15  1203 1.00
## y0[1]      0.05    0.02 0.91  -1.76  -0.54   0.02   0.64   1.86  1886 1.00
## lp__     -45.29    0.04 1.62 -49.48 -46.07 -44.91 -44.12 -43.34  1355 1.01
## 
## Samples were drawn using NUTS(diag_e) at Fri May 22 09:41:29 2020.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

And here’s a pretty plot of the results: