# Probabilistic programming languages for statistical inference

## Introduction

This post was inspired by a question about JAGS vs BUGS vs Stan:

right, that's what got me confused! so they.. do the same thing? @RallidaeRule

— Andrew MacDonald ðŸŒˆ (@polesasunder) January 10, 2017

Explaining the differences would be too much for Twitter, so I’m just gonna give a quick explanation here.

**2020-05-18 update**: Coming from a background of statistical inference in the context of academia and research using R, where these have been the prevalent
PPLs for quite some time, I admittedly have a bit of a blind spot for
PyMC3. I know it’s *very* popular in the industry (especially for probabilistic machine learning) and can even be used for
supply chain optimization. Furthermore, rise of
deep probabilistic programming in the past few years has yielded PPLs like
TensorFlow-based
Edward (superseded by
Edward2) &
TensorFlow Probability,
PyTorch-based
Pyro &
Brancher. I also don’t have a lot of experience with
Julia, so my blind spot extends to
Turing.jl and
Gen.jl. Clearly there are…a *lot* of choices, so I would recommend checking out
Colin Carroll’s
tour of PPL APIs.

## Bayesian modelling languages

### BUGS (Bayesian inference Using Gibbs Sampling)

I was taught to do Bayesian stats using
WinBUGS, which is now a very outdated (but stable) piece of software for Windows. There’s also
OpenBUGS, an open source version that can run on Macs and Linux PCs. Benefits include: academic papers and textbooks written in 80s, 90s, and early 2000s that use Bayesian stats might include models written in BUGS. For example,
Bayesian Data Analysis* (1st and 2nd editions) and
*Data Analysis Using Regression and Multilevel/Hierarchical Models* use BUGS.

### JAGS (Just Another Gibbs Sampler)

JAGS, like OpenBUGS, is available across multiple different platforms. The language it uses is basically BUGS, but with a few minor differences that require you to rewrite BUGS models to JAGS before you can run them.

I used JAGS during my time at University of Pittsburgh’s neuropsych research program because we used Macs, I liked that JAGS was written from scratch, and I preferred the R interface to JAGS over the R interfaces to WinBUGS/OpenBUGS.

### Stan

Stan is a newcomer and it’s pretty awesome. It has a bunch of interfaces to modern data analysis tools. The language syntax was designed from scratch by people who wrote BUGS programs and thought it could be better and were inspired by R’s vectorized functions. It’s strict about the type of data (integer vs real number) and about parameters vs transformed parameters, which might make it harder to get into than BUGS which gives you a lot of leeway (kind of like R does), but I personally like constraints and precision since that’s what allows it to be hella fast. Stan is fast because it compiles your Stan models into C++ (hence the need for strictness). I also really like Stan’s Shiny app for exploring the posterior samples, which also supports MCMC output from JAGS and others.

The latest (3rd) edition of Bayesian Data Analysis has examples in Stan and
*Statistical Rethinking* uses R and Stan, so if you’re using modern textbooks to learn Bayesian statistics, you’re more likely to find examples in Stan.

There are two pretty cool R interfaces to Stan that make it easier to specify your models. The first one is rethinking (accompanies the Statistical Rethinking book I linked to earlier) and then there’s brms and rstanarm, which use a formula syntax similar to lme4.

## Conclusion

can I just do Stan forever?

— Andrew MacDonald ðŸŒˆ (@polesasunder) January 10, 2017

Stan has an active discussion board and development, so if you run into issues with a particular model or distribution, or if you’re trying to do something that Stan doesn’t support, you can reach out there and you’ll receive help and maybe they’ll even add support for whatever it is that you were trying to do.