# Even faster matrix math in R on macOS with M1

I recently picked up an M1 Mac mini to replace my more-than-a-decade-old Mac mini while the old one was still operational and I could transfer files (namely my music library) more easily. At the same time I also did a full Windows reset on my Dell laptop that I previously used for gaming and development but decided to use only for gaming, nothing more, after seeing how much RAM was being taken up by various background processes.

So I decided to make the new Mac mini my blogging computer. Turns out it’s great!

Sometimes I blog about Bayesian modeling with Stan, and when I re-knit the post about ODEs I saw sampling time drop from 30 seconds per chain (from that Dell G5 laptop with Core i7 CPU) to 15 seconds per chain. The improvement in computing performance from M1 is going to be really nice for any future technical blog posts I write.

After that the second thing I wanted to do was revisit an earlier blog post – partly because of my own curiosity, but also partly because a reader wrote in:

I recently purchased a new MacBook pro with the M1 chip and RStudio runs are much slower than on my old 2012 MacBook. Through lots of google searching I learned that the old MacBook has BLAS enabled, but the new one does not. I tried using your advice from https://mpopov.com/blog/2019/06/04/faster-matrix-math-in-r-on-macos/ but it is not updating when I recheck the session info.

Okay, there’s a couple of things going on here. First, if you transferred your system from the old MacBook to the new one then you’re using the “Intel 64-bit” version of R that runs via
Rosetta 2, rather than the arm64 version of R which was released to run natively on Apple Silicon. If that’s the case I recommend downloading and installing *that* (available
here), together with a copy of GNU Fortran for arm64 (available
here) and the latest version of
RStudio – since
v1.4 is when they added support for R 4.1 *and* native arm64 builds of R.

In writing this blog post I installed both versions of R ~~4.1.1~~ 4.2.0 and used
RSwitch to switch between them.

Using instructions posted to the R-SIG-Mac mailing list I switched out the BLAS library to Apple’s vecLib version.

```
cd /Library/Frameworks/R.framework/Resources/lib/
ln -s -i -v libRblas.vecLib.dylib libRblas.dylib
```

## Benchmark

This is the code I benchmarked on all four configurations:

```
set.seed(20211010)
d <- 1e2
a <- matrix(rnorm(d^2), d, d)
n <- 1e3
p <- 1e2
b <- rnorm(p + 1, 0, 10)
x <- matrix(runif(n * p, -10, 10), ncol = p, nrow = n)
y <- cbind(1, x) %*% b + rnorm(n, 0, 2)
mb <- microbenchmark(
tcrossprod(a), solve(a), svd(a), lm(y ~ x),
times = 1000L,
unit = "ms"
)
```

## Results

M1 Mac mini benchmarks of various operations involving matrix math | |||||

Execution time (ms) | |||||
---|---|---|---|---|---|

Minimum | Lower Quartile | Median | Upper Quartile | Maximum | |

tcrossprod(a) | |||||

R 4.2.0 with R's BLAS on Apple Silicon arm64 | 0.19 | 0.20 | 0.20 | 0.20 | 3.73 |

R 4.2.0 with R's BLAS on Intel 64-bit via Rosetta2 | 0.22 | 0.22 | 0.23 | 0.23 | 3.11 |

R 4.2.0 with Apple vecLib on Apple Silicon arm64 | 0.02 | 0.02 | 0.02 | 0.02 | 6.39 |

R 4.2.0 with Apple vecLib on Intel 64-bit via Rosetta2 | 0.04 | 0.07 | 0.07 | 0.08 | 2.90 |

svd(a) | |||||

R 4.2.0 with R's BLAS on Apple Silicon arm64 | 2.57 | 2.61 | 2.62 | 2.64 | 32.70 |

R 4.2.0 with R's BLAS on Intel 64-bit via Rosetta2 | 2.91 | 2.93 | 2.94 | 2.97 | 29.78 |

R 4.2.0 with Apple vecLib on Apple Silicon arm64 | 1.02 | 1.07 | 1.09 | 1.28 | 5.59 |

R 4.2.0 with Apple vecLib on Intel 64-bit via Rosetta2 | 1.70 | 1.73 | 1.76 | 1.79 | 7.44 |

solve(a) | |||||

R 4.2.0 with R's BLAS on Apple Silicon arm64 | 0.51 | 0.53 | 0.53 | 0.54 | 30.58 |

R 4.2.0 with R's BLAS on Intel 64-bit via Rosetta2 | 0.57 | 0.58 | 0.59 | 0.60 | 3.53 |

R 4.2.0 with Apple vecLib on Apple Silicon arm64 | 0.12 | 0.14 | 0.14 | 0.15 | 3.77 |

R 4.2.0 with Apple vecLib on Intel 64-bit via Rosetta2 | 0.22 | 0.26 | 0.27 | 0.29 | 2.97 |

lm(y ~ x) | |||||

R 4.2.0 with R's BLAS on Apple Silicon arm64 | 8.28 | 8.37 | 8.41 | 8.52 | 15.27 |

R 4.2.0 with R's BLAS on Intel 64-bit via Rosetta2 | 7.85 | 7.92 | 7.97 | 8.13 | 34.30 |

R 4.2.0 with Apple vecLib on Apple Silicon arm64 | 4.38 | 4.55 | 4.61 | 4.71 | 34.63 |

R 4.2.0 with Apple vecLib on Intel 64-bit via Rosetta2 | 3.68 | 3.74 | 3.78 | 3.91 | 29.09 |

- Posted on:
- October 10, 2021

- Length:
- 4 minute read, 734 words