Mostly-free resources for learning data science
In the past year or two I’ve had several friends approach me about learning statistics because their employer/organization was moving toward a more data-driven approach to decision making. (This brought me a lot of joy.) I firmly believe you don’t actually need a fancy degree and tens of thousands of dollars in tuition debt to be able to engage with data, glean insights, and make inferences from it. And now, thanks to many wonderful statisticians on the Internet, there is now a plethora of freely accessible resources that enable curious minds to learn the art and science of statistics.
First, I recommend installing R and RStudio for actually using it. They’re free and what I use for almost all of my statistical analyses. Most of the links in this post involve learning by doing statistics in R.
Okay, now on to learning stats…
Free, self-paced online courses from trustworthy institutions:
- Data Analysis and Statistical Inference
- Carnegie Mellon University’s Open Learning Initiative:
- Harvard University’s edX: Data Analysis for Life Sciences 1: Statistics and R
- University of Toronto’s Statistics: Making Sense of Data on Coursera
Not free online courses from trustworthy institutions:
- Johns Hopkins University’s Data Science Specialization on Coursera
- Introduction to Data Science with R by Garrett Grolemund of RStudio
Free books and other resources:
- DataSciGuide (Data Science Learning Directory) by Renee Marie Parilak Teate of Becoming A Data Scientist
- Computer Age Statistical Inference by ( Free PDF) by Bradley Efron and Trevor Hastie
- R for Data Science by Garrett Grolemund and Hadley Wickham
- Python Data Science Handbook by Jake VanderPlas, in the form of Jupyter notebooks
- Think Bayes: Bayesian Statistics Made Simple by Allen B. Downey
- An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (free PDF available from www-bcf.usc.edu/~gareth/ISL/)
- Learn R in R with swirl
- Probability and statistics ebook by UCLA’s Statistics Online Computational Resource
- AP Statistics Tutorial
Book recommendations:
- Introductory Statistics with R by Peter Dalgaard
- Doing Data Science: Straight Talk from the Frontline by Cathy O’Neil
- Statistics in a Nutshell by Sarah Boslaugh
- Principles of Uncertainty by Jay Kadane (free PDF at http://uncertainty.stat.cmu.edu/)
- Statistical Rethinking: A Bayesian Course with Examples in R and Stan by Richard McElreath
Phew! Okay, that should be enough. Feel free to suggest more in the comments below.