## 2

May 2017# Math for Machine Learning: Top Math Resources for Data Scientists

*At some point, every aspiring data scientist has to get familiar with mathematics for machine learning.*

To be blunt, the more serious you are about data science, the more math you’ll need to learn for machine learning. If you have a strong math background, this is likely to little issue. In my case, I’ve had to relearn much of the mathematics (note – I’m not done yet!) that I took at a university as my professional life had allowed my math skills to atrophy.

Based on my experience teaching our bootcamp there is also a group of aspiring data scientists that fall into a category where their formal math training needs to be augmented. For example, we have many students that come from marketing backgrounds where, for example, studying linear algebra was never a requirement.

## What Math Skills do Data Scientists Need

Forms of the question “what math do I need for data science” and “what math do I need for machine learning” are popular on sites like Quora. I would encourage all aspiring data scientists to perform their own research on this subject and not to take my post as gospel. However, as I often get asked for my opinion on what math aspiring data scientists need to know/study, I will provide my own list:

- Basic statistics and probability (e.g., normal and student’s t distributions, confidence intervals, t-tests of significance, p-values, etc.).
- Linear algebra (e.g., eigenvectors)
- Single variable calculus (e.g., minimization/maximization using derivatives).
- Multivariate calculus (e.g., minimization/maximization with gradients).

Please note that the above is **not** an exhaustive list. To be honest, you likely can never know enough math to help you as a data scientist. What I would argue is the above list represents the 80/20 rule – the 20% of math that you will use 80% of the time as a practicing data scientist.

## A List of Top Math Resources

Here’s my list of the top 80/20 math resources for aspiring data scientists:

The Cartoon Guide to Statistics is one of the books we provide to our bootcamp students and it is an excellent resource for gently learning – or refreshing – your statistics knowledge. It covers many of the basic concepts in statistics in easy-to-consume and an entertaining fashion. Well worth a read.

Coursera’s Statistics with R Specialization is a must for every aspiring data scientist. The accompanying textbook (pictured to the left) is also a great read. I liked the book so much I picked up a hard copy from Amazon.

Interestingly, I’ve found that University of California Irvine’s free UCI Open course Math 4: Math for Economists is a most excellent resource for focusing on the specific aspects of linear algebra and multivariate calculus needed for aspiring data scientists. The accompanying textbook is also quite good and covers a number of interesting subjects, including single variable calculus for folks that need a refresher.

## The Takeaway

Studying of the above resources will allow you to go along way in developing the math skills required for data science. For example, you will be well-prepared to study books like Intro to Statistical Learning,

Elements of Statistical Learning, and Applied Predictive Modeling, including all the mathematics related to the algorithms.

Until next time! I wish happy data sleuthing!