Link: Data visualization

A QQ plot helps us see if the data fits a distribution.

Let’s take Normal distribution as an example:

Steps to do a QQ plot for normal distribution

Step 1: give each point its own quantile

Step 2: get a (any) normal curve

Step 3: Add the same number of quantiles to the curve

Step 4: Plot the QQ plot

Get the x coordinate from the line in normal distribution, and get y coordinate from the data set. For each quantile, we draw a dot and we will get a QQ graph.

Interpret the results

If most dots fit a straight line, the data is normally distributed. If many of them did not fit, we should consider other distributions.

Compare two dataset with QQ plot

QQ plot can be used to compare datasets and see if they have similar distribution.

Example: Compare dataset to another smaller dataset

The new dataset only has 4 quartiles.

The steps are similar. We take the dataset with the smaller size and determine the another data set with the same number of quantile. Then get the x from the new dataset, and y from original dataset.

Once we get the dots, we can add a straight line to see how fit it is. If it fits well, meaning the two datasets similar distributions.