Link: Stats

Use to compare the means of two groups.

The goal of t-test is to compare means and see if they’re significantly different from each other.

Main concept

The two concepts from Linear regresssion can also be applied to t-test.

  1. Is the x variable useful to predict the y? (via R-squared)
  2. What’s the chance to see that relationship? (via P-value)

Steps of conducting t-test comparing with linear regression

Review t-test: it compares means from two groups and check whether they’re significantly different.

Note: the images below are comparing linear regression (left) and t-test (right) to show how different they are.

Step 1: Find overall mean

Find the overall means of linear regression and t-test.

Step 2: Calculate SS(mean)

SS(mean) = sum of squared residuals around the mean.

Step 3: Fit a line to the data

Now we put x-axis back. To Fitting a line to data in linear regression, we can use Least squares.

For t-test, if we also use least squares, the line would be the two mean lines of each group. Then we can combine the two lines into a single equation by computer.

Get the equation by combine two means line

For group 1, apply y = 1 * mean1 + 0 * mean2 + the residual for each data point.

Do the same for group 2, but swap the 1 and 0 in the equation. The 1s and 0s forms a Design matrix, which can be combined to: (Note that it’s not a standard design matrix)

where the column1 and column2 act as a switch to turn the mean on or off.

Step 4: Calculate SS(fit)

SS(fit) = ss of the residuals around the line(s)

For linear regression: SS(fit) is simple as it only has one line

SS(fit) for t-test is the ss for two lines.

Step 5: Calculate F-value

Plug-in the number from previous steps, and we have F-value.

Note in here:

For linear regression, = 1, = 2

For t-test, = 1, = 2 (mean for two groups separately)

Once we have the F-value, we can convert it to p-value.