Link: Supervised learning
The main idea of linear regression
 Use Least squares to Fitting a line to data in linear regression
 Calculated Rsquared
 Calcualte a Pvalue for $R_{2}$
Terminology
SS(mean) sum of squares around the mean
Var(mean) variation around the mean
In this way, var() can be viewed as the average SS.
SS(fit) sum of squares of around the leastsquares fit
Var(fit) variation around the line
How does it work

Meausre $SS(mean)$

Measure $SS(fit)$

Plug in and we get $R_{2}$
$R_{2}=SS(mean)SS(mean)−SS(fit) $Note that when $SS(fit)$ is 0, then $R_{2}$ = 1

Use pvalue to determine if $R_{2}$ is statistically significant When there’re only two data points for the line, $SS(fit)$ = 0. So we need a pvalue to identify things like this. Pvalue for $R_{2}$ comes from $F$, the Fvalue.
Fvalue
The larger Fvalue is, it indicates y can explain more variation in x.
$(P_{fit}−P_{mean})/(n−P_{fit})$ is also known as degree of freedom.
Steps to covert Fvalue to pvalue:

Generate a set of random data

Calculate the mean and ss(mean)

Plugin and get $F$ value, plot the value in histogram

Repeat many times for random datasets

Do the same for the original dataset, and get $F_{0}$. pvalue = the number of more extreme values than $F_{0}$ divided by all F values
Fdistributions
In above example, the sample of red is smaller than blue while other parameters are the same. The blue line is steeper, and thus, when we calculate the probability of the extreme values for getting pvalue, the pvalue would be smaller. Therefore, more sample size leads to a smaller pvalue.