Population and population parameters
- Population: the entire group of subject
- Population parameters: which represents the populationaka mean and standard deviation. This is called population mean and population SD for normal distribution, or population rate (for exponential distribution)
Sample and estimates
The goal is to use sample to estimate the population parameters, due to limited resources.
- Sample: the part of population from which we collect information
- Estimates (statistic)): the esimated population parameters based on the sample
The more data we have, the more confidence we can have in the accuracy of the estimates, using P-value and Confidence intervals. Each sample will produce different estimates, p-vale or confidence interval would tell whether they are significantly different.
Note: even a small sample can produce an estimate that is close to the parameter.
Sample in machine learning
We can think that sample is like the training dataset in Machine learning, while the population curve is the result we want to predict using machine learning method.