Link: Stats

Population and population parameters

  • Population: the entire group of subject
  • Population parameters: which represents the populationaka mean and standard deviation. This is called population mean and population SD for normal distribution, or population rate (for exponential distribution)

Sample and estimates

The goal is to use sample to estimate the population parameters, due to limited resources.

  • Sample: the part of population from which we collect information
  • Estimates (statistic)): the esimated population parameters based on the sample

The more data we have, the more confidence we can have in the accuracy of the estimates, using P-value and Confidence intervals. Each sample will produce different estimates, p-vale or confidence interval would tell whether they are significantly different.

Note: even a small sample can produce an estimate that is close to the parameter.

Sample in machine learning

We can think that sample is like the training dataset in Machine learning, while the population curve is the result we want to predict using machine learning method.