Link: Bayes classifier
What is LDA?
It reduces dimensions, just like PDA.
Example
Assume we use one gene to decide whether a drug works on different people
Use one predictor vs two predictors (gene)
Use three+ predictors: it will be a 3D or more dimensional chart
How does it reduce dimension?
We can’t just simply dropping one of the axis (e.g. gene y) because it will lose data. Instead, LDA creates a new axis and project new data there. LDA creates the new axis based on two criterias:
 Maximize the distance between two mean $μ_{1}$ and $μ_{2}$
 Minimize the variation (“scatter”, $s_{2}$) within each category
 Consider two above simutanously
where $(μ_{1}−μ_{2})$ is also called $d$ (d for distance).
Why we need the two criterias
If we only maxmize $d$, there will be lots of dots overlapped and thus hard to separate them.
Two predictors (genes)
Same process of creating new axis, but it will be 3 dimensional.
Three categories
The process follow the same rule, but the implementation is a bit different.

Maxmize $d$
 Find the main central points of all data
 Find the central points of each category
 Get $d_{1}$, $d_{2}$, $d_{3}$, the distance between each category and the main central point
 Maximize the $d=d_{1}+d_{2}+d_{3}$. Now we have 3 $d$, so we need to add them up in order max the value

Minimize the scatter for each category
$s_{1}+s_{2}+s_{3}d=d_{1}+d_{2}+d_{3} $ 
Consider above two simutanously:

Instead of creating one axis, LDA now creates two axes to separate the data. This is because we have three categories which form a plane, not a line.
An example of usage in application
Say we have 3 categories and 10,000 genes. If we plot the raw data without reducing dimensions, it would require 10,000 axes. However, LDA can reduce the axes to only 2.
LDA vs PDA
 PCA doesn’t focus on separating cateogries. It uses mainly for looking max variation
 Both reduce dimensions by creating new axes.
 PCA creates 1st axis on 1st max variation in data, and 2nd axis and so on
 LDA careates 1st axis per most variation between categories, and so on
We can also dig in to see which predictors (genes) are most impactful
Summary
LDA is a method to reduce dimensions. It’s useful for separating categories.