Preliminary Definitions
Dataset and Class Labels
Suppose you have:
- A data matrix $D\in\R^{d\times N}$, where each column is one data sample (feature dimensionality $d$ and $N$ total samples).
- A label array $L\in 0,1,...,K-1^N$, with $K$ different classes in total.
For Iris, $d=4$ (sepal length, sepal width, petal length, petal width) and $K=3$ classes (Setosa, Versicolor, Virginica).
Within-Class and Between-Class Covariance
We recall the two main matrices in LDA:
- Between-Class Covariance $S_B$.
- Within-Class Covariance $S_W$.
They are defined (in normalized form) as follows:
$$
S_B=\frac{1}{N}\sum_{c=1}^Kn_c(\mu_c-\mu)(\mu_c-\mu)^T
$$
$$
S_W=\frac{1}{N}\sum_{c=1}^K\sum_{i=1}^{n_c}(x_{c,i}-\mu_c)(x_{c,i}-\mu_c)^T
$$
where:
- $n_c$ is the number of samples of class $c$ (so $\sum_{c=1}^Kn_c=N$).
- $\mu_c$ is the mean of the samples of class $c$.
- $\mu$ is the overall mean of all samples in $D$.
- $x_{c,i}$ is the $i^{th}$ sample in class $c$.
Computing Class Means