Preliminary Definitions

Dataset and Class Labels

Suppose you have:

A data matrix $D\in\R^{d\times N}$, where each column is one data sample (feature dimensionality $d$ and $N$ total samples).
A label array $L\in 0,1,...,K-1^N$, with $K$ different classes in total.

For Iris, $d=4$ (sepal length, sepal width, petal length, petal width) and $K=3$ classes (Setosa, Versicolor, Virginica).

We recall the two main matrices in LDA:

They are defined (in normalized form) as follows:

$$ S_B=\frac{1}{N}\sum_{c=1}^Kn_c(\mu_c-\mu)(\mu_c-\mu)^T $$

$$ S_W=\frac{1}{N}\sum_{c=1}^K\sum_{i=1}^{n_c}(x_{c,i}-\mu_c)(x_{c,i}-\mu_c)^T $$

where: