1 files changed, 13 insertions, 37 deletions
diff --git a/report/paper.md b/report/paper.md
index 978cb11..11ed9fd 100755
--- a/report/paper.md
+++ b/report/paper.md
@@ -262,9 +262,7 @@ being slightly slower than PCA, and not as good for face reconstruction.
 
 To combine both method it is possible to perform LDA in a generative subspace created by PCA. In order to
 maximize class separation and minimize the distance between elements of the same class it is necessary to 
-maximize the function J(W) (Fisher's Criterion): 
-
-$$ J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}\textrm{  or  } J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{t}W} $$ 
+maximize the function J(W) (generalized Rayleigh quotient): $J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}$ 
 
 With S\textsubscript{B} being the scatter matrix between classes, S\textsubscript{W} 
 being the within-class scatter matrix and W being the set of projection vectors. $\mu$ 
@@ -285,48 +283,26 @@ $$ S\textsubscript{W}\textsuperscript{-1}S\textsubscript{B}W - JW = 0 $$
 
 From here it follows:
 
-$$ W\textsuperscript{*} = arg\underset{W}max(\frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}) = S\textsubscript{W}\textsuperscript{-1}(\mu\textsubscript{1} - \mu\textsubscript{2}) $$
-
-By isomorphic mapping where P are the eigenvectors generated through PCA:
-$W = PX$
-
-We can substitute for W in the J(W) expression, obtaining:
-
-$$ J(W) = \frac{X\textsuperscript{T}P\textsuperscript{T}S\textsubscript{B}PX}{X\textsuperscript{T}P\textsuperscript{T}S\textsubscript{t}PX} $$
-
-We can rewrite such expression substituting for:
-
-$$ P\textsuperscript{T}S\textsubscript{B}P = \widetilde{S}\textsubscript{B} \textrm{  and  } P\textsuperscript{T}S\textsubscript{t}P = \widetilde{S}\textsubscript{t} $$
-$$ J(W) = \widetilde{J}(W) = \frac{X\textsuperscript{T}\widetilde{S}\textsubscript{B}X}{X\textsuperscript{T}\widetilde{S}\textsubscript{t}X} $$
-
-$\widetilde{S}\textsubscript{B} \textrm{  and  } \widetilde{S}\textsubscript{t}$ 
-are respectively semi-positive definite and positive definite. $\widetilde{J}(X)$ 
-similarly to the original J(X), applies Fisher's criterion in a PCA generated subspace. 
-This enables to perform LDA minimizing loss of data.
+$$ W\textsubscript{opt} = arg\underset{W}max|\frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}| = S\textsubscript{W}\textsuperscript{-1}(\mu\textsubscript{1} - \mu\textsubscript{2}) $$
 
-R\textsuperscript{n} contains the optimal discriminant vectors for LDA.
-However S\textsubscript{t} is singular and the vectors are found through
-an expensive computational process. The soultion to such issue is derivation
-from a lower space.
+However S\textsubscript{W} is often singular since the rank of S\textsubscript{W}
+is at most N-c and usually N is smaller than D.
 
-The M biggest eigenvectors of S\textsubscript{t} are positive and non zero and
-*rank*(S\textsubscript{t})=M. 
+In such case it is possible to use Fisherfaces. The optimal solution to such
+problem lays in W\textsuperscript{T}\textsubscript{opt} =  W\textsuperscript{T}\textsubscript{lda}W\textsuperscript{T}\textsubscript{pca}
 
-**Theorem:**
-*$$ \textrm{For any arbitrary } \varphi \in R\textsuperscript{n}, \varphi 
-\textrm{ can be denoted by } \varphi = X + \xi, $$ 
-$$ \textrm{ where, }X \in \phi\textsubscript{t}\textrm{ and } \xi \in 
-\phi\textsubscript{t}\textsuperscript{perp}\textrm{, and
-satisfies }J(\varphi)=J(X). $$*
+Where W\textsubscript{pca} is chosen to maximize the determinant of the total scatter matrix
+of the projected samples: $$ W\textsuperscript{T}\textsubscript{pca} = arg\underset{W}max|W\textsuperscript{T}S\textsubscript{T}W| $$
+$$ And $$
+$$ W\textsubscript{lda} = arg\underset{W}max\frac{|W\textsuperscript{T}W\textsuperscript{T}\textsubscript{pca}S\textsubscript{B}W\textsubscript{pca}W|}{|W\textsuperscript{T}W\textsuperscript{T}\textsubscript{pca}S\textsubscript{W}W\textsubscript{pca}W|} $$
 
-The theorem indicates that the optimal discriminant vectors can be derived
-through the reduced space obtained through PCA without losing information
-according to the Fisher's criterion.
+Such result indicates that the optimal discriminant vectors can be derived
+from the reduced feature space M\textsubscript{pca} (<=N-c) obtained through PCA
+and applying FLD to reduce the dimension to M\textsubscript{lda}(<=c-1).
 
 In conclusion such method is theoretically better than LDA and PCA alone.
 The Fisherfaces method requires less computation complexity, less time than
 LDA and it improves recognition performances with respect to PCA and LDA.
-Fisherfaces method is effective because it requires less computation 
 
 # Question 3, LDA Ensemble for Face Recognition, PCA-LDA