From 16bdf5149292f8f2a8c6e5957a1f644c3d81c9ab Mon Sep 17 00:00:00 2001
From: nunzip <np.scarh@gmail.com>
Date: Thu, 15 Nov 2018 10:45:08 +0000
Subject: Add details in Q2 and Q1

---
 report/paper.md | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

(limited to 'report')

diff --git a/report/paper.md b/report/paper.md
index 68df6fd..de62e4f 100755
--- a/report/paper.md
+++ b/report/paper.md
@@ -143,11 +143,13 @@ use effectively 97% of the information from our initial training data for recons
 \end{center}
 \end{figure}
 
-The analysed classification methods used for face recognition are **Nearest Neighbor** and
-**alternative method** through reconstruction error. 
-EXPLAIN THE METHODS
+The analysed classification methods used for face recognition are Nearest Neighbor and
+alternative method through reconstruction error. 
 
-REFER TO ACCURACY GRAPH 1 FOR NN. MAYBE WE CAN ALSO ADD SAME GRAPH WITH DIFFERENT K
+Nearest Neighbor projects the test data onto the generated subspace and find the closest 
+element to the projected test image, assigning the same class as the neighbor found.
+
+Recognition accuracy of NN classification can be observed in Figure 4 (CHANGE TO ALWAYS POINT AT THE GRAPH, DUNNO HOW).
 
 A confusion matrix showing success and failure cases for Nearest Neighbor classfication
 can be observed below:
@@ -179,6 +181,11 @@ classification.
 \end{center}
 \end{figure}
 
+The process for alternative method is somewhat similar to LDA. One different
+subspace is generated for each class. These subspaces are then used for reconstruction
+of the test image and the class of the subspace that generated the minimum reconstruction
+error is assigned.
+
 The alternative method shows overall a better performance, with peak accuracy of 69%
 for M=5. The maximum M non zero eigenvectors that can be used will in this case be at most
 the amount of training samples per class minus one, since the same amount of eigenvectors
@@ -225,7 +232,15 @@ The pictures on the right show the reconstructed images.
 
 # Question 2, Generative and Discriminative Subspace Learning
 
-Maximize function J(W) (Fisher's Criterion): 
+As mentioned in the introduction, PCA is a generative method that allows to perform dimensionality
+reduction while keeping most of the information from the initial training data. It is a very good method for 
+reconstruction and allows very fast computation. LDA is instead a discriminative method that uses a high 
+dimensional space for computation. It comes with a very high classification accuracy, with the tradeoff of
+being slightly slower than PCA, and not as good for face reconstruction.
+
+To combine both method it is possible to perform LDA in a generative subspace created by PCA. In order to
+maximize class separation and minimize the distance between elements of the same class it is necessary to 
+maximize the function J(W) (Fisher's Criterion): 
 
 $$ J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}\textrm{  or  } J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{t}W} $$ 
 
@@ -263,12 +278,14 @@ $$ P\textsuperscript{T}S\textsubscript{B}P = \widetilde{S}\textsubscript{B} \tex
 $$ J(W) = \widetilde{J}(W) = \frac{X\textsuperscript{T}\widetilde{S}\textsubscript{B}X}{X\textsuperscript{T}\widetilde{S}\textsubscript{t}X} $$
 
 $\widetilde{S}\textsubscript{B} \textrm{  and  } \widetilde{S}\textsubscript{t}$ 
-are respectively semi-positive definite and positive definite. So $\widetilde{J}(X)$ 
-acts like Fisher's criterion but in PCA transformed space. This method 
-does not result in any loss of data.
+are respectively semi-positive definite and positive definite. $\widetilde{J}(X)$ 
+similarly to the original J(X), applies Fisher's criterion in a PCA generated subspace. 
+This enables to perform LDA minimizing loss of data.
 
 *Proof:*
 
+REWRITE FROM HERE
+
 The set of optimal discriminant vectors can be found in R\textsuperscript{n}
 for LDA. But, this is a difficult computation because the dimension is very 
 high. Besides, S\textsubscript{t} is always singular. Fortunately, it is possible 
-- 
cgit v1.2.3-70-g09d2