2 files changed, 6 insertions, 6 deletions
diff --git a/report/metadata.yaml b/report/metadata.yaml
index b404339..7113dce 100755
--- a/report/metadata.yaml
+++ b/report/metadata.yaml
@@ -4,7 +4,7 @@ author:
   - name: Vasil Zlatanov, Nunzio Pucci
     affilation: Imperial College
     location: London, UK
-    email: vz215@ic.ac.uk, np1915@ic.ac.uk
+    email: CID:01120518, CID:01113180
 numbersections: yes
 lang: en
 babel-lang: english
diff --git a/report/paper.md b/report/paper.md
index fc784b7..7fb0961 100755
--- a/report/paper.md
+++ b/report/paper.md
@@ -230,13 +230,13 @@ affect recognition the most are: glasses, hair, sex and brightness of the pictur
 
 To combine both method it is possible to perform LDA in a generative subspace created by PCA. In order to
 maximize class separation and minimize the distance between elements of the same class it is necessary to 
-maximize the function J(W) (generalized Rayleigh quotient): $J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}$ 
+maximize the function J(W) (generalized Rayleigh quotient): $J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}$. 
 
 With S\textsubscript{B} being the scatter matrix between classes, S\textsubscript{W} 
 being the within-class scatter matrix and W being the set of projection vectors. $\mu$ 
 represents the mean of each class.
 
-It can be proven that when we have a singular S\textsubscript{W} we obtain [@lecture-notes]: $W\textsubscript{opt} = arg\underset{W}max\frac{|W\textsuperscript{T}S\textsubscript{B}W|}{|W\textsuperscript{T}S\textsubscript{W}W|} = S\textsubscript{W}\textsuperscript{-1}(\mu\textsubscript{1} - \mu\textsubscript{2})$
+It can be proven that when we have a singular S\textsubscript{W} we obtain [@lecture-notes]: $W\textsubscript{opt} = arg\underset{W}max\frac{|W\textsuperscript{T}S\textsubscript{B}W|}{|W\textsuperscript{T}S\textsubscript{W}W|} = S\textsubscript{W}\textsuperscript{-1}(\mu\textsubscript{1} - \mu\textsubscript{2})$.
 
 However S\textsubscript{W} is often singular since the rank of S\textsubscript{W}
 is at most N-c and usually N is smaller than D. In such case it is possible to use 
@@ -258,7 +258,7 @@ small number) are H\textsubscript{pca}(*e*)=
 
 Through linear interpolation, for $0\leq t \leq 1$: $F\textsubscript{t}(e)=\frac{1-t}{2}
 H\textsubscript{pca}(e)+\frac{t}{2}H\textsubscript{lda}(e)=
-\frac{1-t}{2}<e,S\textsubscript{e}>+\frac{t}{2}\frac{<e, S\textsubscript{B}e>}{<e,S\textsubscript{W}e> + \epsilon}$ 
+\frac{1-t}{2}<e,S\textsubscript{e}>+\frac{t}{2}\frac{<e, S\textsubscript{B}e>}{<e,S\textsubscript{W}e> + \epsilon}$. 
 
 The objective is to find a unit vector *e\textsubscript{t}* in **R**\textsuperscript{n} 
 (with n being the number of samples) such that: $e\textsubscript{t}=arg\underset{et}min F\textsubscript{t}(e)$.
@@ -268,7 +268,7 @@ We can model the Lagrange optimization problem under the constraint of ||*e*||
 
 To minimize we take the derivative with respect to *e* and equate L to zero: $\frac
 {\partial L(e\lambda)}{\partial e}=\frac{\partial F\textsubscript{t}(e)}{\partial e}
-+\frac{\partial\lambda(||e||\textsuperscript{2}-1)}{\partial e}=0$
++\frac{\partial\lambda(||e||\textsuperscript{2}-1)}{\partial e}=0$.
 
 Being $\nabla F\textsubscript{t}(e)= (1-t)Se+\frac{t}{<e,S\textsubscript{W}e>
 +\epsilon}S\textsubscript{B}e-t\frac{<e,S\textsubscript{B}e>}{(<e,S\textsubscript{W}
@@ -278,7 +278,7 @@ parallel to *e*. Since S is positive semi-definite, $<\nabla F\textsubscript{t}(
 It means that $\lambda$ needs to be greater than zero. In such case, normalizing both sides we 
 obtain $\frac{\nabla F\textsubscript{t}(e)}{||\nabla F\textsubscript{t}(e)||}=e$.
 
-We can express *T(e)* as $T(e) = \frac{\alpha e+ \nabla F\textsubscript{t}(e)}{||\alpha e+\nabla F\textsubscript{t}(e)||}$ (adding a positive multiple of *e*, $\alpha e$ to prevent $\lambda$ from vanishing
+We can express *T(e)* as $T(e) = \frac{\alpha e+ \nabla F\textsubscript{t}(e)}{||\alpha e+\nabla F\textsubscript{t}(e)||}$ (adding a positive multiple of *e*, $\alpha e$ to prevent $\lambda$ from vanishing).
 
 It is then possible to use the gradient descent optimization method to perform an iterative procedure
 that solves our optimization problem, using e\textsubscript{n+1}=T(e\textsubscript{n}) and updating