aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornunzip <np.scarh@gmail.com>2018-11-15 00:44:59 +0000
committernunzip <np.scarh@gmail.com>2018-11-15 00:44:59 +0000
commit70692e20466329b41ea73192e177f2fb5150f1aa (patch)
tree3177b519dcd059d54a228aa3d3c4706140f4bf27
parent696d63052406d8c30895c6ffb106d0c6ab4743a2 (diff)
downloadvz215_np1915-70692e20466329b41ea73192e177f2fb5150f1aa.tar.gz
vz215_np1915-70692e20466329b41ea73192e177f2fb5150f1aa.tar.bz2
vz215_np1915-70692e20466329b41ea73192e177f2fb5150f1aa.zip
Modify values for max accuracy
-rwxr-xr-xreport/paper.md39
1 files changed, 18 insertions, 21 deletions
diff --git a/report/paper.md b/report/paper.md
index d286b08..65bacca 100755
--- a/report/paper.md
+++ b/report/paper.md
@@ -7,7 +7,7 @@ the same amount of elements. The test data will instead
be taken from the remaining samples. Testing on accuracy
with respect to data partition indicates that the maximum
accuracy is obtained when using a 90% of the data for
-training. Despite such results we will be using 80% of the data
+training. Despite such results we will be using 70% of the data
for training as a standard. This will allow to give more than one
example of success and failure for each class when classifying the
test_data.
@@ -23,7 +23,7 @@ and eigenvectors. The amount of non-zero eigenvalues and
eigenvectors obtained will only be equal to the amount of
training samples minus one. This can be observed in the
graph below as a sudden drop for eigenvalues after the
-415th.
+363rd.
\begin{center}
\includegraphics[width=20em]{fig/eigenvalues.pdf}
@@ -42,7 +42,7 @@ for our standard seed can be observed below.
To perform face recognition we choose the best M eigenvectors
associated with the largest eigenvalues. We tried
different values of M, and we found an optimal point for
-M=42 with accuracy=66.3%. After such value the accuracy starts
+M=99 with accuracy=57%. After such value the accuracy starts
to flaten, with some exceptions for points at which accuracy decreases.
WE NEED TO ADD PHYSICAL MEANINGS
@@ -60,9 +60,7 @@ two computation techniques used shows that the difference
is very small (due to rounding
of the np.eigh function when calculating the eigenvalues
and eigenvectors of the matrices A\textsuperscript{T}A (NxN) and AA\textsuperscript{T}
-(DxD))
-
-I MIGHT HAVE SWAPPED THE DIMENSIONS, NOT SURE
+(DxD)).
The first ten biggest eigenvalues obtained with each method
are shown in the table below.
@@ -92,15 +90,15 @@ Computing the eigenvectors **u\textsubscript{i}** for the DxD matrix AA\textsupe
we obtain a very large matrix. The computation process can get very expensive when D>>N.
For such reason we compute the eigenvectors **v\textsubscript{i}** of the NxN
-matrix A\textsuperscript{T}A. From the computation it follows that $$ A\textsuperscript{T}A\boldsymbol{v\textsubscript{i}} = \lambda \textsubscript{i}\boldsymbol{v\textsubscript{i}} $$
+matrix A\textsuperscript{T}A. From the computation it follows that $A\textsuperscript{T}A\boldsymbol{v\textsubscript{i}} = \lambda \textsubscript{i}\boldsymbol{v\textsubscript{i}}$.
-Multiplying both side by A we obtain:
+Multiplying both sides by A we obtain:
$$ AA\textsuperscript{T}A\boldsymbol{v\textsubscript{i}} = \lambda \textsubscript{i}A\boldsymbol{v\textsubscript{i}} \rightarrow SA\boldsymbol{v\textsubscript{i}} = \lambda \textsubscript{i}A\boldsymbol{v\textsubscript{i}} $$
-We know that $$ S\boldsymbol{u\textsubscript{i}} = \lambda \textsubscript{i}\boldsymbol{u\textsubscript{i}} $$
+We know that $S\boldsymbol{u\textsubscript{i}} = \lambda \textsubscript{i}\boldsymbol{u\textsubscript{i}}$.
-From here it follows that AA\textsuperscript{T} and A\textsuperscript{T}A have the same eigenvalues and their eigenvectors follow the relationship $$ \boldsymbol{u\textsubscript{i}} = A\boldsymbol{v\textsubscript{i}} $$
+From here it follows that AA\textsuperscript{T} and A\textsuperscript{T}A have the same eigenvalues and their eigenvectors follow the relationship $\boldsymbol{u\textsubscript{i}} = A\boldsymbol{v\textsubscript{i}}$
Using the computational method for fast PCA, face reconstruction is then performed.
@@ -139,12 +137,13 @@ can be observed below:
An example of failed classification is a test face from class 2, wrongly labeled as class 5:
\begin{center}
-\includegraphics[width=20em]{fig/failure_2_5.pdf}
+\includegraphics[width=7em]{fig/face2.pdf}
+\includegraphics[width=7em]{fig/face5.pdf}
%![Class 2 (left) labeled as class 5 (right)](fig/failure_2_5.pdf)
\end{center}
-The alternative method shows overall a better performance, with peak accuracy of 73%
-for M=3. The maximum M non zero eigenvectors that can be used will in this case be at most
+The alternative method shows overall a better performance, with peak accuracy of 69%
+for M=5. The maximum M non zero eigenvectors that can be used will in this case be at most
the amount of training samples per class minus one, since the same amount of eigenvectors
will be used for each generated class-subspace.
@@ -166,15 +165,16 @@ instance of mislabel of the same face of class 2 as class 5. An additional class
failure of class 6 labeled as class 7 can be observed below:
\begin{center}
-\includegraphics[width=20em]{fig/failure_6_7.pdf}
+\includegraphics[width=14em]{fig/failure_6_7.pdf}
%![Class 6 (left) labeled as class 7 (right)](fig/failure_6_7.pdf)
+\includegraphics[width=7em]{fig/rec_6.pdf}
\end{center}
# Question 2, Generative and Discriminative Subspace Learning
Maximize function J(W) (Fisher's Criterion):
-$$ J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}\textrm{ or } J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{t}W}$$
+$$ J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}\textrm{ or } J(W) = \frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{t}W} $$
With S\textsubscript{B} being the scatter matrix between classes, S\textsubscript{W}
being the within-class scatter matrix and W being the set of projection vectors. $\mu$
@@ -191,7 +191,6 @@ $$ (W\textsuperscript{T}S\textsubscript{W}W)2S\textsubscript{B}W - (W\textsupers
$$ S\textsubscript{B}W - JS\textsubscript{W}W = 0 $$
Multiplying by the inverse of S\textsubscript{W} we obtain:
-
$$ S\textsubscript{W}\textsuperscript{-1}S\textsubscript{B}W - JW = 0 $$
From here it follows:
@@ -199,8 +198,7 @@ From here it follows:
$$ W\textsuperscript{*} = arg\underset{W}max(\frac{W\textsuperscript{T}S\textsubscript{B}W}{W\textsuperscript{T}S\textsubscript{W}W}) = S\textsubscript{W}\textsuperscript{-1}(\mu\textsubscript{1} - \mu\textsubscript{2}) $$
By isomorphic mapping where P are the eigenvectors generated through PCA:
-
-$$ W = PX $$
+$W = PX$
We can substitute for W in the J(W) expression, obtaining:
@@ -211,9 +209,8 @@ We can rewrite such expression substituting for:
$$ P\textsuperscript{T}S\textsubscript{B}P = \widetilde{S}\textsubscript{B} \textrm{ and } P\textsuperscript{T}S\textsubscript{t}P = \widetilde{S}\textsubscript{t} $$
$$ J(W) = \widetilde{J}(W) = \frac{X\textsuperscript{T}\widetilde{S}\textsubscript{B}X}{X\textsuperscript{T}\widetilde{S}\textsubscript{t}X} $$
-$$ \widetilde{S}\textsubscript{B} \textrm{ and } \widetilde{S}\textsubscript{t} $$
-
-are respectively semi-positive definite and positive definite. So $$ \widetilde{J}(X) $$
+$\widetilde{S}\textsubscript{B} \textrm{ and } \widetilde{S}\textsubscript{t}$
+are respectively semi-positive definite and positive definite. So $\widetilde{J}(X)$
acts like Fisher's criterion but in PCA transformed space. This method
does not result in any loss of data.