aboutsummaryrefslogtreecommitdiff
path: root/report/paper.md
blob: 113dfa62d346752ae6f5c399cd7b8e9094ba92c6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# Question 1, Eigenfaces 

The data is partitioned to allow random selection of the
same amount of samples for each class. This is done to
prevent overfitting (?) of some classes with respect to others. In
such way, each training vector space will be generated with
the same amount of elements. The test data will instead
be taken from the remaining samples. Testing on accuracy
with respect to data partition indicates that the maximum
accuracy is obtained when using a 90% of the data for
training. Despite such results we will be using 80% of the data
for training as a standard. This will allow to give more than one
example of success and failure for each class when classifying the 
test_data.

![Classification Accuracy of Test Data vs % of data used for training](fig/partition.pdf "Partition")

After partitioning the data into training and testing sets,
PCA is applied. The covariance matrix, S, of dimension
2576x2576 (features x features), will have 2576 eigenvalues
and eigenvectors. The amount of non-zero eigenvalues and
eigenvectors obtained will only be equal to the amount of
training samples minus one. This can be observed in the
graph below as a sudden drop for eigenvalues after the
415th.

![Log PCA Eigenvalues](fig/eigenvalues.pdf "Eigenvalues")

The mean image is calculated averaging the features of the
training data. Changing the randomization seed will give
very similar values, since the vast majority of the training
faces used for averaging will be the same. The mean face
for our standard seed can be observed below.

![Mean Face](fig/mean_face.pdf){ width=1em }


To perform face recognition we choose the best M eigenvectors
associated with the largest eigenvalues. We tried
different values of M, and we found an optimal point for
M=120. After such value the accuracy starts to flaten, with
some exception for points at which accuracy decreases. 
WE NEED TO ADD PHYSICAL MEANINGS

![Recognition Accuracy of Test data varying M](fig/accuracy.pdf "Accuracy1")

# Question 1, Application of eigenfaces

rming the low-dimensional computation of the
eigenspace for PCA we obtain the same accuracy results
of the high-dimensional computation previously used. A
comparison between eigenvalues and eigenvectors of the
two computation techniques used shows that the difference
is very small. The difference we observed is due to rounding
of the np.eigh function when calculating the eigenvalues
and eigenvectors of the matrices ATA (DxD) and AAT
(NxN).

The first ten biggest eigenvalues obtained with each method
are shown in the table below.

\begin{table}[ht]
\centering
\begin{tabular}[t]{cc}
PCA &Fast PCA\\
2.9755E+05 &2.9828E+05\\
1.4873E+05 &1.4856E+05\\
1.2286E+05 &1.2259E+05\\
7.5084E+04 &7.4950E+04\\
6.2575E+04 &6.2428E+04\\
4.7024E+04 &4.6921E+04\\
3.7118E+04 &3.7030E+04\\
3.2101E+04 &3.2046E+04\\
2.7871E+04 &2.7814E+04\\
2.4396E+04 &2.4339E+04\\
\end{tabular}
\caption{Comparison of eigenvalues obtain with the two computation methods}
\end{table}

It can be proven that the eigenvalues and eigenvectors
obtain are the same: ##PROVE

Reconstruction is then performed on a chosen 
# Cites


# Conclusion


# References