aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornunzip <np.scarh@gmail.com>2018-12-10 20:35:38 +0000
committernunzip <np.scarh@gmail.com>2018-12-10 20:35:38 +0000
commitd255f1d53f47599c6b5c422031e2432472049159 (patch)
tree84c36e027cbdde9c392c2d10fa34d34c0c2b4eaa
parent1fd15d053253ec82326df5816894c34e5de73c22 (diff)
parent410b6adba3cf266fbbe6455caae9c33712b1b8c5 (diff)
downloadvz215_np1915-d255f1d53f47599c6b5c422031e2432472049159.tar.gz
vz215_np1915-d255f1d53f47599c6b5c422031e2432472049159.tar.bz2
vz215_np1915-d255f1d53f47599c6b5c422031e2432472049159.zip
Merge branch 'master' of git.skozl.com:e4-pattern
-rwxr-xr-xreport2/paper.md15
1 files changed, 12 insertions, 3 deletions
diff --git a/report2/paper.md b/report2/paper.md
index cf8221a..98915e8 100755
--- a/report2/paper.md
+++ b/report2/paper.md
@@ -75,8 +75,11 @@ identification is shown in red.
\end{center}
\end{figure}
-Normalization of the feature vectors does not improve accuracy results of the
-baseline as it can be seen in figure \ref{fig:baselineacc}. ###EXPLAIN WHY
+Magnitude normalization of the feature vectors does not improve
+accuracy results of the baseline as it can be seen in figure \ref{fig:baselineacc}.
+This is due to the fact that the feature vectors appear scaled, releative to their
+significance, for optimal distance classification, and as such normalising loses this
+scaling by importance which has previously been introduced to the features.
## kMean Clustering
@@ -89,8 +92,14 @@ classify the query image.
This method did not bring any major improvement to the baseline, as it can be seen from
figure \ref{fig:baselineacc}. It is noticeable how the number of clusters affects
performance, showing better identification accuracy for a number of clusters away from
-the local minimum achieved at 60 clusters (figure \ref{fig:kmeans}). ###EXPLAIN WHY
+the local minimum achieved at 60 clusters (figure \ref{fig:kmeans}). This trend can likely be explained by the number of distance comparison's performed.
+We would expect clustering with $k=1$ and $k=\textrm{label count}$ to have the same performance
+the baseline approach without clustering, as we are performing the same number of comparisons.
+
+Clustering is a great method of reducing computation time. Assuming 39 clusters of 39 neighbours we would be performing only 78 distance computation for a gallery size of 1487, instead of the original 1487. This however comes at the cost of ignoring neighbours from other clusters which may be closer. Since clusters do not necessarily have the same number of datapoints inside them (sizes are uneven), we find that the lowest average number of comparison happens at around 60 clusters, which also appears to be the worst performing number of clusters.
+
+We find that for the query and gallery set clustering does not seem to improve identification accuracy, and consider it an additional baseline.
\begin{figure}
\begin{center}