Add comments to part 3

author: nunzip <np.scarh@gmail.com> 2019-02-13 18:21:16 +0000
committer: nunzip <np.scarh@gmail.com> 2019-02-13 18:21:16 +0000
commit: d29d8573cbc149fb23339b787c769f3e7c860362 (patch)
tree: b7f3e8845b1bf68e642509443cd53d61e3ba472f
parent: 3da81b76d224d2d81d576b77193703150c3d67d9 (diff)
download: e4-vision-d29d8573cbc149fb23339b787c769f3e7c860362.tar.gz
e4-vision-d29d8573cbc149fb23339b787c769f3e7c860362.tar.bz2
e4-vision-d29d8573cbc149fb23339b787c769f3e7c860362.zip
1 files changed, 9 insertions, 1 deletions
diff --git a/report/paper.md b/report/paper.md
index 51df666..0674557 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -95,9 +95,11 @@ Figure \ref{fig:km_cm} shows a confusion matrix for K-means+RF CLassifier with 2
 
 # RF codebook
 
-An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each decriptor according to which leaf of each tree in the ensemble it is sorted. This effectively performs and unsupervised transformation of our dataset to a high-dimensional spare representation. The dimension of the vocabulary size is determined by the number of leaves in each random tree and the ensemble size. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduce complexity of RF Codebook creation, 
+An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each decriptor according to which leaf of each tree in the ensemble it is sorted. This effectively performs and unsupervised transformation of our dataset to a high-dimensional spare representation. The vocabulary size is determined by the number of leaves in each random tree multiplied by the ensemble size. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduce complexity of RF Codebook creation, 
 which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mapping given a created vocabulary is also quicker than K-means, $O(\log K)$ (assuming a balanced tree) vs $O(KD)$.
 
+The effect of vocabulary size on classification accuracy can be observed both in figure \ref{fig:p3_voc}, in which we independently vary number of leaves and ensemble size, and figure \ref{fig:p3_colormap}, in which both parameters are varied simultaneously. It is possible to notice that these two parameters make classification accuracy plateau for *leaves>80* and *estimators>100*. The peaks at 82% shownin figure \ref{fig:p3_colormap} are just due to the seed.
+
 \begin{figure}[H]
 \begin{center}
 \includegraphics[width=12em]{fig/error_depth_p3.pdf}
@@ -107,6 +109,8 @@ which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mappi
 \end{center}
 \end{figure}
 
+Similarly to K-means+RF Classifier, in this case we find optimal tree depth and number of trees to be respectively close to 5 and 100 as it can be seen in figure \ref{fig:p3_trees}. The classification accuracy on average is 1% to 2% lower (78% on average, peaking at 82%).
+
 \begin{figure}[H]
 \begin{center}
 \includegraphics[width=12em]{fig/p3_vocsize.pdf}
@@ -116,6 +120,10 @@ which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mappi
 \end{center}
 \end{figure}
 
+Varying the randomness parameter (figure \ref{fig:kmeanrandom}) achieves similar results to the K-means+RF Classifier case.
+
+Figure \ref{fig:p3_cm} shows the confusion matrix for a case in whichi we choose Codebook Forest Size=256, Classifier Forest Size=100, Trees Depth=5 (examples of success and failure in figure \ref{fig:p3_succ}). The classification accuracy for this case is 79%, with the top performing class being *windsor_chair*. Also in this case *water_lilly* was the worst performing class. The results are in general similar to the ones obtained with K-Means+RF Classifier, but the increase in time performance of RF Codebooks+RF Classifier is significant.
+
 # Comparison of methods and conclusions
 
 Overall we observe slightly higher accuracy when using K-means codebooks compared to RF codebook at the expense of a higher execution time for training and testing.
author	nunzip <np.scarh@gmail.com>	2019-02-13 18:21:16 +0000
committer	nunzip <np.scarh@gmail.com>	2019-02-13 18:21:16 +0000
commit	d29d8573cbc149fb23339b787c769f3e7c860362 (patch)
tree	b7f3e8845b1bf68e642509443cd53d61e3ba472f
parent	3da81b76d224d2d81d576b77193703150c3d67d9 (diff)
download	e4-vision-d29d8573cbc149fb23339b787c769f3e7c860362.tar.gz e4-vision-d29d8573cbc149fb23339b787c769f3e7c860362.tar.bz2 e4-vision-d29d8573cbc149fb23339b787c769f3e7c860362.zip