aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--report/paper.md72
1 files changed, 38 insertions, 34 deletions
diff --git a/report/paper.md b/report/paper.md
index 63e20d0..51df666 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -75,6 +75,8 @@ This parameter also affects correlation between trees. We expect in fact trees t
Changing the randomness parameter had no significant effect on execution time. This may be attributed to increased required tree depth to purify the training set.
+Effects of vocabulary size on accuracy and time performance are shown in section I, figure \ref{fig:km_vocsize}. Time increases linearly with vocabulary size. Optimal number of cluster centers was found to be around 100, giving a good tradeoff between time and accuracy performance. As shown in figure \ref{fig:km_vocsize} the classification error in fact does no plateau completely, despite experiencing a significant decrease in gradient.
+
## Weak Learner comparison
In figure \ref{fig:2pt} it is possible to notice an improvement in recognition accuracy by 2%,
@@ -91,23 +93,6 @@ test theoretically brings a slight deacrease in time performance due to complexi
Figure \ref{fig:km_cm} shows a confusion matrix for K-means+RF CLassifier with 256 centroids, a forest size of 100 and trees depth of 5. The reported accuracy for this case is 82%. Figure \ref{fig:km_succ} reports examples of failure and success cases obtained from this test, with the top performing classes being *trilobite* and *windsor_chair*. *Water_lilly* was the one that on average performed worst.
-\begin{figure}[H]
-\begin{center}
-\includegraphics[width=14em]{fig/e100k256d5_cm.pdf}
-\caption{Confusion Matrix: K=256, ClassifierForestSize=100, Depth=5}
-\label{fig:km_cm}
-\end{center}
-\end{figure}
-
-\begin{figure}[H]
-\begin{center}
-\includegraphics[width=8em]{fig/success_km.pdf}
-\includegraphics[width=8em]{fig/fail_km.pdf}
-\caption{K-means + RF Classifier: Success (left); Failure (right)}
-\label{fig:km_succ}
-\end{center}
-\end{figure}
-
# RF codebook
An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each decriptor according to which leaf of each tree in the ensemble it is sorted. This effectively performs and unsupervised transformation of our dataset to a high-dimensional spare representation. The dimension of the vocabulary size is determined by the number of leaves in each random tree and the ensemble size. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduce complexity of RF Codebook creation,
@@ -115,23 +100,6 @@ which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mappi
\begin{figure}[H]
\begin{center}
-\includegraphics[width=14em]{fig/256t1_e200D5_cm.pdf}
-\caption{Confusion Matrix: CodeBookForestSize=256; ClassifierForestSize=200; Depth=5}
-\label{fig:p3_cm}
-\end{center}
-\end{figure}
-
-\begin{figure}[H]
-\begin{center}
-\includegraphics[width=8em]{fig/success_3.pdf}
-\includegraphics[width=8em]{fig/fail_3.pdf}
-\caption{RF Codebooks + RF Classifier: Success (left); Failure (right)}
-\label{fig:p3_succ}
-\end{center}
-\end{figure}
-
-\begin{figure}[H]
-\begin{center}
\includegraphics[width=12em]{fig/error_depth_p3.pdf}
\includegraphics[width=12em]{fig/trees_p3.pdf}
\caption{RF-codebooks Classification error varying trees depth (left) and numbers of trees (right)}
@@ -163,10 +131,46 @@ is the one that gets misclassified the most, both in k-means and RF-codebook (re
from this class do not guarantee very discriminative splits, hence the first splits in the trees
will prioritize features taken from other classes.
+\newpage
+
# Appendix
\begin{figure}[H]
\begin{center}
+\includegraphics[width=14em]{fig/e100k256d5_cm.pdf}
+\caption{Confusion Matrix: K=256, ClassifierForestSize=100, Depth=5}
+\label{fig:km_cm}
+\end{center}
+\end{figure}
+
+\begin{figure}[H]
+\begin{center}
+\includegraphics[width=8em]{fig/success_km.pdf}
+\includegraphics[width=8em]{fig/fail_km.pdf}
+\caption{K-means + RF Classifier: Success (left); Failure (right)}
+\label{fig:km_succ}
+\end{center}
+\end{figure}
+
+\begin{figure}[H]
+\begin{center}
+\includegraphics[width=14em]{fig/256t1_e200D5_cm.pdf}
+\caption{Confusion Matrix: CodeBookForestSize=256; ClassifierForestSize=200; Depth=5}
+\label{fig:p3_cm}
+\end{center}
+\end{figure}
+
+\begin{figure}[H]
+\begin{center}
+\includegraphics[width=8em]{fig/success_3.pdf}
+\includegraphics[width=8em]{fig/fail_3.pdf}
+\caption{RF Codebooks + RF Classifier: Success (left); Failure (right)}
+\label{fig:p3_succ}
+\end{center}
+\end{figure}
+
+\begin{figure}[H]
+\begin{center}
\includegraphics[width=14em]{fig/p3_colormap.pdf}
\caption{Varying leaves and estimators: effect on accuracy}
\label{fig:p3_colormap}