aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornunzip <np.scarh@gmail.com>2019-02-13 22:48:33 +0000
committernunzip <np.scarh@gmail.com>2019-02-13 22:48:33 +0000
commit3c5784b1fcd2321ab598b04757943a4b8be11e9c (patch)
tree7ee89c7a8ec1f436ab97c7e0cb3f9b3b2db9aecb
parent61fba78fe0b42447addb8d59413dfe081a4c5b63 (diff)
downloade4-vision-3c5784b1fcd2321ab598b04757943a4b8be11e9c.tar.gz
e4-vision-3c5784b1fcd2321ab598b04757943a4b8be11e9c.tar.bz2
e4-vision-3c5784b1fcd2321ab598b04757943a4b8be11e9c.zip
Move confusion matrices to main body
-rw-r--r--report/paper.md75
1 files changed, 34 insertions, 41 deletions
diff --git a/report/paper.md b/report/paper.md
index ba2653c..57cbf23 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -18,10 +18,10 @@ The number of clusters or the number of centroids determines the vocabulary size
An example histograms for training and testing images is shown on figure \ref{fig:histo_tr}, computed with a vocubulary size of 100. The histograms of the same class appear to have comparable magnitudes for their respective keywords, demonstrating they had a similar number of descriptors which mapped to each of the clusters. The effect of the vocubalary size (as determined by the number of K-means centroids) on the classificaiton accuracy is shown in figure \ref{fig:km_vocsize}. A small vocabulary size tends to misrepresent the information contained in the different patches, resulting in poor classification accuracy. Conversly a large vocabulary size (many K-mean centroids), may display overfitting. In our tests, we observe a plateau after a cluster count of 60 on figure \ref{fig:km_vocsize}.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=12em]{fig/kmeans_vocsize.pdf}
-\includegraphics[width=12em]{fig/time_kmeans.pdf}
+\includegraphics[width=12em]{fig/kmean_train_test_time.pdf}
\caption{Effect of vocabulary size; classification error (left) and time (right)}
\label{fig:km_vocsize}
\end{center}
@@ -33,7 +33,7 @@ The time complexity of quantisation with a K-means codebooks is $O(DNK)$, where
K-means is a process that converges to local optima and heavily depends on the initialization values of the centroids.
Initializing K-means is an expensive process, based on sequential attempts of centroids placement. Running for multiple instances significantly affects the computation process, leading to a linear increase in execution time. We did not observe increase in accuracy with K-means estimator size larger than one, and therefore present results accuracy and execution time results with a single K-Mean estimator.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=12em]{fig/trainhist.pdf}
\includegraphics[width=12em]{fig/testhist.pdf}
@@ -52,7 +52,7 @@ Figure \ref{fig:km-tree-param} shows the effect of tree depth and number of tree
Optimal values for tree depth and number of trees were found to be respectively 5 and 100 as shown in figure \ref{fig:km-tree-param}. Running for multiple seeds instances shows an average accuracy of 80% for these two parameters, peaking at 84% in very specific cases.
We expect a large tree depth to lead into overfitting. However for the data analysed it is only possible to observe a plateau in classification performance.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=12em]{fig/error_depth_kmean100.pdf}
\includegraphics[width=12em]{fig/trees_kmean.pdf}
@@ -64,7 +64,7 @@ We expect a large tree depth to lead into overfitting. However for the data anal
Random forests will select a random number of features on which to apply a weak learner (such as axis aligned split) and then chose the best feature of the sampled ones to perform the split on, based on a given criteria (our results use the *Gini index*). The fewer features that are compared for each split the quicker the trees are built and the more random they are. Therefore the randomness parameter can be considered the number of features used when making splits. We evaluate accuracy given different randomness when using a K-means vocabulary in figure \ref{fig:kmeanrandom}. The results in the figure \ref{fig:kmeanrandom} use a forest size of 100 as we infered that this is the estimatator count for which performance gains tend to plateau (when selecting $\sqrt{n}$ random features).
This parameter also affects correlation between trees. We expect in fact trees to be more correlated when using a large number of features for splits.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=12em]{fig/new_kmean_random.pdf}
\includegraphics[width=12em]{fig/p3_rand.pdf}
@@ -83,7 +83,7 @@ In figure \ref{fig:2pt} it is possible to notice an improvement in recognition a
with the two pixels test, achieving better results than the axis-aligned counterpart. The two-pixels
test theoretically brings a slight deacrease in time performance due to complexity, since it adds one dimension to the computation. It is difficult to measure in our case since it should be less than a second.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=14em]{fig/2pixels_kmean.pdf}
\caption{K-means classification accuracy changing the type of weak learners}
@@ -91,7 +91,15 @@ test theoretically brings a slight deacrease in time performance due to complexi
\end{center}
\end{figure}
-Figure \ref{fig:km_cm} shows a confusion matrix for K-means+RF CLassifier with 256 centroids, a forest size of 100 and trees depth of 5. The reported accuracy for this case is 82%. Figure \ref{fig:km_succ} reports examples of failure and success cases obtained from this test, with the top performing classes being *trilobite* and *windsor_chair*. *Water_lilly* was the one that on average performed worst.
+Figure \ref{fig:km_cm} shows a confusion matrix for K-means+RF CLassifier with 256 centroids, a forest size of 100 and trees depth of 5. The reported accuracy for this case is 82%. Figure \ref{fig:km_succ} reports examples of failure and success cases obtained from this test, with the top performing classes being `trilobite` and `windsor_chair`. `Water_lilly` was the one that on average performed worst.
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=14em]{fig/e100k256d5_cm.pdf}
+\caption{Confusion Matrix: K=256, ClassifierForestSize=100, Depth=5}
+\label{fig:km_cm}
+\end{center}
+\end{figure}
# RF codebook
@@ -100,7 +108,7 @@ which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mappi
The effect of vocabulary size on classification accuracy can be observed both in figure \ref{fig:p3_voc}, in which we independently vary number of leaves and ensemble size, and figure \ref{fig:p3_colormap}, in which both parameters are varied simultaneously. It is possible to notice that these two parameters make classification accuracy plateau for *leaves*$>80$ and *estimators*$>100$. The peaks of 82% accuracy visible on the heatmap in figure \ref{fig:p3_colormap} are highly dependent on the seed and indicate the range of *good* hyperparametres.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=12em]{fig/error_depth_p3.pdf}
\includegraphics[width=12em]{fig/trees_p3.pdf}
@@ -111,10 +119,10 @@ The effect of vocabulary size on classification accuracy can be observed both in
Similarly to K-means codebook, we find that for the RF codebook the optimal tree depth and number of trees are around 5 and 100 as it can be seen in figure \ref{fig:p3_trees}. The classification accuracy on average is 1% to 2% lower (78% on average, peaking at 82%).
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=12em]{fig/p3_vocsize.pdf}
-\includegraphics[width=12em]{fig/p3_time.pdf}
+\includegraphics[width=12em]{fig/p3_train_test_time.pdf}
\caption{RF codebooks Effect of vocabulary size; classification error (left) and time (right)}
\label{fig:p3_voc}
\end{center}
@@ -124,9 +132,17 @@ Varying the randomness parameter of the RF classifier (as in figure \ref{fig:kme
Figure \ref{fig:p3_cm} shows the confusion matrix for results with Codebook Forest Size=256, Classifier Forest Size=100, Trees Depth=5 (examples of success and failure in figure \ref{fig:p3_succ}). The classification accuracy for this case is 79%, with the top performing class being `windsor_chair`. In our tests, we observed poorest performance with the `water_lilly` class. The per class accuracy of classification with the RF codebook is similar to that of K-Means coded data, but we observe a significant speedup in training performance when building RF tree based vocabulary.
+\begin{figure}
+\begin{center}
+\includegraphics[width=14em]{fig/256t1_e200D5_cm.pdf}
+\caption{Confusion Matrix: CodeBookForestSize=256; ClassifierForestSize=200; Depth=5}
+\label{fig:p3_cm}
+\end{center}
+\end{figure}
+
# Comparison of methods and conclusions
-Overall we observe marginally higher accuracy when using K-means codebooks compared to RF codebook at the expense of a higher execution time for training and testing.
+Overall we observe marginally higher accuracy when using K-means codebooks compared to RF codebook at the expense of a higher training execution time. Testing time is similar in both methods, with RF-codebooks being slightly faster as explained in section III.
As discussed in section I, due to the initialization process for optimal centroids placements, K-means can be unpreferable for large
descriptors' counts (and in absence of methods for dimensionality reduction).
@@ -140,15 +156,9 @@ The `water_lilly` is the most misclassified class, both in k-means and RF codebo
# Appendix
-\begin{figure}[H]
-\begin{center}
-\includegraphics[width=14em]{fig/e100k256d5_cm.pdf}
-\caption{Confusion Matrix: K=256, ClassifierForestSize=100, Depth=5}
-\label{fig:km_cm}
-\end{center}
-\end{figure}
+The Appendix section includes additional pictures to support some of the points presented in the main report.
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=8em]{fig/success_km.pdf}
\includegraphics[width=8em]{fig/fail_km.pdf}
@@ -157,15 +167,15 @@ The `water_lilly` is the most misclassified class, both in k-means and RF codebo
\end{center}
\end{figure}
-\begin{figure}[H]
+\begin{figure}
\begin{center}
-\includegraphics[width=14em]{fig/256t1_e200D5_cm.pdf}
-\caption{Confusion Matrix: CodeBookForestSize=256; ClassifierForestSize=200; Depth=5}
-\label{fig:p3_cm}
+\includegraphics[width=14em]{fig/p3_colormap.pdf}
+\caption{Varying leaves and estimators: effect on accuracy}
+\label{fig:p3_colormap}
\end{center}
\end{figure}
-\begin{figure}[H]
+\begin{figure}
\begin{center}
\includegraphics[width=8em]{fig/success_3.pdf}
\includegraphics[width=8em]{fig/fail_3.pdf}
@@ -174,23 +184,6 @@ The `water_lilly` is the most misclassified class, both in k-means and RF codebo
\end{center}
\end{figure}
-\begin{figure}[H]
-\begin{center}
-\includegraphics[width=14em]{fig/p3_colormap.pdf}
-\caption{Varying leaves and estimators: effect on accuracy}
-\label{fig:p3_colormap}
-\end{center}
-\end{figure}
-
-\begin{figure}[H]
-\begin{center}
-\includegraphics[width=12em]{fig/kmean_train_test_time.pdf}
-\includegraphics[width=12em]{fig/p3_train_test_time.pdf}
-\caption{Train and Test time; K-means (left), RF-Codebooks (right)}
-\label{fig:train_test_times}
-\end{center}
-\end{figure}
-
# References