aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornunzip <np.scarh@gmail.com>2019-02-14 18:36:46 +0000
committernunzip <np.scarh@gmail.com>2019-02-14 18:36:46 +0000
commite7bb63c5f3195ee0505ff834df5110f1e2a51c70 (patch)
tree5178c0105514e52f5a2a228a0c389ba722d09a8c
parent809dc0aa8259b180db5da3bd0ed2c93a7e50f786 (diff)
parentcd152dae483571d07b47d69e928c4f2da3f4ea45 (diff)
downloade4-vision-e7bb63c5f3195ee0505ff834df5110f1e2a51c70.tar.gz
e4-vision-e7bb63c5f3195ee0505ff834df5110f1e2a51c70.tar.bz2
e4-vision-e7bb63c5f3195ee0505ff834df5110f1e2a51c70.zip
Merge branch 'master' of skozl.com:e4-vision
-rw-r--r--report/paper.md10
1 files changed, 3 insertions, 7 deletions
diff --git a/report/paper.md b/report/paper.md
index 6f454fa..40a2137 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -14,7 +14,7 @@ The number of clusters or the number of centroids determines the vocabulary size
## Bag-of-words histogram quantisation of descriptor vectors
-An example of histograms for training and testing images is shown on figure \ref{fig:histo_tr}, computed with a vocubulary size of 100. The histograms of the same class appear to have comparable magnitudes for their respective keywords, demonstrating they had a similar number of descriptors which mapped to each of the clusters. The effect of the vocubalary size (as determined by the number of K-means centroids) on the classificaiton accuracy is shown in figure \ref{fig:km_vocsize}. A small vocabulary size tends to misrepresent the information contained in the different patches, resulting in poor classification accuracy. Conversly a large vocabulary size (many K-mean centroids), may display overfitting. In our tests, we begin to observe a plateau-like effect after a cluster count of 60 on figure \ref{fig:km_vocsize}.
+An example of histograms for training and testing images is shown on figure \ref{fig:histo_tr}, computed with a vocubulary size of 100. The histograms of the same class appear to have comparable magnitudes for their respective keywords, demonstrating they had a similar number of descriptors which mapped to each of the clusters. The effect of the vocubalary size (as determined by the number of K-means centroids) on the classificaiton accuracy is shown in figure \ref{fig:km_vocsize}. A small vocabulary size tends to misrepresent the information contained in the different patches, resulting in poor classification accuracy. Conversly a large vocabulary size (many K-mean centroids), may display overfitting. In our tests, we begin to observe a plateau after a cluster count of 60 on figure \ref{fig:km_vocsize}. This proccess of partitioning the input space into K distinct clusters is a form of **vector quantisation**.
\begin{figure}
\begin{center}
@@ -89,7 +89,7 @@ test theoretically brings a slight deacrease in time performance due to complexi
\end{center}
\end{figure}
-Figure \ref{fig:km_cm} shows a confusion matrix for K-means+RF CLassifier with 256 centroids, a forest size of 100 and trees depth of 5. The reported accuracy for this case is 82%. Figure \ref{fig:km_succ} reports examples of failure and success cases obtained from this test, with the top performing classes being `trilobite` and `windsor_chair`. `Water_lilly` was the one that on average performed worst.
+Figure \ref{fig:km_cm} shows a confusion matrix for RF Classification on K-means coded descriptors with 256 centroids, a forest size of 100 and trees depth of 5. The reported accuracy for this case is 82%. Figure \ref{fig:km_succ} reports examples of failure and success cases obtained from this test, with the top performing classes being `trilobite` and `windsor_chair`. `Water_lilly` was the one that on average performed worst.
\begin{figure}
\begin{center}
@@ -101,7 +101,7 @@ Figure \ref{fig:km_cm} shows a confusion matrix for K-means+RF CLassifier with 2
# RF codebook
-An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each decriptor according to which leaf of each tree in the ensemble it is sorted. This effectively performs and unsupervised transformation of our descriptors to a high-dimensional sparse representation. The vocabulary size is determined by the number of leaves in each random tree multiplied by the ensemble size. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduce complexity of RF Codebook creation,
+An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each decriptor according to which leaf of each tree in the ensemble it is sorted. This effectively performs an unsupervised quantization of our descriptors. The vocabulary size is determined by the number of leaves in each random tree multiplied by the ensemble size. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduce complexity of RF Codebook creation,
which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mapping given a created vocabulary is also quicker than K-means, $O(\log K)$ (assuming a balanced tree) vs $O(KD)$.
The effect of vocabulary size on classification accuracy can be observed both in figure \ref{fig:p3_voc}, in which we independently vary number of leaves and ensemble size, and figure \ref{fig:p3_colormap}, in which both parameters are varied simultaneously. It is possible to notice that these two parameters make classification accuracy plateau for *leaves*$>80$ and *estimators*$>100$. The peaks of 82% accuracy visible on the heatmap in figure \ref{fig:p3_colormap} are highly dependent on the seed and indicate the range of *good* hyperparametres.
@@ -150,10 +150,6 @@ For the Caltech_101 dataset, a RF codebook seems to be the most suitable method
The `water_lilly` is the most misclassified class, both for K-means and RF codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}). This indicates that the features obtained from the class do not provide for very discriminative splits, resulting in the prioritsation of other features in the first nodes of the decision trees.
-All code/graphs and configurable scripts can be found on our repository:
-
-``git clone https://git.skozl.com/e4-vision/``
-
# References
<div id="refs"></div>