aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorVasil Zlatanov <v@skozl.com>2019-02-15 18:06:25 +0000
committerVasil Zlatanov <v@skozl.com>2019-02-15 18:06:25 +0000
commit3515d8e373173c59b639ae307804dadfd2c80da3 (patch)
tree07d5c2b4f70524863ace0975179b84c626c74d61
parent28e0eee3ced106210084054319f94e22515e82b3 (diff)
parent3a8dc4fae2626d0c9e8e9227ff36b0798741221e (diff)
downloade4-vision-3515d8e373173c59b639ae307804dadfd2c80da3.tar.gz
e4-vision-3515d8e373173c59b639ae307804dadfd2c80da3.tar.bz2
e4-vision-3515d8e373173c59b639ae307804dadfd2c80da3.zip
Fixes to introHEADmaster
-rw-r--r--report/paper.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/report/paper.md b/report/paper.md
index 65755ce..f88d761 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -101,7 +101,7 @@ Figure \ref{fig:km_cm} shows a confusion matrix for RF Classification on K-means
# RF codebook
-An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each descriptor according to which leaf of each tree in the ensemble it is sorted. This effectively performs an unsupervised quantization of our descriptors. The vocabulary size is determined by the number of leaves in each random tree multiplied by the ensemble size. The returned leafe node ID's are then binned into a histogram to create a bag-of-words, much like the one created using K-Means. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in figure \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduced complexity of RF Codebook creation,
+An alternative to codebook creation via K-means involves using an ensemble of totally random trees. We code each descriptor according to which leaf of each tree in the ensemble it is sorted to. This effectively performs an unsupervised quantization of our descriptors. The vocabulary size is determined by the number of leaves in each random tree multiplied by the ensemble size. The returned leaf node IDs are then binned into a histogram to create a bag-of-words, much like the one created using K-Means. From comparing execution times of K-means in figure \ref{fig:km_vocsize} and the RF codebook in figure \ref{fig:p3_voc} we observe considerable speed gains from utilising the RF codebook. This may be attributed to the reduced complexity of RF Codebook creation,
which is $O(\sqrt{D} N \log K)$ compared to $O(DNK)$ for K-means. Codebook mapping given a created vocabulary is also quicker than K-means, $O(\log K)$ (assuming a balanced tree) vs $O(KD)$.
The effect of vocabulary size on classification accuracy can be observed both in figure \ref{fig:p3_voc}, in which we independently vary number of leaves and ensemble size, and figure \ref{fig:p3_colormap}, in which both parameters are varied simultaneously. It is possible to notice that these two parameters make classification accuracy plateau for *leaves*$>80$ and *estimators*$>100$. The peaks of 82% accuracy visible on the heatmap in figure \ref{fig:p3_colormap} are highly dependent on the seed and indicate the range of *good* hyper-parameters.